Department of Computer Science

Thursday, June 25, 2009

Algorithm

For a detailed presentation of the various points of view around the definition of "algorithm" see Algorithm characterizations. For examples of simple addition algorithms specified in the detailed manner described in Algorithm characterizations, see Algorithm examples.
While there is no generally accepted formal definition of "algorithm", an informal definition could be "a process that performs some sequence of operations." For some people, a program is only an algorithm if it stops eventually. For others, a program is only an algorithm if it stops before a given number of calculation steps.
A prototypical example of an "algorithm" is Euclid's algorithm to determine the maximum common divisor of two integers (X and Y) which are greater than one: We follow a series of steps: In step i, we divide X by Y and find the remainder, which we call R1. Then we move to step i + 1, where we divide Y by R1, and find the remainder, which we call R2. If R2=0, we stop and say that R1 is the greatest common divisor of X and Y. If not, we continue, until Rn=0. Then Rn-1 is the max common division of X and Y. This procedure is known to stop always and the number of subtractions needed is always smaller than the larger of the two numbers.
We can derive clues to the issues involved and an informal meaning of the word from the following quotation from Boolos & Jeffrey (1974, 1999) (boldface added):
No human being can write fast enough or long enough or small enough to list all members of an enumerably infinite set by writing out their names, one after another, in some notation. But humans can do something equally useful, in the case of certain enumerably infinite sets: They can give explicit instructions for determining the nth member of the set, for arbitrary finite n. Such instructions are to be given quite explicitly, in a form in which they could be followed by a computing machine, or by a human who is capable of carrying out only very elementary operations on symbols (Boolos & Jeffrey 1974, 1999, p. 19)
The words "enumerably infinite" mean "countable using integers perhaps extending to infinity." Thus Boolos and Jeffrey are saying that an algorithm implies instructions for a process that "creates" output integers from an arbitrary "input" integer or integers that, in theory, can be chosen from 0 to infinity. Thus we might expect an algorithm to be an algebraic equation such as y = m + n — two arbitrary "input variables" m and n that produce an output y. As we see in Algorithm characterizations — the word algorithm implies much more than this, something on the order of (for our addition example):
Precise instructions (in language understood by "the computer") for a "fast, efficient, good" process that specifies the "moves" of "the computer" (machine or human, equipped with the necessary internally-contained information and capabilities) to find, decode, and then munch arbitrary input integers/symbols m and n, symbols + and = ... and (reliably, correctly, "effectively") produce, in a "reasonable" time, output-integer y at a specified place and in a specified format.
The concept of algorithm is also used to define the notion of decidability. That notion is central for explaining how formal systems come into being starting from a small set of axioms and rules. In logic, the time that an algorithm requires to complete cannot be measured, as it is not apparently related with our customary physical dimension. From such uncertainties, that characterize ongoing work, stems the unavailability of a definition of algorithm that suits both concrete (in some sense) and abstract usage of the term.

[edit] Formalization
Algorithms are essential to the way computers process information. Many computer programs contain algorithms that specify the specific instructions a computer should perform (in a specific order) to carry out a specified task, such as calculating employees’ paychecks or printing students’ report cards. Thus, an algorithm can be considered to be any sequence of operations that can be simulated by a Turing-complete system. Authors who assert this thesis include Savage (1987) and Gurevich (2000):
...Turing's informal argument in favor of his thesis justifies a stronger thesis: every algorithm can be simulated by a Turing machine (Gurevich 2000:1)...according to Savage [1987], an algorithm is a computational process defined by a Turing machine. (Gurevich 2000:3)
Typically, when an algorithm is associated with processing information, data is read from an input source, written to an output device, and/or stored for further processing. Stored data is regarded as part of the internal state of the entity performing the algorithm. In practice, the state is stored in one or more data structures.
For any such computational process, the algorithm must be rigorously defined: specified in the way it applies in all possible circumstances that could arise. That is, any conditional steps must be systematically dealt with, case-by-case; the criteria for each case must be clear (and computable).
Because an algorithm is a precise list of precise steps, the order of computation will always be critical to the functioning of the algorithm. Instructions are usually assumed to be listed explicitly, and are described as starting "from the top" and going "down to the bottom", an idea that is described more formally by flow of control.
So far, this discussion of the formalization of an algorithm has assumed the premises of imperative programming. This is the most common conception, and it attempts to describe a task in discrete, "mechanical" means. Unique to this conception of formalized algorithms is the assignment operation, setting the value of a variable. It derives from the intuition of "memory" as a scratchpad. There is an example below of such an assignment.
For some alternate conceptions of what constitutes an algorithm see functional programming and logic programming .

[edit] Termination
Some writers restrict the definition of algorithm to procedures that eventually finish. In such a category Kleene places the "decision procedure or decision method or algorithm for the question" (Kleene 1952:136). Others, including Kleene, include procedures that could run forever without stopping; such a procedure has been called a "computational method" (Knuth 1997:5) or "calculation procedure or algorithm" (Kleene 1952:137); however, Kleene notes that such a method must eventually exhibit "some object" (Kleene 1952:137).
Minsky makes the pertinent observation, in regards to determining whether an algorithm will eventually terminate (from a particular starting state):
But if the length of the process is not known in advance, then "trying" it may not be decisive, because if the process does go on forever — then at no time will we ever be sure of the answer (Minsky 1967:105).
As it happens, no other method can do any better, as was shown by Alan Turing with his celebrated result on the undecidability of the so-called halting problem. There is no algorithmic procedure for determining of arbitrary algorithms whether or not they terminate from given starting states. The analysis of algorithms for their likelihood of termination is called termination analysis.
See the examples of (im-)"proper" subtraction at partial function for more about what can happen when an algorithm fails for certain of its input numbers — e.g., (i) non-termination, (ii) production of "junk" (output in the wrong format to be considered a number) or no number(s) at all (halt ends the computation with no output), (iii) wrong number(s), or (iv) a combination of these. Kleene proposed that the production of "junk" or failure to produce a number is solved by having the algorithm detect these instances and produce e.g., an error message (he suggested "0"), or preferably, force the algorithm into an endless loop (Kleene 1952:322). Davis does this to his subtraction algorithm — he fixes his algorithm in a second example so that it is proper subtraction (Davis 1958:12-15). Along with the logical outcomes "true" and "false" Kleene also proposes the use of a third logical symbol "u" — undecided (Kleene 1952:326) — thus an algorithm will always produce something when confronted with a "proposition". The problem of wrong answers must be solved with an independent "proof" of the algorithm e.g., using induction:
We normally require auxiliary evidence for this (that the algorithm correctly defines a mu recursive function), e.g., in the form of an inductive proof that, for each argument value, the computation terminates with a unique value (Minsky 1967:186).

[edit] Expressing algorithms
Algorithms can be expressed in many kinds of notation, including natural languages, pseudocode, flowcharts, and programming languages. Natural language expressions of algorithms tend to be verbose and ambiguous, and are rarely used for complex or technical algorithms. Pseudocode and flowcharts are structured ways to express algorithms that avoid many of the ambiguities common in natural language statements, while remaining independent of a particular implementation language. Programming languages are primarily intended for expressing algorithms in a form that can be executed by a computer, but are often used as a way to define or document algorithms.
There is a wide variety of representations possible and one can express a given Turing machine program as a sequence of machine tables (see more at finite state machine and state transition table), as flowcharts (see more at state diagram), or as a form of rudimentary machine code or assembly code called "sets of quadruples" (see more at Turing machine).
Sometimes it is helpful in the description of an algorithm to supplement small "flow charts" (state diagrams) with natural-language and/or arithmetic expressions written inside "block diagrams" to summarize what the "flow charts" are accomplishing.

History of Computer

The early foundations of what would become computer science predate the invention of the modern digital computer. Machines for calculating fixed numerical tasks, such as the abacus, have existed since antiquity. Wilhelm Schickard built the first mechanical calculator in 1623.[6] Charles Babbage designed a difference engine in Victorian times[7] helped by Ada Lovelace.[8] Around 1900, punch-card machines[9] were introduced. However, all of these machines were constrained to perform a single task, or at best some subset of all possible tasks.
During the 1940s, as newer and more powerful computing machines were developed, the term computer came to refer to the machines rather than their human predecessors. As it became clear that computers could be used for more than just mathematical calculations, the field of computer science broadened to study computation in general. Computer science began to be established as a distinct academic discipline in the 1950s and early 1960s, with the creation of the first computer science departments and degree programs.[4][10] Since practical computers became available, many applications of computing have become distinct areas of study in their own right.
Although many initially believed it impossible that computers themselves could actually be a scientific field of study, in the late fifties it gradually became accepted among the greater academic population.[11] It is the now well-known IBM brand that formed part of the computer science revolution during this time. IBM (short for International Business Machines) released the IBM 704 and later the IBM 709 computers, which were widely used during the exploration period of such devices. "Still, working with the IBM [computer] was frustrating...if you had misplaced as much as one letter in one instruction, the program would crash, and you would have to start the whole process over again".[11] During the late 1950s, the computer science discipline was very much in its developmental stages, and such issues were commonplace.
Time has seen significant improvements in the usability and effectiveness of computer science technology. Modern society has seen a significant shift from computers being used solely by experts or professionals to a more widespread user base.

Computer science (or computing science) is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems.[1][2][3] It is frequently described as the systematic study of algorithmic processes that describe and transform information; the fundamental question underlying computer science is, "What can be (efficiently) automated?"[4] Computer science has many sub-fields; some, such as computer graphics, emphasize the computation of specific results, while others, such as computational complexity theory, study the properties of computational problems. Still others focus on the challenges in implementing computations. For example, programming language theory studies approaches to describing computations, while computer programming applies specific programming languages to solve specific computational problems, and human-computer interaction focuses on the challenges in making computers and computations useful, usable, and universally accessible to people.
The general public sometimes confuses computer science with vocational areas that deal with computers (such as information technology), or think that it relates to their own experience of computers, which typically involves activities such as gaming, web-browsing, and word-processing. However, the focus of computer science is more on understanding the properties of the programs used to implement software such as games and web-browsers, and using that understanding to create new programs or improve existing ones.[5]

Field of Computer

As a discipline, computer science spans a range of topics from theoretical studies of algorithms and the limits of computation to the practical issues of implementing computing systems in hardware and software.[17][18] The Computer Sciences Accreditation Board (CSAB) – which is made up of representatives of the Association for Computing Machinery (ACM), the Institute of Electrical and Electronics Engineers Computer Society, and the Association for Information Systems – identifies four areas that it considers crucial to the discipline of computer science: theory of computation, algorithms and data structures, programming methodology and languages, and computer elements and architecture. In addition to these four areas, CSAB also identifies fields such as software engineering, artificial intelligence, computer networking and communication, database systems, parallel computation, distributed computation, computer-human interaction, computer graphics, operating systems, and numerical and symbolic computation as being important areas of computer science.[17]

Theory of Computer

The study of the theory of computation is focused on answering fundamental questions about what can be computed, and what amount of resources are required to perform those computations. In an effort to answer the first question, computability theory examines which computational problems are solvable on various theoretical models of computation. The second question is addressed by computational complexity theory, which studies the time and space costs associated with different approaches to solving a computational problem.
The famous "P=NP?" problem, one of the Millennium Prize Problems,[19] is an open problem in the theory of computation.

The broader field of theoretical computer science encompasses both the classical theory of computation and a wide range of other topics that focus on the more abstract, logical, and mathematical aspects of computing.As a discipline, computer science spans a range of topics from theoretical studies of algorithms and the limits of computation to the practical issues of implementing computing systems in hardware and software.[17][18] The Computer Sciences Accreditation Board (CSAB) – which is made up of representatives of the Association for Computing Machinery (ACM), the Institute of Electrical and Electronics Engineers Computer Society, and the Association for Information Systems – identifies four areas that it considers crucial to the discipline of computer science: theory of computation, algorithms and data structures, programming methodology and languages, and computer elements and architecture. In addition to these four areas, CSAB also identifies fields such as software engineering, artificial intelligence, computer networking and communication, database systems, parallel computation, distributed computation, computer-human interaction, computer graphics, operating systems, and numerical and symbolic computation as being important areas of computer science.[17]

In theoretical computer science, automata theory is the study of abstract machines and problems which they are able to solve. Automata theory is closely related to formal language theory as the automata are often classified by the class of formal languages they are able to recognize.
An automaton is a mathematical model for a finite state machine (FSM). A FSM is a machine that, given an input of symbols, "jumps", or transitions, through a series of states according to a transition function (which can be expressed as a table). In the common "Mealy" variety of FSMs, this transition function tells the automaton which state to go to next given a current state and a current symbol.
The input is read symbol by symbol, until it is consumed completely (similar to a tape with a word written on it, which is read by a reading head of the automaton; the head moves forward over the tape, reading one symbol at a time). Once the input is depleted, the automaton is said to have stopped.
Depending on the state in which the automaton stops, it's said that the automaton either accepts or rejects the input. If it landed in an accept state, then the automaton accepts the word. If, on the other hand, it lands on a reject state, the word is rejected. The set of all the words accepted by an automaton is called the language accepted by the automaton.
Note, however, that, in general, an automaton need not have a finite number of states, or even a countable number of states. Thus, for example, the quantum finite automaton has an uncountable infinity of states, as the set of all possible states is the set of all points in complex projective space. Thus, quantum finite automata, as well as finite state machines, are special cases of a more general idea, that of a topological automaton, where the set of states is a topological space, and the state transition functions are taken from the set of all possible functions on the space. Topological automata are often called M-automata, and are simply the augmentation of a semiautomaton with a set of accept states, where set intersection determines whether the initial state is accepted or rejected.
In general, an automaton need not strictly accept or reject an input; it may accept it with some probability between zero and one. Again this is illustrated by the quantum finite automaton, which only accepts input with some probability. This idea is again a special case of a more general notion, the geometric automaton or metric automaton, where the set of states is a metric space, and a language is accepted by the automaton if the distance between the initial point, and the set of accept states is sufficiently small with respect to the metric.

Departments of major Universities

Despite its name, a significant amount of computer science does not involve the study of computers themselves. Because of this, several alternative names have been proposed. Certain departments of major universities prefer the term computing science, to emphasize precisely that difference. Danish scientist Peter Naur suggested the term datalogy, to reflect the fact that the scientific discipline revolves around data and data treatment, while not necessarily involving computers. The first scientific institution to use the term was the Department of Datalogy at the University of Copenhagen, founded in 1969, with Peter Naur being the first professor in datalogy. The term is used mainly in the Scandinavian countries. Also, in the early days of computing, a number of terms for the practitioners of the field of computing were suggested in the Communications of the ACM – turingineer, turologist, flow-charts-man, applied meta-mathematician, and applied epistemologist.[20] Three months later in the same journal, comptologist was suggested, followed next year by hypologist.[21] The term computics has also been suggested.[22] Informatik was a term used in Europe with more frequency.
The renowned computer scientist Edsger Dijkstra stated, "Computer science is no more about computers than astronomy is about telescopes." The design and deployment of computers and computer systems is generally considered the province of disciplines other than computer science. For example, the study of computer hardware is usually considered part of computer engineering, while the study of commercial computer systems and their deployment is often called information technology or information systems. However, there has been much cross-fertilization of ideas between the various computer-related disciplines. Computer science research has also often crossed into other disciplines, such as cognitive science, economics, mathematics, physics (see quantum computing), and linguistics.
Computer science is considered by some to have a much closer relationship with mathematics than many scientific disciplines, with some observers saying that computing is a mathematical science.[4] Early computer science was strongly influenced by the work of mathematicians such as Kurt Gödel and Alan Turing, and there continues to be a useful interchange of ideas between the two fields in areas such as mathematical logic, category theory, domain theory, and algebra.
The relationship between computer science and software engineering is a contentious issue, which is further muddied by disputes over what the term "software engineering" means, and how computer science is defined. David Parnas, taking a cue from the relationship between other engineering and science disciplines, has claimed that the principal focus of computer science is studying the properties of computation in general, while the principal focus of software engineering is the design of specific computations to achieve practical goals, making the two separate but complementary disciplines.[23]
The academic, political, and funding aspects of computer science tend to depend on whether a department formed with a mathematical emphasis or with an engineering emphasis. Computer science departments with a mathematics emphasis and with a numerical orientation consider alignment computational science. Both types of departments tend to make efforts to bridge the field educationally if not across all research.

[edit] Computer science education
Some universities teach computer science as a theoretical study of computation and algorithmic reasoning. These programs often feature the theory of computation, analysis of algorithms, formal methods, concurrency theory, databases, computer graphics and systems analysis, among others. They typically also teach computer programming, but treat it as a vessel for the support of other fields of computer science rather than a central focus of high-level study.
Other colleges and universities, as well as secondary schools and vocational programs that teach computer science, emphasize the practice of advanced programming rather than the theory of algorithms and computation in their computer science curricula. Such curricula tend to focus on those skills that are important to workers entering the software industry. The practical aspects of computer programming are often referred to as software engineering. However, there is a lot of disagreement over the meaning of the term, and whether or not it is the same thing as programming.