[function] What is the difference between a "function" and a "procedure"?

This is a well-known old question, but I'd like to share some more insights about modern programming language research and design.

Basic answer

Traditionally (in the sense of structured programming) and informally, a procedure is a reusable structural construct to have "input" and to do something programmable. When something is needed to be done within a procedure, you can provide (actual) arguments to the procedure in a procedure call coded in the source code (usually in a kind of an expression), and the actions coded in the procedures body (provided in the definition of the procedure) will be executed with the substitution of the arguments into the (formal) parameters used in the body.

A function is more than a procedure because return values can also be specified as the "output" in the body. Function calls are more or less same to procedure calls, except that you can also use the result of the function call, syntactically (usually as a subexpression of some other expression).

Traditionally, procedure calls (rather than function calls) are used to indicate that no output must be interested, and there must be side effects to avoid the call being no-ops, hence emphasizing the imperative programming paradigm. Many traditional programming languages like Pascal provide both "procedures" and "functions" to distinguish this intentional difference of styles.

(To be clear, the "input" and "output" mentioned above are simplified notions based on the syntactic properties of functions. Many languages additionally support passing arguments to parameters by reference/sharing, to allow users transporting information encoded in arguments during the calls. Such parameter may even be just called as "in/out parameter". This feature is based on the nature of the objects being passed in the calls, which is orthogonal to the properties of the feature of procedure/function.)

However, if the result of a function call is not needed, it can be just (at least logically) ignored, and function definitions/function calls should be consistent to procedure definitions/procedure calls in this way. ALGOL-like languages like C, C++ and Java, all provide the feature of "function" in this fashion: by encoding the result type void as a special case of functions looking like traditional procedures, there is no need to provide the feature of "procedures" separately. This prevents some bloat in the language design.

Since SICP is mentioned, it is also worth noting that in the Scheme language specified by RnRS, a procedure may or may not have to return the result of the computation. This is the union of the traditional "function" (returning the result) and "procedure" (returning nothing), essentially same to the "function" concept of many ALGOL-like languages (and actually sharing even more guarantees like applicative evaluations of the operands before the call). However, old-fashion differences still occur even in normative documents like SRFI-96.

I don't know much about the exact reasons behind the divergence, but as I have experienced, it seems that language designers will be happier without specification bloat nowadays. That is, "procedure" as a standalone feature is unnecessary. Techniques like void type is already sufficient to mark the use where side effects should be emphasized. This is also more natural to users having experiences on C-like languages, which are popular more than a few decades. Moreover, it avoids the embarrassment in cases like RnRS where "procedures" are actually "functions" in the broader sense.

In theory, a function can be specified with a specified unit type as the type of the function call result to indicate that result is special. This distinguishes the traditional procedures (where the result of a call is uninterested) from others. There are different styles in the design of a language:

  • As in RnRS, just marking the uninterested results as "unspecified" value (of unspecified type, if the language has to mention it) and it is sufficient to be ignored.
  • Specifying the uninterested result as the value of a dedicated unit type (e.g. Kernel's #inert) also works.
  • When that type is a further a bottom type, it can be (hopefully) statically verified and prevented used as a type of expression. The void type in ALGOL-like languages is exactly an example of this technique. ISO C11's _Noreturn is a similar but more subtle one in this kind.

Further reading

As the traditional concept derived from math, there are tons of black magic most people do not bother to know. Strictly speaking, you won't be likely get the whole things clear as per your math books. CS books might not provide much help, either.

With concerning of programming languages, there are several caveats:

  • Functions in different branches of math are not always defined having same meanings. Functions in different programming paradigms may also be quite different (even sometimes the syntaxes of function call look similar). Sometimes the reasons to cause the differences are same, but sometimes they are not.
    • It is idiomatic to model computation by mathematical functions and then implement the underlying computation in programming languages. Be careful to avoid mapping them one to one unless you know what are being talked about.
  • Do not confuse the model with the entity be modeled.
    • The latter is only one of the implementation to the former. There can be more than one choices, depending on the contexts (the branches of math interested, for example).
    • In particular, it is more or less similarly absurd to treat "functions" as "mappings" or subsets of Cartesian products like to treat natural numbers as Von-Neumann encoding of ordinals (looking like a bunch of {{{}}, {}}...) besides some limited contexts.
  • Mathematically, functions can be partial or total. Different programming languages have different treatment here.
    • Some functional languages may honor totality of functions to guarantee the computation within the function calls always terminate in finite steps. However, this is essentially not Turing-complete, hence weaker computational expressiveness, and not much seen in general-purpose languages besides semantics of typechecking (which is expected to be total).
    • If the difference between the procedures and functions is significant, should there be "total procedures"? Hmm...
  • Constructs similar to functions in calculi used to model the general computation and the semantics of the programming languages (e.g. lambda abstractions in lambda calculi) can have different evaluation strategies on operands.
    • In traditional the reductions in pure calculi as well in as evaluations of expressions in pure functional languages, there are no side effects altering the results of the computations. As a result, operands are not required to be evaluated before the body of the functions-like constructs (because the invariant to define "same results" is kept by properties like ß-equivalence guaranteed by Church-Rosser property).
    • However, many programming languages may have side effects during the evaluations of expressions. That means, strict evaluation strategies like applicative evaluation are not the same to non-strict evaluation ones like call-by-need. This is significant, because without the distinction, there is no need to distinguish function-like (i.e. used with arguments) macros from (traditional) functions. But depending on the flavor of theories, this still can be an artifact. That said, in a broader sense, functional-like macros (esp. hygienic ones) are mathematical functions with some unnecessary limitations (syntactic phases). Without the limitations, it might be sane to treat (first-class) function-like macros as procedures...
    • For readers interested in this topic, consider some modern abstractions.
  • Procedures are usually considered out of the scope of traditional math. However, in calculi modeling the computation and programming language semantics, as well as contemporary programming language designs, there can be quite a big family of related concepts sharing the "callable" nature. Some of them are used to implement/extend/replace procedures/functions. There are even more subtle distinctions.