[r] Explicitly calling return in a function or not

My question is: Why is not calling return faster

It’s faster because return is a (primitive) function in R, which means that using it in code incurs the cost of a function call. Compare this to most other programming languages, where return is a keyword, but not a function call: it doesn’t translate to any runtime code execution.

That said, calling a primitive function in this way is pretty fast in R, and calling return incurs a minuscule overhead. This isn’t the argument for omitting return.

or better, and thus preferable?

Because there’s no reason to use it.

Because it’s redundant, and it doesn’t add useful redundancy.

To be clear: redundancy can sometimes be useful. But most redundancy isn’t of this kind. Instead, it’s of the kind that adds visual clutter without adding information: it’s the programming equivalent of a filler word or chartjunk).

Consider the following example of an explanatory comment, which is universally recognised as bad redundancy because the comment merely paraphrases what the code already expresses:

# Add one to the result
result = x + 1

Using return in R falls in the same category, because R is a functional programming language, and in R every function call has a value. This is a fundamental property of R. And once you see R code from the perspective that every expression (including every function call) has a value, the question then becomes: “why should I use return?” There needs to be a positive reason, since the default is not to use it.

One such positive reason is to signal early exit from a function, say in a guard clause:

f = function (a, b) {
    if (! precondition(a)) return() # same as `return(NULL)`!
    calculation(b)
}

This is a valid, non-redundant use of return. However, such guard clauses are rare in R compared to other languages, and since every expression has a value, a regular if does not require return:

sign = function (num) {
    if (num > 0) {
        1
    } else if (num < 0) {
        -1
    } else {
        0
    }
}

We can even rewrite f like this:

f = function (a, b) {
    if (precondition(a)) calculation(b)
}

… where if (cond) expr is the same as if (cond) expr else NULL.

Finally, I’d like to forestall three common objections:

  1. Some people argue that using return adds clarity, because it signals “this function returns a value”. But as explained above, every function returns something in R. Thinking of return as a marker of returning a value isn’t just redundant, it’s actively misleading.

  2. Relatedly, the Zen of Python has a marvellous guideline that should always be followed:

    Explicit is better than implicit.

    How does dropping redundant return not violate this? Because the return value of a function in a functional language is always explicit: it’s its last expression. This is again the same argument about explicitness vs redundancy.

    In fact, if you want explicitness, use it to highlight the exception to the rule: mark functions that don’t return a meaningful value, which are only called for their side-effects (such as cat). Except R has a better marker than return for this case: invisible. For instance, I would write

    save_results = function (results, file) {
        # … code that writes the results to a file …
        invisible()
    }
    
  3. But what about long functions? Won’t it be easy to lose track of what is being returned?

    Two answers: first, not really. The rule is clear: the last expression of a function is its value. There’s nothing to keep track of.

    But more importantly, the problem in long functions isn’t the lack of explicit return markers. It’s the length of the function. Long functions almost (?) always violate the single responsibility principle and even when they don’t they will benefit from being broken apart for readability.