[c++] What is external linkage and internal linkage?

I want to understand the external linkage and internal linkage and their difference.

I also want to know the meaning of

const variables internally link by default unless otherwise declared as extern.

This question is related to c++ c++-faq

The answer is


Before talking about the question, it is better to know the term translation unit, program and some basic concepts of C++ (actually linkage is one of them in general) precisely. You will also have to know what is a scope.

I will emphasize some key points, esp. those missing in previous answers.

Linkage is a property of a name, which is introduced by a declaration. Different names can denote same entity (typically, an object or a function). So talking about linkage of an entity is usually nonsense, unless you are sure that the entity will only be referred by the unique name from some specific declarations (usually one declaration, though).

Note an object is an entity, but a variable is not. While talking about the linkage of a variable, actually the name of the denoted entity (which is introduced by a specific declaration) is concerned. The linkage of the name is in one of the three: no linkage, internal linkage or external linkage.

Different translation units can share the same declaration by header/source file (yes, it is the standard's wording) inclusion. So you may refer the same name in different translation units. If the name declared has external linkage, the identity of the entity referred by the name is also shared. If the name declared has internal linkage, the same name in different translation units denotes different entities, but you can refer the entity in different scopes of the same translation unit. If the name has no linkage, you simply cannot refer the entity from other scopes.

(Oops... I found what I have typed was somewhat just repeating the standard wording ...)

There are also some other confusing points which are not covered by the language specification.

  1. Visibility (of a name). It is also a property of declared name, but with a meaning different to linkage.
  2. Visibility (of a side effect). This is not related to this topic.
  3. Visibility (of a symbol). This notion can be used by actual implementations. In such implementations, a symbol with specific visibility in object (binary) code is usually the target mapped from the entity definition whose names having the same specific linkage in the source (C++) code. However, it is usually not guaranteed one-to-one. For example, a symbol in a dynamic library image can be specified only shared in that image internally from source code (involved with some extensions, typically, __attribute__ or __declspec) or compiler options, and the image is not the whole program or the object file translated from a translation unit, thus no standard concept can describe it accurately. Since symbol is not a normative term in C++, it is only an implementation detail, even though the related extensions of dialects may have been widely adopted.
  4. Accessibility. In C++, this is usually about property of class members or base classes, which is again a different concept unrelated to the topic.
  5. Global. In C++, "global" refers something of global namespace or global namespace scope. The latter is roughly equivalent to file scope in the C language. Both in C and C++, the linkage has nothing to do with scope, although scope (like linkage) is also tightly concerned with an identifier (in C) or a name (in C++) introduced by some declaration.

The linkage rule of namespace scope const variable is something special (and particularly different to the const object declared in file scope in C language which also has the concept of linkage of identifiers). Since ODR is enforced by C++, it is important to keep no more than one definition of the same variable or function occurred in the whole program except for inline functions. If there is no such special rule of const, a simplest declaration of const variable with initializers (e.g. = xxx) in a header or a source file (often a "header file") included by multiple translation units (or included by one translation unit more than once, though rarely) in a program will violate ODR, which makes to use const variable as replacement of some object-like macros impossible.


Basically

  • extern linkage variable is visible in all files
  • internal linkage variable is visible in single file.

Explain: const variables internally link by default unless otherwise declared as extern

  1. by default, global variable is external linkage
  2. but, const global variable is internal linkage
  3. extra, extern const global variable is external linkage

A pretty good material about linkage in C++

http://www.goldsborough.me/c/c++/linker/2016/03/30/19-34-25-internal_and_external_linkage_in_c++/


As dudewat said external linkage means the symbol (function or global variable) is accessible throughout your program and internal linkage means that it is only accessible in one translation unit.

You can explicitly control the linkage of a symbol by using the extern and static keywords. If the linkage is not specified then the default linkage is extern (external linkage) for non-const symbols and static (internal linkage) for const symbols.

// In namespace scope or global scope.
int i; // extern by default
const int ci; // static by default
extern const int eci; // explicitly extern
static int si; // explicitly static

// The same goes for functions (but there are no const functions).
int f(); // extern by default
static int sf(); // explicitly static 

Note that instead of using static (internal linkage), it is better to use anonymous namespaces into which you can also put classes. Though they allow extern linkage, anonymous namespaces are unreachable from other translation units, making linkage effectively static.

namespace {
  int i; // extern by default but unreachable from other translation units
  class C; // extern by default but unreachable from other translation units
}

I think Internal and External Linkage in C++ gives a clear and concise explanation:

A translation unit refers to an implementation (.c/.cpp) file and all header (.h/.hpp) files it includes. If an object or function inside such a translation unit has internal linkage, then that specific symbol is only visible to the linker within that translation unit. If an object or function has external linkage, the linker can also see it when processing other translation units. The static keyword, when used in the global namespace, forces a symbol to have internal linkage. The extern keyword results in a symbol having external linkage.

The compiler defaults the linkage of symbols such that:

Non-const global variables have external linkage by default
Const global variables have internal linkage by default
Functions have external linkage by default


  • A global variable has external linkage by default. Its scope can be extended to files other than containing it by giving a matching extern declaration in the other file.
  • The scope of a global variable can be restricted to the file containing its declaration by prefixing the declaration with the keyword static. Such variables are said to have internal linkage.

Consider following example:

1.cpp

void f(int i);
extern const int max = 10;
int n = 0;
int main()
{
    int a;
    //...
    f(a);
    //...
    f(a);
    //...
}
  1. The signature of function f declares f as a function with external linkage (default). Its definition must be provided later in this file or in other translation unit (given below).
  2. max is defined as an integer constant. The default linkage for constants is internal. Its linkage is changed to external with the keyword extern. So now max can be accessed in other files.
  3. n is defined as an integer variable. The default linkage for variables defined outside function bodies is external.

2.cpp

#include <iostream>
using namespace std;

extern const int max;
extern int n;
static float z = 0.0;

void f(int i)
{
    static int nCall = 0;
    int a;
    //...
    nCall++;
    n++;
    //...
    a = max * z;
    //...
    cout << "f() called " << nCall << " times." << endl;
}
  1. max is declared to have external linkage. A matching definition for max (with external linkage) must appear in some file. (As in 1.cpp)
  2. n is declared to have external linkage.
  3. z is defined as a global variable with internal linkage.
  4. The definition of nCall specifies nCall to be a variable that retains its value across calls to function f(). Unlike local variables with the default auto storage class, nCall will be initialized only once at the start of the program and not once for each invocation of f(). The storage class specifier static affects the lifetime of the local variable and not its scope.

NB: The keyword static plays a double role. When used in the definitions of global variables, it specifies internal linkage. When used in the definitions of the local variables, it specifies that the lifetime of the variable is going to be the duration of the program instead of being the duration of the function.

Hope that helps!


Linkage determines whether identifiers that have identical names refer to the same object, function, or other entity, even if those identifiers appear in different translation units. The linkage of an identifier depends on how it was declared. There are three types of linkages:

  1. Internal linkage : identifiers can only be seen within a translation unit.
  2. External linkage : identifiers can be seen (and referred to) in other translation units.
  3. No linkage : identifiers can only be seen in the scope in which they are defined. Linkage does not affect scoping

C++ only : You can also have linkage between C++ and non-C++ code fragments, which is called language linkage.

Source :IBM Program Linkage


In terms of 'C' (Because static keyword has different meaning between 'C' & 'C++')

Lets talk about different scope in 'C'

SCOPE: It is basically how long can I see something and how far.

  1. Local variable : Scope is only inside a function. It resides in the STACK area of RAM. Which means that every time a function gets called all the variables that are the part of that function, including function arguments are freshly created and are destroyed once the control goes out of the function. (Because the stack is flushed every time function returns)

  2. Static variable: Scope of this is for a file. It is accessible every where in the file
    in which it is declared. It resides in the DATA segment of RAM. Since this can only be accessed inside a file and hence INTERNAL linkage. Any
    other files cannot see this variable. In fact STATIC keyword is the only way in which we can introduce some level of data or function
    hiding in 'C'

  3. Global variable: Scope of this is for an entire application. It is accessible form every where of the application. Global variables also resides in DATA segment Since it can be accessed every where in the application and hence EXTERNAL Linkage

By default all functions are global. In case, if you need to hide some functions in a file from outside, you can prefix the static keyword to the function. :-)


In C++

Any variable at file scope and that is not nested inside a class or function, is visible throughout all translation units in a program. This is called external linkage because at link time the name is visible to the linker everywhere, external to that translation unit.

Global variables and ordinary functions have external linkage.

Static object or function name at file scope is local to translation unit. That is called as Internal Linkage

Linkage refers only to elements that have addresses at link/load time; thus, class declarations and local variables have no linkage.