There are lots of different takes here on the mechanics of the code in question, the "How", but for me none of it made sense until I understood the "Why". This should be especially helpful for new programmers.
Take file "ab.py":
def a():
print('A function in ab file');
a()
And a second file "xy.py":
import ab
def main():
print('main function: this is where the action is')
def x():
print ('peripheral task: might be useful in other projects')
x()
if __name__ == "__main__":
main()
What is this code actually doing?
When you execute xy.py
, you import ab
. The import statement runs the module immediately on import, so ab
's operations get executed before the remainder of xy
's. Once finished with ab
, it continues with xy
.
The interpreter keeps track of which scripts are running with __name__
. When you run a script - no matter what you've named it - the interpreter calls it "__main__"
, making it the master or 'home' script that gets returned to after running an external script.
Any other script that's called from this "__main__"
script is assigned its filename as its __name__
(e.g., __name__ == "ab.py"
). Hence, the line if __name__ == "__main__":
is the interpreter's test to determine if it's interpreting/parsing the 'home' script that was initially executed, or if it's temporarily peeking into another (external) script. This gives the programmer flexibility to have the script behave differently if it's executed directly vs. called externally.
Let's step through the above code to understand what's happening, focusing first on the unindented lines and the order they appear in the scripts. Remember that function - or def
- blocks don't do anything by themselves until they're called. What the interpreter might say if mumbled to itself:
"__main__"
in the __name__
variable.__name__ == "ab.py"
.a()
; I just learned that. Printing 'A function in ab file'."__main__"
!x()
; ok, printing 'peripheral task: might be useful in other projects'.if
statement. Well, the condition has been met (the variable __name__
has been set to "__main__"
), so I'll enter the main()
function and print 'main function: this is where the action is'.The bottom two lines mean: "If this is the "__main__"
or 'home' script, execute the function called main()
". That's why you'll see a def main():
block up top, which contains the main flow of the script's functionality.
Why implement this?
Remember what I said earlier about import statements? When you import a module it doesn't just 'recognize' it and wait for further instructions - it actually runs all the executable operations contained within the script. So, putting the meat of your script into the main()
function effectively quarantines it, putting it in isolation so that it won't immediately run when imported by another script.
Again, there will be exceptions, but common practice is that main()
doesn't usually get called externally. So you may be wondering one more thing: if we're not calling main()
, why are we calling the script at all? It's because many people structure their scripts with standalone functions that are built to be run independent of the rest of the code in the file. They're then later called somewhere else in the body of the script. Which brings me to this:
But the code works without it
Yes, that's right. These separate functions can be called from an in-line script that's not contained inside a main()
function. If you're accustomed (as I am, in my early learning stages of programming) to building in-line scripts that do exactly what you need, and you'll try to figure it out again if you ever need that operation again ... well, you're not used to this kind of internal structure to your code, because it's more complicated to build and it's not as intuitive to read.
But that's a script that probably can't have its functions called externally, because if it did it would immediately start calculating and assigning variables. And chances are if you're trying to re-use a function, your new script is related closely enough to the old one that there will be conflicting variables.
In splitting out independent functions, you gain the ability to re-use your previous work by calling them into another script. For example, "example.py" might import "xy.py" and call x()
, making use of the 'x' function from "xy.py". (Maybe it's capitalizing the third word of a given text string; creating a NumPy array from a list of numbers and squaring them; or detrending a 3D surface. The possibilities are limitless.)
(As an aside, this question contains an answer by @kindall that finally helped me to understand - the why, not the how. Unfortunately it's been marked as a duplicate of this one, which I think is a mistake.)