Imports in Python

Learn em once so you never have to learn them again...

Example

Let's setup an example file structure to ground our example in:

main.py
packages/
    - subpackageA
        - __init__.py
        - A.py
        - AA.py
    - subpackageB
        - __init__.py
        - B.py
        - BB.py

Let's say we want to import all members of A.py from main.py. We can simply use an absolute path from main.py to the module

main.py
from packages.subpackageA import A
...

Why can we do this? This is because, by default, the parentw directory of the entrypoint script is added to sys.path. For instance in, python some_dir/main.py main.py is the entrypoint script and some_dir is added to sys.path.

What is sys.path?

Sys.path is a list of directories where Python looks for packages. When we use absolute paths, Python appends this absolute path(i.e 'packages.subpackageA') to one of the paths in sys.path and then uses this full path to check if it can find a module there. If the module isn't found like this, it continues appending this absolute path to other paths in sys.path until the module is found

When we use absolute paths, Python appends this absolute path(packages.subpackageA) on one of the paths in sys.path and then uses this full path to check if it can find a module there. If the module isn't found like this, it continues appending this absolute path to other paths in sys.path.

Since sys.path only has the current directory of the script being executed, A.py cannot import B.py unless it uses the full absolute path. For example if A wanted to import B it would not be able to do so with simply from subpackageB import B . Instead it would have to be:

A.py
from packages.subpackageB import B
...

This makes sense because the directory that A resides in is not in sys.path. We can add it, but keeping absolute paths relative to the parent directory of the entrypoint seems like the best way to keep everything organized and neat.

Use absolute imports when writing python code!

What about __init__.py?

__init__.py let's you control how you want to child modules and subpackages. For instance, imagine that the whole purpose of subpackageA is to expose a function in AA.py called some_important_function. Importing this function from main.py would look something like this:

main.py
from packages.subpackageA.AA import some_important_function

But this is a messy way to import the main_function. We don't want people fiddling around in A. Instead we can modify packages/subpackageA/__init__.py to look something like this:

packages/subpackageA/__init__.py
from packages.subpackageA.AA import some_important_function

Then we can simply do this:

main.py
from packages.subpackageA import some_important_function

This is a little bit neater and serves as a kind of encapsulation for subpackageA. Essentially we can think of __init__.py files as analogous to index.js files.

If the folder structure of the package matches the structure of the API -- __init__.py files are not necessary.

Writing Packages

So sys.path only includes the directory of the script being executed. This complicates writing packages for third parties where we don't know where the entrypoint script resides.

Our solution to this problem is based on the important fact that sys.env is shared in a Python application. Let's consider that we wanted to export subpackageA to pip. Here's what we'd want to do to make sure our package is ready to be exported.

  1. Make all absolute imports relative to the __init__.py below.

- subpackageA
    - __init__.py
    - A.py
    - AA.py
    - subsubpackageA
        - AAA.py

For instance, if A needs to import AAA it would use from subpackageA import AAA

2. Add this code to the top-level __init__.py :

__init__.py
import os
import sys

path = os.path.abspath(__file__)
dir_path = os.path.dirname(path)
sys.path.insert(0, dir_path)

This adds the path to the package to sys.path which means that our absolute imports relative to the top level __init__.py will continue to work. Of course, you need to do this at top-level __init__ file of the package so that the path is added before execution moves to another module.

Lastly, here are some links for this topic:

Last updated

Was this helpful?