I have never worked on or built myself a properly structure Python codebase before. I aim to do that today.
Here's the structure I have attempted to implement:
My-Fancy-Project-Name
/src
/bin
/binary1/main.py
/binary2/my_binary.py
/binary3
my_scripy.py
aux_file.txt
/binaryother
binary4.py
binary5.py
binary6.py
/lib
/lib1
__init__.py
lib1.py
/libio
__init__.py
libio.py
/libexception
__init__.py
exceptions.py
This structure is somewhat systems programming language inspired. It may not be a suitable project structure in the world of Python.
My reasoning for this structure is as follows:
The root directory for my whole "project" lives one level above My-Fancy-Project-Name. It contains all kinds of stuff, not all written in the same language, to do various things which are all related to a single project objective.
Broadly speaking this consists of:
- data collection and webscraping
- data preprocessing stages
- data analysis stages and visualization
- auxillary stuff, including some metrics visualization
- a webserver to serve preprocessed data to other external clients
- a documentation folder
- some shared stuff such as a couple of data folders containing small size data files
- a
.venvvirtual environment
Within the root directory there are other directories. One of those is called My-Fancy-Project-Name. (It isn't really, this is just an example name.)
- The point being:
My-Fancy-Project-Namecollects together an aggregation of python code which all relates to one part of the project. Specifically, all the code for "data collection via webscraping" is contained in this directory.
There are a number of scripts which act like executables. They perform some function. One of those is the main webscraper, others are smaller webscrapers which perform some specific function in some sense acting like a small utility. These "executables" are stored in a bin directory.
There is common code which is shared by each of the Python scripts ("executables"). This "library" code lives in a folder called lib.
Both bin and lib live in a folder called src. The intention is to add a doc folder which is adjacent to src.
I am running into issues where Python doesn't know how to "get to" the library folder code. This is unsurprising because I would not expect python to search any path other than the system defined paths and the current directory from where the Python script is executed.
My question is, how should I structure my project in a sensible way so that I can have a folder of shared library code, called lib, which contains sub-modules or packages(?), which can be accessed by a set of Python "executables" (scripts) contained inside a bin directory?
pyproject.toml? setuptools?