As a member of data science team, we do not only analyse / explore data with terrific algorithm, but also create some reports which can help others departments to make decisions or map out strategies. For creating reports, we create a python package which contains multiple elements. In this blog, I’ll talk about how we create the python package with following points:
- Structure of python package
- Detail for each element
Structure of python package
Detail for each element
init.py
The __init__.py
files are required to make Python treat the directories as
containing packages; this is done to prevent directories with a common name,
such as string, from unintentionally hiding valid modules that occur later on
the module search path. In the simplest case, __init__.py
can just be an
empty file, but it can also execute initialization code for the package or set
the __all__
variable. If you want more information, click here.
SQL file
In the SQL file, we specify all queries that we need for the report, with following format:
As you read, we put date_start
and date_end
as input, search stores
data
directly from database, and use all of them to filter sales
data.
Python file setup.py
setup.py
tells you that the module/package you are about to install has been
packaged and distributed with Distutils, which is the standard for distributing
Python Modules.
Python file job.py
This file describes how to create report with SQL file above.
This .py
script defines start date and end date of report with datetime.date
,
connects to Teradata database with teradata
package, read
SQL queries with pkg_resources.resource_filename()
, import SQL query’s input
with .format()
, and extract data with pandas.read_sql()
.
Other files
-
Test file
job_test.py
We write tests (e.g. unit tests) in this file. I’ll write another blog about how to write unit test with python in the near furture. -
.gitlab-ci.yml
&environment.yml
They are used for automating job with gitlab runner. If you are interesting in it, you can find more information here.
Reference
- “6.4. Packages”, doc.python.org. [Online]. Available: https://docs.python.org/3/tutorial/modules.html#packages
- TimHill, “Tree image”, pixabay. [Online]. Available: https://pixabay.com/photos/tree-sun-sunshine-summer-meadow-2916763
- Silfverstrom, “What is setup.py?”, stackoverflow. [Online]. Available: https://stackoverflow.com/questions/1471994/what-is-setup-py