I have an application that has been written in Python and it is used for generating various MIS reports for a specified month. It is now written without using the OOPs concepts or any kind of design patterns. I would like to refactor the application so that it will be maintainable and extensible. At this point, I have three reports and it will be increased in the future. There is a plan to move all the reports (about 25) to this platform so that the reporting system will act like an independent microservice.
Now the code is written in a procedural way as follows:
app.py
conn = get_db_connection()
cur = conn.cursor()
workbook = excel.create_workbook(filename)
excel.write_report_details(workbook, filename, year, month)
for current_month in range(int(month), 0, -1):
cur.execute('SELECT get_data(%s,%s,%s); ', (str(current_month), str(year)))
data = json.loads(json.dumps(cur.fetchall()))
row = data[0][0]
data_report1 = row['data_report1']
data_report2 = row['data_report2']
data_report3 = row['data_report3']
data_report4 = row['data_report4']
data_report5 = row['data_report5']
excel.write_report1(data_report1,workbook. year, month, logger) -> creates excel sheet in workbook and writes data
excel.write_report2(data_report2, workbook, year, month, logger, type='one_type') -> creates excel sheet and writes data based on the type
excel.write_report2(data_report2, workbook, year, month, logger, type='another_type') -> creates excel sheet and writes data based on the type
workbook.save(filename)
conn.close()
here the excel.py
is another python file from which the functions and imported to the main app.py
.
This looks simple, but as the number of reports increases, it will be difficult to manage the code as every time I need to add a method and all of these methods are in a single file. I have done some refactoring and moved some functions from excel.py
to another file and imported them to the excel
file. Also, there are multiple configurations like the report file names, report column headers, etc, which are specific to each of the reports.
Now my idea is to create a common Report
and extend this class for each of the reports. The base class will contain a method generate_report()
and this will be overridden in each of the sub-classes. As I need to generate all these reports, I have confused about whether I need to create individual objects for all these subclasses or if is there any alternative method for doing this. Also, if there is any better architectural pattern, please suggest one. I have checked the Factory Pattern, but not sure it is the correct pattern that I need to follow here.
This doesn’t really seem like a job for a plethora of subclasses.
If you really want general purpose solution, instead of baking a specific notion of a configuration into a subclasses, use a first-class notion of configuration (a configuration object and/or data structure), at one extreme perhaps even stored in and loaded from a text file, to define the shape of the worksheet data and their data types and column headers (if you cannot derive these from the SQL query).
Then anyone can edit or add to the configuration without necessarily modifying & rebuilding the program. In theory, this is a simplification.
The downside is that this moves into the DSL business, which can be a slippery slope and get out of control, by itself turning into a general purpose programming language and making the editors of configuration (i.e. the users of the program) into programmers of a possibly not so well defined language. Therefore in going this route, will want to find a good balance between simple configuration specification and the individual capabilities that need to be coded and added into the program to implement those specifications.
1
Before going nuts with objects and patterns start with the simplest separation you can.
def gatherData(generateReports) {
conn = get_db_connection()
cur = conn.cursor()
workbook = excel.create_workbook(filename)
excel.write_report_details(workbook, filename, year, month)
for current_month in range(int(month), 0, -1):
cur.execute('SELECT get_data(%s,%s,%s); ', (str(current_month), str(year)))
data = json.loads(json.dumps(cur.fetchall()))
row = data[0][0]
generateReports(row, excel, year, month, logger)
workbook.save(filename)
conn.close()
Now the code that is likely to change is separated from the code that isn’t. Writing the first version of generateReports
should be a simple copy n paste. Write other versions as you need them. Oh sure you can make this fancier with subclasses and what not but however that is solved keep it hidden from here.
Huh, come to think of it, I believe this is called the hole-in-the-middle pattern.