I’m writing a python app that instructs a database to perform various processing instructions on partitioned tables. The processing can take long enough that there might be a timeout, so I surround my calls to the database with try:... except:
blocks like:
while True:
try:
process_one(table, logger, cursor)
except OperationalError as oe:
logger.error(oe)
con, cursor = retry_connection(logger, dbset)
else:
break
Where process_one()
is a call to a PostgreSQL function like:
def process_one(table, logger, cursor):
logger.info('Processing table %s', table)
cursor.execute('SELECT do_something(%(table)s)', {'table':table})
But if I have a sequence of processing functions to execute, and don’t want to repeat a procedure in case of a connection issue this would turn into:
while True:
try:
process_one(...)
....
while True:
try:
process_two(...)
....
Would the DRY ideal be to have a function that performs the while True, try, except, break
loop and takes different processing functions as a parameter?
def try_process(process, processparams):
while True:
try:
process(processparams)
except OperationalError as oe:
...
2
Yes, that design makes sense. (Based on the information given here).
Assuming that each process has the exact same error handling requirements, having a function that does that error handling, so you don’t have to repeat it for each process is a good design.
One can imagine a case where in the future you might want to add additional error handling logic, and having this in a function keeps things manageable even if you end up having quite complex logic that handles a whole bunch of different error states.
You probably shouldn’t have this retry logic at all. Instead, you should resolve what’s failing. In ordinary usage you shouldn’t be getting OperationalError exceptions. If your queries take too long to run, you should increase the timeout. If at all possible, you should resolve the reasons your code is failing instead of retrying.
(To be clear, there are cases where retrys make sense. Its possible you are in one them, but its relatively rare).
Also, you should pretty much never retry in an infinite loop. You really want to retry a number of times, and then give up. If something is seriously wrong, you don’t want to sit in an infinite loop retrying it, you want to fail so someone is notified there is a problem.
Assuming that you should be retrying, I think you may have chosen the wrong level of abstraction. You are going to have many different functions that make database calls that you want to retry. However, these are going to call a small number of functions in the database api, (mustly the execute method). So instead of adding retry logic to every one of your functions, it would make more sense to wrap the relatively smaller number of postgres api functions with retry logic.
Instead of writing a lot of code like this:
def process_one(table, logger, cursor):
logger.info('Processing table %s', table)
cursor.execute('SELECT do_something(%(table)s)', {'table':table})
try_process(process_one, table, logger, cursor)
Write a replacement cursor class:
class Cursor:
def __init__(self, cursor):
self._cursor = cursor
def execute(self, *args):
for _ in range(NUMBER_OF_RETRIES):
return self._cursor.execute(*args)
except OperationError:
pass # retry
And use this wrapper class instead of the psql cursor. Since most problematic statements will go through the execute function, this handles most of the cases, avoiding having to insert retry logic all over the place.
2
Yes, looks fine, although you wouldn’t need to require the process
function to always take a single parameter:
def try_process(process):
while True:
try:
process()
except OperationalError as oe:
...
And you need to call it with a parameter you use a lambda:
try_process(lambda: process(processparams))
(I would probably add a limit to the number of retries though)