How does one add observability to python code affecting code quality?

  softwareengineering

We’re working an agile project and designing as we go on a new python commandline app / systemd service for some fancy in-house project. Right now, we’re supposed to be adding an observability / logging layer to the program. According to the product owner and the guy on our team that knows the most about observability (not me) we’re supposed to send to an OTEL collector basically every function call we do so we can 100% recreate what the program did.

This is supposed to create hierarchical spans of work so we can use a dashboard to see what it did (I’ll basically look like a gantt chart / timeline of all the work so we can see bottlenecks or whatever).

I’m of the opinion that this is rather insane and if I log every function call I’ll ruin my app’s performance and denude the program of any semblance of thread safety (not that that has been a concern thus far).

Furthermore, it seems like, in order to create all these child spans for OTEL collectors you need to have some global variables watching everything (which also makes multithreading a bigger chore and muddies up the whole program)

I’m not opposed to a global logger, but since that doesn’t really keep any state, it doesn’t hurt too much, what kinds of patterns do people use when they’re adding observability to python code that doesn’t either:

  • add extra, unnecessary variables to every function
  • add shakey and unsafe read/write global variables
  • slow down the program to a screeching halt on the network IO.

Right now, I’ve got a pretty good rating with the linter and the complexity of the program is low. It seems like a violently abhorrent tradeoff to add observability into a relatively simple program just so managers can tick a box – but I don’t want to get into the philosophy of this, just figure out a way to do it without making the program look like crap.

LEAVE A COMMENT