Question

I am in a situation I cannot decide which approach is the most optimal (performance wise) while being maintainable at the same time (in the sense of having a clear logic).

The question is laid in the context of a Django web app, but I figure it applies to any related scenario.

In my scenario we are visiting a particular route that displays many matches in a league or tournament associated to a particular season:

URL: season/<season_id>/

Associated Django ORM query: season.match_set.all()

A season has many divisions, and of course, matches are made up of teams. The client can filter by division and/or by team. These filters can also be included in the URL (so users can share it, already filtered), i.e. season/<season_id>/#division=<division_name>, so matches belonging to the specified division are filtered.

However, even when visiting a route including a filter, the entire query is executed: season.match_set.all().

And here is what I cannot decide about. In terms of efficiency, it would be way better to just fetch the matches related to that division:

season.match_set.filter(division=division)

However, it might be pretty common that users use the filters in the page, switch between them, etc. Which, if we use the second approach, would obviously mean additional requests which would also mean extra database hits to retrieve the filtered matches. This would not happen with the first approach since we have all the data set since the beginning: just one request and one database hit (although heavier).

We could try to optimize the second approach by storing filtered data as requested; i.e. if we have a season with three divisions and the user filters by Division 1 (request 1), we store that somewhere (in the client side I figure), then if he/she filters by Division 2 (request 2) we do the same and add it to the existing data, and finally if the user filters by Division 1 again we just get it from the stored data and we can spare ourselves from performing request 3.

However, I have concerns about having a clear logic and code as I mentioned before, because this last optimization approach can easily get really funky and unreliable.

My question: what is the to-go approach? This is a fairly common scenario so I figure there must be a consensus on what is the most efficient approach: fetching all database entries and performing just one request or performing multiple requests and database queries and get data as it is being requested?

Data filtering & requests: fetch all entries or split data?

LEAVE A COMMENT Hủy