Evaluating the cost of a query in database

We are working on a project using Google App Engine + Google Cloud Datastore. The language used is Python.

At some points of our API, we check if some arg is passed in request, and we make some query on database if arg isn’t null.

Example:

result = None
if request.name != '':
    result = ndb.Key('Users', request.name).get()
else:
    return "Missing args."

We all agree to not make query if there is no data to query. However, the get method returns null if no result was found. In this specific scenario, App Engine + Datastore, couldn’t we avoid the principle of not make some unnecessary query in favor of reduce cyclomatic complexity of code? My point is, all the nodes of architecture (app server, database, etc) are in a balanced infrastructure, the cost of a simple query really matters?

Example:

result = ndb.Key('Users', request.name).get()
if result:
    print 'Keep running...'
else:
    return "User not found."

Obviously this is a very small part of code to talk about complexity.

1

The cost of reaching out across a network is extremely expensive relative to doing a check against memory. The added complexity of a single if and break/return is 1 (most calculations incorrectly double-count the if and the return as 1 each.)

If you aren’t doing this much, it’s not going to matter much either way but it you are doing this many times (e.g thousands a second) it can add up. This matters more when you are paying the number accesses to the DB as is the case in many cloud offerings. It’s hard to justify that extra cost for such a small reduction in complexity.

The cost of making an unnecessary query may indeed be significant. Although this is not always so. It depends on your data, expected usage of the app, etc.

Performance aside, the second example is conceptually worse. The missing name appears to be an error state, which is properly handled in the application rather than passed on to the database.

If you simply pass on the null to the database and let the query fail, you end up providing less useful error information as output, because you can’t distinguish between name being blank, and a name being provided but not found.

Trả lời

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *