Is there an official name for the “one object disease” anti-pattern (iterative single object operations on databases, services etc.)?

It is caused by the naive programming paradigm: focus on just a single object, do something with it, and if you have to work with many objects, you loop, iterate and traverse, repeating the operation on each object you come across. This is usually o.k. for rather small collections in memory.

It is wrong in an environment where either a set based approach can be handled much better, or every individual operation has costly overhead, so it’s better to group them. Typically, databases and services which require connections and round trips for each operation and can handle set operations much better.

For example, a naively coded data access layer will load a collection of aggregate root/parent objects, then loop over the collection to retrieve child objects for each, and then over each child object to retrieve “grandchild” objects by individual queries, instead of making one query for all parents’ and then all childrens’ subsequent data rows/objects. Because a number of N parents requires 1 query for the parents and then N for their children, this problem is known as SELECT N+1, frequently bringing databases and application performance to their knees.

Bad:

var parents = dbaccess.SelectParentsFromDb(fromDateTime, toDateTime);
foreach(var parent in parents)
{
    parent.Children = dbaccess.GetChildrenForParent(parent.Id);
}

Good:

var parents = dbaccess.SelectParentsFromDb(fromDateTime, toDateTime);
var parentIds = parents.Select(p => p.Id).ToArray();
var allChildren = dbaccess.GetChildrenForParents(parentIds);
foreach(var parent in parents)
{
    parent.Children = allChildren
        .Where(c => c.ParentId == parent.Id)
        .toList();
}

In direct database coding, like with PL/SQL or T-SQL, this anti-pattern would be visible as a DB cursor to iterate over rows, rather than using set-based SQL or DML commands. This is the main reason why cursors are usually considered bad in database programming.

Using web services or other, remote sources, similar disadvantages may occur; huge overhead and time loss through a large number of single object queries/operations.

Sometimes, developers would insist on this “handle one object, repeat for multiple” approach, at least as long as they don’t recognize the consequences; for this reason, I started to call this anti-pattern “one object disease”.

Is there a more “official”, general name for it?

4

As Robert Harvey’s comment, I would expect to see this referred to as an “N + 1” problem as a general term for similar scenarios where an O(n) or even O(n^2) implementation is used when an O(1) solution is available.

A “Schlemiel the Painter’s algorithm” is similar but not exactly what you’re describing here.

Trả lời

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *