Is a repository return a graph of entities violating SRP?

  softwareengineering

I’m working in this scenario

Post entity has many Image entities.

I also have repository to both entities:

  • PostRepository
  • ImageRepository

Since this entities are tightly related, when I get a Post I want to return the list of images also.

I have to possible ways of do this:

  1. The orchestator (in this case a PostService) will call indepently each one of the Repositories to fecth the data.
    PROs:

    • For sure don’t violates SRP
      CONs:
    • Create extra dependencies PostService might not have need to have a dependency with ImageRepository.
    • Unnecesary calls to database
  2. The PostRepository will return the whole Graph Post with Images
    PROs:

    • Reduce dependencies
    • Simplify the database query.
      CONs:
    • Might violate SRP.

In the case of a writting scenario I clearly prefer the 1st option, since I want to keep validation of entities completely segreggated. But for reading scenario I thinkg 2nd option could be more optimal, however I couldn’t find any good argument or reading for this.

So which option is better for the reading scenario?

2

In my experience, at least in in application development, it is seldom necessary to decide this beforehand.

Instead, I would recommend to start with one repo class and see how far it brings you. When the code base grows and you get some indications that splitting up the repo into two becomes beneficial, then refactor immediately. Such indications are

  • the class becomes “too large”, readibility decreases and it takes more and more time to find the right place in that class for extensions and changes

  • you want to mock out either the db access for images or posts (independently from each other) for testing the other in isolation

  • you want to reuse the repo functionality of images or posts on its own.

  • the API of your repos gets simpler to use when split up into two classes.

There is also a third alternative to the two designs you suggested: let PostRepository provide an API which can return the whole graph, but let it internally use a ImageRepository for it, which is injected at construction time of PostRepository. That may be useful when the orchestration between both repos is so simple it does not seem to be worth introducing a separate orchestrating class.

The SRP is not an end in itself, it is a means to an end. So apply it when your code requires it, not “just in case”.

Of course, you have to make sure you don’t miss the point in time when you have to refactor. And things may look different when you are trying to design a library with a stable (=hard to refactor) API.

2

Yes, it is a violation of SRP. The repositories are designed with the intent to each handle one particular entity, and you’re having your repository handle two (or more) entity types at the same time.

However, it is not wrong to do this*, because it dramatically reduces performance hits. If you handle each entity in their own repository, that means you need to do two database calls, which is more expensive.

This is one of those cases where practical considerations (performance due to networked database calls) outweigh the theoretical (perfectly adhering to SRP).


*There are other solutions to this problem. I’m merely pointing out that returning a graph is acceptable because of the performance benefits you get from doing so, as opposed to sticking with a pure one-entity-per-repository approach.

What you are experiencing is normal and it’s due to the dual nature of the application: write and read. In your case the something does in fact have 2 responsibilities but not who you thing: the model is used to write and also to read. So not the Repository is the problem but the Aggregates.

CQRS solves this modeling problem. It splits the model in two: write/command and read/query.

In your case you would have the Post and Image Aggregates with the corresponding repositories or the write side. For the read side you would have a specially designed ReadModel that would give you a list of posts and their images, in a optimized format. This would be its single responsibility. This ReadModel would be kept up-to-date by listening to changes from the Aggregates in the write side. The most common solution is to use Domain events.

As a note, you create a ReadModel per use case. In this way the ReadModel is simple, fast and perfectly fitted for one job. For example, for the Newest posts widget on the Homepage you have a NewestPostsHomepageReadModel that contains only 10 rows, the newest 10 posts, each post having only one image, the main image.

CQRS is not trivial but it makes your models a lot simpler, as they should.

LEAVE A COMMENT