SOA and the N + 1 Selects Problem


(10 comments)
February 17th 2009


Service Oriented Architectures (or, microservices, as they're called now) add a new twist to an old problem. Imagine this simple scenario:

You are an architect of an online retailer. Your business owners have laid down the gauntlet. Well, not really they've just added a new requirement: they'd like to see a list of all orders for customers matching some specific criteria (e.g. city, name, registration date, etc.). For example, show me all orders for customers named "Holmes" from "Pittsburgh, PA". Further, this isn't for a static report, but rather a dynamic screen in their admin UI.

Sounds simple, right? In the old days of integrated applications, it would be - just join the customer and order tables and apply said criteria. Voila!

Ahh...but it's a brave new world of SOA, and some old tricks no longer apply. In your architecture, instead of one integrated application, you have two loosely coupled services, Customer and Order. Both services are deployed independently and have their own data - i.e. the Customer service has a Customer data store and the Order service an Order store.

The question stands then how can you answer a business question about customers and orders, given that this data spans across multiple services?

Here are four solutions I've come across:

1. Retrieve from Service: A pure SOA architecture may dictate that access to another service's data necessitates a call through that service's exposed interface - and this is where you come face-to-face with the n + 1 selects problem. In this architecture, your Customer Service would first have to query the customer data store for the list of customers matching the criteria. It would then, for each customer, call the Order service to return all orders associated with that customer. Therefore, for n customers matching the criteria, n + 1 queries would need to be executed (on top of n remote calls to the Order service!). The performance costs here are significant, and I imagine unsatisfactory in most cases.

2. Retrieve Directly: Another option would be to open read-only access to the Orders data store from the Customer service. This would improve performance, by saving the remote calls to the Orders service (and perhaps the associated data binding), but still wouldn't circumnavigate the "n + 1 selects" problem, since without a database join from customers to orders, a query of orders for each customer is still necessary.

3. Cache Locally: A third route would be to cache another service's data locally (perhaps through some pub-sub, Master Data Management solution). In the scenario above, some customer specific data could be cached in the Order store, such that finding all orders for customers living in "Pittsburgh" could be handled by the Orders service alone. Alternatively, there could be a special search service which caches both customer and order data locally, and would be used exclusively to handle queries across data stores.

4. Retrieve from Central Database: A fourth solution, not depicted on the diagram, would be to have only one master database from which both services may read and write. Doing so would improve performance and perhaps reduce complexity, but much of the loose-coupling, separation of concerns benefits of your SOA would not be realized - for example, a change to the orders table could have the effect of breaking the Customer service.

You may argue that this is a straw-man example - in the real world, these services wouldn't exist (e.g. wrong service granularity, etc.). Possibly true. However, any SOA will have multiple services, and some of these services will need access to data that spans other services, so in some form, the problem will still remain.

Anyway, thanks for reading, and I'd be very interested to hear your thoughts or solutions!

I'm an "old" programmer who has been blogging for almost 20 years now. In 2017, I started Highline Solutions, a consulting company that helps with software architecture and full-stack development. I have two degrees from Carnegie Mellon University, one practical (Information and Decision Systems) and one not so much (Philosophy - thesis here). Pittsburgh, PA is my home where I live with my wife and 3 energetic boys.
I recently released a web app called TechRez, a "better resume for tech". The idea is that instead of sending out the same-old static PDF resume that's jam packed with buzz words and spans multiple pages, you can create a TechRez, which is modern, visual, and interactive. Try it out for free!
Got a Comment?
Comments (10)
Chris Burnley
February 17, 2009
I’ve come across this problem many times where I work ( a bank). The solution for this is usually the cache locally with pub/sub updates. The overhead is obviously creating your own storage for the cached data.
todd
February 18, 2009
You could create a service layer that was in charge of matching the customers and orders so everyone didn’t have to create the same logic. And of course their should be batch interfaces so more than one item can be returned at a time.
Ben
February 18, 2009
Thanks for the comments!

@Chris - agreed that local caching is typically the best option for performance and perhaps complexity, but like you said, the costs of implementing this are non-trivial and there are big decisions to make (e.g. cache in memory in or in database?, is staleness an issue?, how to implement the pub-sub mechanism?, etc.)

@Todd - A service layer like this could work, especially if there are big reuse opportunities. Good point about the batch interface.
Ariel
February 19, 2009
If those are the requirement from the application then it means that those services are not loosely coupled and they should not be in different services.
Ryan
February 20, 2009
There’s a very simple fifth solution, which is to add a “loadOrdersByCustomers” call to the Order service. Then you’re down to two selects: load customers by criteria + load orders for a set of customers. As within a single service/project/application, loose coupling is a great design goal until it begins to impede something else, like performance in this case. Then you start adding specialized code for specific use cases. Your scenario assumes that the Order service is written without consideration of outside requirements, and you’ve shown that that’s a bad idea.
Ben
February 23, 2009
Thanks for the comments, Ariel and Ryan.

@Ariel - good point…the example is a bit fabricated, so in the real world, two such services might not make sense if these were indeed the requirements. In general, however, I believe the problem is general to most service oriented architectures - most likely you will encounter a requirement that necessitates the merging of data across services.

@Ryan - good catch! This is a fifth solution, however, it seems like it too would some performance impact, and also be fairly cumbersome from an implementation perspective - since the “join” logic would need to exist in your application code rather than your database. Your first query would return a set of customers. Presumably, your second query would then use the customer IDs from the first query in the WHERE clause (like “WHERE customer_id IN (1, 5, 23, …)”) and find all the orders associated with those customers. The business tier, then, would be responsible for merging the orders lists with the customer list, which could hurt performance (and maintainability).
Steve Ayers
February 25, 2009
This is a cool post, at least to me, since I have come across this issue twice in the past six months, each presenting one of the above predicaments.

In one scenario, I have a central database, which I’m guessing is going to occur the lion’s share of the time this problem presents itself. In this case, a simple join of the two tables was all it took. I understand the tight coupling of the two services, but there are always going to be tradeoffs in any solution and obviously the correct approach is contingent on the circumstances of the problem. In this instance, a little tight coupling in my opinion is far better than the n+1 performance issue. I attempted the solution incorporating n+1 and the time it took was exponentially higher.

In the other scenario, we have more of an According-to-Hoyle SOA environment in that one set of data resides in a different repository from another set. In this instance, caching seems to be a viable option. The configuration overhead might be an issue (albeit slight in my opinion) as would staleness, but you run the risk of staleness in any scenario, the only variable is how stale are we talking.

For instance, if I go with a n+1-esque scenario or even the common service that pairs both together, by the time this logic completes, the data could be completely invalid anyway. n+1 as I’ve said is an insane performance hit and a common ‘matching’ service dictates that essentially ALL orders be retrieved, since you have no way of applying additional customer criteria to the order table. In my mind, the solution that makes the most sense is always the one that offers high performance gains while at the same time keeping some level of maintainability and minimal overhead. Mr. Obvious, I know, but people do not always think that way.

Also, I do not think the granularity of these services is far off by any means. This is a perfectly acceptable level of granularity and is very similar to many I’ve run across.

Cool posts, Ben. Love the website too, keep up the good work. Hope all is well.
Ben
February 26, 2009
Thanks Steve! (and good to hear from you)

Great points. Agreed…the central database option definitely seems to be the best in terms of performance and complexity. One way, maybe, to preserve some separation of concerns is to enforce some access rules at the DB level to, say, only allow read access to the Customer service, but not write access.

And good point about staleness being an issue not just with the caching option, but also with the n+1 (especially since global transactions aren’t all that viable in SOA, so while orders are being retrieved for n customers, new orders could be coming in).
Vinay Rajadhyaksha
March 27, 2009
This is probably an extension of the solution put forth by Ryan. How about developing two services. Service 1 retrieves Customers based on a filter criteria, service 2 retrieves orders for a set of customers. Note that service 2 itself would be responsible for grouping the orders for a given customer. Additionally, the service itself need not be a WebService, if the customer service and order service are based on the same underlying technology, the same could be leveraged for defining the service interface. Alternatively I would look at hosting the two services on the same box allowing me to leverage within VM calls for local clients and use the service interface for remote clients. Advantage here is that we have a right grained service, which is sufficiently loosely coupled and provides options for doing performance improvement. It is not directly retrieving information from database, thus allowing service specific business logic wrappers to be retained, does not need a caching solution(in a typical scenario for performance reasons the order and customer databases would be independent) and addresses the N+1 select issue.

Let me know your thoughts.
Erik
May 09, 2010
How about using a “Entity Abstraction”-pattern with “Event Driven Messaging” for synchronization?

(This is very much solution 3 (cache locally), but it is an even more loosely coupled solution implemented as one “Task service” and one “Entity service” for each service in the above diagram.)