Who should build the RESTful client between two applications which both offer APIs?

02/05/2024 softwareengineering

Application A creates Widgets. Managers have decided that information regarding these Widgets is needed in Application B.

Application A has a RESTful API that allows client applications to pull data from it.
Application B has a RESTful API that allows client applications to push data to it.

In other words, the two applications have APIs exposed, and one of them is going to need to contact the other in order to exchange the Widget data. Real-time transfers would be nice to have, but hourly transfers would be acceptable.

As a GENERAL practice, which side should be responsible for building a client to call the other application?

Application A devs should build a client to push the Widget data to Application B’s API.
Application B devs should build a client to poll for new Widget data and pull it from Application A’s API.

Assume most everything else is equal. Both applications are equally large and critical, well supported, and both applications have many existing clients that call their APIs.

Each application should publish its own client*

If you test your api, you need a client to connect to it for the test. So you have a client right there, publish it.

If the calling code writes a client, then you end up writing the client multiple times, once for every app that uses the API

When you update the API, you also update your tests and the client.

*Obviously there are many programming languages in the world, you wouldn’t expect the API creator to write and publish a client for each. Standards like REST, SOAP or ebXML make it easier for people to write clients where there isn’t one published for your language.

But for when you have many of your own microservices, you’re going to either be writing them all in the same language, or know what language to use. It makes sense to put the client or clients with the API project.

If your Question is push vs pull….

Push is superior because it means you don’t need to poll and polling is expensive. Also, you control the rate rather than being overloaded with requests for data.

You do need to think about missed messages if you are pushing, but it you are using some out of the box infrastructure like a message queue the hard parts have been done for you.

If not, then also have a pull mechanism. This can be used by the receiver to check for and catch up with missed messages. But primararly you expect the push to work

In general, it’s easier to pull data than to push it. If something goes wrong when pushing data, the side doing the pushing won’t always know that the push wasn’t successful. For example, if there’s an issue pushing data to an API, even if the API produces an error, it’s often not clear whether the data should be pushed again. When a client pulls data, if something goes wrong calling the API or handling the results, there’s 1. usually enough context to determine whether a retry can be executed and 2. reads are usually (and generally should be) safe to repeat many times.

The one issue with this is it may not always be clear when the client needs data. If this something that is on a schedule, that’s easy enough. But otherwise, you have two main options: polling and events. Polling is easy to get right,, so if your SLA (time between changes in A and when it needs to be known in B) is not very tight, that can work.

If you need a ‘near-real-time’ data, then you can either pull on demand as needed or use some sort of event mechanism to inform clients of changes. That could mean using queues/topics or having the B application host an endpoint for update notifications.

You could put all the information about the update in the notification. My personal experience with that kind of thing hasn’t been great. I can elaborate on that experience if needed. My preference is that the application which is the source of the data expose simple safe APIs and also host topics on which it publishes notifications of updates. Ideally, those updates contain the URI to call retrieve the data associated with that update. Your update could also contain some or all of that data. I would still provide the APIs even if you pass all relevant data on the event. Retrying events can be complicated and messy. Partial data can be useful to the client-side for determining whether an event is relevant.

While it’s completely possible to pass all the data in the update on a topic or queue, I don’t think this is always a good choice. If the amount of data associated with the update is small and simple, then it’s probably fine. I would tend to avoid this approach for the following reasons, especially if the data associated with the update is large and/or complex:

The more complex your content, the more likely it is that a consumer will fail to parse it. If the consumer hasn’t implemented some sort of poison message handling, your consumer will not progress beyond it, or drop the update to the ether.
I have found it rare for developers who are new to messaging to understand that reads in are destructive (or at least not safe) in that context. Very often commits are executed before successful processing of the message.
Passing around large messages on these systems can result in excessive resource requirements.

All else being equal, it seems like it would be technically better for application A to push the data to B because it’ll be more timely and avoid a polling interface.

LEAVE A COMMENT Hủy