In the world of container managed transactions, Java provides a highly misused feature called XA transactions.
In simplest terms XA transaction is a global transaction that spans multiple resources. The implementation provides a transaction manager that allows applications to perform interactions with multiple resources as a single unit of work.
A lot of applications end up using XA adapters when they have multiple databases to deal with or just because of future possibility of using multiple database interactions. CMT hides transaction management and developers are not bothered with the nuances of the same, until something goes wrong. And with XA, it can go wrong a lot.
I planned to explain why, but then I found this very relevant post, so I'll leave it at that.
Distributed transactions are evil
Instead, let's focus on the solution since interactions with multiple resources is a fact of life. It is neither a good idea, nor a practical one to keep all the data that an application needs in a single huge database (let's not talk big data).
If the application uses resource like MQ to read data that it then processes to store into one or more databases, a little more consideration shall be put into building it. Developers can ask that if CMT is NOT_SUPPORTED and if their database interactions fail, the application might end up losing data.
A careful choice of MQ provider can solve this. Many MQ providers provide for syncpoints, where a message shall return to the queue if the Listener reading the message throws an error. This is because the provider didn't get an acknowledgement.
So if the database interactions are written such that if it fails and the application wants to prevent data loss, it shall throw an exception.
Point 3 above comes in handy, if database interactions are made idempotent, message re-delivery shall ensure successful end to end business interaction.
In simplest terms XA transaction is a global transaction that spans multiple resources. The implementation provides a transaction manager that allows applications to perform interactions with multiple resources as a single unit of work.
A lot of applications end up using XA adapters when they have multiple databases to deal with or just because of future possibility of using multiple database interactions. CMT hides transaction management and developers are not bothered with the nuances of the same, until something goes wrong. And with XA, it can go wrong a lot.
I planned to explain why, but then I found this very relevant post, so I'll leave it at that.
Distributed transactions are evil
Instead, let's focus on the solution since interactions with multiple resources is a fact of life. It is neither a good idea, nor a practical one to keep all the data that an application needs in a single huge database (let's not talk big data).
- One of the most important consideration while building better enterprise applications is correctly identifying what data shall reside in a particular database. What's important, is to keep all related data together. For example, say we have an online store, then we can store customer profile data in one database while the products that it sell shall reside in another database and maybe the orders in another database or either of the two previous ones. The intent is to keep all associated entities in a single place.
- If a particular interaction involves operations on multiple databases, it is probably a better idea to implement our own transaction management. If we have taken care of correctly distributing our entities, as in 1 above, we should be able to run transactions on each database independently as separate workflow steps. Each database interaction can commit or rollback as one unit of work, since all related entities are part of that single transaction only. The success or failure of previous database interaction can drive whether next database call is made or not.
- Also it is a good thing to keep database interactions idempotent. A fair way of doing that is to read before writing. Alternatively, a lot of tools like hibernate etc. provide saveOrUpdate kind of methods which provide this functionality, out of the box.
- If the application uses EJBs, mostly they will, the container transaction management shall be set to NOT_SUPPORTED. This is extremely important to ensure that the container does not try to wrap the interactions within transaction context.
- The database drivers should be non-XA. This will reduce overhead associated with database transactions and also if someone "enhances" the code in future and forgets to take care of design consideration in point 1 above. If there are overlapping entities, it'll either fail fast or look difficult to implement that someone will catch the miss.
If the application uses resource like MQ to read data that it then processes to store into one or more databases, a little more consideration shall be put into building it. Developers can ask that if CMT is NOT_SUPPORTED and if their database interactions fail, the application might end up losing data.
A careful choice of MQ provider can solve this. Many MQ providers provide for syncpoints, where a message shall return to the queue if the Listener reading the message throws an error. This is because the provider didn't get an acknowledgement.
So if the database interactions are written such that if it fails and the application wants to prevent data loss, it shall throw an exception.
Point 3 above comes in handy, if database interactions are made idempotent, message re-delivery shall ensure successful end to end business interaction.
Comments
Post a Comment