Distributed Transactions with Entity Framework and Multiple Databases
Entity Framework has come a long way, and it has plenty of options to support transactions. But what happens if you are working with multiple databases and want to retain transactional control - i.e. you require 2-phase commit when making changes to two (or more) different databases?
If you are working with PaaS service like Azure SQL Database and App Service, the advice here does not apply - that is an entirely different underlying infrastructure - skip to the end to get help on that front. This post is dealing with solutions that are running on physical or visualized servers.
So where you do still run on tin or have IaaS hosted solutions you can deal with the scenario. For many this won't come up, but every now and again some enterprise will fling in that curve ball. If you can't engineer yourself out of this position, luckily you can still wrap multiple contexts inside a transaction, a Microsoft Distributed Transaction to be specific.
If you are working with PaaS service like Azure SQL Database and App Service, the advice here does not apply - that is an entirely different underlying infrastructure - skip to the end to get help on that front. This post is dealing with solutions that are running on physical or visualized servers.
So where you do still run on tin or have IaaS hosted solutions you can deal with the scenario. For many this won't come up, but every now and again some enterprise will fling in that curve ball. If you can't engineer yourself out of this position, luckily you can still wrap multiple contexts inside a transaction, a Microsoft Distributed Transaction to be specific.
This comes at a price, as you need to step back a little in time to when you used to manually open up your database connections. But that is about as far back in time as you need to go, once you have opened your connections to your multiple databases, you initialize your EF context passing in the relevant open connection and being sure to tell the context that it does not own the connection.
The constructor is documented by Microsoft here
The constructor is documented by Microsoft here
So where does the transaction come in - you start off by creating a Transaction Scope, and here you have option for starting a new transaction, or enlisting in an existing transaction to support nesting behavior. Within your scope you can now open you database connections, and pass these connections into the contexts. At this point you can continue with EF as you see fit - be it executing TSQL or LINQ syntax. Whatever you do is currently being done within a transaction wrapped around all the connections you opened. The constructor for the TransactionScope is documented here
The entire sample.
The entire sample.
Code
- /// <summary>
- /// Enlist MSDTC transaction around 2 EF dbcontext calls.
- /// </summary>
- /// <param name="args"></param>
- static void Main(string[] args)
- {
- try
- {
- var option = new TransactionOptions
- {
- IsolationLevel = IsolationLevel.RepeatableRead,
- Timeout = TimeSpan.FromMinutes(5)
- };
- using (var scope = new TransactionScope(TransactionScopeOption.RequiresNew, option))
- {
- using (var connAlphDb = new SqlConnection(ConfigurationManager.ConnectionStrings["AlphaConn"].ConnectionString))
- {
- connAlphDb.Open();
- using (var connBetaDb = new SqlConnection(ConfigurationManager.ConnectionStrings["BetaConn"].ConnectionString))
- {
- connBetaDb.Open();
- //I've got both my databases open, lets start doing some EF against both of them
- using (var context = new DbAlphaContext(connAlphDb, contextOwnsConnection: false))
- {
- var AlphaRow = new Alphatable { Ts = DateTime.Now };
- context.Alphatables.Add(AlphaRow);
- context.SaveChanges();
- }
- using (var context = new DbBetaContext(connBetaDb, contextOwnsConnection: false))
- {
- var BetaRow = new Betatable() { Ts = DateTime.Now };
- context.Betatables.Add(BetaRow);
- context.SaveChanges();
- }
- }
- }
- scope.Complete();
- }
- }
- catch (Exception e)
- {
- Console.WriteLine(e);
- }
- }
So what about commit and rollback? You need to commit manually by calling Complete, but rollback will occur when your transaction scope is disposed.. Rollback can be controlled manually if the situation requires it, calling Dispose yourself on the scope will do it, as will calling Transaction.Current.Rollback. I tend to let the using block dispose of the transaction scope itself.
So what happens whilst this transaction is open? It's blocking be default, and if you place a break point before you call Complete on your scope, you will see that on your SQL databases. Query the relevant tables with a nolock hint if you want to see the uncommitted dirty data. Continue and the commit is made, releasing the locks on the underlying resources. But you do get to define the type of blocking you create, the same as if you wrote a stored procedure and defined the isolation level for any transactions within. Transaction Scope is exactly the same and you can provide a set of options at initializing where one of them is defining your isolation level - the default being serializeable. That's the daddy of blocking, so do consider if a lowering of the isolation level is appropriate.
That's all there is to it. If you have worked with middle-ware platforms like BizTalk, or ETL tools like SSIS, then MSDTC won't be a revelation to you, but if you spend most of your time dealing with a single database on a solution then you probably have not had cause to get into using Transaction Scope in this manner, especially since EF6 already provides cool ways to begin or use an existing transaction which will sees the use of Transaction Scope start to fade out.
So what are the gotchas? In a local development environment getting this running locally is pretty straightforward, just start the MSDTC service in your local services.
When you start moving to a distributed setup things get trickier. So the simplest setup would be communication between a single web server and database server. But a more realistic setup would be multiple web servers talking to a clustered SQL server, and keep in mind the transaction may be wrapping databases on entirely different SQL clusters. All parties need an appropriate configuration, be it the web server flowing the transaction, or a SQL Server co-coordinating it.
However, configuration itself remains straight forward, open Component Services, find Distributed Transaction Coordinator, right click and go to properties.
The first thing here is enabling Network DTC Access. The settings are explained here. I've tended toward ensuring an identical configuration on all servers to get going and then refine the configuration from there - you will see that you can vary settings depending if the node is intended to coordinate the transaction or initiate a request.
We are not quite done yet, you will need to consider firewalls. DTC traffic requires port 135 open in both directions, which is straight forward enough (obviously you can change this to a non standard port if you prefer). DTC is using RPC under the hood, so dynamic ports will need to remain open. You or your network team are probably not that keen on all the dynamic ports being open, so it can be done in a limited manner. Below I have allowed 200 ports between 5001-5201 and this will limit the dynamic communication to within this range. This is still in Component Services, but on the computer node properties.
(in the above screen grab you may just have spotted that extra cheeky little MSDTC tab - this is where you go if you are working in an enterprise that has offloaded MSDTC coordination to some dedicated servers, that would be specified as the case there, else leave it local)
The final caveat is that in a clustered enterprise setup, you will likely have the MSDTC service itself clustered. In that scenario you just need ensure you are configuring the correct MSDTC instance.
This link is a great article covering investigating MSDTC issues and shows how to use and review MSDTC logging https://blogs.msdn.microsoft.com/puneetgupta/2009/02/06/the-hidden-tool-msdtc-transaction-tracing/
The one gap here that jumps out is what if you are planning a PaaS based implementation - at that point all bets are off in terms of DTC as we have known it until now. This article explains the PaaS based alternative for working between app services and Azure SQL databases (including multiple logical servers) https://docs.microsoft.com/en-us/azure/sql-database/sql-database-elastic-transactions-overview. Its a good read and worth the time if you are planning a migration. There are plenty of conditions to be met including getting up to .net framework 4.6.1, and the supported scenario's don't cover everything you could do with MSDTC Transaction Scopes, so hybrid implementations could run into a few dead ends.
When you start moving to a distributed setup things get trickier. So the simplest setup would be communication between a single web server and database server. But a more realistic setup would be multiple web servers talking to a clustered SQL server, and keep in mind the transaction may be wrapping databases on entirely different SQL clusters. All parties need an appropriate configuration, be it the web server flowing the transaction, or a SQL Server co-coordinating it.
However, configuration itself remains straight forward, open Component Services, find Distributed Transaction Coordinator, right click and go to properties.
The first thing here is enabling Network DTC Access. The settings are explained here. I've tended toward ensuring an identical configuration on all servers to get going and then refine the configuration from there - you will see that you can vary settings depending if the node is intended to coordinate the transaction or initiate a request.
We are not quite done yet, you will need to consider firewalls. DTC traffic requires port 135 open in both directions, which is straight forward enough (obviously you can change this to a non standard port if you prefer). DTC is using RPC under the hood, so dynamic ports will need to remain open. You or your network team are probably not that keen on all the dynamic ports being open, so it can be done in a limited manner. Below I have allowed 200 ports between 5001-5201 and this will limit the dynamic communication to within this range. This is still in Component Services, but on the computer node properties.
(in the above screen grab you may just have spotted that extra cheeky little MSDTC tab - this is where you go if you are working in an enterprise that has offloaded MSDTC coordination to some dedicated servers, that would be specified as the case there, else leave it local)
The final caveat is that in a clustered enterprise setup, you will likely have the MSDTC service itself clustered. In that scenario you just need ensure you are configuring the correct MSDTC instance.
This link is a great article covering investigating MSDTC issues and shows how to use and review MSDTC logging https://blogs.msdn.microsoft.com/puneetgupta/2009/02/06/the-hidden-tool-msdtc-transaction-tracing/
The one gap here that jumps out is what if you are planning a PaaS based implementation - at that point all bets are off in terms of DTC as we have known it until now. This article explains the PaaS based alternative for working between app services and Azure SQL databases (including multiple logical servers) https://docs.microsoft.com/en-us/azure/sql-database/sql-database-elastic-transactions-overview. Its a good read and worth the time if you are planning a migration. There are plenty of conditions to be met including getting up to .net framework 4.6.1, and the supported scenario's don't cover everything you could do with MSDTC Transaction Scopes, so hybrid implementations could run into a few dead ends.
Comments
Post a Comment