XA/Distributed transactions

Notă: Version requirement

XA related functions have been introduced in PECL/mysqlnd_ms version 1.6.0-alpha.

Notă: Early adaptors wanted

The feature is currently under development. There may be issues and/or feature limitations. Do not use in production environments, although early lab tests indicate reasonable quality.

Please, contact the development team if you are interested in this feature. We are looking for real life feedback to complement the feature.

Below is a list of some feature restrictions.

  • The feature is not yet compatible with the MySQL Fabric support . This limitation is soon to be lifted.

    XA transaction identifier are currently restricted to numbers. This limitation will be lifted upon request, it is a simplification used during the initial implementation.

Notă: MySQL server restrictions

The XA support by the MySQL server has some restrictions. Most noteably, the servers binary log may lack changes made by XA transactions in case of certain errors. Please, see the MySQL manual for details.

XA/Distributed transactions can spawn multiple MySQL servers. Thus, they may seem like a perfect tool for sharded MySQL clusters, for example, clusters managed with MySQL Fabric. PECL/mysqlnd_ms hides most of the SQL commands to control XA transactions and performs automatic administrative tasks in cases of errors, to provide the user with a comprehensive API. Users should setup the plugin carefully and be well aware of server restrictions prior to using the feature.

Example #1 General pattern for XA transactions

<?php
$mysqli 
= new mysqli("myapp""username""password""database");

/* BEGIN */
mysqlnd_ms_xa_begin($mysqli/* xa id */);

/* run queries on various servers */
$mysqli->query("UPDATE some_table SET col_a = 1");
...

/* COMMIT */
mysqlnd_ms_xa_commit($link1);
?>

XA transactions use the two-phase commit protocol. The two-phase commit protocol is a blocking protocol. During the first phase participating servers begin a transaction and the client carries out its work. This phase is followed by a second voting phase. During voting, the servers first make a firm promise that they are ready to commit the work even in case of their possible unexpected failure. Should a server crash in this phase, it will still recall the aborted transaction after recover and wait for the client to decide on whether it shall be committed or rolled back.

Should a client that has initiated a global transaction crash after all the participating servers gave their promise to be ready to commit, then the servers must wait for a decision. The servers are not allowed to unilaterally decide on the transaction.

A client crash or disconnect from a participant, a server crash or server error during the fist phase of the protocol is uncritical. In most cases, the server will forget about the XA transaction and its work is rolled back. Additionally, the plugin tries to reach out to as many participants as it can to instruct the server to roll back the work immediately. It is not possible to disable this implicit rollback carried out by PECL/mysqlnd_ms in case of errors during the first phase of the protocol. This design decision has been made to keep the implementation simple.

An error during the second phase of the commit protocol can develop into a more severe situation. The servers will not forget about prepared but unfinished transactions in all cases. The plugin will not attempt to solve these cases immediately but waits for optional background garbage collection to ensure progress of the commit protocol. It is assumed that a solution will take significant time as it may include waiting for a participating server to recover from a crash. This time span may be longer than a developer and end user expects when trying to commit a global transaction with mysqlnd_ms_xa_commit(). Thus, the function returns with the unfinished global transaction still requiring attention. Please, be warned that at this point, it is not yet clear whether the global transaction will be committed or rolled back later on.

Errors during the second phase can be ignored, handled by yourself or solved by the build-int garbage collection logic. Ignoring them is not recommended as you may experience unfinished global transactions on your servers that block resources virtually indefinitely. Handling the errors requires knowing the participants, checking their state and issuing appropriate SQL commands on them. There are no user API calls to expose this very information. You will have to configure a state store and make the plugin record its actions in it to receive the desired facts.

Please, see the quickstart and related plugin configuration file settings for an example how to configure a state. In addition to configuring a state store, you have to setup some SQL tables. The table definitions are given in the description of the plugin configuration settings.

Setting up and configuring a state store is also a precondition for using the built-in garbage collection for XA transactions that fail during the second commit phase. Recording information about ongoing XA transactions is an unavoidable extra task. The extra task consists of updating the state store after each and every operation that changes the state of the global transaction itself (started, committed, rolled back, errors and aborts), the addition of participants (host, optionally user and password required to connect) and any changes to a participants state. Please note, depending on configuration and your security policies, these recordings may be considered sensitive. It is therefore recommended to restrict access to the state store. Unless the state store itself becomes overloaded, writing the state information may contribute noteworthy to the runtime but should overall be only a minor factor.

It is possible that the effort it takes to implement your own routines for handling XA transactions that failed during the second commit phase exceeds the benefits of using the XA feature of PECL/mysqlnd_ms in the first place. Thus, the manual focussed on using the built-on garbage collection only.

Garbage collection can be triggered manually or automatically in the background. You may want to call mysqlnd_ms_xa_gc() immediately after a commit failure to attempt to solve any failed but still open global transactions as soon as possible. You may also decide to disable the automatic background garbage collection, implement your own rule set for invoking the built-in garbage collection and trigger it when desired.

By default the plugin will start the garbage collection with a certain probability in the extensions internal RSHUTDOWN method. The request shutdown is called after your script finished. Whether the garbage collection will be triggered is determined by computing a random value between 1...1000 and comparing it with the configuration setting probability (default: 5). If the setting is greater or equal to the random value, the garbage collection will be triggered.

Once started, the garbage collection acts upon up to max_transactions_per_run (default: 100) global transactions recorded. Records include successfully finished but also unfinished XA transactions. Records for successful transactions are removed and unfinished transactions are attempted to be solved. There are no statistics that help you finding the right balance between keeping garbage collection runs short by limiting the number of transactions considered per run and preventing the garbage collection to fall behind, resulting in many records.

For each failed XA transaction the garbage collection makes max_retries (default: 5) attempts to finish it. After that PECL/mysqlnd_ms gives up. There are two possible reasons for this. Either a participating server crashed and has not become accessible again within max_retries invocations of the garbage collection, or there is a situation that the built-in garbage collection cannot cope with. Likely, the latter would be considered a bug. However, you can manually force more garbage collection runs calling mysqlnd_ms_xa_gc() with the appropriate parameter set. Should even those function runs fail to solve the situation, then the problem must be solved by an operator.

The function mysqlnd_ms_get_stats() provides some statistics on how many XA transactions have been started, committed, failed or rolled back.