Friday, February 24, 2017

Automatic and Programmatic Two Phase Commit



Sybase:  An Organization with a Good Plan that Went Wrong.

The organization that I will consider here is Sybase, a still current relational database vendor.  At the time back in 1990, I was working for one of Sybase's main competitors, Ingres Software.  We built and marketed a relational database system as well.  In the 1990's distributed systems (client server) computing were all of the rage.  This extended to the database realm as well.  I was, at the time, the Manager of Distributed Ingres, which involved managing three product lines:  Ingres-Net, which allowed database clients to communicate with database servers over a host of networks, Ingres-Gateways, which allowed Ingres applications to store and access information in non-Ingres databases (these are database gateways not communications gateways), and Ingres-Star, our distributed databases management system, capable of both distributed queries (with a sophisticated performance based distributed query optimizer) and now distributed transactions.

In the realm of distributed transactions, back in 1987, Sybase had developed a two-phase commit capability with their database.  Two phase commit is the protocol that is required to execute distributed transactions (updating databases in multiple geographic locations at the same time).  Sybase had an excellent plan.  They could out do any competitor in the market with a two phase commit capability that only they had allowing Sybase database management systems to do distributed transactions.  Oracle didn't have two phase commit at the time, and neither did Ingres or Informix (among other things UNIX based database vendors, which IBM was not at the time).  I.e., Sybase was the first vendor in the open systems market to be able to do distributed transactions and held this lead for three years.

Along came our team at Ingres to change all of this.  We developed a two phase commit capability that surpassed the Sybase capability because it was transparent.  I.e., a distributed transaction ran just the same as a regular transaction, the database user didn't even need to know that the databases being updated were in different locations.  I positioned this as "automatic two phase commit," which killed interest in the Sybase product, which I depositioned as "programmatic two phase commit."  The problem that Sybase had is in order to use two phase commit with their product you had to explicitly program the two phase commit protocol.  It was not transparent.  When we announced our automatic two phase commit product, it was so well received in the market that we got front page press in almost all of the important trade magazines and newspapers.  We had executed a coup that severely hurt Sybase.  That was my job, I was the Manager of Distributed Ingres.

So in this case, an innovative competitor came into the market, Ingres, and introduced new technology that made Sybase's product obsolete, and a technology that also changed the database market.  Sybase's plan became obsolete.  Customers were now asking for automatic two phase commit.

Impact of the Socio-Technical Plan for Two Phase Commit

In terms of the sociotechnical plan, as far as  people's interaction with the technology, ours was much easier to use.  With Sybase's product you had to be a programmer.   With our product you didn't have to be technical at all, just say update (and the distributed update is done automatically).  The sociotechnical plan was to make two phase commit easy to use and to require no programming capability.  The distributed system automatically did the work for you.

Relevance:  The Need to Two Phase Commit and Distributed Transactions Today
This technology is relevant and still relevant as the world is distributed.  Computers everywhere are connected on networks and need to work together to get tasks done (distributed computing).  This is the case with the internet and any installation of computers and networks in any organization.  Companies continue to need to be able to do transactions in a networked environment.  I.e., they need distributed transaction capability.

Three Forces That Could Affect Automatic Two Phase Commit

At the time, there were several forces that Ingres needed to consider upon release of the automatic two phase commit capability.  One, other vendors could follow on and develop their own automatic two phase commit capability.  Oracle, several years later, with Oracle 7 did just that.  Sybase eventually, several years later,  developed automatic two phase commit as well.  Two, another possible force would be standards that allowed heterogeneous distributed transaction capability.  At the time that Ingres released automatic two phase commit it only worked on Ingres databases.  I.e., what we had was homogeneous two phase commit.  Needed also by the market was the ability to update multiple databases of different types in different locations.  This standard never came about, and heterogeneous transactions only exist in a small measure by database companies that have modified a database gateway or two to do a restricted type of heterogeneous transaction.  The market went a different route, to the internet and to cloud computing and big data.  Big data is really about reads.  You read massive amounts of information and then analyze that information for patterns.  It is not very update oriented.  A third possible force would be transaction managers that handle distributed transactions like IBM's CICS.


References

These are all books from my bookshelf purchased when I was in the database industry.  Date , from IBM, wrote the classic text on relational database systems.

Bal, H.E. (1990).  Programming Distributed Systems, First Edition.  Silicon Press. Summit, New Jersey..

Ceri, S., Pelagatti, G. (1984).  Distributed Databases:  Principles and Systems.  McGraw-Hill,  New York, New York.

Date, C. J. (1983).  An Introduction to Database Systems:  Volume II.  Addison-Wesley,  Reading, Massachusetts.

Date, C. J. (1986). An Introduction to Database Systems:  Volume I, Fourth Edition.  Addison-Wesley, Reading, Massachusetts.

Khanna, R. (1994).  Distributed Computing:  Implementation and Management Strategies.  Prentice Hall, Edgewood Cliffs, New Jersey.

No comments:

Post a Comment