Talk:Commitment ordering

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
WikiProject Databases / Computer science  (Rated C-class, Low-importance)
WikiProject icon This article is within the scope of WikiProject Databases, a collaborative effort to improve the coverage of database related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Low  This article has been rated as Low-importance on the project's importance scale.
Taskforce icon
This article is supported by WikiProject Computer science (marked as Low-importance).
 
WikiProject Computer science (Rated C-class, Low-importance)
WikiProject icon This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Low  This article has been rated as Low-importance on the project's importance scale.
 

Cleanup tag[edit]

  • I don't know about the topic but this article needs a severe cleanup per the Manual of style. For starters, it needs a single opening paragraph that encapsulates the main concepts, then start with section breaks and the table of contents. Thatcher131 (talk) 18:12, 7 September 2006 (UTC)


Pls check last cleanup and further advise if needed. Thnx. Comps 20:07, 8 September 2006 (UTC)


Hi Thatcher131 (talk),

I believe your specific comments for the cleanup of the article Commitment ordering have been met by 8 September 2006 (UTC).

The tag that you have put in the article is now too vague to provide guidance, and requests the addition of more specific comments, if needed. Please advise. Thank you.

CC: Thatcher131 (talk)

Comps 12:06, 20 September 2006 (UTC)

  • Well, it definitely looks like a Wikipedia article now, so I removed the cleanup tag. I still don't quite understand it, though, it's written at a higher level than I can fully grasp. It might benefit from rewriting the first paragraph to be even more basic; "Commitment ordering is programmed into large databases that are accessed simultaneously by multiple users to avoid transaction conflicts and loss of data" or something like that, and giving one or two examples like recording sales transactions at large department stores with many registers, or airline reservation and ticketing systems, or something like that. (Assuming I've even understood the principle correctly) :) Thatcher131 11:21, 21 September 2006 (UTC)

Thank you for your input. This is a technical-mathematical article. It is not an easy reading. An effort has been made to write it completely without mathematical notation and to reduce mathematical terminology without harming accuracy, while capturing the main results about the subject (in the reference). This, to make it accessible also to people not fluent in the language of mathematics. However, some mathematical terms like graph, cycle in graph, necessary condition, serializability, etc., could not be eliminated. As such every sentence and every word are important, and have been carefully considered. Comps 17:37, 21 September 2006 (UTC)



"Random comment" in Nov 19, 2010 has been moved from here to a new section at the bottom --Comps (talk) 10:08, 20 November 2010 (UTC)

See Talk:Global serializability#Are the experts experts?[edit]

for additional discussion relating to Commitment ordering. - JCLately (talk) 16:54, 20 May 2008 (UTC)

Single Source[edit]

This article only has one source, namely Mr Raz (= User:Comps?!). Therefore, this wikipedia article is very close to self-publishing. There are very little references to the work of Mr Raz (for example, CiteSeer produces not even one citation for Yoav Raz), so the topic might even be considered insignificant (which of course does not mean it is wrong). --85.178.29.119 (talk) 09:43, 5 September 2008 (UTC)

I'm sorry, but this is incorrect: All published references (with external links; articles and patents) have been reviewed independently by expert reviewers. Both journals and conferences are very selective and prestigious. The fact that all publications are by the same author does not categorize the subject as "single source" (BTW, no other significant CO articles exist). It is definitely multi-(independent) source! It seems that the fact that no other significant articles on CO exist is probably due to the fact that these references quite completely cover the subject. Also, not much published research now on concurrency control (CC) in general. CC is considered by many as "well understood..." Are you an expert in CC to classify this as "insignificant?" For sure not! Look in CiteSeer for the subject, and you'll find. Maybe you complain to them you cannot find by author... Sufficient respected references for CO and Mr Raz exist on the web. Have you done homework? Comps (talk) 16:03, 5 September 2008 (UTC)
It seems you have put the "single source" tag twice. You have used a new IP, but both point to same network in Germany. I explained it to you shortly the first time in the undo comment. Another user reverted your second time. I hope that this time you understand. Comps (talk) 16:29, 5 September 2008 (UTC)
I accept you are an expert on CO, but please take a minute to reflect on Wikipedia in its own right. As widely advertised, Wikipedia does not want to be a place for promoting personal views or theories that are new or scarcely known, the most extreme form of which are original research, and so-called fringe theories. I would not attribute either of the latter to CO and I accept the importance of practicable solutions for concurrency control in distributed environments (I wouldn't have come across the article if I didn't, would I?).
However, from a merely formal point of view, your article is amendable in that it is hard for a reader to estimate the value of the theory as the article appears like a "singularity" among others of its field. In fact, the addition of other sources (for example publications of third party research testing the theory) would only benefit to the topic. A positive example of an article addressing a relatively new technology is, for example MapReduce.
Therefore, I think the single source tag should be regarded as a chance for improvements on the article, rather than being a "stain" on CO.
Please take a look at Wikipedia:Neutral_point_of_view#Undue_weight for further arguments. --85.178.30.168 (talk) 21:32, 5 September 2008 (UTC)
I do not see the relevance of what you write. CO is the ONLY general solution for autonomous databases. This by itself makes it very important. This is a solution for a problem that has been researched for many years without proper solution. I do not see your point. It is NOT "original research" at any case (a Wikipedia term: Not published by a credible source before being posted in Wikipedia). It is well reviewed and published in academic journal and conferences. It is Not a "point of view" either. It is a well established technique. It is either correct and effective or not, and should be judged by this only. Also "Single source" is incorrect. Independent prestigious committees have reviewed multiple articles: VLDB, ACM PODS, IPL, IEEE RIDE, The US Patent office. Each is an independent source. It happened that all these papers have been written by a single person.
Commitment ordering is referenced by multiple articles. I even saw a PhD thesis on distributed systems based on CO (95?). Wikipedia is not the place to bring "references that reference." An expert in the field knows how to find it. It is a "singularity" in the sense that it is the only general solution. When people understood it they stopped looking further. 16 years have passed since VLDB 92 with several more CO articles published. Not a word about incorrectness of any of the CO claims. Look at Global serializability and its discussion. It may help you understand.
At any case, I remove the wrong tag third time (the last removed by another reader). I do not think it can improve the article, since I do not believe any other source exists to be added. You have the exhaustive list of references, as far as I know, and I have been watching it for sixteen years now. Comps (talk) 23:25, 5 September 2008 (UTC)

@User:Comps(Mr Raz?): your article is (as also recognized by User:JCLately) very hard to understand (which, in my opinion, is due to the fact that it is just a very complicated way of saying "one needs to maintain a partial ordering of the updates", but this is not the place for academic dispute). However, having read your statement on Talk:Global_serializability, I am sure I misconceive CO as well. Why don't you just add a reference to the PhD thesis quoted above. Have you rigorously proven your solution is unique (in the very sense that no other general solution exists)?

@Everybody: I hope that someone digs up a reliable third party analysis of CO, so this article may be straightened out in that the main concepts of CO become clearer and that it is pointed out more clearly what the technological advances are that are covered by patent claims. Right now, however, the article doesn't analyze/describe CO but, rather, proposes it.

So long, --85.178.55.100 (talk) 09:44, 8 September 2008 (UTC)

I can agree with you only on one thing: That you have completely misunderstood the article, as well as its references, if you have looked them up. This article is the simplest (without Math notation), that I managed to come with, and still be accurate. However, you are not alone: I found that many database professionals do not understand concurrency control theory, while thinking they do. Apparently it is not trivial. I do not know if this gives you any consolation, and I urge you to have a closer look at the subject's principles. E.g., see Serializability.
"one needs to maintain a partial ordering of the updates" is a completely incorrect statement, and has no meaning within the subject's terminology.
The mentioned thesis can be found in http://srg.cs.uiuc.edu/Bib/LXiao-PhD.thesis.pdf and has no place in the article references just because it references and uses CO, since I do not see it provides any original contribution to the subject of CO itself. Its contribution is in another domain.
All this said, any professional criticism, finding mistakes and inaccuracies if exist, by a person who understands the subject, is highly welcome. Comps (talk) 19:04, 8 September 2008 (UTC)
Furthermore, you do not seem to understand the academic review process at all: The prestigious committees of experts that reviewed these articles typically accepted only about 10% of the submissions at that time (i.e., about 90% were rejected). It is very prestigious for a paper to be included in these journal and conferences after such scrutiny. It means that many experts thought it was both correct and important, to be included in the respective years' issues/conferences... Comps (talk) 14:28, 9 September 2008 (UTC)
All this teaches that editing Wikipedia, even for a tag, should be done after thinking and researching it over and over again. Good will to improve, without sufficient knowledge either in the subject, or in the Wikipedia definitions, or in both, especially together with stubbornness (3 times!), can cause unnecessary waste of time. Comps (talk) 15:45, 9 September 2008 (UTC)

Comment removed from article[edit]

The comment (see below) for The CO voting deadlock theorem has been removed. Though it describes a situation where no vote occurs, this is still a situation of missing votes and a deadlock. It is resolved by two competing mechanisms: 1. local cycle resolution described below, and 2. atomic commitment protocol timeout for registered transactions with no votes (unlikely to win over local mechanism). Thus this is a different situation from the usual, but not a real exception to the theorem, if we are a little more flexible with the term "voting deadlock."

The removed comment:

Comment: Formally, a rare exception exists, not of concern: When all the databases involved with the cycle are SS2PL based, it is possible (rarely) that no vote occurs. In such case no voting deadlock can happen. The highest probability instance of such rare event involves two transactions on two simultaneous opposite cycles. Such global deadlocks/cycles are not of concern since they overlap with local cycles, which are resolved locally (portions of local cycles generate a global one; to see this, split each global transaction (node) to local subtransactions (its portions confined each to a single database); a directed edge exists between transactions if an edge exists between any respective local subtransactions; a cycle is local if all its edges originate from a cycle among subtransactions of the same database, and global if not; global and local can overlap: a same cycle among transactions can result from several different cycles among subtransactions, and be both local and global). The formal issue is resolved by modifying the definition of a global cycle, and excluding this rare type of cycles, which consist of portions of local cycles.

Comps (talk) 14:33, 24 February 2009 (UTC)

Comment returned modified. Comps (talk) 23:58, 7 March 2009 (UTC)

Comment on the comment:

The situation described can happen only if each local sub-transactions can have more than one process or thread. In this case some local atomic commitment protocol (separate from the global) is needed for local atomicity before voting for the global. Now some local cycles may translate to local voting-deadlocks for the local protocol, and be resolved either by a local cycle elimination mechanism or automatically by the local atomic commitment protocol (racing). Comps (talk) 16:35, 14 July 2009 (UTC)


See User talk:Jwoodger#Commitment ordering questions[edit]

  1. Citation method
  2. Assessment tag

-- Comps (talk) 14:20, 5 September 2009 (UTC)

Random comment - Copied from above to this new section[edit]

Let me make a random comment here, as an arbitrary but technical user. I *still* don't understand the article. The very first thing it should do is define what commitment ordering is, before launching into why it was an open problem or how wonderful it is as an algorithm / protocol. I understand ACID properties. I understand database precedence properties and locking schemes and commitment protocols. I have a degree in math and would understand the math. Instead, I feel like the article reads more like a sales pitch trying to convince readers of the benefits of the protocol without FIRST defining the protocol and how it works. I hope that makes sense: I've read many technical and difficult articles here that were still quite understandable. 165.125.160.16 (talk) 05:41, 19 November 2010 (UTC)

Let's see if I can help.
The third sentence in the article reads: "In a CO compliant schedule the chronological order of commitment events of transactions is compatible with the precedence order of the respective transactions". This is a definition at the very beginning of the article, and I wonder how you missed it.
In the very first section after the overview section you find the formal definition of CO:
  • Definition - Commitment ordering
Let be two committed transactions in a schedule, such that is in a conflict with ( precedes ). The schedule has the Commitment ordering (CO) property, if for every two such transactions commits before commits.


Regarding the protocol: First, it is not in the very beginning since such a long article has a lead section summarizing the entire article, and then a general background section. Than we dive to business with theoretical part which is a must to prove what the protocol has to do. The protocol is trivial, but just describing it will have you wonder "how come, why?".
The protocol is highlighted in a paragraph at the end of the section Enforcing global CO:
The Global CO algorithm comprises enforcing (local) CO in each participating database system by ordering commits of local transactions (see Enforcing CO locally below) and enforcing the voting strategy in the theorem above (for global transactions).
The protocol has a distributed part which relies on the local part. First the distributed is described: A certain voting condition is proven to be needed for global serializability. This is the main task. After you understand the condition, you just (trivially) apply it: You locally maintain the needed order of voting, as the condition says! That's it. Getting CO locally can be done in many ways (e.g., SS2PL), and a local generic algorithm is given, which spans the entire CO schedule class and can be combined with every local CC type, since it only checks order of commits, after the transactions are in ready state. The article structure and sections, with table of content, allow you to find any needed part, but a linear, patient reading first helps understanding. It is like a small (concentrated) book on a completely new subject.
I do not agree that the article includes any sales pitch: Most of the claims are proven inside the article (in words, which as mathematician you should know is perfectly OK), and for CO-related that is not proven here (quite little: CO as a necessary condition to global serializability across autonomous databases, and MVCO theory; avoided here due to needed length), explicit references to proofs are given.
I agree that the article is probably not easy. But this is the best I have managed to do in terms of readability and comprehension since I started it in 2006. I strongly recommend that you read the article Serializability before this one. It gives all the background, the building blocks, and terminology. I hope this helps. --- Comps (talk) 20:30, 19 November 2010 (UTC)

Text here copied from above to this new section --Comps (talk) 09:50, 20 November 2010 (UTC)

Parenthetical referencing is inline citation (not footnotes only)[edit]

The article uses Parenthetical referencing. See Wikipedia:Citing sources#Parenthetical referencing. Thus why the tag? --Comps (talk) 19:18, 9 November 2011 (UTC)

Copy editing: Handle with caution[edit]

This article is a mathematical. No Math formulas but very careful phrasing. Make sure content does not change meaning. Pls do not change text if meaning is unclear to you. In this case please discuss it here. Thanks. --Comps (talk) 20:12, 9 November 2011 (UTC)

Importance - Back to High[edit]

User:Ruud Koot has changed the article's importance from High to low. No justification exists to this change since Commitment ordering is increasingly becoming important and increasingly utilized. The following text is a quotation from the deletion discussion of the article which took place recently:

"1. the following quotations on CO appear in a 2009 book: Philip A. Bernstein, Eric Newcomer (2009): Principles of Transaction Processing, 2nd Edition, Morgan Kaufmann (Elsevier), June 2009, ISBN 978-1-55860-623-4
Quotations:
  • "Not all concurrency control algorithms use locks... Three other techniques are timestamp ordering, serialization graph testing, and commit ordering. Timestamp ordering assigns each transaction a timestamp and ensures that conflicting operations execute in timestamp order. Serialization graph testing tracks conflicts and ensures that the serialization graph is acyclic. Commit ordering ensures that conflicting operations are consistent with the relative order in which their transactions commit, which can enable interoperability of systems using different concurrency control mechanisms." (quotation from page 145)
  • "Commit ordering is presented in Raz (1992)." (page 360)
Bold fonts in source
Phil Bernstein is a known researcher and authority in database concurrency control. --Comps (talk) 23:22, 19 November 2011 (UTC)
2. The textbook on transaction concurrency control: Gerhard Weikum, Gottfried Vossen (2001): Transactional Information Systems, Elsevier, ISBN 1-55860-508-8 , has two section dedicated to and named commitment ordering: One starts in page 102 and the other in page 700. The book references Yoav Raz's CO articles for the sections. Both authors are known researchers in this area.--Comps (talk) 23:47, 19 November 2011 (UTC)
Commitment ordering is a central element today in Concurrency control theory. It has enjoys increasing utilization and interest (see History of commitment ordering)."

The first quotation specifies it as one of the major Concurrency control existing methods, with the special capability of providing interoperability to all other methods. Thus no doubt about its high importance.

Google search for

"commitment ordering" transaction yields 16,600 results
"commit order" transaction 242,000 results

Google scholar search for

"commitment ordering" transaction 99 results
"commit order" transaction 873 results

Importance is changed back to high. 94.230.86.34 (talk) 04:58, 18 December 2011 (UTC)

Looking at google counts for mid-class articles C Sharp 3.0 13,100,000, CiteSeer 354,000, Data integration 33,200,000, Concurrent computing 17,800,000. I'll be changing it back to low. Stuartyeates (talk) 05:39, 18 December 2011 (UTC)
Google search was not the main reason (see above), just support. I can be selective as well. Some from High importance in computer science.
Klee-Minty cube 6340
Patent visualisation 4130
Inheritance semantics 23,700
Floyd-Warshall algorithm 31,300
PCP theorem 46,200
Birthday attack 55,000
Please change back to High. Basing only on search results you miss the point. 94.230.86.34 (talk) 16:53, 18 December 2011 (UTC)
You ignore what CO does in Concurrency control of transactions:
It is crucial to Distributed serializability and Global serializability.
It plays an important and increasing role in Software transactional memory (multi-core) and distributed Snapshot isolation.
It clearly explains the huge success of its special case SS2PL as a distributed solution, and more 94.230.86.34 (talk) 17:10, 18 December 2011 (UTC)
One needs good understanding in the area to properly evaluate its importance. 94.230.86.34 (talk) 17:16, 18 December 2011 (UTC)

This is an academic paper, not an encyclopedia entry.[edit]

This article is incomprehensible.

This article has paragraphs that start with 'Furthermore' or 'In addition'. This is not appropriate for an encyclopedia.

This article has sentences like "Each not-CO-compliant database system is augmented with a CO component (the Commitment Order Coordinator - COCO) which orders the commitment events for CO compliance, with neither data-access nor any other transaction operation interference. As such CO provides a low overhead, general solution for global serializability (and distributed serializability)". See my first point.

I am aware that this article survived an AfD. The content may well be notable. As a qualified 'computer programmer' (not that that counts for much), I have to honestly say that I can't tell. It is poorly written. See my first point.

This article needs to be changed. It needs to be easily understood. It needs other editors to rewrite it so that it can be understood.

It needs a 'Simple english' approach.

Cheers, Colonel Tom 10:58, 22 December 2011 (UTC)

Notability, or back to AfD[edit]

Now that the author Comps is blocked for sockpuppetry, it may be easier to discuss whether there are any secondary sources that establish notability of this topic. If not, let's take it back to AfD, where it may be easier to discuss without his presence. And if we keep it, let's prune it back to what secondary sources say about the topic. Dicklyon (talk) 05:16, 23 December 2011 (UTC)

Gerhard Weikum & Gottfried Vossen (2002). Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery (ScienceDirect Google Books) spends two pages on the topic. I don't seem to have access to the full text, but this would probably give a good indication of the depth to which this article should discuss the topic.
Professional NoSQL has a short blurb that could further help to establish notability.
I had considered nominating the article again, but I'd probably still vote "weak keep and rewrite" again. —Ruud 14:19, 23 December 2011 (UTC)
Sounds good. A rewrite to a much smaller article seems appropriate. You up for it? Dicklyon (talk) 21:51, 23 December 2011 (UTC)

The Raz infection is spreading[edit]

A search finds what appears to be Mr. Raz repeating the content here into:

Serializability

Global serializability

Concurrency control

Distributed concurrency control

Two-phase locking

Schedule (computer science)

Global concurrency control

Deadlock

Federated database system

Locks with ordered sharing

VO

Software transactional memory

Deadlock prevention algorithms


He also considers himself notable personally: List of computer scientists

And there's another sock-puppet aborning: User:Dingo1729/History of commitment ordering

Personally I think that Global concurrency control and Distributed concurrency control should be replaced by a link to Concurrency control; Global serializability should be replaced by a link to Global serializability; Deadlock prevention algorithms should be replaced by a link to Deadlock; Locks with ordered sharing should be simply deleted; the entry for Raz in List of computer scientists should be removed; and the entry for databases in VO should be removed. None of the proposed removals add anything not already contained in the proposed links except more puffery by Mr. Raz.


Ivan

  • This is outrageous. "The Raz infection is spreading" reminds of "The Jews infection" in Nazi slur. No place for such in Wikipedia. Similar "discussion" in 2011, Wikipedia_talk:WikiProject_Computer_science/Archive_10#User:Comps_.2F_Commitment_ordering, about "neutrality disputed" has been baseless as well, with no any fact given, no explanation, and no conclusion. This "discussion" was archived in March 2012. The only effect I see is multiple unexplained tags contaminating Wikipedia, and damage to the Snapshot isolation article. No place for unsupported such tags in Wikipedia, and should be urgently removed. Initiators of such discussions should be honest and conclude (in the discussions) that accusations made were unsubstantiated. 65.96.201.116 (talk) 14:39, 15 August 2012 (UTC)


  • outrageous? really? I came across this discussion accidentally as I corrected a wrong statement in the article on deadlocks (And yes - definitely wrong - no question). The same wrong statement can be found in this article. Now that I see this list of "infected" articles I had a look on these - and it is true!: All these articles are heavily infiltrated with misleading (at least) and even wrong statements. These infitrations obviously have been written by someone who implicitely makes numerous general assumtions that are not generally valid. For example: (1)- all transactions are distributed transactions with different processes that reach a "ready" state before actually committing.(2)- the same term (e.g. 2PL) can be used for a synchronization protocol and the set of histories it generates. This leads to wrong conclusions. (3)- The author seems to have in mind a multidatabase-environment with different synchronization protocols on the global and local level. However, he/she never explicitely says that, so it is never clear which level he/she is currently talking about. All the proposals made in the previous paragraph are completely justified. The last statement is from 2012 - nothing has happened since then. We are in 2014 and the whole topic on synchronization is still infiltrated with many misleading statements. The previously mentioned list of articles should be corrected and large parts should be deleted. JohnTB (talk) 12:45, 12 June 2014 (UTC)
  • It is indeed outrageous, and you, User:JohnTB, simply do not understand the material: (1) For the atomic transaction model that was used in the literature for decades this is correct. Distributed transactions are defined by different processes that need to commit. For this purpose they first reach the "ready state," explicitly or implicitly. (2) This confusion has been common in the literature, and my papers clarified this issue, though not completely prevented it later in published academic articles by some others. The terminology issue is discussed in the article Two-phase locking (3) The general assumptions have been clearly explained, and the assumptions for sync are minimal and allow a coherent and logical understanding of the material. Satisfactory for a person that understood the background information in Serializability. Be specific with text if you think anything is inaccurate. --- Also please identify yourself to show your credentials, since what you write does not make sense to me. Also please do not make changes here or in related articles before a thorough discussion. Yoav Raz (talk) 16:45, 29 September 2014 (UTC)
  • sorry. I did not realize that there was a response. I simply dont use wikipedia much, and I certaily dont want to be involved in any kind of edit wars. And yes: I don't understand what you are writing, even though I am expert in this area. So you might be able to explain this sentence in the article: "In addition, locking based global deadlocks are resolved automatically in a CO based multi-database environment, an important side-benefit (including the special case of a completely SS2PL based environment; a previously unnoticed fact for SS2PL)." To my understanding strict 2 phase locking (the protocol, not the set of histories it generates!) may result in deadlocks. A transaction caught in a deadlock never reaches even the request to commit, so this cannot be treated in a special commitment protocol. So, maybe you can explain this to me.

JohnTB (talk) 13:11, 17 May 2018 (UTC)