|This is the talk page for discussing improvements to the Virtual synchrony article.|
- 1 Heavy revision needed
- 2 External Links
- 3 Help needed for the illustrations
- 4 Article "okay" as is? No.
- 5 More detail, please
- 6 Protocol for removing tags?
- 7 asynchronous, message queue and buffer seem like pre-requisites.
- 8 Locking paragraph.
- 9 Article still biased?
- 10 Better figure(s) required
- 11 Other references
- 12 The quality issue (again)
Heavy revision needed
In the talk page below, there are (seemingly) years of complaints about this page, and despite the claim of one person that the article was written by one person, it shows obvious signs of the too many authors problem common to Wikipedia articles that have been edited in haphazard ways by many people. I did some work on it during a break in the SOCC conference in Seattle, but I can see now that what this really needs is a complete rewrite by Yair Amir (Spread), Ken Birman (Isis2) or some other person who writes well and knows the model well. The current revision isn't really very satisfactory.
- Just noticed this. First, thanks for jumping in and just fixing the language. But let me ask a question. One group of people are complaining that people like Yair and me would be biased and shouldn't touch the page. Now here you go saying Yair or I should revise the page "heavily". Which group is right?
- I had understood that Wikipedia really centers on unbiased editors. So if the technical content is correct, my feeling is that I should stay disengaged. At a glance the page can definitely be improved at this point (your comment about too many authors seems about right to me). So one person doing a top-to-bottom rewrite would certainly help. But if that person was me, wouldn't that just double down on the biased editor concerns?
- Seriously, I'll do what I should to help. But I need to understand: what am I supposed to do, under Wikipedia policy? Ken Birman (talk) 12:58, 10 November 2014 (UTC)
The references currently in the article are as follows, but of course I'm a co-author on a few of them. Just to head off any kind of flaming, I don't have anything to sell, not even the book -- I did write it, but I don't get a penny of royalties on this book. Springer has a deal under which they will cut the price of a book quite deeply if the author declines royalties, and a book like this never makes much money anyhow, so I took the deal. And indeed, the book is cheaper as a result.
- Reliable Distributed Systems: Technologies, Web Services and Applications. K.P. Birman. Springer Verlag (1997). Book, covers all of the replication technologies in question.
- Distributed Systems: Principles and Paradigms (2nd Edition). Andrew S. Tanenbaum, Maarten van Steen (2002). Book, famous authors; also covers all of the replication technologies in question.
- "The process group approach to reliable distributed computing". K.P. Birman, Communications of the ACM (CACM) 16:12 (Dec. 1993). Easiest single article to read, for general audiences.
- "Group communication specifications: a comprehensive study" Gregory V. Chockler, Idit Keidar, Roman Vitenberg. ACM Computing Surveys 33:4 (2001). A relatively mathematical treatment but very clear and rigorous.
- "Practical Impact of Group Communication Theory." Andre Schiper. Future Directions in Distributed Computing. Springer Verlag Lecture Notes in Computer Science 2584 (July 2005). Talks about the history of data replication and fault-tolerance; very balanced and suitable for a non-expert.
- "The part-time parliament". Leslie Lamport. ACM Transactions on Computing Systems (TOCS), 16:2 (1998). Introduces the Paxos protocols, one of the three main classes of replication techniques. Some may find this hard to follow; the mathematical notation is a bit dense.
- "A review of experiences with reliable multicast" Software, Practice and Experience. 29:9 (July 1999) K. P. Birman. Describes some case-studies including stock exchanges, air traffic control systems.
- "Exploiting virtual synchrony in distributed systems". K.P. Birman and T. Joseph. Proceedings of the 11th ACM Symposium on Operating systems principles (SOSP), Austin Texas, Nov. 1987. First use of the term, but not necessarily the easiest source for learning more.
Help needed for the illustrations
Can anyone help with the formatting of the 3 figures in the section that illustrates virtual synchrony using time-line diagrams? These seem too small in the gallery format, but are too large to include in-line. I'm happy to do the work if you want to suggest a particular way of doing this but don't feel like doing it yourself. Ken Birman 16:33, 13 June 2007 (UTC)
Draft time-line diag:
Article "okay" as is? No.
This article has numerous problems. First of all the editor who created the artice, User:Ken Birman has a conflict of interest (WP:COI) in that he is the developer of Virtual synchrony software, as he has stated. The article does not contain in-line citations. Its tone is inappropriate for an encyclopedia, a casual discussion in some areas, abruptly introduced technical language in others, sometimes both in one, so the article seems like a casual discussion among insiders or buyers in places, and unfocused run on descriptions of the process without any concept of the reader of Wikipedia articles ("Developers of distributed computer systems often need a way to replicate data for sharing between programs running on multiple machines, connected by a network. Virtual synchrony is one of three major technologies for solving this problem. The key idea is to create a form of distributed state machine associated with the replicated data item"). The examples are awkwardly introduced (probably due to COI), the prose needs thoroughly edited. The sections need cohesively structured, with internal order, for the reader of the article--WP:MOS. I have tagged these articles with requests for this clean-up in case there are interested Wikipedia editors who can improve these articles, in particular starting with structure and pose before moving on to technical accuracy. Ken Birman has made it clear he does not want me to edit the articles. KP Botany 18:23, 23 June 2007 (UTC)
On the OR: I included this tag because of the COI between the primary editor and the topic and the failure to include in-line citations, coupled with the disorganized structure of the article, that altogether make difficult the direct verification of information and original research. KP Botany 18:33, 23 June 2007 (UTC)
In addition the complete rewrite is more appropriate at this point than the clarity tag due to the overall unstructured nature of the article and the lack of direction internally on sections. The clarity tag may need to be re-added at some point if the technical jargon is not dealt with at an appropriate for Wikipedia level. KP Botany 18:37, 23 June 2007 (UTC)
- Sure. Just to point it out, though: I did edit the article to address the concerns you raised, and you then chose not to comment further. Under the rules as I understand them, you need to do more than sort of glare at the article with unarticulated issues. Your role is to explain, in as clear a way as you can, what issues concern you -- "I don't like it" not being a good way to express those concerns, whereas "I read this, but I honestly can't understand it" being an example of something that would be acceptable under the Wiki rules. Given that I had revised the article and you then remained silent, after two weeks, I assumed that your objections had been addressed. This is a reasonable assumption, since you didn't express your concerns.
- With respect to your concerns, I'll simply reiterate that hundreds of people have worked with virtual synchrony as a model, and while this does create a point of view, it isn't obvious to me that it implies a conflict of interest. You say that I am "the developer of virtual synchrony" but the actual fact is that (nearly 20 years ago) I was one of the first people to work with the model... and by now, this is a model with an industry standard behind it, used in systems and products developed independentally all over the world, by companies as large as IBM and Microsoft and by researchers in places as widespread as Israel, Switzerland, Italy, Norway, and France (and the US too). I don't have some sort of monopoly on the model. As a general matter, it can be difficult to write the initial version of a technical contribution if one lacks the necessary insight into the area as a whole. Necessarily, any Wiki article strikes a compromise between technical detail and readibility. It is a misunderstanding to assume that the point of the COI/POV rules is to ensure that only people lacking technical background can write Wiki articles. The goal of those rules, as written, is to prevent people from using Wikipedia for self-promotion (especially for marketing products). As you know from prior dialog on this, I have nothing to sell here. Moreover, while you seem not to accept that, I'm not trying to promote myself.
- In fact my main reason for adding the article is that in the past six months or so, at least a dozen people have commented to me that a Wiki entry of this sort was lacking, and suggested that I would be the natural person to write the first version.
- You've said "I don't want you to edit the article". Actually, I never said that. Feel free. But something else troubles me about this comment: this is actually the second time you've put words in my mouth (the previous time being on the Articles for Deletion discussion). Will all due respect, it would be helpful if you could stop doing that. If you want to quote people, you really should try and do that with cut-and-paste.
- As indicated previously, I would be happy to see one of the editors tackle this. In fact I'm quite interested to see what they come up with. Ken Birman 23:03, 25 June 2007 (UTC)
- No, I don't need to do any more as when I articulated my ideas you made it clear for me to keep away from your article. That means, the responsibility is yours, as it's about the articles, and you don't want me near them, so there is no point in us discussing anything more here, other than for me to note as I did that the articles don't go anywhere near being wikified or being okay. The tags may alert other editors to this, hopefully you won't chase them away, too. You could try reading WP:MOS until someone comes by to help. 05:17, 26 June 2007 (UTC)
- Reading your talk page, I can see that you've had a history of these very aggressive interactions with people on Wikipedia. Under the circumstances, I think that waiting for a Wiki editor who has the time and inclination to work on this makes sense. Now, I never once said that you should keep clear of "my" article. But let's not go there. You seem to feel a need to (mis)characterize what others say, and I can see now by browsing your history that I'm not the first person to run into this sort of thing. So: When a Wiki editor takes interest in the question, I'll be happy to assist in any way they request. Ken Birman 15:36, 26 June 2007 (UTC)
More detail, please
As with gossip protocol, I found this entry useful and informative. Granted, I already have a fair bit of knowledge of networking, but I can't see someone without CS networking experience wanting to learn about this topic anyway (and if they do, they'll need to catch up on prerequisites first.) I thus find KP Botany's comments unreasonable and over-the-top. The article could certainly benefit from some copy-editing, but a rewrite? No. Nor do I see any conflict of interest.
I do however think the article could use some more detail on how these protocols work. I was following along with the basic principles, and then the article ended without giving me a sense of what type of messages the hosts need to exchange in order to implement this type of synchrony. Obviously this article shouldn't give full details, but something more is needed. JensAlfke 00:00, 4 July 2007 (UTC)
- The article is improperly formatted, disorganized in outline, written in variable and largely incorrect tense, is disorganized, and is unneccesarily filled with poorly explained jargon, and is not written for a general audience. Write a sentence, an introductory sentence to a concept, then elaborate within the paragraph, place these sentences in a logical order, developing the concept, use the introductory paragraph to introduce the concpet in the same order as the entire article develops it. The initial paragraph starts out stong then gest bogged down in specific details that are poorly elaborated, yet don't belong in an introduction which should include an outline of the article. For a Wikipedia article, if you read the introductory paragraph you should come away with knowledge, not with confusion. There's a lot more missing than what type of messages the hosts need to exchange. If you want to see what it should and could look like, check out these articles Sei Whale, El Greco, WP:MOS. If you have a specific example of how my comments here are "unreasonable and over-the-top," please do quote directly from the article, and one of my comments, and Wikipedia MOS, and show precisely how my comments are incorrect. But coming here and piling on me after Ken decided the two of us should just cool off, and I agreed is no place in this article discussion--without specific examples, your complaint about me is simply a complaint about me, so take it to my talk page or omit it, and get to the article. KP Botany 17:54, 4 July 2007 (UTC)
- JensAlfke, I could definitely add that sort of detail but I'm worried that it might take the article a bit beyond what one normally expects in an encyclopedia. Could you suggest some examples of comparable articles that I could look at, to see how this was handled? It seems to me that there is a delicate balance between explaining the basic idea and departing into a very detailed discussion. Also, with "models" of this sort, there are always multiple possible implementations, so you get a whole issue of which one to present and how to explain the existence of other options.
- KP Botany, seems to me that less than a week ago we agreed that I wouldn't react to your comments, and you would focus on pages unrelated to these two. I just want to urge you to follow your own posted decision to stay away from these pages. Let other people get involved. Cool off. Focus on other things. There is an actual WP policy on conflict and you just aren't abiding by it. Ken Birman 19:11, 4 July 2007 (UTC)
- Ken, it seems you didn't actually read what I wrote. These articles remain deeply problematic in their style, in their accuracy, in their choice of material. I will be monitoring them until they get cleaned up. I suggest you delete your comments about me from this talk page, so others won't think that open-season-on-KP Botany is needed to support the article. It is either a well-written and appropriate article for Wikipedia or not. KP Botany 01:14, 5 July 2007 (UTC)
- KP Botany, as we've discussed at some length, you and I have a history of ending up in conflict. WP:conflicts expresses a policy on precisely this sort of thing. That policy makes it clear that none of us is critical and that the simple and best solution when conflicts arise is for both parties to cool off by not interacting at all. You and I agreed to cease interactions, but now you are ignoring your own statement that you had other articles to focus on an would keep yourself scarce. At any rate I did read your comments and I am not ignoring them. However, I do not plan to revive the senseless argument that characterized our previous round of interactions. I'll be more than happy to discuss things with anyone who wants to do so in a polite manner, but I'm really not interested in being treated abusively, misquoted, etc. And I don't plan to resume a debate with you over this either. We clearly have personality issues with one-another, and Wiki has a zillion other people who can edit these pages. I'll help any of them out (if they approach things constructively). But you yourself said you had no plans to do so, or to post anything further here. And I think that's exactly right: give it a rest and pursue some of the other topics of interest to you... Ken Birman 11:07, 10 July 2007 (UTC)
For about six months now, this page has been tagged for Point of View, Original Research and Factual Accuracy. After a brief period of discussion and editing activity, nothing has happened for six months.
Here's my question: as a matter of policy, do such tags remain on the page forever, or is there some procedure for agreeing to remove them? It seems safe to assume that KPBotany, who placed them there, would not consider the matter resolved, since nothing substantive changed during this period. However, perhaps that particular reviewer will never find comfort with this material -- such things do sometimes happen. Ken Birman (talk) 18:03, 21 December 2007 (UTC)
- Ken, such tags are a matter of local consensus. As most of the text has been contributed by one person, namely you -- the author of Virtual Synchrony, the OR, POV, and FA tags should probably remain. Its too bad that no one stepped up to the plate to help with the editing. Perhaps you can attract attention of editors on the Paxos algorithm page as they might be interested in the topic and would probably do a good job as that particular article has become much better over the last year or so.Bestchai (talk) 03:26, 25 January 2008 (UTC)
- Perhaps this article could be condensed into an automatic way to resolve an edit conflict. When I was introduced to dumb terminals in 1988, two people could not write to the same file at the same time. Now, I can almost see a way where this could usually be automatic. IOW, if two people did not alter exactly the same text in a file. In an edit conflict, current wiki software sends one user's edit to another. It's a lot of work to do that stuff manually. Know what, though? If two users do write into the same bytes, then one needs more authority (latest submission--timeouts in effect). As wiki is, now, I would prefer to serialize write locks, like they used to. event queue seems to be majorly relevant. BrewJay (talk) 09:25, 14 July 2008 (UTC)
- In fact this topic has been studied (the most famous work is by Jerry Popek, years ago in a system called LOCUS (sadly, he just passed away last week), and more recently by Doug Terry at Xerox Parc and then Microsoft. Microsoft has a new file system under wraps that does exactly what you have in mind. Very very cool. And definitely related to the topic of event ordering and time in a distributed system. A bit less specifically related to virtual synchrony, though. Ken Birman (talk) 14:17, 9 August 2008 (UTC)
I wonder if michigan terminal system is in here somewhere. It's largely dead. On that system, not only were write locks serialized, you could not lock a file for reading if someone had it locked for writing, and if someone had it locked for rename|destroy, then you couldn't lock it for writing. BrewJay (talk) 09:33, 14 July 2008 (UTC)
asynchronous, message queue and buffer seem like pre-requisites.
That's all I hav to say on this at the moment. This article is much better than other work I've seen from Birman, so I'm not redirecting it. I am requesting in-line citations on the numbers, though. See WP:NOR. BrewJay (talk) 11:41, 14 July 2008 (UTC)
Maybe I'm missing the boat on the PAXOS method that he's trying to attack. I don't see any reason to believe that a CPU operating in the gigahertz range can't route a packet to the network and save it to disk, when those things operate at about 31 megahertz. BrewJay (talk) 15:23, 14 July 2008 (UTC)
- Hi guys. I'll take that as a kind of back-handed compliment. I'm not as good a writer as some of the editors who hang out on Wikipedia, so I'm kind of counting on you folks to fix things up. This said, it is really important not to edit a thing if you haven't learned the background (I don't mean that as an insult; this is just basic Wiki policy, and makes good sense).
- I'm seeing some evidence of confusion here, so I'll make an offer. Email me your physical mail addresses -- email@example.com -- and I'll mail you a free copy of the textbook I've written on reliable distributed systems. Sure, my writing is dreadful -- I honestly realize that. And sure, these are tough things to learn just by reading a book or a research paper. But I've done my very best to explain things clearly and the book has more pages and more pictures. Maybe once you've got it all clear in your heads, you can rewrite the section better than I did.
- Also, to be very clear, I'm not attacking Paxos at all. Virtual synchrony actually includes the core Paxos mechanism -- the two are "the same" at their core. But I would definitely say that if you read the history of this area, and the papers people have written, there are many out there who do see these as competing solutions. Oddly, Leslie Lamport (the inventor of Paxos) and I are friends, and neither of us sees it that way. In fact Leslie is working on a model he calls "virtually synchronous Paxos" that makes the connections very explicit. But the knee-jerk response from Joe Random out there is that they compete and that one must be better than the other. The idea that they are somehow the same is apparently hard for poor Joe to grasp, even if Leslie and I see it that way!
- At any rate, email me and I'll be happy to provide that free book. Ask and you shall receive more than you actually wanted! And I do appreciate any editing help you can provide... Ken Birman (talk) 14:14, 9 August 2008 (UTC)
- Locking. Many systems need some form of locking or synchronization mechanism. Locking can easily be implemented on top of a virtual synchrony subsystem. For example, a system can associate a token with each group, and make the rule that to hold the lock, a process must gain "ownership" of the token. Multicasts are used to request and to pass the token.
- That's how it was left for me. If I understand the serialization article correctly, then the term is now applied to network communication as well as to buffers prepared for direct memory access. So, the system is assuming that its own devices are functioning, unless it waits for acknowledgement of those multicast lock assertions. BrewJay (talk) 11:19, 15 July 2008 (UTC)
- In the interests of performance, I'm led to believe that some UNIX systems don't enforce locking mechanisms, at least not at the file-system level, which means that a programmer should tread carefully when she ignores a lock. I think this makes proving reliability more difficult. BrewJay (talk) 11:28, 15 July 2008 (UTC)
- I've corrected the language in the locking section, which had a confusing sentence (a consequence of what looks like an editing mistake, actually -- something about ignoring flush messages). To clarify I added a paragraph; I hope this helps. You are correct about UNIX file systems, but I would suggest that you NOT add anything on this point to the main page. Yes, your statement is true and yes, it relates to locking, but locking done by a dozen active processes managing an air traffic control system is just a different enough question from locking a file that it only confuses things to start discussing that type of file system locking here. Anyhow, the file system has to implement the locking mechanism. (Turtles all the way down). If you study that problem, you end up back in the world of virtual synchrony and Paxos! The original file system locking work, by the way, was done by Jo Mei Chang at SUN in 1988 or so, and is very much in the same spirit as virtual synchrony or the more modern Google "Chubby" locking service!
- At any rate, it is really important to keep an article such as this limited in scope. Virtual synchrony is a communications model used when building a service that has multiple processes or agents running on different nodes in a computer network. Let's not confuse it by talking about devices, CPU talking to disk, etc. Those are just different problems! Ken Birman (talk) 14:07, 9 August 2008 (UTC)
Article still biased?
IMHO, this article is still deeply flawed. Most importantly, there are fundamentally different assumptions between virtual synchrony and other algorithms (e.g. Paxos). Specifically, virtual synchrony has stronger assumptions about delivery of messages - it seems to me that this can only be guaranteed (with unreliable network transports) by extra acknowledgement messages. Thus the claimed performance gains of virtual synchrony are achieved by sleight of hand. DrInequality (talk) 00:00, 8 August 2010 (UTC)
- It would help if you could be very specific about your concern, since this is written in a general way and I don't know what line you are referring to (or, if referring to a published paper, which paper and which lines within that paper). But the broad answer is that you are incorrect, although the question is a sensible one to pose. Virtual synchrony makes exactly the same assumptions about message delivery as Paxos. Clearly something has caused you to be confused either about Paxos or about virtual synchrony. Both of them require a layer that sends acknowledgements and does flow control and congestion control; this isn't really an "assumption" or "slight of hand" but just a form of layering: first, you deal with reliability on a point-to-point basis, and then you layer your more elaborate protocols over these point to point ones. But we don't assume anything -- in particular, we don't assume that the recovery mechanism will be successful. Thus a protocol instance could try and try to send a message from A to B, over this layer, and yet the message being sent might never get through, even though both A and B are running (the network could be "partitioned", to use a technical term). In such cases we force a reconfiguration: either the sender will end up exiting, or the destination will be removed from the system, based on a majority vote within the membership service, which itself is built over exactly the same unreliable message passing layer. At any rate, no slight of hand, and in fact not only is the model identical, but Dahlia Malkhi has written a very nice paper in which the two models are merged into a single more general model (but the assumptions remain unchanged). This paper is available from the MSR technical reports web site (look for Birman, Malkhi and Van Renesse) or the Cornell web site (http://www.cs.cornell.edu/projects.quicksilver/pubs.html) and is quite formal; indeed, we've used a mechanical theorem prover called NuPRL to automate the model and to verify proofs within it. Ken Birman (talk) 00:41, 22 November 2011 (UTC)
Better figure(s) required
This article requires better illustration by figures (compare with the Paxos alg page to see what I mean). It should show standard operation messages and typical failure cases. This is a key requirement for this type of article. DrInequality (talk) 00:04, 8 August 2010 (UTC)
Totally agree, but as indicated earlier, I did a lot of work on this, and the argument for having multiple eyes/hands is a strong one. But if someone takes this on and isn't sure how a figure should look, drop me a note and I'll be happy to help out. Ken Birman (talk) 21:47, 22 August 2010 (UTC)
- Absolutely. But it isn't a sole-editor situation, so if you mean I should do that edit, I prefer not to: I wrote the first version, quite a while ago. Now you own it; the best response to conflict of interest is to have other people do the future edits. So go right ahead! JGroups does use the same model (it originated as a Java rewrite of Horus, actually). So, totally appropriate suggestion, and I encourage it. Ken Birman (talk) 22:45, 21 October 2014 (UTC)
The quality issue (again)
I stumbled upon this article while searching the web for materials on virtual synchrony.
Basically the article is written in marketing talk: it claims that VS is superior to other methods; makes cluster communication easy (that particular statement even ends with exclamation point which is rarely seen on Wikipedia), etc. without going into much detail on implementation or underlying principles.
Currently I am using the book Reliable Distributed Systems: Technologies, Web Services, and Applications by Kenneth Birman to learn VS. The book is very well written and the structure of the material is great. VS and underlying techniques is covered in sections 15-18 (probably some prerequisites were introduced in earlier sections.)
Now this article's primary author is Ken Birman, what a twist!
Ken many thanks for the book, I think it is among the best ones to learn VS, but you've done a really sloppy job on Wikipedia, sorry to tell that.
- why does one thing always have to be better than another? Virtual synchrony is what got used in some systems, like the European air traffic system and the NYSE and the Corba standard; Paxos in others, like Chubby. What makes one better and one worse? Why is that a main goal in an article?
- these days even Leslie favors the virtual synchrony membership protocol over his original Paxos protocol because it finalizes the prior view before installing a new view. He calls the resulting protocol "virtually synchronous Paxos". Also, virtual synchrony can support a safe form of optimistic delivery that performs much better than Paxos/SafeSend, although you do need to call Flush before externally visible actions. So I guess those are benefits. But you seem to be asking me to do a sales pitch. Why not just improve the article? I have a sense that you are actually claiming that Paxos is better. Why do you feel that way? Virtual synchrony gbcast (SafeSend) bisimulates Paxos, and existed years earlier too. So in a sense, aren't you comparing applies and oranges?
- by the way, I didn't recall writing about cluster communication, or adding an explanation point. In fact I only wrote the first version of the article. Since then, others have edited it. Your issue is most likely with a subsequent edit. Maybe a fan of Isis2. But look, this is Wikipedia. Just improve it!