Wikipedia:Huggle/No you can't have instant reverts

From Wikipedia, the free encyclopedia

LATENCY FOR BEGINNERS

There is a single, definitive state of the wiki at any one moment -- that stored in the master database.
Web clients, such as web browsers and Huggle, cannot magically see this state immediately.

One database server is not sufficient to handle Wikipedia's traffic.
So there are multiple database servers, holding copies of the wiki's state.

These are always slightly out of date, often by several seconds.
There is then another delay across the Internet, often several more seconds.

This is a fact of life in any networked system.
If you've ever played online FPS games you'll know the importance of ping times.

Gamers want ping times to be stable and below about 100ms, so that other players' actions seem instant.
Anything over about 300ms is generally considered unplayable.
Latency is critical. If you shoot someone, you expect them to die now, not in ten seconds' time.

Latency is less critical in other applications. Wikimedia's servers generally respond within 500 to 3000ms, with occasional spikes up to 10000ms or more. This is fine for readers, they can wait 3 seconds for an article to display. On pages with the pending changes extension, response times in excess of 20000ms are normal.


Expecting Huggle to carry out actions "instantly" is like expecting an FPS to be playable with a ping of 10000.
Now do you see why your desire for instant feedback is unsatisfiable?


OK, so we accept that:

  • things will often take several seconds to happen
  • that this time will be unpredictable and widely variable
  • that there is nothing the client can do about it

But, still, these things should still always happen, predictable, in the same order, right?
Well, not exactly.


CONCURRENCY FOR BEGINNERS

Consider the following situation:

  • Another client (probably a web browser) has edited the wiki
  • We were notified of this through the IRC recent changes feed
  • The feed only contains very basic information, so we now want to retrieve details of the change

So we ask the server for the information, right?
It's not as simple as that.
There are now five actors involved in the system, each with associated, greatly variable, delays.

Any request to change the wiki is sent to the master server.
The change is committed and propagated out to the other database servers.
Read-only requests, such as viewing a page, go to these servers.
The recent changes feed naturally comes from the master server.

And the crucial point:
The delays incurred transferring data to the IRC feed and to the database servers are both variable and there is no guarantee which one will happen first.

Sometimes, we will get this:

So we ask the server for details of the revision, and the server says, "What revision? There is no such revision."

The information we're getting from the wiki isn't even consistent in itself! What possible hope do we have of maintaining a consistent state in our client?!
Retrying after a few seconds may or may not work. We don't know. It all depends on whether the database servers have received the revision of interest from the master by that time.

But you never had this problem yourself while using Huggle. Because it was solved for you.
All you saw was some diffs randomly taking much longer to load than others. And you still complained.


CONCURRENCY PART TWO

Now let's consider another scenario:

  • Someone has edited the wiki
  • We have examined the change and wish to revert it
  • Someone else also wishes to revert it

Now there are six actors involved in the system.
It's tricky enough just to ensure people don't overwrite each others' changes without realizing (the whole "edit conflict" concept).
But MediaWiki handles that. We should be able to revert by just sending the edit and hoping for the best, right?
After all, if someone beat us to it MediaWiki will just discard our edit and tell us about it.

Well... that's kind of true, actually. It's not actually that simple, which is why Huggle sometimes reverts things it shouldn't. But usually yes.

But again, you're not satisfied with that. You want a client with an incomplete state picture to nonetheless produce completely consistent output messages in all situations. Can't be done.

Look at the following three diagrams (you might want to view them full size). They depict almost the same process, but with subtle differences that mean both the outcome and the messages that can be displayed to the user are different.

(Because the diagrams are complex enough without it, here we assume edits come to us, the client, directly from the master server without the intermediary of the IRC feed.)

In this diagram, the user successfully reverts. Someone else tries to but fails:

In this diagram, the user tries to revert but fails, because someone else did. Note that the only differences between this and the previous diagram are the timings involved:

In this diagram, the user is about to revert but receives notification that an identical revert has already been done. Obviously the message in this case is different to that in the above example, but again, the only difference is in timings, and the client has no way of knowing beforehand which of these situations is the case:

Ixfd64 thinks all these situations -- and more -- should produce identical messages. This is not possible.

"Huggle does nothing - it just displays the "Reverting <page name>..." message, which then disappears"

Two users attempted to revert. MediaWiki accepted the first edit and then also silently accepted the second edit because it was identical, performing a null edit. You were the second user. Huggle therefore has no error message to display, and yet no edit of yours to credit you with either.

"Huggle prints a message saying that the page has already been rolled back"

The scenario in the second diagram above.

Huggle prints a message saying that the target revision is identical to the current one"

The scenario in the third diagram above.

Huggle tries to revert the user who reverted the page

Imagine the third diagram above with the client end shifted even further down so that the client has received notification of another client's revert before the user even clicks the revert button. On receiving such notification Huggle will update the User field and briefly disable the Revert button. The diff display won't update immediately because Huggle will have to go fetch the diff and sometimes that will fail (remember our first timeline diagram?) If this happens, either you didn't read the interface properly, or notification of the new edit is so slow arriving that the revert button re-enabled and you clicked it despite the interface changes. In neither case can the same message as any of the above be given.

If the user being reverted has made more than one edit in a row, then Huggle may revert the article to the user's second-to-last edit (which would most likely be vandalism)

The above happened. Then, because you can't rollback revisions by multiple users, Huggle reverted to the selected revision instead. It would have warned you first about reverting to a revision by the same user, unless you turned it off, or the warnings are broken. Either way, you still can't have the same message as in any of the other cases.