In the first of a series looking at this year's eight ongoing Google Summer of Code projects, the Signpost caught up with developer Harry Burt, which wasn't too tricky, given that he is also the regular writer of this report. Burt explained what his project was about, his success so far – final submissions are due in a month's time – and what impact it might have on the Wikimedia community:
The English-language version of a map of South Sudan created during 2011; alongside it on Wikimedia Commons are more than a dozen duplicates containing translated labels.
On 9 July 2011, South Sudan declared independence, and during that buzz, an Italian Wikimedian found his map showing the borders of the new nation had been translated into a dozen other languages, among them English, Greek, Catalan, and Macedonian. These copies were then uploaded onto Wikimedia Commons as separate files. Of course, one would expect the map to change significantly over the next decade. More often than not, these kinds of change are picked up first by editors of the larger projects, who rapidly update their own versions of the map. To do so takes, say, 20 minutes; but to replicate that same change across Catalan, Greek, Macedonian? Hours of work – and dozens of separate uploads.
My project, named "TranslateSvg", changes this workflow – for SVG format files at least – firstly by making it easier to translate those files (thus reducing the all-too-common sight of English-language diagrams in use on non-English wikis), and secondly by embedding the new translations within the same SVG file. When boundaries change, a single update will propagate to all language versions instantly. It's the smaller projects that will benefit the most, picking up those image updates that are performed every day by users of larger projects, but there are gains for larger wikis too from the reverse process.
I would say that progress has been good so far: the main hurdle remaining is code review, and it's during that period that the project will either sink or swim. If the latter, TranslateSvg could find its way onto Wikimedia Commons before the end of the year.
Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for several weeks.
MediaWiki 1.20wmf8 begins deployment cycle: 1.20wmf8 – the eighth release to Wikimedia wikis from the 1.20 branch – was deployed to its first wikis on July 23 and will be deployed to all wikis by August 1. The release incorporates approximately two hundred changes to the MediaWiki software that powers Wikipedia, comprising 94 "core" changes plus a similar number of patches for WMF-deployed extensions. Among the changes (the product of a fortnight of development time, allbeit one disrupted by Wikimania) are improvements to the way Special:MyPage handles additional URL parameters (bug #35060; for example, &redirect=no) and a fix to Special:Contributions' "newbie" mode to exclude new bots from the display (revision #14674). A release to external sites including the same selection of bug fixes and new features is not expected for some time.
History bug marked resolved: After an investigation lasting nearly two months, developers are now confident they have fully resolved bug #37225. The bug, which intermittently caused null revisions to be erroneously recorded – and with non-zero byte change numbers – had prompted dozens of comments across many wikis, including at least 5 threads on the English Wikipedia's technical village pump alone. It was eventually traced to a small change in the exact SQL command being given to the database upon a page save, a problem which has since been repaired. The fix is included in the 1.20wmf8 release, but it is not known if it will also be applied (or indeed needs to be applied) retroactively to clear old revisions from the database.
Gerrit, but faster: The speed of Gerrit received a significant boost on Friday, when its databases were transferred from the Tampa, Florida datacentre to the WMF's other datacentre in Ashburn, Virginia, where the Gerrit frontend is hosted (wikitech-l mailing list). Developers have already noted a significant improvement in load times for all sorts of web-based Gerrit operations. The improvement comes in the midst of long discussions about the long-term viability of sticking with Gerrit, and more specifically this week, whether or not there was a "serious alternative" to the code review system, which has been in place since March (also wikitech-l).
Requested moves bot sidelined: RM Bot, the automated bot responsible for listing requested move discussions, made its last edit Wednesday. The PHP-run bot, which has performed since November 2009, is operated by recently inactive contributor HardBoiledEggs. The community is discussing returning to manually updating the listing page until HardBoiledEggs returns; in addition, current and potential bot-operators are advised that the existing and "indispensable" bot is currently in need of a new owner.
More dumps on the Internet Archive?: WMF developer and dumps guru Ariel Glenn this week blogged about her efforts to improve the quantity, quality and timeliness of WMF database exports (colloquially known as "dumps") hosted on the US-based Internet Archive. Wrote Glenn, "When we look back on this period of our digital history, the Archive will surely be recognized as one of the great repositories of knowledge, a project that changed forever the course of the Internet. Naturally we want Wikimedia XML dumps to be a part of this repository", before explaining her work on a new suite aimed at facilitating the upload process.
&action=info?: An old system for supplying human-readable metadata about a page could be reinvigorated if a formal Request for Comment (RFC) receives support from developers (wikitech-l mailing list). The &action=info system, which would complement existing pages such as &action=history, could eventually list dozens of pieces of information about a page, including such details as creation time, creator and number of revisions. Enabling the page would still require local community consensus; the current discussion relates to having such functionality in the code should projects then wish to use it.
New staff member: Long-time editor S ("yes his name really is just 'S'") Page has joined the Wikimedia Foundation as a software engineer in its Editor Engagement Experiments (E3) engineering team, WMF Director of Engineering Alolita Sharma announced this week (wikitech-l mailing list). A former ski instructor and sometime road sweeper, Page will help develop new technologies aimed at bringing in and then retaining new editors; it is not yet known what his precise focus might be.