Jump to content

User:Mill 1/sandbox3

From Wikipedia, the free encyclopedia

Round 1: Breaking up the Deaths in Years[edit]

Period: September 2018 – October 2020
Articles: Deaths in January 1996[1]Deaths in December 2005

The first phase of this project involved transforming the long and unwieldy dpy's into twelve shorter and more manageable dpm's. I began this process on 1 September 2018 by standardizing the format of sections and list entries across all the dpy's[2]. This allowed me to start adding more deceased people ('entries') to the lists. At first, I focused on filling the gaps in the days of death, but later I decided to increase the minimum number of entries per day to two.[3] I also applied these changes to the 24 existing dpm's of 2004 and 2005.

More issues[edit]

However, I soon realized that there were other issues that needed to be fixed before I could proceed with the additions. For each dpm, some entries did not have a biography page ('bio') to link to and had to be removed. Moreover, the deceased people were grouped by day sub-sections within each month, and many of them were in the wrong sub-section and had to be moved. The reason for these errors was often that the date of death in the corresponding bio had changed but was not updated in the list.

Adding an actual entry[edit]

And this was all before I could even start adding the missing entries. FYI: the Wikitext structure of an entry looks like this:
*[[Name]], age, country of citizenship + reason for notability, cause of death (if known).<ref>Reference citing the date of death</ref> which could be rendered as:

So for each entry, I had to find the relevant information in the corresponding bio and add it to the appropriate sub-section.

Compiling the list of entry candidates[edit]

But before I could even start adding entries, I had to figure out which entries were eligible for addition. Who had died on a given date?[5] I tried using some advanced search queries available in Wikipedia (more info). But this still meant doing the searches manually and handling the results.

To the rescue: Excel[edit]

Screenshot of the Excel tool which generated most of the wikitext

Clearly, this was too much work to do by hand. It was no surprise that these lists were so incomplete and inaccurate. So before I started, I improved the functionality of an Excel tool that could generate most of the wikitext for me.

  1. ^ Technically the year 1996 was handled in other rounds but it is included here because I forked the dpy into new articles.
  2. ^ Cite error: The named reference sw1 was invoked but never defined (see the help page).
  3. ^ Eventually, I raised this minimum to three entries per day, and also applied similar criteria for the number of references per day and so on..
  4. ^ A reference citing the date of death of Lesley Cunliffe
  5. ^ An alternative method would have been to go through everyone listed in the category of deaths of a specific year. However, this would have required processing the months of a whole year at once. And I would still have to check the bio's for the exact date of death of each person. Also, I discovered that many bio's had incorrect categories for the year of death (and birth).