The Anome is a second-wave Wikipedian.

Interesting reading[edit]

To do[edit]


  • look at KnotInfo stuff, and in particular why their Alexander polynomials seem to be a factor t different from everyone else's

    Answer from the article: "Since this is only unique up to multiplication by the Laurent monomial , one often fixes a particular unique form"


  • Add exotic Unicode math "fonts" to the global titleblacklist



  • Sort mangling of double-blank-line-spaced paras on tag insertion  Done
  • Sort mangling of double-blank-line-spaced paras on tag removal
  • Add Category:Townlands of Northern Ireland by county to the bot input list  Done
  • Rewrite the Anomebot's internal API access library to use Python's "requests" HTTP library under the hood, to allow the use of persistent HTTPS connections
  • Fix bot exclusion stuff: complex spec, painful to implement -- do a simplified version that just detects the {{bot}} tag
  • OSM semantic bridge: see


Short descriptions[edit]



Custom political navbox colors[edit]

See Category:Navboxes using background colours and Category:Political ideology templates: PetScan

See also: Category:Navboxes using background colours and Category:Political party templates by country: PetScan

Python code to remove color styling from navbox template wikitext: User:The Anome/

Discussion: Wikipedia_talk:WikiProject_Templates#Advertising_colors

Articles for creation[edit]

Articles for:


Articles for splitting[edit]

Geodata to-do[edit]


  • Do similar format-preserving stuff to tag remover part of bot

Quality control[edit]

  • Look into autodetection of low-resolution geocoding of fine-grained objects (villages, buildings, landmarks...). Ping User:Abductive.


  • Harvest unused lat/long data from {{infobox settlement}}, and replace with {{coord}}: see here for monitoring script.

    Not many articles have this, so this is likely to affect a couple of hundred of articles at most. Still, every little helps.

  • Monitor Category:Pages with malformed coordinate tags
  • Why do Republic of Dagestan etc. articles escape the {{coord missing}} sorter?
  • Possible low-hanging fruit for geocoding: the following categories have thousands of non-geocoded articles that are not getting matched by my current software, and may benefit from special-purpose matching heuristics:
    • Category:Brazil articles missing geocoordinate data (was 3000+ articles, now 2,478 as of 2015-03-25) -- ??
      • Note: most of these appear to be rivers -- just matched 500+ of these by translating GNS names
    • Category:Iran articles missing geocoordinate data (13,000+ articles) -- transliteration problems, presumably
      • It looks like a lot of this might be repetition of the same location in multiple places: the bot's code gets 7000+ multi-matches for Iran
      • See also this paper: "Cross linguistic name matching in English and Arabic: a "one to many mapping" extension of the Levenshtein edit distance algorithm" in Freeman, A. T.; Condon, S. L.; Ackerman, C. M. (2006). "Cross linguistic name matching in English and Arabic". Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics: 471. doi:10.3115/1220835.1220895.
      • And this: A verified Arabic-IPA mapping for Arabic transcription ...
      • And this:
    • Category:Pakistan articles missing geocoordinate data (~3500 articles) -- transliteration problems, presumably
      • Note: 700+ multimatches from bot code
    • Category:Philippines articles missing geocoordinate data (2000+ articles) -- ??
      • Note: mostly universities, schools, other locatable organizations, very little here looks bot-matchable.
    • Category:Romania articles missing geocoordinate data (7000+ articles)
      • Note: apparently mostly rivers
    • Category:South Korea articles missing geocoordinate data (1700+ articles) -- not sure what's going on here: fixed my FIPS 10-4 mapping, but that doesn't go very far towards fixing the problem
      • Note: insignificant number (< 100) of multimatches
      • This may be a matter of transliteration: McCune–Reischauer vs. Revised Romanization
    • Category:Turkey articles missing geocoordinate data (5000+ articles) -- lots of places with the same names but in different regions (eg. 17 villages all called "Akpınar"), same problem as was found with Polish placenames (also: why is Akçakoca failing to be caught?) The bot code finds 3000+ multi-matches for Turkey.
      • Also, this is due to non-standard naming conventions for the hierarchy of Turkish article categories: see, for example Category:Ankara Province.
      • I've now used spatial disambiguation to resolve some 2000+ of these.

Total is over 27,000 possibles: even doing a fraction of these would make a big dent in the backlog.

Tools of interest[edit]

AI scenarios[edit]

From , the following list of AI/logic problem scenarios:

The Baby Scenario, the Bus Ride Scenario, the Chess Board Scenario, the Ferryboat Connection Scenario, the Furniture Assembly Scenario, the Hiding Turkey Scenario, the Kitchen Sink Scenario, the Russian Turkey Scenario, the Stanford Murder Mystery, the Stockholm Delivery Scenario, the Stolen Car Scenario, the Stuffy Room Scenario, the Ticketed Car Scenario, the Walking Turkey Scenario, and the Yale Shooting Anomaly.

We should have articles on all of these that meet the notability criteria. -- The Anome (talk) 15:10, 1 March 2013 (UTC)

Given it's been over two years since this suggestion and that there is still an article on the Yale Shooting Anomaly by that name, it seems fair for you to go ahead and create them. Unless you already tried and were prevented. In which case it might be helpful to state why, here on on a page linked from here. Fallacies and dilemmas and toy problems from AI do seem to be important enough to report, certainly more so than every minor character from The Simpsons. Though perhaps some of these have acquired other names since that article you cite? In which case you might consider redirects from all those names. — Preceding unsigned comment added by (talkcontribs) at 19:22, 7 July 2015 (UTC)


Character blacklists[edit]

Work in progress:

[\x{1D400}-\x{1D7FF}]  # characters from Unicode block	Mathematical Alphanumeric Symbols
[\x{2100}-\x{214F}]    # characters from Unicode block Letterlike Symbols
[\x{2460}-\x{24FF}]    # characters from Unicode block Enclosed Alphanumerics
[\x{1F100}-\x{1F1FF}]  # characters from Unicode block Enclosed Alphanumeric Supplement
[\x{FF00}-\x{FFEF}]    # characters from Unicode block Fullwidth and Halfwidth Forms
[\x{2580}-\x{259F}]    # characters from Unicode block Block Elements
[\x{2500}-\x{257F}]    # characters from Unicode block Box Drawing
[\x{1D00}-\x{1D7F}]    # characters from Unicode block Phonetic Extensions
[\x{0250}-\x{02AF}]    # characters from Unicode block IPA Extensions

See this diff for some usernames using these characters, and this diff for adding these to AmandaNP's bot. See also meta:Talk:Title blacklist for global discussion.