Jump to content

User:The Anome

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 74.76.208.129 (talk) at 19:40, 5 November 2016. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

{{Sockpuppet|JarlaxleArtemis} The Anome is a second-wave Wikipedian.

The Anome abides.

Interesting reading

To do

Geodata to-do

...

  • Monitor Category:Pages with malformed coordinate tags
  • Why do Republic of Dagestan etc. articles escape the {{coord missing}} sorter?
  • Possible low-hanging fruit for geocoding: the following categories have thousands of non-geocoded articles that are not getting matched by my current software, and may benefit from special-purpose matching heuristics:
    • Category:Brazil articles missing geocoordinate data (was 3000+ articles, now 2,478 as of 2015-03-25) -- ??
      • Note: most of these appear to be rivers -- just matched 500+ of these by translating GNS names
    • Category:Iran articles missing geocoordinate data (13,000+ articles) -- transliteration problems, presumably
      • It looks like a lot of this might be repetition of the same location in multiple places: the bot's code gets 7000+ multi-matches for Iran
      • See also this paper: "Cross linguistic name matching in English and Arabic: a "one to many mapping" extension of the Levenshtein edit distance algorithm" in Freeman, A. T.; Condon, S. L.; Ackerman, C. M. (2006). "Cross linguistic name matching in English and Arabic". Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics: 471. doi:10.3115/1220835.1220895.
      • And this: A verified Arabic-IPA mapping for Arabic transcription ... http://eprints.whiterose.ac.uk/79653/1/brierley14jss.pdf
      • And this: http://geonames.nga.mil/gns/html/romanization.html
    • Category:Pakistan articles missing geocoordinate data (~3500 articles) -- transliteration problems, presumably
      • Note: 700+ multimatches from bot code
    • Category:Philippines articles missing geocoordinate data (2000+ articles) -- ??
      • Note: mostly universities, schools, other locatable organizations, very little here looks bot-matchable.
    • Category:South Korea articles missing geocoordinate data (1700+ articles) -- not sure what's going on here: fixed my FIPS 10-4 mapping, but that doesn't go very far towards fixing the problem
      • Note: insignificant number (< 100) of multimatches
      • This may be a matter of transliteration: McCune–Reischauer vs. Revised Romanization
    • Category:Turkey articles missing geocoordinate data (5000+ articles) -- lots of places with the same names but in different regions (eg. 17 villages all called "Akpınar"), same problem as was found with Polish placenames (also: why is Akçakoca failing to be caught?) The bot code finds 3000+ multi-matches for Turkey.
      • Also, this is due to non-standard naming conventions for the hierarchy of Turkish article categories: see, for example Category:Ankara Province.
      • I've now used spatial disambiguation to resolve some 2000+ of these.

Total is over 27,000 possibles: even doing a fraction of these would make a big dent in the backlog.

AI scenarios

From http://plato.stanford.edu/entries/logic-ai/ , the following list of AI/logic problem scenarios:

The Baby Scenario, the Bus Ride Scenario, the Chess Board Scenario, the Ferryboat Connection Scenario, the Furniture Assembly Scenario, the Hiding Turkey Scenario, the Kitchen Sink Scenario, the Russian Turkey Scenario, the Stanford Murder Mystery, the Stockholm Delivery Scenario, the Stolen Car Scenario, the Stuffy Room Scenario, the Ticketed Car Scenario, the Walking Turkey Scenario, and the Yale Shooting Anomaly.

We should have articles on all of these that meet the notability criteria. -- The Anome (talk) 15:10, 1 March 2013 (UTC)

Given it's been over two years since this suggestion and that there is still an article on the Yale Shooting Anomaly by that name, it seems fair for you to go ahead and create them. Unless you already tried and were prevented. In which case it might be helpful to state why, here on on a page linked from here. Fallacies and dilemmas and toy problems from AI do seem to be important enough to report, certainly more so than every minor character from The Simpsons. Though perhaps some of these have acquired other names since that article you cite? In which case you might consider redirects from all those names.

Filters