User talk:The Anomebot2/Archive 1

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

I am a computer program, so there's not much point in talking to me. However, you might want to talk to my owner, User:The Anome at User talk:The Anome.

Latest improvements[edit]

  • I have now improved the graph traversal code to deal better with the special case of territories of other countries which have their own ISO 3166 2-character code. For example Aruba, which is a constituent country of the federacy of the Netherlands, but is autonomous enough to have its own code, AW, will now be listed as within AW, not NL. This is currently only of theoretical interest, since these entries are currently eliminated by other heuristics, but when I get round to sorting them out, everything should work correctly. -- The Anome 18:11, 20 August 2006 (UTC)[reply]
  • I have added extra code to stop the addition of lat/long tags to articles which already contain OSGB tags: UK articles added by the bot from Aberaeron to Biggleswade did not use this check, and may therefore contain duplicate geotags. -- The Anome 18:36, 24 August 2006 (UTC)[reply]
Hi The Anome. Maybe it's been fixed by now, but there are cases where title co-ordinates are being duplicated such as in this edit.--85.158.139.99 (talk) 16:04, 16 May 2008 (UTC)[reply]

Australian towns[edit]

  • Noticing your excellent work on adding coords to Australian towns. You are doing a great job :-) Thanks--Arktos talk 20:20, 19 August 2006 (UTC)[reply]

Progress so far...[edit]

Progress so far:

GNS/category fusion dataset:

  • inspected 17424 articles that were uniquely taggable and not recorded as being geotagged in a recent dump
  • added geodata tags to 13290 of them: the others were already geotagged in some other way, or otherwise untaggable, and were not updated

de:/en: collation dataset:

  • Pending for inspection, 8240 articles recorded as having geotags in de: but not in en: -- a subset of these articles now dumped
  • Double-check: early versions did not spot {{geodis}}. Only a few pages got marked by mistake: fixed by hand
  • dumping second run of ~1600 articles: towns, cities, lakes and islands only, degrees-and-minutes resolution only

-- The Anome 10:50, 8 September 2006 (UTC)[reply]

  • Now doing a new pass over recently-added and recently-categorized articles, based on the 20061104 dump set. -- The Anome 01:50, 20 November 2006 (UTC)[reply]
  • Adding various new feature types: bays, glaciers, fjords, volcanoes, etc. -- The Anome 02:01, 21 November 2006 (UTC)[reply]
    • New: total Anomebot2 edit count is 23304 geotags added.

forests:

  • I tried matching forests with GNS entries: there do not appear to be enough unambiguously-identifiable and well-tagged forest articles in en: to match anything

combined multilingual dataset:

  • I'm currently scanning about 14,000 records generated from the multilingual CSV dataset from de:Wikipedia:WikiProjekt_Georeferenzierung/Wikipedia-World/en
  • On the basis of a bit of informal inspection, about one third of entries seem to be new, so this will probably result in 4,000-5,000 new geotags being added

-- The Anome 00:56, 30 December 2006 (UTC)[reply]

To do:

  • add mountain passes to the "mountain" class, prior to the next database dump analysis.
  • get rid of instances of (rarely used) template {{mmuk maphot}}, either by replacing with dms geotags, or by using a standard osgb template

-- The Anome 22:43, 31 December 2006 (UTC)[reply]

Interwiki link sorting[edit]

I beg you to change the sorting order or disable this ability. It is frustrating not to find Suomi before Svenska where it logically belongs because it is moved before Français (see Jyväskylä for example). Not too many readers are familiar with the language ISO codes, they most likely read the list alphabetically. Yeah, I know, there's no consesus how the links should be listed, but still. Anyway, thanks for the coordinates thingy.--JyriL talk 12:40, 22 August 2006 (UTC)[reply]

The problem is that no-one can agree on the correct order. Some like them in English-language name order (in which case suomi should be sorted under 'F'), others in native-language name order (in which case, the Albanian language should be sorted under 'S', and where do you sort 中文, relative to Latin alphabets, in that case: under H or Z, or before or after all Latin characters, and if so why -- perhaps before, because their script's older, for example? -- or use raw Unicode collating order? And what about Indian or African languages, some of which have sounds not expressible in English: where would you put the !Kung language in the sorting order?) Rather than trying to reorder the tags in every article, which would result in a mess of different conventions, partial combinations of both, and no order at all, and massive database churn as competing attempts are made to impose the "correct" order on millions of interwikis within articles, it would be better to sort this out in the page-rendering code.
So, given that I have to put them in some sort of order when I move them to the bottom of the article, I've chosen to sort them in ISO code order, which is neatest in the source text, and this has the advantage of annoying both the big-endians and the little-endians equally not taking sides in this controversy. Thus the cosmic balance is preserved.
Seriously, since this should be a page-rendering, rather than an article-formatting issue, why not file a bug in the MediaWiki Bugzilla about this? The sorting code would be trivial, and could be made configurable for those people who really, really care about the ordering of the interwiki tags one way or another. -- The Anome 23:37, 22 August 2006 (UTC)[reply]

Geodata[edit]

What is the purpose of the geodata you have added to some Irish towns like Enniskerry and Blessington? I don't see any result on the page. Cheers ww2censor 14:37, 23 August 2006 (UTC)[reply]

It's displayed at the top of the article, to the right of the title. -- The Anome 18:32, 24 August 2006 (UTC)[reply]

Wonderful Bot[edit]

This bot has done great so far! How often do crazy people go and press the emergency shutoff? Can you add locations for Anjouan please? Keep it up! Felixboy 13:49, 25 August 2006 (UTC)[reply]

Thanks! I've fixed Anjouan by hand, for now, but there are many, many, more candidtate geodata tags of various classes yet to be added, so watch this space. -- The Anome 22:37, 24 August 2006 (UTC)[reply]

So do you do like whole regions at once and not move on until that area is a lot better or do you wake up every day aand say what the heck lets edit here? Felixboy 14:59, 28 August 2006 (UTC)[reply]

Not so much regions as classes of feature, but, pretty much, yes. I'm progressively improving my filtering code, and each time I add another set of filter criteria, I manually QA a sample before proceeding with the entire dump. So far I've mostly steered clear of U.S. features because they are not in the GNS -- they are in GNIS, instead -- so I may do the U.S. after I've done the rest of the world. -- The Anome 10:54, 8 September 2006 (UTC)[reply]

Stevenston / Ardrossan[edit]

Your bot has given Stevenston a latitude of 55 38 N and Ardrossan 55 37 N although Ardrossan actually lies to the NORTH of Stevenston, with Saltcoats in between.

The difference is small, however. Just seems v odd to me. --NSH001 00:04, 25 August 2006 (UTC)[reply]

The problem is with the source data: these annotations can only be as accurate as the original NIMA GNS data. Ideally, the coarse lat/long data will be replaced in the long term by higher-resultion data referenced to that particular countries' national geodetic systems. -- The Anome 20:40, 25 August 2006 (UTC)[reply]

Some things[edit]

Hi Anome, nice work. You can also use my CSV-Data. Many articel in german wikipedia has a geotag, but at the same article in the englisch is no geotag. The biggest problem in the english wikipedia is, that many geocoordinates has no "region" and "type. Is it possible that your bot fix this geotags? It will be a great help. -- Stefan Kühn 19:58, 26 August 2006 (UTC)[reply]

I'm doing this now. -- The Anome 10:56, 8 September 2006 (UTC)[reply]

Interwiki Bug[edit]

Hi Mr Bot.

I am considering pressing the emergency shutdown, since you mess up the interwiki links. The interwiki links are not alphabetical after language code; they are alphabetical after the language you see in the side bar. Here is an example how you messed up Vättern [1].

Fred-Chess 22:23, 26 August 2006 (UTC)[reply]

I did some checking: as far as I can see, as of the last discussion, which I believe was Wikipedia:Language_order_poll, there was no consensus, but where the leading choice was, by a tiny margin, alphabetical by language code order, which is the ordering used by this bot.
I think the lack of a clear consensus points clearly to this being a rendering issue, not an article-formatting issue: otherwise, every time the consensus changed, hundreds of thousands of articles would need to be re-formatted. -- The Anome 23:10, 26 August 2006 (UTC)[reply]
You are arbitrary changing the formatting of hundreds of thousands of pages. No matter the poll, suomi is put as "s" by several bots, and is so in all pages I have checked, including Venezuela, Bill Gates, Sweden, etc. I suggest you stop changing an accepted practice, since it will now take some time to clean up after you.
Fred-Chess 08:30, 27 August 2006 (UTC)[reply]
I've now changed the bot code to preserve the ordering of existing interwikis -- I'm tempted to have a long argument with you on this, but life's too short to waste on arguing the point. (However, please see my comments above, to see why this issue is almost impossible to resolve: for example, what letter would you sort 中文 under -- C, Z, H, before A, or after Z?) -- The Anome 11:55, 27 August 2006 (UTC)[reply]
Ok I see your point -- sorry if I came about too strongly. But I do think that it is imperative to do have interwikis in a uniform manner, and that preference should be given to the most common practice.
I am not familiar with how ZH is usually sorted.
Fred-Chess 12:08, 27 August 2006 (UTC)[reply]

Coordinates of provinces[edit]

Hi, I have just realized that "you" have inserted coordinates for the Province of Maputo in Mozambique. Although coordinates are for specific points, I can understand using them for cities and towns, now provinces....can you give a coordinate to an area? Is it tehe coordinate for the capital, or for the geographic center or for some corner or for a randomly selected point? Could "you" please elaborate? Thank you Teixant 18:52, 10 September 2006 (UTC)[reply]

It's purely a representaive point for centering maps on. -- The Anome 23:28, 15 September 2006 (UTC)[reply]

Bonete[edit]

Hi, I saw your bot added the coordinates to Bonete. Where did you get them from? They seam to be wrong. Check the talk page. Good wiking, Mariano(t/c) 08:10, 10 October 2006 (UTC)[reply]

Coordinates for Brahmagiri very wrong[edit]

The coordinates you added to Brahmagiri are very wrong. Please look into this problem with your system. --BostonMA talk 04:33, 14 October 2006 (UTC)[reply]

The bot-generated coordinates are only given to a precision of a few minutes of arc. This map [2] suggests that there is at least something called Brahmagiri around this location. -- The Anomebot2 22:21, 3 November 2006 (UTC)[reply]
Um, there might be a village or something there named Brahmagiri, however, the article is about a mountain on the border between Kerala and Karnataka, and is probably over 1500 km from whatever it is your map points to. --BostonMA talk 22:45, 3 November 2006 (UTC)[reply]

Multiple place names[edit]

I noticed that the bot inserted coordinates for Furnace, Llanelli which were actually those for Furnace, Scotland another of the 4 places called Furnace in the UK. Is there a problem with duplicate place names?--JBellis 19:16, 14 October 2006 (UTC)[reply]

The bot's code goes to some effort to prevent this sort of error from happening by detecting and filtering duplicates, and it has been tested extensively, but it's not infallible. If you've found any similar errors, I'd appreciate hearing about them. -- The Anomebot2 22:15, 3 November 2006 (UTC)[reply]

Joseph Beuys[edit]

Hi Anome (and bot)

I took the liberty of reverting your edit to this article. I'm not sure which landmark you intended to tag. Happy editing. Valentinian (talk) / (contribs) 23:25, 29 December 2006 (UTC)[reply]

Interwiki link problem[edit]

In this edit the "pdc:" link was moved up and out of the other language links. JonHarder talk 16:40, 2 January 2007 (UTC)[reply]

Bot problem?[edit]

Hello again: I notice you've added a message to my bot's talk page, complaining about incorrect coordinates being added, but you haven't told me which coordinates you consider to be a problem. Considering the bot has annotated over 25,000 articles with coordinates, there is, in spite of extensive testing and checking, bound to be the occasional error in any large data set, for which I apologize. However, it's hard for me to fix the problem unless you can tell me which articles you have identified as having been tagged in error, or with bad coordinates. Please let me know, and I will try to fix any problems which may exist. -- The Anome 00:19, 14 January 2007 (UTC)[reply]

Right, I think I've got it. I think the article you were referring to was Skull and Bones, and the location was 4°21′39″N 75°54′27″W / 4.36083°N 75.90750°W / 4.36083; -75.90750, which is in Colombia. The source of this was the German Wikipedia, and it is, I believe, a single-character typo for 44°21′39″N 75°54′27″W / 44.36083°N 75.90750°W / 44.36083; -75.90750, which is (I think) the coordinates of Deer Island, mentioned in the article. You were quite right to remove it: articles should not be geocoded unless they are about a place, and I missed this one in my manual checks to remove non-place articles; the typo in the source just added extra confusion. -- The Anome 00:29, 14 January 2007 (UTC)[reply]
Just to clarify your last question in your comment on my talk page: for just this reason, the bot keeps a log of all the pages it visits, and won't revisit a page it has edited, unless I manually remove that page from its visit log. It also won't add a tag to any page that already has one (it checks for a wide number of geotag variants.) -- The Anome 00:40, 14 January 2007 (UTC)[reply]
By the way, if you're happy with the explanation above, would you like to remove the RfC, and the vandalism warning on the bot's talk page? I'd greatly appreciate it. -- The Anome 00:32, 14 January 2007 (UTC)[reply]
Hi. I raised the RfC just to try to find out what was the correct procedure. At first glance it looked like it was worse than it probably was, so sorry if i overreacted. --Rebroad 00:50, 14 January 2007 (UTC)[reply]

Hi, I'm a bit perplexed why/where title coordinates were add in Monsoon? This doesn't make any sense, since a monsoon is a phenomena that occurs in more than one place—not just in India. I have removed the edit.+mwtoews 00:44, 19 January 2007 (UTC)[reply]

That's truly weird. Thanks for reverting it. The data added on that pass came from a merge of geodata from a number of other languages, so an editor of one of the other language articles on Monsoon must have added that tag to the corresponding article -- I can't imagine why. I catch most of these by hand, by manually reviewing the article titles in the update data files, but that one clearly slipped through. -- The Anome 17:21, 10 February 2007 (UTC)[reply]

Wrong locations[edit]

I've noticed that some of the places you've placed coordinates on are just plain wrong. For example, the coords in Okura, Yamagata point to south-west Tokyo (200 miles south of Yamagata). The coords of Ōgata, Akita are not only more than 400 miles off, but on the wrong island. Just by looking at the latitudes in the edit summaries between 2:18 and 2:37 this morning (JST), it seems that several others are messed up. What did you use as the source? Are non-Japanese locations prone to the same errors by this bot?? Neier 22:38, 27 January 2007 (UTC)[reply]

(See reply on my talk page) -- The Anome 17:18, 10 February 2007 (UTC)[reply]

Shifting stub templates[edit]

Hi - I notice that on occasions Anomebot shifts stub templates - for instance, here. I suspect that you've set the bot up so that the coordinates go above the categories and below any templates... the problem with that is that stub templates are meant to go below the categories! Any way of fixing it so that the templates and cats are in the correct order (coor then cat then stub)? Grutness...wha? 07:59, 8 February 2007 (UTC)[reply]

My understanding of the normal formatting of articles is that the order should be main text, stub notices, category tags, interlanguage tags, so this behaviour is by design. One good reason for this is that tags that are not displayed inline, like category and interwiki tags, are snipped from the article text before the rest of the rendering is done. Whitespace around category tags can lead to multiple whitespace lines in the body of the article, leading to excess whitespace in the rendered page between the main text and the stub notice. The ordering given above prevents that. -- The Anome 17:14, 10 February 2007 (UTC)[reply]

New bot pass started[edit]

I've now regenerated new bot data files using GNS/Wikipedia matching/merging program, using the latest en: Wikipedia dump files and the most recent NGA GNS data. Since there have been repeated problems with Japanese locations in each of the last two runs, I have simply dropped all Japanese locations from this dataset after generation. The bot is now running again, inserting the new NGA GNS locations. As ever, I will be performing periodic quality checks during the run. -- The Anome 17:14, 10 February 2007 (UTC)[reply]

Update: I have now also removed all entries without region codes: the new, separate, Serbia and Montenegro were not being tagged with ISO 3166 region codes, as the codes were not yet defined when I wrote it: I'll fix this now... -- The Anome 17:26, 10 February 2007 (UTC)[reply]

Now fixed in the match/merge program: I'm regenerating the data now. -- The Anome 17:33, 10 February 2007 (UTC)[reply]

More wrong locations[edit]

During my spot-checking of the latest bot additions, I've noticed an increased number of incorrect edits in cases of multiple places with the same name in the United Kingdom when only one of these has been created in Wikipedia. This has only occured recently, after the creation of a large number of articles for much smaller places in the UK. These need more looking at: the GNS per-country multiple-name check should be catching these, but clearly isn't -- perhaps the GNS coverage on tiny places is not as thorough as I thought. I've just added a simple heuristic to catch most of these, at the cost of some false positives, by stopping any articles with commas in their titles from being annotated.

This may also be what has been happening in Japan (which has also been dealt with by a simple heuristic -- in this case not updating any locations in Japan). I'm considering sub-binning by region, for example, counties in the UK, to attempt to improve disambiguation even further -- this is not as easy as it sounds, and there are several ways of doing this: more thought required. -- The Anome 10:51, 12 February 2007 (UTC)[reply]

This is a automated to all bot operators[edit]

Please take a few moments and fill in the data for your bot on Wikipedia:Bots/Status Thank you Betacommand (talkcontribsBot) 19:50, 12 February 2007 (UTC)[reply]

Wrong locations analysis[edit]

OK, I've looked into this in more detail. As I suspected, there are two independent sources of error:

  • Firstly, the GNS data is less comprehensive than I thought: it only has one Colby in the UK, for example, when there are actually three. This stops one of the duplicated-name checks from working, in cases where only one of the several possible names has been created, preventing the duplicate article check (which works on suffix-stripped names) from catching it. I can add an extra check for this, possibly by looking to see if a disambiguation page with the base name exists, and rejecting the name if it does; there will be some false positives, but this should reject quite a number of possible errors.
  • Secondly, the GNS same-name check was not catching some real duplicates in the GNS, because of differences in orthography: for example, two places called Okura in Japan were listed in the GNS, once as local name "Okūra" with Latin name "Okura", and once with both names the same, "Okura". I thought I was catching these by checking for variant names in the GNS: I'll have to be more aggressive, by using both forms listed in the single record, and by also using the name generated by accent-stripping the local form.

As stated above, both forms of error are currently mitigated by excluding names with commas (a sign of disambiguation), and eliminating articles about places in Japan (to avoid the orthography problem). I'll implement more fine-grained fixes later, prior to the the next pass of match/merge analysis program. -- The Anomebot2 15:31, 13 February 2007 (UTC)[reply]

Update: I've now fixed the second problem by getting all the name variant forms from the GNS record, and ignoring accents when making exclusion checks, but not when making inclusion checks.

And I've just realized that some of the problems with the Japanese articles are to do with Japanese naming conventions: Ogata-machi, Ogata-mura, Ogata-cho, (also Ōgata-cho, Ōgata-chō...), are all different places listed separately in the GNS, but could all be called "Ogata" in English. I think this may need special-case suffix-stripping for Japanese names, in the same code path as the accent-stripping code just added. -- The Anome 10:37, 16 February 2007 (UTC)[reply]

Japanese place suffixes[edit]

Japanese place suffixes include:

-cho
-fu
-gun
-horo
-ichi
-juku
-ken
-koku
-ku
-kyo
-machi
-mura
-shi
-shu
-shuku
-son

...and there appear to be yet more: -ichiokacho, -ichiba, etc, etc. Probably best for now just to leave Japanese place names out entirely. -- The Anome 13:38, 16 February 2007 (UTC)[reply]

Coming soon...[edit]

A new global dump is now available from de:Wikipedia:WikiProjekt Georeferenzierung, dated 2007-02-07. I'll analyze the interwiki geodata in this, and then feed the resulting updates to the bot, after the current GNS geodata pass has finished. -- The Anomebot2 15:31, 13 February 2007 (UTC)[reply]

Now partially done. Some of the interwiki data has geotags for things like companies, typically geotagging their headquarters building, something I don't think is appropriate, so for now I've filtered these down to only include entries that have geotag types defined, where the type is city, isle, waterbody, or mountain. -- The Anome 10:20, 16 February 2007 (UTC)[reply]

New subproject[edit]

I think I've now pretty much mined out the potential for rapid progress by using GNS tagging and interwiki tagging, since these are now mostly dependent on the rate of progress of category tagging and interwiki tagging on article with geodata on other-language Wikipedia editions: this can continue in future, but it will be a slow process of organic growth driven by manual tagging. I did a random sample yesterday, and I estimate that about a third of all geotaggable articles on en: are now tagged.

I now intend to start tagging U.S. locations using USGS Geographic Names Information System data. [3] Although a large number U.S. articles are already tagged, a great many are not. More soon. -- The Anome 08:46, 21 February 2007 (UTC)[reply]

I'm just doing a short sample run of GNIS tagging, will review by hand when the toolserver is back up. -- The Anome 00:19, 23 February 2007 (UTC)[reply]

Automated message to bot owners[edit]

As a result of discussion on the village pump and mailing list, bots are now allowed to edit up to 15 times per minute. The following is the new text regarding bot edit rates from Wikipedia:Bot Policy:

Until new bots are accepted they should wait 30-60 seconds between edits, so as to not clog the recent changes list and user watchlists. After being accepted and a bureaucrat has marked them as a bot, they can edit at a much faster pace. Bots doing non-urgent tasks should edit approximately once every ten seconds, while bots who would benefit from faster editing may edit approximately once every every four seconds.

Also, to eliminate the need to spam the bot talk pages, please add Wikipedia:Bot owners' noticeboard to your watchlist. Future messages which affect bot owners will be posted there. Thank you. --Mets501 05:11, 22 February 2007 (UTC)[reply]

Data source[edit]

While I have no reason to doubt the data, there is no source reference added as part of the text inserted by the bot, nor is there any explanation here as to where the data comes from. While I can check every edit made by the bot to the many municipal pages on my watchlist, it would serve the entire Wikipedia community if a sources were made available to allow editors to more readily verify the additions. Alansohn 19:27, 23 February 2007 (UTC)[reply]

The U.S. data currently being added by the bot is entirely from the GNIS database. -- The Anome 19:52, 23 February 2007 (UTC)[reply]
Update: I have now started source-coding the bot-generated tags, and documented the format on the Wikipedia:WikiProject Geographical coordinates page. -- The Anome 11:50, 14 March 2007 (UTC)[reply]

Geo microformat[edit]

You might be interested in the Geo microformat, Wikipedia Project Microformats and this attempt to apply geo in Wikipedia (currently held up by the decimals vs. deg-min-sec issue; discussion here). Andy Mabbett 15:06, 13 March 2007 (UTC)[reply]

Progress to date[edit]

As of March 2007, this bot has now added over 44,000 standard geotag records to the English-language Wikipedia, from GNS, GNIS, OSGB coordinates in UK articles, and geotag data from other-language Wikipedias. -- The Anome 11:42, 14 March 2007 (UTC)[reply]

New subproject?[edit]

Places in China are currently very poorly covered on the en: Wikipedia, partly because of the lack of access from the PRC limiting the number of editors available to do the work, and partly because of the inability of most en: editors to read Chinese. This can be seen by the very low density of geodata points in China, in spite of it being one of the most populous countries in the world. If I can, I'll try to do some machine-work to help improve this. -- The Anome 13:05, 14 March 2007 (UTC)[reply]

Unneeded coords for dissolved municipalities[edit]

This is a bit tricky, so please bear with my explanation.

Japan is currently undergoing lots of mergers and dissolution of municipalities to save money on administration budgets. Towns and cities from a year ago may no longer exist today.

A town that no longer exists should probably not have coordinates listed (for instance, non-extant cities should not be shown in the Wikipedia layer in Google Earth). It seems that the Anomebot2 doesn't do Japanese locations at this point in time, but for the future I have this suggestion: Could you check to make sure the article is not listed in Category:Dissolved municipalities of Japan before adding geodata? Amake 14:16, 23 March 2007 (UTC)[reply]

As you say, Japanese locations are currently not being touched by the bot, so this should not arise at the moment. In the longer term, I think this needs more consideration, before we can come up with a consistent policy about geolocation of placenames that no longer exist. -- The Anome 08:49, 3 April 2007 (UTC)[reply]
I think that coordinates should be included, for anywhere which you could point to on a map; that includes former towns and dissolved municipalities. Andy Mabbett 09:50, 3 April 2007 (UTC)[reply]
I think that would create lots of redundancy and confusion for visualization services like Google Earth. If a town no longer exists then it doesn't need coordinates. If you want to see where it was, follow a link to the current municipality and check its coordinates. Amake 23:39, 3 April 2007 (UTC)[reply]
Perhaps the old names should, in most cases, redirect to the name of the new municipality, with perhaps a mention in the history section of that article. This would completely sort this problem, since redirects don't get geotagged by the bot. -- The Anome 00:37, 5 April 2007 (UTC)[reply]
While that's certainly an interesting possibility, you can also argue that dissolved municipalities are "notable" enough to be their own articles, in the context of "political reform/reorganization in Japan." There's also the fact that it would be quite an undertaking to change all the articles, and would require a lot of debate before gaining any consensus. Amake 01:48, 5 April 2007 (UTC)[reply]
Any such redundancy would be for Google Earth to resolve; we should concentrate on building this resource. Andy Mabbett 02:07, 5 April 2007 (UTC)[reply]
Yes, but the fact remains that these places no longer exist. This bot just re-added geodata to a bunch of dissolved municipalities (Seto, Ehime, Hakata, Ehime, Ikina, Ehime to name just a few) that I'll have to go revert now, because the equivalent article in the es Wikipedia has geodata. PLEASE PLEASE PLEASE have the bot check for Category:Dissolved municipalities of Japan before adding geodata for Japanese articles! Or at least properly prevent the bot from touching articles about locations in Japan. Amake 13:03, 12 April 2007 (UTC)[reply]
It now does so. Thanks. -- The Anome 09:29, 1 June 2007 (UTC)[reply]

India info box[edit]

Some of the India related articles already have coordinates filled in the info box (see Template:Infobox Indian Jurisdiction). These coordinates are displayed at the top corner of the page. Does this bot check this before updating the coordinates in such pages. If not, the page would display the coordinates twice. --(Sumanth|Talk) 11:19, 4 April 2007 (UTC)[reply]

Yes, it does several checks for pre-existing geodata, and it should detect the coordinates in these articles, and not mark them redundantly. -- The Anome 00:29, 5 April 2007 (UTC)[reply]
Thanks. I can safely ignore Anomebot2 changes in my watchlist. --(Sumanth|Talk) 03:35, 5 April 2007 (UTC)[reply]

French communes and stations coordinates[edit]

Hi, I was wondering if The Anomebot2 was malfunctioning as each and every coordinates I have witnessed been added by The Anomebot2 were constantly a mile or two off. I've had to follow him up on several occasions (others I abandonned) and modify its contributions. Captain Scarlet and the Mysterons 09:34, 5 April 2007 (UTC)[reply]

Can you give some examples, please? (See below.)
I just spotted Gare de Cornavin: you're right, that's a fair way off. I'll investigate. Do you have any other good examples? -- The Anome

Railway stations[edit]

There are 49815 railway stations listed in the GNS database, with many different name formats, such as "Halte XXX", "Gare de XXX", "XXX Station", "XXX Stansiyası", and so on. There are 4409 Wikipedia articles with names ending with "railway station" (and 3 ending with "railroad station"). More to come. -- The Anome 05:46, 6 April 2007 (UTC)[reply]

There's remarkably little correlation between the two datasets. Grr. -- The Anome 20:57, 6 April 2007 (UTC)[reply]


wrong geodata[edit]

About 10-20 km to south/west from exact position

18:42, 7 April 2007 The Anomebot2 (Talk | contribs) (Adding geodata: {{coor title dm}})

Geodata to Naphegy are wrong Tamas Szabo 07:36, 8 April 2007 (UTC)[reply]

I've now changed it to use the location given by the maps in the article. Thanks for letting me know. -- The Anome 09:40, 11 April 2007 (UTC)[reply]

Your bot made error, when move coors from picture caption in dewiki as main coor to enwiki. In fact adding coors to 1297 km long railway do not make sense for me. --Jklamo 13:16, 11 April 2007 (UTC)[reply]

Thanks for letting me know. I've now reverted that edit. -- The Anome 21:20, 11 April 2007 (UTC)[reply]

Your bot made an error by trying to add specific coords to this article, since the islands spread over 17 degrees in longitude and nearly 2000 km in length. Please try to revise the bot to prevent future occurrences of this issue. I see that the previous comment also mentions the same issue in regards to a lengthy railway. Please make sure that the bot only adds coords to articles about localized areas (certainly less than 1 degree in longitudinal or latitudinal extent). Thanks. --Seattle Skier (talk) 19:17, 11 April 2007 (UTC)[reply]

Thanks for letting me know, and reverting that edit. I try hard to catch things like this. Fortunately, in more than 50,000 edits so far, only a small fraction have led to complaints, partly as the result of multiple passes of machine checking and human QA checks of a random sample of output data. Unfortunately, there's no way for the bot to catch this particular error: in the cases of interwiki geotags like this, it's simply relying on the judgment of the human being who added the tag in the source wiki. However, once a mistake has been fixed, the bot will not revisit the page, so that particular mistake will not be repeated. -- The Anome 21:27, 11 April 2007 (UTC)[reply]

Tiny Villages in Ireland[edit]

Well done Anome!! I have been recording a string of villages and hamlets in remote areas and your Bot just sweeps along after me adding in the coordinates. Excellent stuff. Regards (Sarah777 13:01, 27 May 2007 (UTC))[reply]

Maybe I'm dense: but exactly what is the charter of this bot?[edit]

I happened to notice that Sun Microsystems now has a latitude and longitude on its page, and traced that down to an addition by this bot -- although the idea seems mostly harmless, I was sort of curious what the reason for doing this was.

To my surprise, there's no actual explanation on the user page of what this bot is intending to do. One thing it does say is that it will "automatically ignore articles that appear to be about people or organizations" -- is that different from corporations?

So: can I suggest just a brief description on the user page of why this bot does what it does, and can I also ask whether the Sun entry should be there or not? Thanks,--NapoliRoma 21:13, 27 June 2007 (UTC)[reply]

Thanks for finding that. I would imagine that the location was that of Sun's corporate headquarters, as tagged by the German-language Wikipedia, where this is common practice, and interwiki'd here by the bot. It's not common practice here: the bot is meant to catch this sort of thing by inspecting category tags, but clearly missed this one. This is a bug to be fixed: I'll add it to my list. -- The Anome 05:51, 30 July 2007 (UTC)[reply]
Thanks!--NapoliRoma 14:40, 30 July 2007 (UTC)[reply]
You're welcome. I've now taken a look at the code, and can confirm that I've added code since then which would have caught this special case. -- The Anome 17:31, 30 July 2007 (UTC)[reply]

French communes[edit]

Hi, I think there is a problem with this bot and the french communes, all of the coordinates i have seen were false, some examples: [4], [5], [6], [7], [8]... Code-Binaire 10:23, 5 July 2007 (UTC)[reply]

Thank you for spotting this! This is an East-West problem: the coordinates given are accurate, apart from a sign error in the East-West coordinate. This is almost certainly caused by faulty parsing of data from the French-language Wikipedia. I'll add it to my list of things to be fixed. -- The Anome 05:47, 30 July 2007 (UTC)[reply]
Update: all of these, and the other similar mis-tagged articles have now been fixed, either automatically or by hand. -- The Anome 07:51, 11 September 2007 (UTC)[reply]

Mongolia wrong ccords[edit]

Your Bayankhongor city coordinates I've corrected in the article, seconds added also. Check your geodatabase. Bogomolov.PL 05:27, 9 July 2007 (UTC)[reply]

Unfortunately, your "corrected" coordinates are not valid, still less located in Mongolia, as they have a latitude of > 90°. Perhaps you have transposed latitude and longitude? -- The Anome 16:25, 1 August 2007 (UTC)[reply]
I've now fixed the transposition: however, your coordinates do not match up with those of a number of online map services: perhaps something like 46°41′48″N 100°9′1″E / 46.69667°N 100.15028°E / 46.69667; 100.15028 might be better? -- The Anome 16:39, 1 August 2007 (UTC)[reply]

Trnsposition it's my sin. Delüün seems to be moved (settlement changed its location, in the nomadic country it happens), your coords are pointing Delüün former location, so I fixed situation. —Preceding unsigned comment added by Bogomolov.PL (talkcontribs) 08:31, August 28, 2007 (UTC)

Norwood, Gauteng mistagging notification[edit]

Note: the bot recently recently tagged Norwood, Gauteng with the location (i assume) of a different Norwood suburb, of Cape Town. I removed the tag and left a note on Talk:Norwood, Gauteng#Coordinate for later addition to Norwood, Western Cape. —Piet Delport 17:09, 15 July 2007 (UTC)[reply]

Thanks! I've now created a new article stub for Norwood, Western Cape to carry the coordinates, and double-checked the location. -- The Anome 16:20, 1 August 2007 (UTC)[reply]
As noted there, thank you for the good work. :) —Piet Delport 06:56, 2 August 2007 (UTC)[reply]

What is this? --- Realest4Life 15:22, 15 August 2007 (UTC)[reply]

The result of interwiki-link confusion between the album Trilla, and Trilla, France. Now fixed. -- The Anome 22:43, 24 August 2007 (UTC)[reply]

Hi your bot assigned coordinates to the Dutch tweede kamer. I am not sure whether it should, as the article is about the second house of the Dutch parliament (ie more the institute than the location). I reverted it with edit summary. Arnoutf 16:43, 15 August 2007 (UTC)[reply]

Barnstar[edit]

File:Interlingual Barnstar.png The Geography Barnstar
Awarded for helping Wikipedia users find their places in the world. You also steal the articles I want to put coordinates on. ;) SpencerT♦C 02:28, 12 February 2008 (UTC)[reply]

Source code[edit]

Is the source code for geotagging process available? Thanks. --Drh08 (talk) 00:06, 19 February 2008 (UTC)[reply]

Unfortunately, no. In it's current form, it's not clean enough to release. -- The Anomebot2 (talk) 15:13, 18 March 2008 (UTC)[reply]
How about history logs? That is, a log of all articles automatically geotagged? --Drh08 (talk) 20:02, 22 April 2008 (UTC)[reply]
  • I am currently rewriting oscoor_a.htm, the resource called by {{oscoor}}. I don't mind if the source code for your bot is obscenely dirty, I would love to see your code for converting grid refs to WGS84 lat/long. Please. — RHaworth (Talk | contribs) 06:33, 4 January 2009 (UTC)[reply]

Disambiguation pages[edit]

Hi, I saw that the bot had added co-ordinates to King, Wisconsin which, as a disambiguation page, presumably should not have co-ordinates. I have reverted the change, but I thought I'd let you know in case there's a bug/shortcoming that needs fixing. (Though as the bot's edit was one year ago, possibly the problem's been fixed in the meatime.)--86.149.49.56 (talk) 13:10, 24 February 2008 (UTC)[reply]

I see what it was doing...I added the coordinates to the correct article. SpencerT♦C 23:02, 6 March 2008 (UTC)[reply]
I've now fixed the bot to deal with some new cases of geotags, which should help catch disambiguation pages like this in future. -- The Anome (talk) 15:09, 13 March 2008 (UTC)[reply]

Hmmm....[edit]

Can you have the bot remove {{Locateme}} tags from the talk pages of the articles once it adds coordinates? i.e. Caves, Aude. SpencerT♦C 23:35, 17 March 2008 (UTC)[reply]

It would be a real pain to do so: it would slow the bot down by a factor of 2, and it already has a massive workload. I don't think that {{Locateme}} is actually that useful: it's not needed for automatic maintenance, and it's not useful for encouraging drive-by coordinate entry sitting on the talk page, because no-one will see it. For enthusiasts looking to add coordinates, the Maybe-Checker does a much better job.
As I recall it, {{Locateme}} tags were originally on the pages themselves, my bot removed them automatically, and all was well. Then a decision was made that the Locateme tags were to be moved to the talk pages, stopping this from working. Perhaps the people who insisted on putting them there in the first place could write a bot to remove them? -- The Anomebot2 (talk) 14:28, 18 March 2008 (UTC)[reply]
Is there a place where you can put in requests for a bot? I don't really like the maybe checker as much, as that's personal preference. I also don't really like the {{locateme}} tags, though I sometimes use them, but I prefer to find lists of places, that don't generally have coordinates, and add them though that. If you want me to go through the bot's contribs for now and remove tags, I can do that. One sec, I'll be back later for more, I need to go. SpencerT♦C 22:06, 18 March 2008 (UTC)[reply]
Yes, there is: Wikipedia:Bot requests.
I think {{locateme}} should go, since they are not visible in the article, thus failing to stimulate drive-by geotagging, and tagging articles for human attention which could be tagged by a bot wastes human editors' time.
Instead, I'd like to propose something like a {{geolocation-stub}} template, placed at the bottom of the article and formatted in the same way as other stub notices, saying something like "this article about a location does not have geographic coordinates: please add some". I could then use my bot to add these to only those articles which (a) it can recognize as being about geographic features, and (b) are not currently tagged, and (c) cannot be automatically matched to features listed in the GNS database. -- The Anome (talk) 09:50, 19 March 2008 (UTC)[reply]
I've moved it from there to {{No geolocation}}, since it isn't a stub template. Hope that doesn't cause any problems. Grutness...wha? 00:40, 20 March 2008 (UTC)[reply]

Italy comuni[edit]

Apologies if this has been fixed but just FYI - Avise should be around 45°43′N 7°8′E in NW Italy, but on 00:46, 14 February 2007 the bot put it at 45|43|00|N|43|07|16|E in the Caucasus! Incidentally, you shouldn't need to worry too much about the Italian comuni any more, we're getting on top of infoboxing them, and there's a coord in the infobox. FlagSteward (talk) 20:17, 19 March 2008 (UTC)[reply]

Thanks for the catch. Unfortunately, the NGA GEOnet and interwiki data I use as a source is not always accurate, and the bot's matching heuristics not 100% perfect, so the occasional error like this will slip through now and again. Although I hand-check a random sample of edits, I'm also very grateful for error reports. When I find an error, I try to track it down to find its source, and, if possible, fix the logic that led to it to prevent any recurrences. Unfortunately, this one was added early last year before I added source-tracking in the tags, and I'm not sure what the original data source was. It was probably taken from another Wikipedia language edition via an interwiki link, since Avise is not in the current GNS database. -- The Anomebot2 (talk) 23:22, 19 March 2008 (UTC)[reply]

Coordinates from other wikies[edit]

I've noticed that the bot routinely adds the coordinates taken from the interwikied articles. I am curious as to what checks and balances exist to make sure that these coordinates are correct. It just seems all too easy for a typo in one wiki to propagate to many other wikies with the help of the bot. I haven't seen this happening thus far, but I am nevertheless curious if this potential problem has been given any thought. Cheers,—Ëzhiki (Igels Hérissonovich Ïzhakoff-Amursky) • (yo?); 16:15, 24 March 2008 (UTC)[reply]

I guess I should have read through the post immediately above :) Than answers my question. If you could address the one immediately below, I'd appreciate it.—Ëzhiki (Igels Hérissonovich Ïzhakoff-Amursky) • (yo?); 15:07, 25 March 2008 (UTC)[reply]
At the moment, the bot only propagates data from other wikis to en:, and not vice versa, and will only edit any article once. In addition, it should not change any article with existing geocode data on en:, and will always prefer GNS data to interwiki data, where both are available for a previously-uncoded article.
In the long run, I hope to be able to work on validating Wikipedia's data against multiple independent sources. -- The Anome (talk) 14:10, 21 April 2008 (UTC)[reply]

Also, here is an unrelated bug for you to fix—[9] (note what happened to the interwiki link). Perhaps the bot should stick to fixing the coordinates and leave the cat-stub-iwiki flow alone?—Ëzhiki (Igels Hérissonovich Ïzhakoff-Amursky) • (yo?); 16:23, 24 March 2008 (UTC)[reply]

Thanks for catching that. That's a clear bug in my code: I need to think carefully about how best to fix it. -- The Anome (talk) 14:09, 21 April 2008 (UTC)[reply]

Where to get coordinates for {{coor title dms|||}}?[edit]

Greetings! I would like to start using the template :coor title dms|||: in some placename articles. What source(s) should I use to find the exact dms coordinates of specific settlements (in Central Asia, say)? Thanks for your attention to this novice's query. --Zlerman (talk) 15:56, 21 May 2008 (UTC)[reply]

Slight bug (possibly fixed by now)[edit]

Just came across Nam Ngum Dam; it added the coords but didn't remove the locateme tag from the talk, so it was still showing up as needing coords. Chers, JeremyMcCracken (talk) (contribs) 07:06, 23 May 2008 (UTC)[reply]


Lakes[edit]

Is there a way your bot could tag some or all of the articles in Category:Wikipedia infobox lake articles without coordinates? --- User:Docu

Something like that is now on the way... see below. -- The Anome (talk) 09:56, 28 May 2008 (UTC)[reply]
In the category, there are about 450 lakes of the US. Some (e.g. Amawalk Reservoir) have a unique name. Maybe GNIS could be used for those. -- User:Docu

Unmatchables[edit]

I've now identified roughly 55,000 town/city/village/commune/etc. articles which are candidates for geolocation, but can't currently be resolved by any of my existing automated gelocation tagging/matching programs. A quick test of a random sample of 33 of these articles showed that, after removing articles tagged in other ways and otherwise ineligible articles, 24 of these were taggable as needing location information. This suggests that there are roughly 24/33*55,000 = 40,000 such articles that are currently eligible for tagging in this way. -- The Anome (talk) 09:56, 28 May 2008 (UTC)[reply]

Moving stub templates[edit]

Hi - I've just noticed this edit. Is there any reason why this bot is moving stub templates from their correct place (after the categories - see WP:STUB) and putting them earlier in the article? It's be greatly appreciated if you could alter your settings so that this doesn't happen! Grutness...wha? 01:09, 31 August 2008 (UTC)[reply]

Hm - I just noticed that I asked you about this before, about a year ago. At the time you wrote: My understanding of the normal formatting of articles is that the order should be main text, stub notices, category tags, interlanguage tags, so this behaviour is by design. One good reason for this is that tags that are not displayed inline, like category and interwiki tags, are snipped from the article text before the rest of the rendering is done. Whitespace around category tags can lead to multiple whitespace lines in the body of the article, leading to excess whitespace in the rendered page between the main text and the stub notice. The ordering given above prevents that.
Your understanding of thenormal formatting of articles is - as it was then - incorrect. Stub templates are always placed after categories but before interwikis. The reason for that is primarily so that stub categories are listed last in the list of categories. There are other ways around the problem of whitespace around category tags, such as leaving less white space around the coord template and stub templates. It;'s annoying to have to keep re-editing every article after your bot passes through. Grutness...wha? 01:17, 31 August 2008 (UTC)[reply]
If it's causing annoyance, I'll re-code the content-sorting part of the bot to treat stub templates specially. -- The Anome (talk) 20:35, 31 August 2008 (UTC)[reply]
That would be excellent - thank you. Grutness...wha? 23:12, 31 August 2008 (UTC)[reply]

coord data for rivers - bot adding incorrect data[edit]

I’m not sure what ‘coord’ information the bot is adding to river articles; the markup suggests the source, but the locations suggest the mouth. Anyway, at least two additions, to River Bain, North Yorkshire and River Tame, Greater Manchester have been to the wrong rivers. Mr Stephen (talk) 22:01, 31 August 2008 (UTC)[reply]

Thanks for spotting these. Yes, the NGA GNS data I use generally geocodes rivers at their mouths. Unfortunately, the source data itself is not 100% accurate, and the matching algorithms I use involve a number of heuristics, so when the bot geocodes large numbers of articles, a few bad codes will be often be added amongst the good ones.
My aim is to keep this error rate well below 1%, so that the bot-added entries tend to improve the overall average quality of Wikipedia's geocode data, rather than reduce it. I manually spot-check the bot's output to try to verify this. Very occasionally, there will be a burst of bad values because of systematic data-source or processing errors: when this happens, I will fix these, by hand if necessary. -- The Anome (talk) 09:33, 1 September 2008 (UTC)[reply]
Fair enough. TBH it looks like the wrong river was picked off a DAB page. Mr Stephen (talk) 17:58, 1 September 2008 (UTC)[reply]
Unless we can agree a convention for, say, always using the source and note to that effect next to the coordinates, I don't see the sense of adding coordinates, which denote a point, to articles about a linear feature like a (potentially very long) river. Ideally, of course, we'd have infobox fields for source, each major confluence, and mouth. I've raised the issue on a project page. Andy Mabbett (User:Pigsonthewing); Andy's talk; Andy's edits 10:14, 9 September 2008 (UTC)[reply]

off target[edit]

This addition is off by almost 10 degrees both in lat and long. Please check your sources. =Nichalp «Talk»= 13:27, 4 September 2008 (UTC)[reply]

Thanks for catching that. I've replied on your talk page. -- The Anome (talk) 13:44, 4 September 2008 (UTC)[reply]

Removed coordinates from Tokyo City[edit]

I removed the coordinates The Anomebot2 added to Tokyo City, a dissolved municipality. See Wikipedia:WikiProject Japan/Districts and municipalities#Geodata: "... dissolved municipalities ... should not have geodata." If you can prevent the bot from adding them again, it will be appreciated. Fg2 (talk) 11:02, 8 September 2008 (UTC)[reply]

That's curious: the bot has already been programmed to detect and ignore dissolved municipalities. I'll take a look to find out what went wrong. -- The Anome (talk) 02:39, 9 September 2008 (UTC)[reply]
Update: Tokyo City hadn't been categorized as having been dissolved, so the bot didn't spot it (the approriate categories look like this; Category:Dissolved municipalities of Ehime Prefecture: the bot simply looks for the word "dissolved" in a category name). Can you supply a list of dissolved municipalities, which I could add to the bot's placename blacklist, or suggest another heuristic which might detect these cases without explicit categorization? -- The Anome (talk) 02:42, 9 September 2008 (UTC)[reply]
Thanks. I've put the article in the category and called this to the attention of WikiProject:Japan, asking either for a list or heuristic, or to put articles in categories. Fg2 (talk) 03:58, 9 September 2008 (UTC)[reply]

Coordinates to a topic[edit]

Greetings, Anomebot2 and its owner. I've just reverted an addition of a "missing coordinates" template to "City status in the United Kingdom". (I don't suppose it meant to request the coordinates of the UK.) I thought you'd be interested; and should like to ask about how the article has happened to be considered eligible for tagging. Just curious. Waltham, The Duke of 02:21, 4 October 2008 (UTC)[reply]

Thanks; bug reports like these are really useful.
Just to give you some background; the bot uses a list generated by a program that scans a Wikipedia dump, then follows links through the category graph, using heuristics to spot only "good" inter-category links that appear to refer to relevant types of article. This graph is then used to generate a least-depth category forest rooted at a set of top-level "seed" categories, with articles at its leaves, which is then trimmed down further using a variety of ad-hoc heuristics. Finally, the bot itself scans articles just before editing them, using a further set of heuristics to eliminate false positives. Finally, I perform spot checks to find bad edits, and tweak the heuristics where possible to eliminate similar errors.
Unfortunately, all this failed to spot this particular case. I've spotted one more similar example, Municipal politics in the Netherlands, and fixed it as well. At the moment, I can't think of a rule that would catch articles like these without eliminating many good article matches; I'll take another look later to see if I can think of anything. -- The Anome (talk) 10:02, 4 October 2008 (UTC)[reply]
Well, to be honest, there's a great lot of city and town names in the article; it's not unreasonable that the robot should have been confused. Anyway, thanks for the enlightening explanation. Coordinates are useful, so it's nice to have the 'bot plugging the holes, even if there might be a false positive now and then (which shouldn't take long to remove anyway). Keep up the good work! Waltham, The Duke of 15:10, 4 October 2008 (UTC)[reply]
Another bug: I reverted this edit, since the coordinates already appeared in the infobox.--BillFlis (talk) 17:24, 9 October 2008 (UTC)[reply]

Hi. I reverted your ) to this article as I am not sure what the coordinates should point at. If this project ever gets built, it will extend for almost 28 kilometers. One of the lines would be a loop, so there isn't even an end to point to. Davidelit (talk) 12:24, 5 October 2008 (UTC)[reply]

Thanks. -- The Anome (talk) 12:25, 5 October 2008 (UTC)[reply]

Discussion of coord missing[edit]

Please direct me to the discussion of the roll out of the template "coord missing". cygnis insignis 18:07, 5 October 2008 (UTC)[reply]

Coords for a river??[edit]

The bot added a coord missing template for Buffalo Creek (Illinois). Just how or why would you add coordinates for a river? It's not a single point. —Preceding unsigned comment added by Andyross (talkcontribs) 21:30, 6 October 2008 (UTC)[reply]

See Wikipedia:WikiProject_Geographical_coordinates/Linear, there it is proposed to put the coordinates at the estuary/mouth for a river. --Berland (talk) 10:56, 7 October 2008 (UTC)[reply]
If this is to be true, it should be specified, like it is in an infobox, not just generic coordinates at the top right corner. Murderbike (talk) 07:50, 17 October 2008 (UTC)[reply]

Funny placement[edit]

This probably wasn't the best place to put the template. --Closedmouth (talk) 02:00, 7 October 2008 (UTC)[reply]

And again here. --Closedmouth (talk) 02:01, 7 October 2008 (UTC)[reply]
Thanks for spotting this. I've now fixed the bug which was causing this. -- The Anome (talk) 23:07, 7 October 2008 (UTC)[reply]

Urban township[edit]

You added the "missing coordinates" geodata template to Urban township (Ohio) recently. Just to let you know, the subject of that article is not a physical place. Rather, it's a governmental concept; a way to organize unincorporated areas in the US state of Ohio (and perhaps other jurisdictions as well). Therefore, geodata cannot ever be applied to the article. -- JeffBillman (talk) 00:56, 8 October 2008 (UTC)[reply]

Thanks. The bot is programmed with a number of heuristics to try to spot meta-articles like this, which describe the type of object that belongs to a category, rather than one of the objects that belong to the category. Unfortunately, none of them caught this article.
I've now added Category:Administrative divisions to this and several other similar articles, to stop this from happening again; the bot will now catch these at run-time when it inspects articles immediately before editing, and skip those articles, leaving a note in its logfile. -- The Anome (talk) 13:51, 8 October 2008 (UTC)[reply]

Sacramento Public Library[edit]

The bot added a coordinates needed tag to Sacramento Public Library. As this is a whole system of public libraries, there isn't one single coordinate that applies unless it's for the central branch. This coordinate stuff is new to me, but I'd like to learn, so I'm asking how best to handle this, both for the bot and Wikipedia -- add coordinates for the central branch? Remove the tag? Remove the tag and add something so the bot knows not to re-add? Thanks!--Fabrictramp | talk to me 16:31, 12 October 2008 (UTC)[reply]

If an article isn't taggable by a single coordinate, you can just remove the "coord missing" tag. The bot should not re-tag articles that have already been tagged, but adding any it to any category that uses words like "organizations", "corporations", "departments", "companies" or "systems" will also stop the bot from tagging it. -- The Anome (talk) 18:43, 12 October 2008 (UTC)[reply]
Thanks much!--Fabrictramp | talk to me 18:48, 12 October 2008 (UTC)[reply]

Adding coords from image[edit]

How about adding the geodata for an article from the images in the article if they are geotagged -- PlaneMad|YakYak 15:40, 13 October 2008 (UTC)[reply]

That's an excellent idea. I'll add it to the queue. -- The Anome (talk) 16:01, 13 October 2008 (UTC)[reply]
Super -- PlaneMad|YakYak 16:24, 13 October 2008 (UTC)[reply]
Could be a slight snag sometimes, as the image geotags contain the camera position and not the object position. -- Klaus with K (talk) 11:55, 20 October 2008 (UTC)[reply]

Getting a bit overenthusiastic about lighthouses[edit]

I've had to revert a number of articles related to lighthouses which aren't about locatable things. You might want to think about toning it down a bit. Mangoe (talk) 00:28, 14 October 2008 (UTC)[reply]

Can you give me some examples, so I can add some rules to the bot to prevent it from making this particular kind of mistake again? -- The Anome (talk) 00:38, 14 October 2008 (UTC)[reply]
The three that I've noticed were National Historic Lighthouse Preservation Act, United States Lighthouse Board, and Stephen Pleasonton. Mangoe (talk) 02:59, 14 October 2008 (UTC)[reply]

Revert Trusty system - a system and not a place[edit]

Had to revert Revert Trusty system, as this article is about a system and not necessarily a place. I don't know why your bot thinks it is a place as none of the categories indicate it is. I had to revert another article yesterday this bot put geo coods message on inappropriately - can't remember the name. Perhaps the bot is over enthusiastic. —Mattisse (Talk) 03:09, 14 October 2008 (UTC)[reply]

Coord missing[edit]

I blocked the bot, since it is is adding the {{coord missing}} template to pages. I looked around and the only bot approval I can find is the Wikipedia:Bots/Requests for approval/The Anomebot2, and that only includes adding known coordinates to pages.

And I doubt you will get bot approval for adding {{coord missing}}.

--David Göthberg (talk) 06:26, 14 October 2008 (UTC)[reply]

For example, this was inappropriate on Overwhelmingly Large Telescope - it's a cancelled project, with an undetermined coordinate. While a wikipedian could indeed stump up a billion dollars to get a coordinate by funding it, this seems unlikely. —Preceding unsigned comment added by Speedevil (talkcontribs) 12:11, 14 October 2008 (UTC)[reply]

    • Regarding the Big Dig: yes, it's very eligible for tagging, and I've now tagged it with coordinates. -- The Anome (talk) 23:28, 14 October 2008 (UTC)[reply]
This bot has tagged some articles with Coord Missing that are not geographical entities that can be assinged coords but rather the article discusses a general clasification of geographical entities that are spread over a considerable area. Examples of miss tagging (now removed) are Prefecture-level cities, Sub-prefecture-level city, and County-level city. Rincewind42 (talk) 14:00, 16 October 2008 (UTC)[reply]
In Khoroo I've erased this tag, as this article describes a type of territorial divisions. Bogomolov.PL (talk) 14:29, 16 October 2008 (UTC)[reply]
  • Another strange 'un. Is the algorithm used for {{coord missing}} available? If so, I might be able to determine what might be causing the false positives. -- Fullstop (talk) 22:24, 7 January 2009 (UTC)[reply]
    • Thanks for spotting that. It's based on the names of categories and traversal of their link relationships in the category graph, as extracted from dumps, using a combination of whitelist and blacklist patterns, with a bit of extra pattern matching of both raw and rendered page contents happening just before the edit occurs to catch a few more cases that are not caught by the previous processing. The reason why this particular article got marked is because it belonged to the "Towers in India" and "Towers in Iran" categories, that belong to the wider "<structure>s in <countryname>" classification. I could perhaps add a new filter to spot articles that "belong" to two different country categories, and exclude them, but such articles are rare, and tend to get filtered out by other heuristics. -- The Anome (talk) 10:13, 9 January 2009 (UTC)[reply]

UK categorization errors[edit]

Unfortunately, the category tree for UK regions isn't as uniform as that for U.S. regions, and some assumptions made by the category graph analysis code were insufficient to get it right. I've tweaked the algorithm some more to catch this particular class of error. Sorry about that. -- The Anome (talk) 03:38, 30 November 2008 (UTC)[reply]

Coordinates[edit]

Nice work but I take it you can't get the bot to add it to the infobox display which is what I thought you were going to do like Kangel. I;ve now got 3900 articles to add them to manually! The Bald One White cat 10:20, 18 December 2008 (UTC)[reply]

I might be able to get round to it one day, but there are lots of other tasks already pending. Have you considered becoming a bot operator yourself? Writing bot code in Python is straightforward, although you will have to get used to the problems of dealing with free-form input data that has does not have any formal specification and no guarantees of being correctly formatted in any way. -- The Anome (talk) 10:24, 9 January 2009 (UTC)[reply]
Bot is including the coordinates with so single the degrees and the minutes, avoiding the seconds, the one that causes that the situation in the map is left to kilometers respect to their real situation. Could be corrected this? Thanks. Sonsaz (talk) 22:42, 4 January 2009 (UTC)[reply]
I'm sorry, but the data extracted from parsing the map data really isn't accurate enough to give better resolution. The point chosen is the approximate midpoint of the name as written on the public domain U.N. maps, but the positioning of the names is generally chosen to make them fit neatly on the map, rather than to accurately locate the feature in question. This error is often several minutes of arc. Given this, it's pointless to generate coordinates with higher precision than minutes of arc. Still, a value to within a few kilometers is much better than nothing; if you have more accurate data that can be used under Wikipedia's copyright rules, I'd be interested to have a pointer to the source. -- The Anome (talk) 10:19, 9 January 2009 (UTC)[reply]