Wikipedia talk:Articles for deletion/Anybot's algae articles: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Kurt Shaped Box (talk | contribs)
→‎AfD notices: Is a block warranted?
Line 14: Line 14:


:My finger was actually hovering over the block button a few minutes ago, though I decided that I would wait for Smith to comment here before taking any action. Yes, I fully agree that it was an unwise move on his part to unleash the bot in this manner. Speaking frankly, I'm beginning to wonder if Smith is sufficiently familiar with this field and these particular organisms to be able to program a bot capable of outputting non-error-ridden stubs. I'm 100% certain that his intentions are good and I mean no insult to the man - but this just doesn't seem to be working. --[[User:Kurt Shaped Box|Kurt Shaped Box]] ([[User talk:Kurt Shaped Box|talk]]) 23:59, 24 June 2009 (UTC)
:My finger was actually hovering over the block button a few minutes ago, though I decided that I would wait for Smith to comment here before taking any action. Yes, I fully agree that it was an unwise move on his part to unleash the bot in this manner. Speaking frankly, I'm beginning to wonder if Smith is sufficiently familiar with this field and these particular organisms to be able to program a bot capable of outputting non-error-ridden stubs. I'm 100% certain that his intentions are good and I mean no insult to the man - but this just doesn't seem to be working. --[[User:Kurt Shaped Box|Kurt Shaped Box]] ([[User talk:Kurt Shaped Box|talk]]) 23:59, 24 June 2009 (UTC)


::I appreciate your concerns; however:
::*The discussion taking place hinges on the quality of the articles the bot has created
::* The first run of 'bot version 1' contained major errors, spotted in February.
::* The second (April?) run of 'bot version 2' fixed many of these errors (whilst some escaped my attention and still need addressing). These edits stood for several months without attracting complaint.
::* The third (June?) run of 'bot version 1', which took place without my knowledge, reintroduced the errors which had been fixed in the second run.
::* Editors had expressed the opinion that errors should be fixed ASAP; restoring the version produced by 'bot version 2' goes a long way to fixing many of the most important errors, and allows editors to form a fair view of what the bot created when it was originally run.
::* Remember, the bot only edited articles that humans had not touched. The only 'new' amendment it made was to fix higher taxonomy errors which had occurred because I had assumed that Algaebase only contained Algae; therefore they could only be improving the articles.

::Further,
::*Unauthorised access is now impossible

::There appears to be consensus that errors in the bot should be fixed ASAP - something which I wholeheartedly agree with. Running the bot in this fashion cannot increase the amount of errors; nor can it remove any human input (except my own); nor does it make it any harder to delete some or all articles, should this be the outcome of the ongoing debate.
::Perhaps I am being short-sighted, but I cannot see what harm will be done by using the bot in this 'roll-back' capacity, and I do see some benefits. Can anyone provide me an example of how it is harming WP that I have missed? By blocking the bot account, you are effectively saying that you don't want the errors that previous scripts created to be fixed; as far as I can tell this is the opposite sentiment to the dominant one thus far. If the edits had been performed by a user who wasn't called 'Anybot', would there have been any reason to block that user?

::[[User:Smith609|Martin]]&nbsp;'''<small>([[User:Smith609|Smith609]]&nbsp;–&nbsp;[[User_talk:Smith609|Talk]])</small>''' 02:34, 25 June 2009 (UTC)


== Response to bot owner's post ==
== Response to bot owner's post ==

Revision as of 02:34, 25 June 2009

AfD notices

Is there some straight-forward automated method of putting AfD notices on all the articles, or is that a too-burdensome task at this magnitude? Melburnian (talk) 07:59, 22 June 2009 (UTC)[reply]

Anyone with a bot account could do it with AWB. But is there any point? 4000 edits to save half a dozen deletions and restorations? Hesperian 10:35, 22 June 2009 (UTC)[reply]
Well, for my own contributions it was nine, so we're over half a dozen right there. For me the scary part is that without going through my contributions I had no real way of knowing which articles I had worked on. I guess I'd notice either a deletion or an AfD notice on my watchlist, so maybe it is OK either way, but I'd lean towards trying to err on the side of notifying, if we think there is a chance it would let someone know who would otherwise never notice or find out about this well after the fact. Kingdon (talk) 03:57, 24 June 2009 (UTC)[reply]
This is a good idea. The articles would then appear on watchlists. I want to see the articles disappear ASAP, but I agree that deleting an article someone has improved is tough for the writer. One of the least appreciated jobs on wikipedia is sourcing articles. I would not like to read half a dozen sources, pick the most authoritive, add it to an article, then have the article deleted without my knowledge. Still, I don't like to delay the removal of 2 billion years of misinformation. I like what Hesperian said, something about calling an animal a plant only far more wrong. --69.226.103.13 (talk) 18:17, 24 June 2009 (UTC)[reply]
This error has now been fixed in most articles; the rest are being fixed. Martin (Smith609 – Talk) 21:14, 24 June 2009 (UTC)[reply]
You mean you're running the bot now? Is this agreed upon. It seems we're still discussing the issue. --69.226.103.13 (talk) 22:29, 24 June 2009 (UTC)[reply]

You're introducing more and new errors and deleting stuff from the wrong place. --69.226.103.13 (talk) 22:57, 24 June 2009 (UTC)[reply]

(after edit conflict) Yes, it seems that he ran the bot this afternoon. Care to take a look at some of the edits and see what you think? As I said, making a small number of test edits and waiting for feedback would have been wise before doing this - but I suppose that the end result is either going to be 'slightly better than before' or 'as broken as before'... --Kurt Shaped Box (talk) 23:02, 24 June 2009 (UTC)[reply]

I have reblocked the bot. Given the context (discussion proceeding on the understanding that the bot was blocked; concerns about unauthorised access; the BAG notified that the bot was blocked, but not notified of the unblocking; recommendations that the bot be blocked and/or deapproved; expressed opposition to the bot being deployed to fix its own errors; no BAG approval for the new task; apparently no test run), I think it was inappropriate for the bot operator to unilaterally unblock and deploy. And then there is the problem that it was introducing new errors. Hesperian 23:40, 24 June 2009 (UTC)[reply]

My finger was actually hovering over the block button a few minutes ago, though I decided that I would wait for Smith to comment here before taking any action. Yes, I fully agree that it was an unwise move on his part to unleash the bot in this manner. Speaking frankly, I'm beginning to wonder if Smith is sufficiently familiar with this field and these particular organisms to be able to program a bot capable of outputting non-error-ridden stubs. I'm 100% certain that his intentions are good and I mean no insult to the man - but this just doesn't seem to be working. --Kurt Shaped Box (talk) 23:59, 24 June 2009 (UTC)[reply]


I appreciate your concerns; however:
  • The discussion taking place hinges on the quality of the articles the bot has created
  • The first run of 'bot version 1' contained major errors, spotted in February.
  • The second (April?) run of 'bot version 2' fixed many of these errors (whilst some escaped my attention and still need addressing). These edits stood for several months without attracting complaint.
  • The third (June?) run of 'bot version 1', which took place without my knowledge, reintroduced the errors which had been fixed in the second run.
  • Editors had expressed the opinion that errors should be fixed ASAP; restoring the version produced by 'bot version 2' goes a long way to fixing many of the most important errors, and allows editors to form a fair view of what the bot created when it was originally run.
  • Remember, the bot only edited articles that humans had not touched. The only 'new' amendment it made was to fix higher taxonomy errors which had occurred because I had assumed that Algaebase only contained Algae; therefore they could only be improving the articles.
Further,
  • Unauthorised access is now impossible
There appears to be consensus that errors in the bot should be fixed ASAP - something which I wholeheartedly agree with. Running the bot in this fashion cannot increase the amount of errors; nor can it remove any human input (except my own); nor does it make it any harder to delete some or all articles, should this be the outcome of the ongoing debate.
Perhaps I am being short-sighted, but I cannot see what harm will be done by using the bot in this 'roll-back' capacity, and I do see some benefits. Can anyone provide me an example of how it is harming WP that I have missed? By blocking the bot account, you are effectively saying that you don't want the errors that previous scripts created to be fixed; as far as I can tell this is the opposite sentiment to the dominant one thus far. If the edits had been performed by a user who wasn't called 'Anybot', would there have been any reason to block that user?
Martin (Smith609 – Talk) 02:34, 25 June 2009 (UTC)[reply]

Response to bot owner's post

Delete all that contain errors, improve bot code in consultation with 'experts', and run bot again. I had been unaware of the lengthy discussion at WikiProject Algae; first I should note that the original version of the bot contained some errors, which a later version of the bot corrected as soon as they were pointed out. The original version seems to have been run since April, replicating some of the errors, which has inflamed the discussion.

All articles I checked, except for a small number corrected by other editors, contain huge errors, and a diversity of errors. If it is required that each of the 4000+ articles be checked individually before being deleted, how can that be done, and by whom? You said you don't have time to recode the bot until October. Should wikipedia maintain these articles as they are until October if the 4000 cannot be individually checked for errors? I checked 100s and found not a single accurate article created by anybot.
A user named FingersOnRoids informed you of many of the errors in February. These errors were not removed from the articles or code, because I found the same errors.[1]
I checked articles written by the bot before, during, and after the April problems, including its initial run of articles, including later corrects made by the bot-these are all bad, all contain a variety of errors, all need scrapped to a stub that says “Thisgenus is” if edited by a bot because many of the taxonomies boxes are wrong and the errors are so diverse. In addition to cyanobacteria identified as eukaryotes, there are all different types of macroalgae, algae, angiosperms and at least one fungus identified as diatoms.
The lengthy discussion on WikiProject Plants was linked on your discussion page on June 16.

Now, in my opinion, articles that contain small errors (e.g. the wrong tense) but cite a reliable source are better than no article at all - and if all such pages were deleted from WP the encyclopedia would probably shrink by a factor of two. As evidenced by the work of some dedicated IP editors, the existence of a skeleton article is often the seed from which a useful and correct article is developed. And as all of the articles use information attributed to a reliable source, it is possible for people to check the data against the facts (no-body should ever use WP as a reliable source in itself). Again, this makes the articles more useful than many other unsourced articles on WP.

I identified no articles that contained only a wrong tense. I tried to find systematic errors that would allow the keeping of say, all articles about diatoms or coccoliths or red algae, with a bot fixing. Each group had huge errors in the taxonomy boxes, in the writing of the descriptions where included, in writing of extinct species as living taxa, in addition to errors such as incorrect species counts, and the error in every article attributing AlgaeBase’s verified data to the number of species in the genus. Again, other than stubifying to "Thisgenus is an organism" I don't see how it can be done. The organisms are not even all photosynthetic.
AlgaeBase does not list whether a species is extinct or extant. Every species would have to be checked for this in other sources. This is why I suggested asking phycologists to check the information. A number of species are fairly well known and are getting a couple of hundred page hits a month (bad page hits, sadly). I recognized a common extinct species when I went to check strange information in a paper I was editing.
The main IP editor corrected only the higher level taxonomy. Another IP editor corrected some more serious errors. These IPs did not correct the entire article. Their articles would require identification and additional editing. I offered to edit these if a list is made.
The source is reliable, but the data were not gathered in a reliable manner. Therefore all data would have to be verified. Who would do this? When? How many more Wikipedia mirrors would copy the wrong information before it was done?
Source material that copies the name of an angiosperm then calls it an alga, or copies the information about a macroalga then fills a taxonomy box with information about a diatom is not usable and is not useful. This was discussed at Wikipedia plants. The seagrass that was called an alga in its article, Thalassia’’ is a major marine angiosperm, also.

However, I am embarrassed that wide-spread errors do exist. Systematic errors - such as the use of 'alga' instead of 'cyanobacterium' - are very easy to fix automatically. If I had a list of the errors that have been spotted, so that I could easily understand what is said that is wrong, and what should be said, I could re-code the bot until it got everything right, and then put it up for retesting (hopefully it is now notorious enough that people will be willing to check its output). At that point it would be possible to run the bot again and create error-free articles. In the meantime, perhaps it is a good idea to delete articles which contain factual errors. (I will never support the deletion of any article which details a notable subject, and contains factually correct information attributed to a reliable source.)

Cyanobacteria morphologically described as red algae within their articles is a data-gathering error that is within the code, or, more probably and primarily, it is probably within the coding algorithm.
For example, the article ‘’Aphanocapsaopsis’’ is from the AlgaeBase page that lists two species of uncertain taxonomies, [2], our article calls the organism an alga, lists it as a eukaryotic cyanobacterium, then goes on to say it produces carpospores and tetraspores! There are also golden algae(‘’Hibberdia’’ reproducing with tetraspores and carpospores. This information is NOT from AlgaeBase.
I don’t think “deleting articles which detail a notable subject and containing factually correct information attributed to a reliable source” is going to be an issue with any of anybot’s articles. Please read the nature of the mistakes on the WP:Plants page which lists a variety of genera with a variety and number of errors. There are so many errors and so many different types of errors. Every articles, as far as I can tell, has at least two major errors making it entirely useless.
When these errors were initially pointed out to you, beginning in February, you did not properly recode the bot to correct the errors or correct the articles with the errors already in them. You were asked to fix the cyanobacteria articles in March. You did not fix them. You told me when I started pointing out these errors that you do not have time to fix them until October.
I just do not think that you wrote an algorithm that can scrape a database. I would be concerned using your bot to rewrite the articles without carefully checking the algorithm. I am willing to do that if you post the algorithm and if the other writers agree that this is a useful solution. I don’t think it can wait until October. I think it needed fixed in February when first brought to your attention, or at least the cyanobacteria needed fixed in March when mentioned by Rkitko.

I think that the worst case scenario would be to delete articles willy-nilly and thereby deplete WP. We have the potential to use the Algaebase material to generate useful information - if it's not entirely up to date, then neither are most text books; and if the classification needs systematically updating, the bot can do that as taxonomy is updated. If this is done regularly, WP can keep up to date and become as useful a resource as Algaebase is today. Let's be careful to produce the best quality output we can before the deadline. Martin (Smith609 – Talk) 13:39, 22 June 2009 (UTC)[reply]

The worst case scenario in my opinion is that one more wikipedia mirror copies anybot's cyanobacteria articles.
I do not care if the articles are not entirely up to date, but 2 billion years is too much. The taxonomies of single celled marine photosynthetic organisms are difficult. I offered a solution to this difficulty, namely, using existing taxonomies on Wikipedia. Find the family or order, and simply use the higher level taxonomy already used. Again, you don't have time for this until October.
AlgaeBase is maintained by phycologists. Its primary use is to other scientists who understand the data. It is an amazing resource on the internet. For wikipedia to incorporate some of its information editors must understand the nature of the database.
--69.226.103.13 (talk) 16:47, 22 June 2009 (UTC)[reply]
  • I guess I'm one of the "experts" who would be asked to fix the bot, and I'd have to say "no thanks". It just doesn't seem worth the effort any more (and this isn't just true of AnyBot, I've been saying the same about other proposed bots to mass-create species articles). Kingdon (talk) 04:00, 24 June 2009 (UTC)[reply]