User talk:Lemmey
Open Source
LemmeyBOT has gone open source
- LemmeyBOT.py Runs RefHistoryFix on each article in the broken references category
- RefHistoryFix.py Attempt to fix broken references on the specified article by looking at article history
- LemmeyBOT2.py Runs RefHistoryFix2 on each article in the broken references category
- RefHistoryFix2.py Attempt to fix broken references on the specified article by looking at current versions of linked articles
- RefHistoryFix3.py Attempt to fix broken references on the specified article by looking at a specific article or text file
- Whoipedia.py A magical replacement of the Wikipedia class that doesn't require an editor to login
- basic.py An example bot that uses the whoipedia class. Finally a bot that anyone can edit (with)...
- TheDaily.py A method for posting current events to the proper pages.
LemmyBot Task 2c: Bastards
My observation has been that Bastard refs are often created by an editor cutting a text snippet from one article and pasting it into another. The source article where the parent ref can bve found can often be identified from the context surrounding the bastard, found by text search for the ref name and/or other pasted-in text, or identified from clues left in the edit summary. -- Boracay Bill (talk) 23:18, 1 June 2008 (UTC)
- Yes I saw that. I modified the bot to copy identically named refs from the current version of another article. I also modified it to check each of the pages linked in the article. The bot had great success with articles such as The Wire Season 3 and Criticisms of Tony Blair, where it matched references in The Wire and Tony Blair. I had to run this manually due to the possibility of false positives (mostly on music album sources) but it was very quickly done. I estimate I fixed approximately 100 references in 75 articles this way. However this method has run its course and there are still many articles remaining (Any of the small towns in North Dakota) where its obvious there is no source, never has been a source, and can not find a source on any of the linked pages. At some point any editor would call bullshit and add a fact-needed tag, why should the bot be held to a higher standard when it has already gone so far beyond what the average editor would do? --Lemmey talk 01:55, 2 June 2008 (UTC)
- Examples of copying from another page
- Obviously its very easy to approve the change when its a very unique reference name (reuters010908) or the articles have similar names. --Lemmey talk 02:01, 2 June 2008 (UTC)
- You check every linked article for a matching named ref? That seems like a reasonable solution to me. Gimmetrow 02:52, 2 June 2008 (UTC)
Houston Chronicle is blacklisted?
How can a major metropolitan newspaper like the Houston Chronicle not be allowed as a reference source? It was removed from Ivan Dixon. Please explain. --69.22.254.108 (talk) 18:04, 8 June 2008 (UTC)
- Post chron was added to the wikimedia spamlist by another user, User:JzG/unreliable sources. Wikipedia won't let me restore a reference that includes a url in the spamlist, try it, the page simply will not save. I replaced the tag with a fact tag so that some other user may find an acceptable source for the statement. Goto the spam list page if you want the site to be removed from that list, I don't make the rules I just follow them. --Lemmey talk 02:42, 9 June 2008 (UTC)
- The "Post Chronicle" is not a major newspaper in Houston or elsewhere. It is an online news/tabloid gossip site that has been deemed unacceptable as a source. FCYTravis (talk) 08:19, 9 June 2008 (UTC)
- I'll put this more clearly, I don't care about the source quality or statement it supports. LemmeyBOT only tries to restore 'lost' references, if the page can't save due to a 'Spam Blacklist' the ref gets replaced with a fact tag. I don't make the spamlist, I don't manage the spamlist, I don't even read the articles LemmeyBOT edits. I have absoulutly no opinion on the post and as there is absoulutlely nothing I can do, all issues regarding its inclusion should be handled elsewhere. --Lemmey talk 15:40, 9 June 2008 (UTC)
- The "Post Chronicle" is not a major newspaper in Houston or elsewhere. It is an online news/tabloid gossip site that has been deemed unacceptable as a source. FCYTravis (talk) 08:19, 9 June 2008 (UTC)
- To be most clear: "Post Chronicle" != Houston Chronicle; the official site for the latter, chron.com, is not blacklisted (and, as you observe, its blacklisting or its being deemed inappropriate as a "reference source" would disquiet). Joe 07:15, 10 June 2008 (UTC)
(dedent) Either way
- Nothing I can do
- Just to be clear, LemmeyBOT is blocked and I'm not really interested in defending myself from the baseless and unproven accusations in some kind of policy shouting match I'd have to win just to get the approval to continue exactly what I've already been approved to do. --Lemmey talk 12:51, 10 June 2008 (UTC)
- Thanks for the reply and all the input. My fault -- I confused postchronicle with Houston Chronicle (which had absorbed the Houston Post). Houston Chronicle a completely different thing from postchronicle. In words of Emily Litella, "Never mind!" :) --69.22.254.108 (talk) 16:45, 10 June 2008 (UTC)
RCP
from BeautifulSoup import BeautifulSoup
import datetime
import urllib
import wikipedia
import time
import re
##Get soft numbers
url = "http://www.realclearpolitics.com/epolls/maps/obama_vs_mccain/"
tag = "map-legend2"
f=urllib.urlopen(url)
html=f.read()
f.close()
soup = BeautifulSoup(html)
##print soup.prettify()
soup = soup.find(id=tag)
images=soup.findAll('img')
[image.extract() for image in images]
candidates=soup.findAll("div", {"class" : re.compile('^candidate')})
for candidate in candidates:
if candidate.find("p", {"class" : "candidate-name"}):
firstPTag, secondPTag = candidate.findAll('p')
nametot = firstPTag.string
style = str(firstPTag.attrs[1][1])[len("color:"):]
name = nametot[:-3]
total = nametot[len(name):]
print name, total, style
solid , leaning = secondPTag .findAll('strong')
print "Solid:",solid.string
print "Leaning:",leaning.string
else:
tossupPTag = candidate.find('p')
nametot = tossupPTag.string
name = nametot[:-3]
total = nametot[len(name):]
style = str(tossupPTag.attrs[1][1])[len("color:"):]
print name, total, style
warning
Per my post at WP:AN/I, I am giving you a warning over this. Just because someone else is being an ass doesn't mean it's a good idea. The next gratuitous, off-topic comment, or personal attack, will result in a two week block of you and your bot. You are out of chances. - BanyanTree 03:03, 11 June 2008 (UTC)
- Your post was at WP:AN, not WP:AN/I.
- The bot is already blocked indefinitely. A two week block would effectively unblock the bot after two weeks.
- The comment is not off topic if the topic is expanded by other contributors. i.e. Realm of consideration in criminal law. --Lemmey talk 03:15, 11 June 2008 (UTC)
- Thanks for the info. I hadn't kept track of the thread.
- Fortunately, Wikipedia is not a criminal court. Cheers, BanyanTree 03:18, 11 June 2008 (UTC)
- Not a criminal court? Mind if a quote you on that? With all the hoopla recently excuse me if I became confused. Also you may want to conider restoring my comments as they are WP:obvious satire and Israel is my second favorite nation. --Lemmey talk 03:20, 11 June 2008 (UTC)