Jump to content

User talk:West.andrew.g: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
(One intermediate revision by the same user not shown)
Line 43: Line 43:


Please take a look at my [[User:Terra_Novus/Pending_Changes_Compromise#Competition_on_Wikipedia_Vandalism_Detection|write-up]] on a recent competition in detecting Wikipedia vandalism. I'm interested in your thoughts on the nine tools submitted. Would it be possible to run the sample data set against [[Wikipedia:STiki|STiki]] and see how it stacks up? Even if STiki doesn't come out #1, it has a huge advantage in already having a real-time Wikipedia implementation. As you can see at the [[User:Terra_Novus/Pending_Changes_Compromise|top of the linked page]], my goal is to use such a tool to flag high risk changes for review as a pending change. This will obviously be more sophisticated than my original edit filter proposal, but the concept is the same. If you're not aware, there has been [[Wikipedia:Pending_changes/Straw_poll_on_interim_usage|massive interest]] and [[Wikipedia talk:Pending_changes/Straw_poll_on_interim_usage|concern]] about the new pending changes feature on Wikipedia. Thanks! —[[User:UncleDouggie|UncleDouggie]] ([[User talk:UncleDouggie|talk]]) 06:29, 27 September 2010 (UTC)
Please take a look at my [[User:Terra_Novus/Pending_Changes_Compromise#Competition_on_Wikipedia_Vandalism_Detection|write-up]] on a recent competition in detecting Wikipedia vandalism. I'm interested in your thoughts on the nine tools submitted. Would it be possible to run the sample data set against [[Wikipedia:STiki|STiki]] and see how it stacks up? Even if STiki doesn't come out #1, it has a huge advantage in already having a real-time Wikipedia implementation. As you can see at the [[User:Terra_Novus/Pending_Changes_Compromise|top of the linked page]], my goal is to use such a tool to flag high risk changes for review as a pending change. This will obviously be more sophisticated than my original edit filter proposal, but the concept is the same. If you're not aware, there has been [[Wikipedia:Pending_changes/Straw_poll_on_interim_usage|massive interest]] and [[Wikipedia talk:Pending_changes/Straw_poll_on_interim_usage|concern]] about the new pending changes feature on Wikipedia. Thanks! —[[User:UncleDouggie|UncleDouggie]] ([[User talk:UncleDouggie|talk]]) 06:29, 27 September 2010 (UTC)

:: Hello UncleDouggie. It was actually my intention to enter that PAN-CLEF competition, but short notice and some realities got the better of me. Since that time, I have run STiki against the test set and it would have finished in second place in the competition -- so STiki is indeed a competitive tool. Further, I have worked closely with Bo and Luca (authors of WikiTrust, who *did* finish second in the competition). When our feature-sets are used in combination, we do extremely well together, and would comfortably have taken first place. We're working to make APIs available so STiki can integrate that logic.

:: I've read the technical paper summarizing the PAN-CLEF competition. I've heard the first place individual basically implemented the work of the competition's author, Martin Potthast ("Automatic Vandalism Detection for Wikipedia...") -- which used Natural Language Processing (NLP). STiki has some trivial NLP features -- so the infrastructure is already in place to use them -- so I could easily implement what these individuals found to be the best ones (Though I also suspect that ClueBot et al. might be taking care of some of the low-hanging fruit on the NLP front). If STiki were going to be used in a more official capacity -- this would certainly be enough motivation for me to get something like this done. Thanks, [[User:West.andrew.g|West.andrew.g]] ([[User talk:West.andrew.g#top|talk]]) 15:55, 27 September 2010 (UTC)

Revision as of 15:58, 27 September 2010


Talk page for West.andrew.g:


STiki on Linux

How should I run STiki on Linux? I downloaded the .jar file, but have no clue what to do with it. This wouldn't require compiling would it? Any help would be much appriciated.RadManCF open frequency 21:43, 15 July 2010 (UTC)[reply]

Hi there RadMan. If you download the "GUI/executable" version of STiki (as opposed to the "source") -- you won't need to compile. Since you have a *.JAR file -- you certainly got the "executable" version. All you need to do is issue the command "java -jar /path_to/STiki_exec...jar" in a terminal (fill in the requisite portions as it applies to your system), and the STiki GUI should display. There are a few details along these lines in the README file in the *.ZIP you downloaded. Thanks, and let me know if you have any other questions. West.andrew.g (talk) 17:41, 17 July 2010 (UTC)[reply]

Response to unblock request

The Arbitration Committee has reviewed your block and the information you have submitted privately, and is prepared to unblock you conditionally. The conditions of your unblock are as follows:

  1. You provide a copy of the code you used for your "research" to Danese Cooper, Chief Technical Officer and to any other developer or member of her staff whom she identifies. [Note - this step has been completed]
  2. You review any future research proposals with the following groups: the wikiresearch-L mailing list <https://lists.wikimedia.org/mailman/listinfo/wiki-research-l>; the wikimedia-tech mailing list for any research relating in whole or in part to technical matters; and your faculty advisor and/or University's research ethics committee for any research that involves responses by humans, whether directly or as an indirect effect of the experiment. Please note that your recent research measured human responses to technical processes; you should be prepared to provide evidence that those aspects have been reviewed in advance of conducting any similar research.
  3. Should this project, the Wikimedia Foundation, or an inter-project group charged with cross-site research be developed, they may establish global requirements for research which may supersede the requirements in (2) above.
  4. Any bots you develop for use on this project, whether for research or other purposes, must be reviewed by the Bot Approvals Group (WP:BAG) in advance of use, unless otherwise approved by the WMF technical staff.
  5. You must identify all accounts that are under your control by linking them to your main account. The accounts used in your July 2010 research will remain blocked.

Please confirm below that you agree to abide by these conditions when participating in this project. Once you have done so, a member of the Arbitration Committee will unblock.

For the Arbitration Committee,
Risker (talk) 12:55, 11 August 2010 (UTC)[reply]

I agree to these conditions, and offer a sincere apology to the community. Thanks, West.andrew.g (talk) 13:04, 11 August 2010 (UTC)[reply]

Using STiki in other Wikipedias

Hello. I'm a sysop in Turkish Wikipedia, and im currently operating a pywikipedia bot, Khutuck Bot in Turkish Wikipedia. Is it possible to run STiki on tr.wiki with changes in the codes you have released so far? Turkish Wikipedia is running low on RC patrollers lately and a tool like STiki will be a great ai for us. Please reply me at tr:User:Khutuck, tr:User:Khutuck Bot or User:Khutuck Bot. Thank you for the lovely tool. Khutuck Bot (talk) 22:44, 27 August 2010 (UTC)[reply]

Hi Khutuck, and sorry for the slow response. It is not difficult to implement STiki for different projects, but it is not trivial, either. First, a server is required to host the back-end component (which will need a static IP address). Secondly, there must be some way to identify some portion of vandalism in an ex-post facto fashion. For en.STiki, I use the common format of rollback strings for this purpose. Third, there would need to be language changes in the interface, "bad word" regexes, and perhaps some of the parsing. Fourth, I am willing to support anyone in such a venture -- but they need to have the coding (Java) skills to understand what is going on. If you are still interested, please let me know. Thanks, West.andrew.g (talk) 16:34, 1 September 2010 (UTC)[reply]
Thank you for the detailed explanation. I'll read coding again to better understand these four issues. Sadly I only have the basic coding knowledge, but i've been trying to learn java lately. Is it possible to run back-end component on my own PC for myself only with a dynamic IP with minor coding changes? If it's possible, STiki will be a multi-language Wiki tool :) Khutuck Bot (talk) 20:26, 1 September 2010 (UTC)[reply]

Vandalism reversion mistake

Hi Andrew, 193.130.87.54 made two vandalism edits to the sky article, but you only reverted one of them. I've just cleaned the damage up. You might want to modify STiki to take into account consecutive vandalism edits, or do something to minimise the risk of this problem. Graham87 03:21, 16 September 2010 (UTC)[reply]

Thanks Graham (also for the grammar nit on STiki's homepage)! Incorporating "rollback" in place of "revert" is on my TODO list for STiki. It's easy for those who have the rollbacker right, but its a little more complicated to build "in-software rollback" for those who don't have it. I figure this functionality will take care of most multi-edit vandalism. Either way, its my own clumsiness that I didn't notice. Thanks, West.andrew.g (talk) 04:56, 16 September 2010 (UTC)[reply]
On reverting consecutive edits without rollback, see this thred at the technical village pump from April 2007. I'm not sure if the problem still exists; I subsequently encountered it at demographic transition, but I don't know of any more recent cases. It seems to require a vandal and a user to be editing two sections of an article at exactly the same time. Graham87 14:22, 16 September 2010 (UTC)[reply]

Competition on Wikipedia Vandalism Detection

Please take a look at my write-up on a recent competition in detecting Wikipedia vandalism. I'm interested in your thoughts on the nine tools submitted. Would it be possible to run the sample data set against STiki and see how it stacks up? Even if STiki doesn't come out #1, it has a huge advantage in already having a real-time Wikipedia implementation. As you can see at the top of the linked page, my goal is to use such a tool to flag high risk changes for review as a pending change. This will obviously be more sophisticated than my original edit filter proposal, but the concept is the same. If you're not aware, there has been massive interest and concern about the new pending changes feature on Wikipedia. Thanks! —UncleDouggie (talk) 06:29, 27 September 2010 (UTC)[reply]

Hello UncleDouggie. It was actually my intention to enter that PAN-CLEF competition, but short notice and some realities got the better of me. Since that time, I have run STiki against the test set and it would have finished in second place in the competition -- so STiki is indeed a competitive tool. Further, I have worked closely with Bo and Luca (authors of WikiTrust, who *did* finish second in the competition). When our feature-sets are used in combination, we do extremely well together, and would comfortably have taken first place. We're working to make APIs available so STiki can integrate that logic.
I've read the technical paper summarizing the PAN-CLEF competition. I've heard the first place individual basically implemented the work of the competition's author, Martin Potthast ("Automatic Vandalism Detection for Wikipedia...") -- which used Natural Language Processing (NLP). STiki has some trivial NLP features -- so the infrastructure is already in place to use them -- so I could easily implement what these individuals found to be the best ones (Though I also suspect that ClueBot et al. might be taking care of some of the low-hanging fruit on the NLP front). If STiki were going to be used in a more official capacity -- this would certainly be enough motivation for me to get something like this done. Thanks, West.andrew.g (talk) 15:55, 27 September 2010 (UTC)[reply]