Jump to content

Wikipedia:Scripts: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
m Popups-assisted redirection bypass from KISS Principle to KISS_principle
Line 6: Line 6:


=== KISS ===
=== KISS ===
Remember to [[KISS Principle|Keep it Simple, Stupid]], do one thing and do it well, this is a place for simple scripts and not whole programs.
Remember to [[KISS_principle|Keep it Simple, Stupid]], do one thing and do it well, this is a place for simple scripts and not whole programs.


=== Style ===
=== Style ===

Revision as of 17:14, 30 December 2005

This is about general scripts. For user scripts, see Wikipedia:WikiProject User scripts. For Greasemonkey user scripts see Wikipedia:Tools/Greasemonkey user scripts. For other tools, see Wikipedia:Tools

This page serves as a central repository for scripts and scripting requests on Wikipedia. Please feel free to improve any of these scripts but make sure to test your changes first. Like the rest of Wikipedia, all material here is under the GFDL. See licensing for further details.

Guidelines

KISS

Remember to Keep it Simple, Stupid, do one thing and do it well, this is a place for simple scripts and not whole programs.

Style

This style guide is by no means a rule, however keep in mind that your script must be easy to paste in here as well as easy to copy by others.

<pre>
 #!/usr/bin/perl
 use strict;

 print "Hello World\n";
<nowiki>

</nowiki>


Which would render as:

#!/usr/bin/perl
use strict;

print "Hello World\n";

Licence

All material on Wikipedia is by default under the GNU Free Documentation License, this licence is designed for documentation and written works but not for software. If you want your work to be of the greatest use to the public please consider dual-licensing it and releasing it under a Free software licence such as the GNU General Public License in addition to the GFDL. Like the GFDL, The GPL is released by the Free Software Foundation. You can do so by explicitly stating that your script is dual-licensed under the GPL in a comment at the start of your script.

IRC channel scripts

mIRC

This mIRC script allows you to double-click on [[wiki-links]] and {{templates}} within mIRC, opening up a browser window at that article. It supports links to all existent languages and WikiMedia projects, such as Meta: and commons:, as well as allowing users to change which language Wikipedia will be opened by default when an mIRC wiki link is clicked.

To install, copy the complete text between the horizontal lines below, open up mIRC, click the "scripts editor" icon (or go to 'Tools' -> 'Remote', or press Alt+R), and paste it all there in the "Remote" section.

This script supersedes the older "Wikipedia-style Linking" script, which can be found below if needed. However, it has fewer features and is less efficient. If you have another Wikipedia link script already installed, you will need to delete it (selecting the lines in the script editor and pressing delete is sufficient) before installing this one.

; Wikilinks 1.3.1
; By CXI, based on a script by Masterhomer
; Just hit Alt-R and paste the following lines of code into a new script.
; * Supports {{links}} and [[links]], and [[piped|links]]
; * Supports multi-word links and multiple links per line
; * Supports any Wikipedia language, but defaults to English. This can be changed with:
;   /set %wiki-lang CODE (for example, /set %wiki-lang fr will default to French Wikipedia)
; Language/project links such as [[fr:Accueil]] will work no matter what the default language is.

; If you're having any trouble making links open in a new window, try /set %wiki-uopt -n

; This script is dual-licensed under the GPL,
; version 2 or (at your option) any later version.
; See http://www.gnu.org/licenses/gpl.txt for more details.

on ^$*:HOTLINK:/[\{\}\[\]]{2}/:#,?,=:{ return }
on *:HOTLINK:*:#,?,=:{ 
  tokenize 32 $remove($1,[[,]],{{,}})
  if ($regex($hotline,(.*\[\[(.*? $+ $1 $+ .*?)(\|.*)?\]\].*|.*(\{\{.*? $+ $1 $+ .*?)\}\}.*))) {
    url %wiki-uopt $+(http://,$iif(%wiki-lang,$ifmatch,en),.wikipedia.org/wiki/,$replace($gettok($regml(2),1,124),{{,Template:,$chr(32),_))
  }
}


; Wikipedia-style Linking, Version 2.3.0
; Originally created by MadEwokHerd, modified by nobody and Mark_Ryan,
;   cleaned up by toad, majorly improved by Timwi, added extra functions by Netoholic
; fixed by Masterhomer, commons link fix by CryptoDerk
;  (now supports links with m:, meta:, commons:, WP:, {{template}}, and language codes)
; color stripping added by MadEwokHerd at CryptoDerk's request
on ^*:hotlink:*:*:{
  tokenize 32 $strip($1-)
  set %hotline $strip($hotline)
  if (*[[* iswm $1) || (*]]* iswm $1) || ($+(*,[,[,*,$1,*,],],*) iswm %hotline) { return }
  elseif (*{{* iswm $1) || (*}}* iswm $1) || ($+(*,\{,\{,*,$1,*,\},\},*) iswm %hotline) { return }
  else { halt }
}

on *:hotlink:*:*:{
  tokenize 32 $strip($1-)
  set %hotline $strip($hotline)
  var %page,%template,%dummy,%dummy2,%dummyc,%openb,%closeb
  if (*[[* iswm $1) || (*]]* iswm $1) || ($+(*,[,[,*,$1,*,],],*) iswm %hotline) {
    %openb = $pos($1,[[)
    %closeb = $pos($1,]])
    if (%openb == $null) {
      if (%closeb == $null) {
        %dummy = $regex(%hotline,\[\[([^\]]* $+ $1 $+ [^\]]*)\]\])
      }
      else {
        %dummyc = %closeb - 1
        %dummy = $regex(%hotline,\[\[([^\]]* $+ $left($1,%dummyc) $+ )\]\])
      }
    }
    elseif (%closeb == $null) {
      %dummyc = $len($1) - %openb
      %dummyc = %dummyc - 1
      %dummy = $regex(%hotline,\[\[( $+ $right($1,%dummyc) $+ [^\]]*)\]\])
    }
    else {
      %dummy = $regex($1,\[\[([^\]]*)\]\])
    }
  }
  else {
    %template = yes
    %openb = $pos($1,{{)
    %closeb = $pos($1,}})
    if (%openb == $null) {
      if (%closeb == $null) {
        %dummy = $regex(%hotline,\{\{([^\}]* $+ $1 $+ [^\}]*)\}\})
      }
      else {
        %dummyc = %closeb - 1
        %dummy = $regex(%hotline,\{\{([^\}]* $+ $left($1,%dummyc) $+ )\}\})
      }
    }
    elseif (%closeb == $null) {
      %dummyc = $len($1) - %openb
      %dummyc = %dummyc - 1
      %dummy = $regex(%hotline,\{\{( $+ $right($1,%dummyc) $+ [^\}]*)\}\})
    }
    else {
      %dummy = $regex($1,\{\{([^\}]*)\}\})
    }
  }

  %page = $regml(1)
  if (%template == yes) { %page = Template: $+ %page }

  if ($left(%page,2) == m:) {
    %dummy2 = $len(%page) - 2
    url -an http://meta.wikimedia.org/wiki/ $+ $replace($right(%page,%dummy2),$chr(32),_)
  }
  elseif ($left(%page,5) == meta:) {
    %dummy2 = $len(%page) - 5
    url -an http://meta.wikimedia.org/wiki/ $+ $replace($right(%page,%dummy2),$chr(32),_)
  }
  elseif ($left(%page,8) == commons:) {
    %dummy2 = $len(%page) - 8
    url -an http://commons.wikimedia.org/wiki/ $+ $replace($right(%page,%dummy2),$chr(32),_)
  }
  elseif ($left(%page,3) == WP:) {
    %dummy2 = $len(%page) - 3
    url -an http://en.wikipedia.org/wiki/ $+ $replace(%page,$chr(32),_)
  }
  elseif ($right($left(%page,3),1) == :) {
    %dummy2 = $len(%page) - 3
    url -an http:// $+ $left(%page,2) $+ .wikipedia.org/wiki/ $+ $replace($right(%page,%dummy2),$chr(32),_)
  }
  elseif ($right($left(%page,4),1) == :) {
    %dummy2 = $len(%page) - 4
    url -an http:// $+ $left(%page,3) $+ .wikipedia.org/wiki/ $+ $replace($right(%page,%dummy2),$chr(32),_)
  }
  else {
    url -an http://en.wikipedia.org/wiki/ $+ $replace(%page,$chr(32),_)
  }
}


; Wikipedia WikiLink Hotlinker, Version 1.0.1
; By Masterhomer
; Just hit Alt-R and paste the following three lines of code into mIRC. 
; Not as advanced as the above one, but it works and it's fast.
; Works for all WikiMedia projects and all Wikipedia.
; Thanks to CXI for pointing this out.
; Licensed under the MIT Licence. 

on ^*:HOTLINK:*[[*:#: { return }
on ^*:HOTLINK:*]]*:#: { return }
on *:HOTLINK:*:#: if ([[ isin $hotline) { url $+(http://en.wikipedia.org/wiki/,$replace($gettok($gettok($hotline,2,91),1,93),$chr(32),_)) }


; Tea time., Version 1.2
; Makes you talk like this. On IRC. 
; Ex: i'm using the internet changes to I'm using the Internet.
; Spell check soon..
; Version 1.2 prevents the script adding punctuation to lines with URLs ([[Wikipedia talk:Scripts#the Talk. Like. This. script....|see talk]])
; Inspired by IRC User Austin
; Licensed under the MIT Licence.

 on *:INPUT:#: { 
   if (/* iswm $1) || (http:// isin $1-) { $input }
   elseif (($asc($right($1-,1)) > 96) && ($asc($right($1-,1)) < 123)) || (($asc($right($1-,1)) > 64) && ($asc($right($1-,1)) < 91))  || ($right($1-,1) isnum) {
     msg $active $+($upper($left($1-,1)),$right($1-,-1),.) 
     halt
   }
 }

X-Chat

This Python script for X-Chat replaces links like this: [[de:Linux]] with the corresponding URL, when either you or someone else types them. It requires X-Chat 2.0.6 or later.

To install it, copy the text below and save it into a file called wmlinksubber.py in your xchat directory and then start/restart X-Chat, or load it via the menu Window -> Plugins and Scripts....
X-Chat directories:

  1. Unix-based systems
    1. most likely ~/.xchat or ~/.xchat2
  2. Windows
    1. C:\Documents and Settings\username\Application Data\X-Chat 2\

If you don't want it to replace text which you type, then place a # at the beginning of the line EVENTS.append(("Your Message", 1)), and reload the plugin with the Window -> Plugins and Scripts... menu, if it is already loaded.

Script is available on authors subpages:

Colloquy

A plugin to make wiki links clickable is available for Colloquy, an IRC client for Mac OS X. Although the source is included, it is too long to post here. You can download it from:

ircII

A brief ON trigger for ircII that works with the #Wiki-link filter below. If I was really elite this would all be in ircII script language, but that would be really too awful to contemplate: besides which, the filter can be useful for other purposes.

 set exec_protection off
 on public -
 on #^public 0 * if (match(*[[* [$2-])) { exec mwlink $shellfix($Z$1<$0> $2-) } { echo $Z$1<$0> $2- }

Note: setting exec_protection off could make you vulnerable to exploits by others unless you know what you're doing with your ON triggers.

Also, the $Z$1<$0> $2- is my preferred format for channel messages; yours may differ.

This is efficient in the sense that it only invokes the filter for lines containing a wiki link; it is inefficient because it invokes it for every line. I couldn't get persistent process communication working for ircII--if someone else can, that would be great.

EPIC4

This ircII-derived client can use almost the same script as ircII, above:

 set exec_protection off
 on public -
 on #^public 0 * if (match(*[[* [$2-])) { exec -direct mwlink $Z$1<$0> $2- } { echo $Z$1<$0> $2- } 

Note that the -direct option to /exec is used instead of the $shellfix() function, which EPIC4 does not provide.

ChatZilla

To install this script, copy it into a file called wikilinks.js somewhere convenient, and add it to Auto-load scripts on the Startup tab under Global Settings.

To configure the formatting applied to Wikilinks, use this commands:

  • /wiki-links-class [{class}] to use the CSS class set by the munger rule, you can then style this class of link with your motif file, the default class is "wiki-link"

Contributed to the public domain by IceKarma 01:31, 2005 Apr 22 (UTC)

//

//***********************************************
// IceKarma's WikiLinks script for ChatZilla
// Version 1.2
//   1.2 by James Ross: fix the normal links by shunting the
//       word-hyphenator as well.
// Version 2.0
//   2.0 By Glen Mailer:
//        - Converted to new plugin API
//        - Ripped out a whole load of unused stuff
//        - Also Made to fit chatzilla coding pedantics
//   2.1 By Alphax:
//        - Added basic template linking functionality
//   2.2 By Alphax:
//        - subst: and pipes now handled correctly in templates
//   2.3 by Pathoschild:
//        - reverted to 2.2 (2.3 broke all links with non-interlanguage prefixes)
//        - fixed mailto: wikilink glitch (based on 2.3 code by Stigmj)
//   2.4 By Stigmj:
//        - added support for handling mirc-colors.
//   2.5 By GeorgeMoney:
//        - fixed for change in API
//   2.6 By Pathoschild:
//        - fix broken interwiki prefix ([[w:foo]]);
//        - add support for namespace template syntax;
//        - fix modifiers msg, msgnw, raw; correct int (points to "mediawiki:foo", not "template:int:foo");
//        - don't link parameters: {{[[template:foo|foo]]|bar}}.
//   2.7 By Pathoschild:
//        - added support for external link syntax.
// This file is hereby placed by the authors into the public domain.


plugin.id = "WikiLinks";

plugin.prefary = [
    ["class", "wiki-link", ""],
];

//
// Plugin management
//

plugin.init = 
function init(glob) {
    plugin.major = 2;
    plugin.minor = 7;
    plugin.version = plugin.major + "." + plugin.minor;
    plugin.description = "Munges wikiML links to be clickable in the output window";
    plugin.prefary = plugin.prefary.concat(plugin.prefary);
}

plugin.disable = 
function disable()
{
    client.munger.delRule("wiki-link");
    client.munger.delRule("wiki-template-link");
	client.munger.delRule("wiki-external-link");
    client.commandManager.removeCommands(plugin.commands);

    display( plugin.id + " v" + plugin.version + " disabled.");
    
    return true;
}

plugin.enable = 
function enable()
{
    client.munger.addRule("wiki-link", /(\[(?:[\x1f\x02\x0f\x03\x16]\d{1,2})*\[(?:[\x1f\x02\x0f\x03\x16]\d{1,2})*[^\]]+(?:[\x1f\x02\x0f\x03\x16]\d{1,2})*\](?:[\x1f\x02\x0f\x03\x16]\d{1,2})*\])/, insertWikiLink, 10, 10);
    client.munger.addRule("wiki-template-link", /(\{(?:[\x1f\x02\x0f\x03\x16]\d{1,2})*\{(?:[\x1f\x02\x0f\x03\x16]\d{1,2})*[^\}]+(?:[\x1f\x02\x0f\x03\x16]\d{1,2})*\}(?:[\x1f\x02\x0f\x03\x16]\d{1,2})*\})/, insertWikiTemplateLink, 10, 10);
    client.munger.addRule("wiki-external-link", /(\[http:\/\/[^\s]+ [^\]]+\])/, insertWikiExtLink, 10, 10);
	
	
    var cmdary = [
        [ "wiki-links-class", cmdClass, CMD_CONSOLE, "[<className>]" ],
    ];

    plugin.commands = client.commandManager.defineCommands(cmdary);

    display( plugin.id + " v" + plugin.version + " enabled.");
    
    return true;

}

//
// Mungers
//

function insertWikiLink(matchText,containerTag, data, mungerEntry) {
    var wikiLink = matchText;
    var linkTitle;

    wikiLink  = matchText.replace(/^\[(?:[\x1f\x02\x0f\x03\x16]\d{1,2})*\[(?:[\x1f\x02\x0f\x03\x16]\d{1,2})*/, "");
    wikiLink  = wikiLink.replace(/(?:[\x1f\x02\x0f\x03\x16]\d{1,2})*\](?:[\x1f\x02\x0f\x03\x16]\d{1,2})*\]$/, "");
	linkTitle = wikiLink;
	
	// fix bad links (but leave linkTitle)
	wikiLink = wikiLink.replace(/^w:/, ""); 
	
    if (linkTitle.match(/\|/)) {
        var ary = linkTitle.match(/^(.*?)\|(.*)$/);
        wikiLink = ary[1];
        linkTitle = ary[2];
    }
    wikiLink = escape(wikiLink.replace(/ /g, "_"));

    var anchor = document.createElementNS( "http://www.w3.org/1999/xhtml",
                                           "html:a");
    anchor.setAttribute("href", "http://en.wikipedia.org/wiki/" + wikiLink);
    anchor.setAttribute("class", "chatzilla-link "+plugin.prefs["class"]);
    mungerEntry.enabled = false;
    data.inLink = true;
    client.munger.munge(linkTitle, anchor, data);
    mungerEntry.enabled = true;
    delete data.inLink;
    
    //insertHyphenatedWord(linkTitle, anchor, data);
    containerTag.appendChild(document.createTextNode("[["));
    containerTag.appendChild(anchor);
    containerTag.appendChild(document.createTextNode("]]"));
}

function insertWikiTemplateLink(matchText,containerTag, data, mungerEntry) {
    var wikiLink = matchText;
    var linkTitle;

    wikiLink  = matchText.replace(/^\{(?:[\x1f\x02\x0f\x03\x16]\d{1,2})*\{(?:[\x1f\x02\x0f\x03\x16]\d{1,2})*/, "");
    wikiLink  = wikiLink.replace(/(?:[\x1f\x02\x0f\x03\x16]\d{1,2})*\}(?:[\x1f\x02\x0f\x03\x16]\d{1,2})*\}$/, "");
	linkTitle = wikiLink;

	// fix  parameters
	if(linkTitle.match(/^[^\|]+\|/)) {
		linkParam = linkTitle.replace(/^[^\|]+\|(.*)$/, "|$1");
		linkTitle = linkTitle.replace(/^([^\|]+)\|.*$/, "$1");
		wikiLink  = linkTitle;
	}
	else {
		linkParam = false;
	}
	
	// fix bad links (but leave linkTitle)
		wikiLink  = wikiLink.replace(/^(?:template|msgnw|raw|subst):/, ""); // modifiers

	// set namespace by syntax
	if(wikiLink.match(/^:[a-z\s]*:?/i)) {
		wikiLink = wikiLink.replace(/^:([a-z\s]+):/i, "$1:"); // most ns
		wikiLink = wikiLink.replace(/^:/, ""); // main
	}
	else if(wikiLink.match(/^int:/i)) {
		wikiLink = wikiLink.replace(/^int:/i, "MediaWiki:"); // fix modifier
	}
	else {
		wikiLink = wikiLink.replace(/^/, "Template:");
	}
	
	// construct link
    wikiLink = escape(wikiLink.replace(/ /g, "_"));
    var anchor = document.createElementNS( "http://www.w3.org/1999/xhtml",
                                           "html:a");
    anchor.setAttribute("href", "http://en.wikipedia.org/wiki/" + wikiLink);
    anchor.setAttribute("class", "chatzilla-link "+plugin.prefs["class"]);
    mungerEntry.enabled = false;
    data.inLink = true;
    client.munger.munge(linkTitle, anchor, data);
    mungerEntry.enabled = true;
    delete data.inLink;

    //insertHyphenatedWord(linkTitle, anchor, data);
    containerTag.appendChild(document.createTextNode("{{"));
    containerTag.appendChild(anchor);
	if(linkParam) {
		containerTag.appendChild(document.createTextNode(linkParam));
	}
    containerTag.appendChild(document.createTextNode("}}"));
}

function insertWikiExtLink(matchText,containerTag, data, mungerEntry) {
    var wikiExtLink = matchText;
    var linkTitle = matchText;

	// separate link and text
	wikiExtLink = wikiExtLink.replace(/^\[(http:\/\/[^\s]+)\s+.*$/, "$1");
	linkTitle   = linkTitle.replace(/^\[http:\/\/[^\s]+ ([^\]]+)\]$/, "$1");
	
	// create link
	var anchor = document.createElementNS( "http://www.w3.org/1999/xhtml",
                                           "html:a");
    anchor.setAttribute("href", wikiExtLink);
    anchor.setAttribute("class", "chatzilla-link "+plugin.prefs["class"]);
	anchor.setAttribute("style", "text-decoration:underline;");
    mungerEntry.enabled = false;
    data.inLink = true;
    client.munger.munge(linkTitle, anchor, data);
    mungerEntry.enabled = true;
    delete data.inLink;
    
    // show link syntax
    containerTag.appendChild(anchor);
	
	// add external link icon
	var img = document.createElementNS( "http://www.w3.org/1999/xhtml",
                                           "html:img");
	img.setAttribute("src", 'http://upload.wikimedia.org/wikipedia/commons/4/44/External.png');
	containerTag.appendChild(img);
}

//
// Commands
//

function cmdClass(e) {
    if ( null != e.linkclass )
        plugin.prefs["class"] = e.linkclass;
    display( "Current value: " + plugin.prefs["class"] );
}

// End of file

Gaim

Use of the linkify plugin plus a wikilink config file allows you to see all those [[links]] that everyone is typing as real links.

1. Wingaim users who haven't installed activeperl, download activeperl 5.8 first, as per Perl plugin support, then reinstall gaim.

2. Download the linkify perl script from sourceforge. Copy it to your plugins directory (~/.gaim/plugins or C:\Program Files\Gaim\plugins) as described in How do I use perl scripts with Gaim?

3. Currently in wingaim you need to alter the path line

my $CfgFile = "$ENV{HOME}/.gaim/linkify.cfg";

to

my $CfgFile = "C:/Documents and Settings/YOURUSERNAME/Application Data/.gaim/linkify.cfg";

4. Download the linkify.cfg example file from the same page. Copy it to ~/.gaim (or UserName\Application Data\.gaim). Currently it is set to change "Bug ###" into a clickable link to bugzilla.

5. Add the lines:

# Wikipedia links
\[\[([^\]]+)\]\] http://en.wikipedia.org/wiki/$1

to the config file (and remove the bug linking if you care). The regexp may not be perfect, but you can all refine it! It's a wiki!

(Contributed by User:Omegatron)


Irssi

Use the following script, which will make any [[links]] appear as [[links]] <http://en.wikipedia.org/wiki/links>:

#!/usr/bin/perl
use strict;
use warnings;
use Irssi;

Irssi::settings_add_str ('wikify', 'wiki_lang', 'en');
Irssi::settings_add_str ('wikify', 'wiki_active_channels', 'freenode/#wikipedia
freenode/#wikibooks');

sub wikilang {
        Irssi::settings_get_str ('wiki_lang');
}

sub urlencode {
        my $string = shift;
        $string =~ y/ /_/;
        $string =~ s/([^A-Za-z0-9_])/sprintf("%%%02X", ord($1))/seg;
        return $string;
}

sub wikilink {
        my $s = shift;
        my $u = urlencode $s;
        my $l = wikilang;
        "[[$s]] <http://$l.wikipedia.org/wiki/$u>";
}

sub wikitemplate {
        my $s = shift;
        my $u = urlencode $s;
        my $l = wikilang;
        "{{$s}} <http://$l.wikipedia.org/wiki/Template:$u>";
}

sub wikify {
        my $line = shift;
        $line =~ s/\[\[(.*?)\]\]/wikilink $1/eg;
        $line =~ s/Template:(.+?)/wikitemplate $1/eg;
        return $line;
}

sub sig_message_public {
        my ($server, $msg, $nick, $address, $target) = @_;
        my $chatnet = $server->{chatnet};
        my $ok = 0;
        for my $t (split /\s+/, Irssi::settings_get_str ('wiki_active_channels')) {
                $ok = 1 if lc $t eq lc "$chatnet/$target";
        }
        return unless $ok;
        $msg = wikify $msg;
        Irssi::signal_continue ($server, $msg, $nick, $address, $target);
}

Irssi::signal_add_first ('message public', \&sig_message_public);

(Contributed by Ricky Clarkson, who was channelling 'met' from Freenode IRC, on #irssi)

ERC

ERC is an IRC client for emacs implemented in Emacs Lisp.

The following code can be added to your emacs initialization file (I put it in ~/.emacs.d/mwlink.el). It uses the #mwlink script below, running in daemon mode (mwlink --daemon); or rather, it depends on it. Your browser will open a URL of the form http://localhost:4242/mwlink?page=<page> and get redirected to the appropriate wikimedia page. This means the following code can be relatively simple (rather than having to figure out languages, namespaces and wikis on its own.

Another note: this depends on the emacs-wiki package for the `emacs-wiki-escape-url' function. I also don't know how well it plays with Custom (though customizing `erc-button-alist' isn't any nicer than doing this or editing it directly).


  (add-to-list 'erc-button-alist
   '("\\[\\[\\(.*?\\)\\]\\]" 0 t
     (lambda (page) (browse-url (concat
                     "http://localhost:4242/mwlink?page="
                     (emacs-wiki-escape-url page)))) 1))

Wikilink filter (mwlink)

This Ruby program has two modes. It can run as a daemon or text processor (daemon mode is preferred, since it's more efficient).

In text-scanning mode, it interprets its command line (or stdin if no command line given) as text possibly containing [[wikilinks]]. It preserves the original text and adds a text hyperlink (the http: address contained in <> braces).

In daemon mode, it receives HTTP requests like http://localhost:4242/mwlink?page=wiki-page-name and redirects to the appropriate Wikimedia page. It's convenient for scripts to just use that URL rather than constructing one themselves--all they have to do is URL-escape the text between [[ and ]].

   #!/usr/bin/ruby

   # This script is dual-licensed under the GPL version 2 or any later
   # version, at your option. See http://www.gnu.org/licenses/gpl.txt for more
   # details.

   =begin

   = NAME

   mwlink - Linkify mediawiki-style wikilinks in plain text

   = SYNOPSIS

      mwlink [options] [text-to-wikilink]
         --daemon[=port]     Run as HTTP daemon
         --encoding          Default character set encoding (utf-8)
         --default-wiki      Default wiki (wikipedia)
         --default-language  Default language (en)

   = DESCRIPTION

   In text-scanning mode (without the --daemon argument) The mwlink program scans
   its arguments (or its standard input, in the event of no arguments) for
   wikilinks of the form [[link]]. It expands such links into URLs and inserts
   them into the original text after the [[link]] in sharp braces ((({<})) and
   (({>}))). Options are provided for specifying a default wiki (the wiki to link
   to if no qualifier is given in the link) and a default language (the language
   to assume if no qualifier is given) as well as the character set encoding in
   use. The built-in defaults are ((*wikipedia*)), ((*en*)) and ((*utf-8*)),
   respectively.

   In daemon mode (now preferred), It receives HTTP requests of the form
   "http://.../page=((*wikipedia page*))" (the ((*wikipedia page*)) name is what
   would appear within a [[wikilink]]. URL-escaping is required but no other
   processing, making it convenient to use from scripts.

   == Initialization File

   The names of namespaces vary in different languages (especially due to
   language. For example, "User:" in English is "Benutzer:" in German. You can
   specify lists of namespaces to use for particular languages in an
   initialization file (({~/.mwlinkrc})). This is simply a line with the
   language, a colon, and a space-separated list of namespaces in that
   language. When interpreting links for that language (either because
   ((*--default-language*)) was specified or there is a language qualifier in
   the link, mwlink will recognize it as a namespace appropriately. All the
   namespaces must appear on one line--line continuation is not supported.

   Comments (lines introduced with (({#}})) (pound sign)) are comments, and
   are ignored, along with blank lines.

   Here is an example configuration containing (only) some namespaces from the
   German Wikipedia. ((*Note*)): To be kind to the wiki when this script is
   uploaded, I have broken the line, but it ((*may not be broken*)) in order
   to work with mwlink.

      de: Spezial Spezial_diskussion Diskussion Benutzer Benutzer_diskussion
      Bild Bild_diskussion Einordnung Einordnung_diskussion Wikipedia
      Wikipedia_talk WP Hilf Hilf_diskussion

   = WARNINGS

   * The program (like mediawiki) assumes links are not broken across line
     boundaries.
   * The mechanism for providing an alternate list of namespaces only works
     per-language; other wikis could have different namespaces, too.
   * The list of wikis and their abbreviations is doubtlessly incomplete.
   * The initialization file mechanism is not that useful for a shared daemon.
   * In command-line mode, it's very difficult to process ASCII em-dashes (--)
     correctly and still honor command-line options. mwlink gets it wrong, and
     that's one reason daemon mode is preferred.

   = AUTHOR

   Demi @ Wikipedia - http://en.wikipedia.org/wiki/User:Demi

   =end

   require 'cgi'
   require 'iconv'
   require 'getoptlong'
   require 'webrick'
   include WEBrick

   $opt = {
      'default-wiki' => 'wikipedia',
      'default-language' => 'en',
      'encoding' => 'utf-8'
   }

   class String

      def initcap()
         new = self.dup
         # Okay, I consider it dumb that a string subscripted produces an
         # integer --Demi
         new[0] = new[0].chr.upcase
         return new
      end

      def initcap!()
         self[0] = self[0].chr.upcase
         return self
      end

   end

   class Canon

      def initialize()
         @ns = { }
         @ns_array = %w(Media Special Talk User User_talk Project Project_talk
            Image Image_talk MediaWiki MediaWiki_talk Template Template_talk Help
            Help_talk Category Category_talk Wikipedia Wikipedia_talk WP)
         @ns['default'] = { }
         @ns_array.each { |nspc| @ns['default'][nspc] = nspc }

         if File::readable?(ENV['HOME'] + '/.mwlinkrc')
            IO::foreach(ENV['HOME'] + '/.mwlinkrc') { |line|
               next if line =~ /^\s*\#/
               next if line =~ /^\s*$/
               line.chomp!
               if m = line.match(/^(\w+)\:(.*)$/)
                  lang    = m[1]
                  nslist  = m[2].split
                  @ns[lang] = { }
                  nslist.each { |nspc| @ns[lang][nspc] = nspc }
               end
            }
         end

         @wiki = {
            'Wiktionary' => 'wiktionary',
            'Wikt' => 'wiktionary',
            'W' => 'wikipedia',
            'M' => 'meta',
            'N' => 'news',
            'Q' => 'quote',
            'B' => 'books',
            'Meta' => 'meta',
            'Wikibooks' => 'books',
            'Commons' => 'commmons',
            'Wikisource' => 'source'
         }

         @wikispec = {
            'wikipedia' => { 'domain' => 'wikipedia.org', 'lang' => 1 },
            'wiktionary' => { 'domain' => 'wiktionary.org', 'lang' => 1 },
            'meta' => { 'domain' => 'meta.wikimedia.org', 'lang' => 0 },
            'books' => { 'domain' => 'wikibooks.org', 'lang' => 1 },
            'commons' => { 'domain' => 'commmons.wikimedia.org', 'lang' => 0 },
            'source' => { 'domain' => 'sources.wikimedia.org', 'lang' => 0 },
            'news' => { 'domain' => 'wikinews.org', 'lang' => 1 },
         }

         @cs = Iconv.new("iso-8859-1", $opt['encoding'])

      end

      #TODO The % part of the # section of the URL should become a dot.

      def urlencode(s)
         CGI::escape(s).gsub(/%3[Aa]/, ':').gsub(/%2[Ff]/, '/').gsub(/%23/, '#')
      end

      def canonword(word)
         s = word.strip.squeeze(' ').tr(' ', '_').initcap

         begin
            @cs.iconv(s)
         rescue Iconv::IllegalSequence
            s
         end
      end

      def parselink(link)
         l = {
            'namespace' => '',
            'language' => $opt['default-language'],
            'wiki' => $opt['default-wiki'],
            'title' => ''
         }
         terms = link.split(':')
         l['title'] = canonword(terms.pop)
         terms.each { |term|
            next if term.nil? or term.empty?

            t = canonword(term)

            if @ns[l['language']]
            then
               ns = @ns[l['language']]
            else
               ns = @ns['default']
            end

            if ns.key?(t)
               l['namespace'] = ns[t]
            elsif @wiki.key?(t)
               l['wiki'] = @wiki[t]
            else
               l['language'] = t.downcase
            end
         }

         l
      end

      def canonicalize(link)
         linkdesc = parselink(link.sub(/\|.*$/, ''))

         if @wikispec.key?(linkdesc['wiki'])
            ws = @wikispec[linkdesc['wiki']]
            host = ws['domain']
            if ws['lang'] != 0
               host = linkdesc['language'] + '.' + host
            end
         else
            host = linkdesc['wiki'] + '.' + 'wikimedia.org'
         end

         uri =
            if linkdesc['namespace'].length > 0
               linkdesc['namespace'] + ':' + linkdesc['title']
            else
               linkdesc['title']
            end

         r = urlencode('http://' + host + '/wiki/' + uri)
         r
      end

      def to_s()
         "Namespace sets: " + @ns.keys.join(', ') +
         "; Wikis: " + @wiki.to_a.join(', ')
      end
   end

   def linkexpand(c, bracketlink)
      linktext =
         if m = /\[\[([^\]]+)\]\]/.match(bracketlink)
            m[1]
         else
            bracketlink
         end

      bracketlink +
         " <" + c.canonicalize(linktext) + ">"
   end

   c = Canon.new()
   re = /\[\[\s*[^\s\\][^\]]+\]\]/

   class MwlinkServlet < HTTPServlet::AbstractServlet

      def initialize(server, canonicalizer)
         super(server)
         @c = canonicalizer
      end

      def do_GET(rq, rs)
         p = CGI.parse(rq.query_string)
         # Just for testing
         l = @c.canonicalize(p['page'][0])
         rs.status = 302
         rs['Location'] = l
         rs.body = "<html><body>\n" +
            "<a href=\"#{l}\">#{p['page'][0]}</a>\n" +
                     "</body></html>\n"
      end
   end

   begin
      GetoptLong::new(
         ['--default-wiki',     GetoptLong::REQUIRED_ARGUMENT],
         ['--default-language', GetoptLong::REQUIRED_ARGUMENT],
         ['--encoding',         GetoptLong::REQUIRED_ARGUMENT],
         ['--daemon',           GetoptLong::OPTIONAL_ARGUMENT]
      ).each do |k, v|
         k = k.sub(/^--/,'')

         case k

         when 'default-wiki', 'default-language', 'encoding'
            $opt[k] = v

         when 'daemon'
            $opt['daemon'] = true
            if v.empty?
               $opt['port'] = 4242
            else
               $opt['port'] = v
            end
         end
      end
   rescue GetoptLong::InvalidOption
      true
   end

   if $opt['daemon']

      port = $opt['port'].to_i

      puts "Starting daemon on port #{port}"
      s = HTTPServer.new(:Port => port)
      s.mount("/mwlink", MwlinkServlet, c)

      trap('INT') { s.shutdown }

      s.start

   else

      # Note, there are various combinations of -- appearing in normal text that
      # will break this. --daemon is the recommended method.
      if ARGV.empty?
         STDIN.each_line { |line|
            puts line.chomp.gsub(re) { |expr| linkexpand(c, expr) }
         }
      else
         puts ARGV.join(' ').gsub(re) { |expr| linkexpand(c, expr) }
      end

   end

Example output:

 [[Ashland (disambiguation)]] is an example of a
 [[Wikipedia:Disambiguation]] page.
 [[Ashland (disambiguation)]] <http://en.wikipedia.org/wiki/Ashland_%28disambiguation%29> is an example of a
 [[Wikipedia:Disambiguation]] <http://en.wikipedia.org/wiki/Wikipedia:Disambiguation> page.
 GET http://localhost:4242/mwlink?page=Ashland+%28disambiguation%29
 GET http://localhost:4242/mwlink?page=Ashland+%28disambiguation%29 --> 302 Found
 GET http://en.wikipedia.org/wiki/Ashland_%28disambiguation%29 --> ...(page content)

The GET program is a utility distributed with Perl's libwww. Also, note that wikimedia servers forbid scripts based on the LWP Perl module.

Recent changes scripts

mIRC

Here's a short script to open up all contributions from anons in browser windows (works well with tabbed browsing). CryptoDerk 23:04, Feb 15, 2005 (UTC)

Place the following in Tools->Scripts Editor under the remote tab.

ON $50:TEXT:/(http\S+) \* \d\d?\d?\.\d\d?\d?\.\d\d?\d?\.\d\d?\d? /iS:#en.wikipedia: run $regml(1)
alias F9 auser 50 *127.0.0.1
alias F11 ruser 50 *127.0.0.1

Now just sit in #en.wikipedia and hit F9 to begin. Hit F11 to stop. You can change F9/F11 to whatever function keys you like.

Also, for a frontend to the IRC live feeds, check out CryptoDerk's Vandal Fighter.

Unicode numeric converter scripts

Perl

A HTML character entity converter written in Perl that uses the ord() function to convert a character to its corresponding number in the character set. It operates on standard input.


 #!/usr/bin/perl
 # Code is in the public domain.
 use strict;

 my @input  = split (//, <>);
 for ( my $i = 0; $i<$#input; $i++ ) {
        if ( ord($input[$i]) >= 255 ) {
                print '&#' . ord($input[$i]) . ';';
        } else {
                print $input[$i];
        }
 }

AppleScript

Author: Olof

Notes: I got fed up with looking up Unicode characters, so I wrote an AppleScript for my favorite styled text editor (Style) to write them for me. Now, I can just type Japanese into a text edit window like this

小 泉 純 一 郎

select it, choose my script from the scripts menu, and it turns into

小 泉 純 一 郎 &#23567; &#27849; &#32020; &#19968; &#37070;

Which is what you can paste into the Wikipedia edit window. I'm using a Mac, so this is a mac only solution, but I thought I'd share it for those of you who can use it. Here's the script:

compile this in ScriptEditor, and save it in the "Style Scripts folder in the same folder where the Style application lives. I have mine saved as "Append Unicoded HTML". Thereafter it will appear in Style's Scripts menu.

Style is available at merzwaren ($20 shareware). I have this running on Mac OS X 10.2.1 with Style version 1.9.2

Enjoy !

set theHTML to "" 
tell application "Style" 
    set selText to selection of document 1 as Unicode text
    set selStart to get offset of selection of document 1
    set selEnd to get offset of end of selection of document 1
    set dataLen to (selEnd - selStart)
end tell
set tempName to "unicode temp"

tell application "Finder" 
    if alias tempName exists then
        move alias tempName to the trash
    end if 
end tell 

set fileRef to open for access tempName with write permission
write selText to fileRef
set myRawData to read fileRef from 0 for dataLen
close access fileRef

tell application "Finder" 
    move alias tempName to the trash 
end tell 

set numChars to dataLen div 2 
repeat with n from 1 to numChars 
    set theHTML to theHTML & "&#" 
    set a to get character (2 * n - 1) of myRawData 
    set b to get character (2 * n) of myRawData
    set lVal to ((ASCII number b) + (256 * (ASCII number a))) 
    set theHTML to (theHTML & lVal as string) & "; "	 
end repeat 

tell application "Style" 
    set selection of document 1 to ((selection of document 1) & " (  " & theHTML & " ) ") 
end tell

JavaScript

Authors:

Known to work on:


Notes:

You may not need a script for converting CJK characters if you have a Mac running Mac OS X 10.2 and have Mozilla as your browser. Just do the editing from within Mozilla. Mozilla automatically does the conversion. For example, in adding this edit, I type in the Japanese characters for "edit," which are 編集. Mozilla automatically converted these characters to the proper romanized Unicode format. Just look at the above lines in the editing box to see for yourself. -User: IppikiOokami 5 September 2003

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
  <head>
    <title>Unicode conversion utility</title>
  </head>
  <body>
   <form name="charform">
     <p>Type here, and all characters with a code greater than 127 will be converted to &#1234; codes.</p>
     <p>Input:</p>
     <textarea name="input" cols="80" rows="25" onKeyUp="revtxt()">
       Sorry, this page is useless with JavaScript disabled.
     </textarea>
     <p>Output:</p>
     <textarea name="output" cols="80" rows="25">
       Sorry, this page is useless with JavaScript disabled.
     </textarea>
    </form>
    <script type="text/javascript">
    <!--
      document.charform.input.value="";
      document.charform.output.value="Don't type here.";
      function revtxt() {
        var s=document.charform.input.value;
        o="";
        for( m=0;s.charAt(m);++m )
          if ( (c=s.charCodeAt(m))<128&&c!=38) o+=s.charAt(m);
          else if (c==38) o+="&";
          else o+="&#"+c+";";
          document.charform.output.value=o;
      }
    -->
    </script>
</html>

Watchlist and undeletion select-all script

This script usefully checks all checkboxes on a Wikipedia page, useful for clearing large watchlists or restoring pages with large histories. It works in Firefox and Internet Explorer.

In Firefox, create a bookmark, with the following code in the "location" field. In Internet Explorer, create a favourite, and once it's created, right click it, select "properties" and place the following code in the "URL" box (it will give you an invalid protocol warning, but you can ignore this, and it will work).

javascript:for (i=0; i<document.forms.length; i++) { for (j=0; j<document.forms[i].elements.length; j++) { f= document.forms[i].elements[j]; if (f.type == 'checkbox') f.checked= true; } } void 0

Scripting requests

Wanted: WikiLinks code for Visual IRC
Wanted: Wikilinks code for KVIrc

Database scripting

There are a number of pages designed to help with Wikipedia maintenance and editing which are generated by copying the entire Wikipedia database to a machine and running a script or program on it.

Here you can request that someone with a recent copy of the database dump and the appropriate skills update the Specialpages and other pages requiring scripting.

See also: Wikipedia:SQL query requests.

I'm working up scripts to generate the equivalent of many of these from an offline copy of the database. Some are a bit amateurish right now and should be considered at best beta versions, but please cast your eyes over User:Topbanana/Reports if you're eager to get going. These will find their way into the wikipedia namespace eventually, in the meantime I'll be delighted to hear feedback.