Jump to content

Wikipedia:AutoWikiBrowser: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
→‎Set options: @"Skip non-existing" put apostrophe in "don't"
Bluemoose (talk | contribs)
No edit summary
Line 6: Line 6:
| caption =
| caption =
| developer = AutoWikiBrowser project
| developer = AutoWikiBrowser project
| latest_release_version = 3.0.5.2
| latest_release_version = 3.0.6.0
| latest_release_date = [[2006-11-09]]
| latest_release_date = [[2006-11-30]]
| latest_preview_version =
| latest_preview_version =
| latest_preview_date =
| latest_preview_date =

Revision as of 19:43, 30 November 2006

AutoWikiBrowser
Developer(s)AutoWikiBrowser project
Stable release
3.0.6.0 / 2006-11-30
Preview releaseSVN (SVN) [±]
Repository
Operating systemWindows
TypeWikipedia editor
LicenseGPL
Websitesourceforge.net/.../autowikibrowser

The AutoWikiBrowser is a semi-automated Wikipedia editor for Microsoft Windows 2000/XP (or newer) designed to make tedious repetitive tasks quicker and easier. It is essentially a browser that automatically opens up a new page when the last is saved. When set to do so, it suggests some changes (typically formatting) that are generally meant to be incidental to the main change.

At present, AutoWikiBrowser can create a list of articles from single or multiple categories, "what links here", the wiki links on an article, a text file, a Google search, a user's watchlist, or a user's contributions.

AWB also comes with a program to scan the database, and a development version of IRCMonitor, program that monitors the IRC recent changes channel.

The sources are available under GPL license (see Getting the sources below). It is written in C# using Microsoft Visual C# Express Edition, which is downloadable for free. There is an AWB IRC channel at chat.freenode.net — #AutoWikiBrowser.

Examples of AWB-assisted work are noted on the projects page, this page also contains projects that currently need AWB help.

  1. ^ AWB does have an automatic mode enabled for some bot accounts, so can be used as a bot, but it normally just assists a human.

Rules of use

  • Check every edit before you save it.
  • Don't edit too fast; consider opening a bot account if you are regularly making more than a few edits a minute.
  • Don't do anything controversial with it.
  • Avoid making insignificant minor edits such as only adding or removing some white space, moving a stub tag, converting some HTML to Unicode, removing underscores from links (unless they are bad links), or something equally trivial. This is because it wastes resources and clogs up watch lists.
  • Abide by all Wikipedia guidelines, policies and common practices.
Repeated abuse of these rules will result, without warning, in the software being disabled.

Versions

Using this software

Screenshot
Screenshot

(1) Register

Add your name to the requests for registration if you would like to use the software. For security reasons, only registered users (see the list on the check page) are able to use AutoWikiBrowser on the en.wikipedia.

Anyone can be registered, but only if an admin approves your registration by placing your name on the check page. As a general rule users with more than 500 mainspace edits will be registered. You will probably not be contacted when your registration has been approved, so check the page periodically for your name.

(2) Download

After you are registered, or if you just want to examine the software, you may download AutoWikiBrowser from SourceForge. Note: This is a development version!

AutoWikiBrowser requires Microsoft Windows 2000/XP or newer (Unicode doesn't work properly on Windows 98/Me). It also requires Version 2 of the .NET framework (download .NET framework).

If the software doesn't work, it probably means that you're not registered or that you don't have the correct .NET framework installed.

(3) Get started

  1. Select "Make from Category" then enter a category name.
  2. Click "Make list", let the list load up.
  3. Set any options, such as find and replace, edit summary, etc.
  4. Click "Start!", it will load up the article, automatically make any changes and then go to the diff.
  5. Change anything in the article you want in the textbox on the lower right, not the normal website textbox in the browser, then click "Save" or "Ignore", the next page will load up automatically.

Having problems?

  • Occasionally it stalls when loading, just click "start" again to give it a nudge.
    • This might well be because you have navigated to a different window: AWB needs to remain in the foreground while loading up a new page.
  • It uses the Internet Explorer core, so if you have problems, make sure your IE is working. Make sure you have logged into Wikipedia using IE. If you have altered any settings regarding scripts, first use Tools > Internet Options > 'Advanced' tab > Click on 'Restore Defaults' and then try AWB again.
    • If you have made changes to monobook.css (or whatever your theme is) that require CSS 2 or 3, they may not appear properly in IE and thus in AWB.
  • If you are having problems creating a list from "what links here" try clearing your Internet Explorer cache.
  • A buggy monobook.js can often cause IE to display blank pages or crash AWB. This javascript problem can be avoided by disabling Active Scripting in IE INTERNET OPTIONS - SECURITY - CUSTOM LEVEL.
  • "The application failed to initialize properly..." -> get .NET 2.0 (linked above).

AWB User manual

This section will explain what all the bits and pieces do, please help!

The basic process is this:

Make AWB log in as a user
Make a list of pages to modify
Build a list of pages, based on a category or links to a template, or manually.
Set options
Specify what to do to each page: Clean it up, unicodify it, add a category, etc.
Start the process
AWB takes you through each page, previewing the planned changes, and letting you make further changes, such as adding a template.

Login

  • AWB uses the Internet Explorer engine, and uses the IE login. If you have different accounts, you'll have to log out of Wikipedia in IE and log back in as your other account. If you want to run AWB and do manual edits at the same time using 2 different accounts, use IE for your bot and do your manual edits in another browser like Firefox or Opera

"Make list"

  • Make from:
  • Note: Make lists from multiple pages by separating them with a pipe symbol, e.g. from a category, entering "Cats|Dogs|Fish" will get all the entries in Category:Cats, Category:Dogs and Category:Fish.
  • Category Gets a list of sub categories and articles from the category.
  • What links here Gets a list from the "what links here" of an article. To get the "what links here" of the articles "Cat" and "Dog" in one go, type "Cat|Dog".
  • Option: Inclusion only - same as What links here, but only gets pages that link by inclusion (i.e., {{PAGENAME}})
  • Links on page Gets all the wiki-links from the given page, all namespaces.
  • Text file Will get a list from a text file, the articles in the text should be [[Wiki linked]].
  • Google search Gets a list from a google search of the wiki.
  • User contribs Gets articles edited by a specific user
  • Special page Enter "Lonelypages" to get a list from Special:Lonelypages, you can also enter "Lonelypages&limit=500&offset=500" to get more or from an offset.
  • Image file links Gets a list of articles that use the given image.
  • Database dump Opens the Dump Scannner program to scan the database dump (which needs to be downloaded, ~1.8 GB). See Wikipedia:Database download
  • Watchlist Imports your watchlist (using the account you are logged into Internet Explorer with).
  • Wiki search that gets list of pages from wiki's internal search engine. Typically, Google search results are better but Google rescans Wikipedia only around once per month, and cannot search for specific wikisyntax.
  • Make list Makes list based upon given options
  • Add Adds item in box to list.
  • Remove Removes selected item from list.
  • Clear Clears entire list.
  • Filter Filters the list by a selected list of namespaces, inclusion of selected words. Can also exclude items that exist in another list and remove duplicates.
  • Save... will save the current list to a text file

"Set options"

  • Apply general fixes
    • Fixes common mistakes in "see also" and "external links" sections, removes excess white space.
    • Sorts interwiki links alphabetically (individually selectable in menu), and puts them at the bottom of the page with stubs.
    • Unicodifies interwiki links.
    • Removes duplicate interwikis and categories.
    • Puts categories after article body, followed by interwiki links and stub templates. Recognises some comments as cat and interwiki headers.
    • Adds bullet points to external links after the ==External links==.
    • Replaces italic and bold html markup with wiki markup.
    • Repairs bad links.
    • Simplifies links like [[Dog|Dog]] to [[Dog]].
    • Simplifies links like [[Dog|Dogs]] to [[Dog]]s.
    • Adds bold text to the first occurrence of the title of the article (if there is no other bold text).
    • De-links self referencing wiki-links.
  • Auto tag Appends {{Wikify}}, {{Uncategorised}} and {{stub}} tags when appropriate. Removes stub tags from long articles. Adds the date parameter to the by-date sorted templates.
  • Unicodify whole Article
  • Find and replace Enables multiple find and replacement. Can specify case sensitivity and Perl-style regular expression patterns. The keyword %%title%% represents the article title. For information on setting options such as multi or single line see here. See here for substitution syntax.
  • Advanced find and replace (Replace special) See [1].
  • Categorisation Add/Remove/Replace categories (replace only available when making a list from a category), enter the new category name minus the Category: prefix. When entering a category for "Add new category" use the keyword %%key%% to insert the reversed human name key, e.g. entering "Economists|%%key%%" might insert "[[Category:Economists|Smith, Adam]]".
  • Skip Articles
  • Case Sensitive
  • Are Regexes
  • Skip if contains/doesn't contain Skips articles that do or do not contain the given string/regex.
  • Skip articles with no changes Skips article that it doesn't automatically change (i.e. make a "general fix", find and replace etc.).
  • Skip non-existing pages causes AWB to automatically skip pages that don't exist.
  • Skip Articles - More
  • Skip if no unicodification
  • Skip if no tag changed
  • Skip if no header error fixed
  • Skip if no titled boldened
  • Skip if no external link bulleted
  • Skip if no bad link fixed

"More options"

  • Append message Appends the given text to the bottom of the page (talk pages only).
  • Auto-mode will make saves automatically at given interval, only for accounts registered in the Bots section of the checkpage.
  • Delay the delay in seconds before saving the page after loading, (normally loading takes about an extra 8 seconds or 3 seconds with quick save enabled).
  • Quick save when using auto-mode, avoids loading diff to save time/bandwidth/server load.
  • Suppress "using AWB" stops addition of "using AWB" to the edit summary, as registered bots do not need this.
  • Skip article when no typo fixed will skip articles that have no typo to be fixed

"Start"

  • Summary - the edit summary, either select one from the drop down, or enter your own text.
  • Article statistics - various statistics such as number of characters and images.
  • Alerts - displays alerts when, for example, the article is uncategorized, is long but tagged as a stub, etc.
  • Start the process - starts the process when you have a list of articles. (Shortcut key - control + S)
  • False adds article to a list of false positives, in a file called "False positives.txt"
  • Stop - stops the editing process. (Shortcut key - escape)
  • Preview - changes the view to preview (and updates any extra changes you made).
  • Show changes - changes the view to the diff (and updates any extra changes you made).
  • Move
  • Delete
  • Ignore - moves on to the next page without saving anything. (Shortcut key - control + D)
  • Save - saves the page, including any extra changes you made, then moves on to next article. (Shortcut key - control + S)
  • File
  • Save settings Saves settings to specified path.
  • Save as default Saves the settings to the default.xml, these will then be loaded automatically when AWB is opened
  • Load settings Loads settings from specified path.
  • Reset settings
  • Recent
  • User project and preferences
  • Log in
  • Exit Quits program.
  • List
  • Filter out non main space Removes all non-main space articles.
  • Filter
  • Convert to talk pages
  • Convert from talk pages
  • Sort alphabetically Sorts list alphabetically.
  • Save list to text file Saves list to text file (which can be used later on to create new list, as described above).
  • Launch Database Scanner Launches the Database Scanner, see here
  • Launch List comparer
  • General
  • Enable the Toolbar
  • Bypass redirects Instead of editing pages that redirect to another page, AWB edits the page to which it redirects.
  • Do not automatically apply changes No changes are made, instead you can use the "re-parse" option selectively.
  • Preview instead of diff Previews each article after changes made.
  • Mark all as minor Marks all edits as minor.
  • Add all to watchlist Adds all edited pages to user watchlist.
  • Show timer Shows timer in the lower right corner of the window so user can monitor interval between edits.
  • Sort interwiki links Sorts the interwiki links in same order as pywiki bots (if "Apply general fixes" is selected).
  • Enable button to log false positives
  • Advanced
  • Make Module
  • Help
  • Help Links to this page.
  • About... Shows about box containing version number, etc.

Edit box context menu

The edit box context menu is the menu that appears when you right-click inside the edit box.

  • WordWrap Wraps the text in the edit box at bottom-right.
  • Undo negates the last action.
  • Cut copies and then deletes the selected text.
  • Copy copies the selected text to the clipboard.
  • Paste pastes text from the clipboard to the selected area.
  • Paste more enter text into the textboxes, then double click one to paste it's contents.
  • Select all selects all the text in the edit box.
  • Go to line enter the line number and hit return.
  • Insert... can:
  • Guess birth/death cats guesses the birth and death years of the article's subject and inserts the appropriate categories. (For biographical articles only.)
  • Meta-data template inserts the persondata template. (For biographical articles only.)
  • Human name category key
  • Insert tag inserts the tag selected from the submenu to the selected area of the article. If {{stub}} is selected, the user can optionally change the type of stub by typing into the box.
  • Convert list to
  • * List (Bullet pointed list)
  • # List (Numbered list)
  • Unicodify selected converts any HTML entities or URL encoded characters in the selected text to unicode.
  • Bypass all redirects
  • Fix all excess whitespace
  • Re-parse re-applies all the functions (general fixes, re-categorisation...).
  • Open page in browser opens the article in the default browser.
  • Open page history in browser opens the article history in the default browser.
  • Replace text with last edit

List box context menu

The list box context menu is the menu that appears when you right-click inside the list box.

  • Filter out non main space Removes all non-main space articles.
  • Filter Opens the advanced filter options.
  • Convert to talk pages Transforms the list into talk pages, e.g. "Cat" => "Talk:Cat".
  • Convert from talk pages Transforms the list from talk pages, e.g. "Talk:Cat" => "Cat".
  • Sort alphabetically Sorts list alphabetically.
  • Save list to text file Saves list to text file (which can be used later on to create new list, as described above.)
  • Add selected from list... When an item is selected, the following can be added to the list
    • From category Adds the contents of a category when a category is selected.
    • From whatlinkshere Adds the articles that link to the selected article.
    • From links on page Adds the articles linked in the selected article.
    • From image links Adds the articles linked to an image when an image is selected.
  • Remove Removes the selected article.
  • Clear Clears the list.
  • Open article in browser Opens the article in your default browser.

Database Scanner

AWB includes a database scanner which can be used to create lists of articles to be checked, without causing extra unnecessary load on the WikiMedia Servers.

Database dumps are created frequently (more info here) and are avaliable for free download. As the page states, the best/most useful dump is the pages_articles.xml.bz2. Visiting the database dump progress site, allows you to view the status of the current dump, and be able to easily browse to the downloads in it.

Upon downloading, the archives need to be uncompressed, this will turn it from a ~ 1.8GB bz2 archive, into a xml database dump around 7.5GB.

Database Scanner User Manual

Coming soon.

  • File
  • Open XML dump opens a dialog box for you to browse to the database dump which you want to search
  • Save results list saves the list of articles found from the database dump
  • Reset settings sets all settings back to their defaults default
  • Exit quits Program
  • Options
  • Ignore Redirects
  • Ignore image namespace
  • Ignore Category namespace
  • Ignore Wikipedia namespace
  • Ignore Template namespace
  • Ignore main namespace
  • Ignore <!-- commented out text -->
  • Other
  • Thread Priority
  • Help
  • About Shows about box containing version number, etc

"Text Matches"

  • Are regexes
  • Case sensitive
  • Singleline
  • Multiline
  • Article does contain
  • Does not contain
  • Characters
  • No. of links
  • No. of words

"Title"

  • Are regexes
  • Case sensitive
  • Title does contain
  • Title does not contain

"AWB Specific"

  • None will just list all the articles in the database dump
  • Has title AWB will embolden
  • Has links AWB will simplify allows you to search a DB dump for links that can be simplified e.g:
  • Has bad links AWB will fix
  • Has HTML entries
  • Section error
  • Unbulleted links will search a database dump for any articles that have external links which are not bullet pointed
  • Typo allows you to search a database dump for spelling mistakes, in the same way that AWB can when RegexTypoFix is enabled

"Get Results"

  • Start searches the selected database dump based on the settings set in other option boxes
  • Limit no. of Results limits the number of results that will be found displayed from the database dump
  • Start from article starts from an entered article name. Very useful feature if you do not have time to finish a DB dump scan, you can carry on from where you had reached before without having to start again
  • abc... puts the list of articles from the DB Dump into alphabetical order
  • Filter allows you to filter the results found from the DB Dump. The options are the same for the normal AWB list filter
  • Clear clears the list of articles from the DB Dump

"Make wikified list from results"

  • Add headings every
  • #
  • *
  • A B C... headings
  • Make
  • Copy to clipboard
  • Save
  • Clear

API

  • AWB ships with WikiFunctions.dll, which can be referenced by other standalone projects. The DLL includes a wiki-ready web browser control, a simple page editor, a listmaker, and other tools and components.
  • User:Kingboyk has made available WikiFunctions2.dll which currently offers wiki-logging features for bots.

Plugins

AWB is able to load and use fully customised plugins. These plugins can process article text and extend the user interface, and are in the form of libraries (.dll files) which can be made in any .NET language such as C# or Visual Basic .NET. When AWB loads it automatically checks to see if there are any plugins in the folder it was executed from. Any plugins found are loaded and initialised without further intervention by the user.

Tips and tricks

  • To find and replace a word of upper and lower case, do a regular expression find and replace; for example, find: "\b(T|t)hier\b" and replace with: "$1heir". The "(T|t)" matches upper or lower case "t", the "$1" references whatever "(T|t)" matches. The "\b" means it is on a word boundary, this stops it matching words that correctly contain "thier".
  • To speed up a task, if you are correcting the above typo, set it to "Skip if doesn't contain" the typo(s) that is being corrected.
  • See this website for a breakdown of .NET regular expression syntax.
  • Turning off "Show pictures" in Internet Explorer options can speed up page loading times especially when the Wikipedia servers are responding slowly. Also, editors who do not normally use Internet Explorer yet use a custom monobook.js javascript (godmode-light, popups, etc...) for other browsers may see better page load performance by disabling "Active Scripting" in Internet Explorer security settings. NOTE:Those who manually update Windows will need to enable Active Scripting when manually checking for updates. The Windows update page will mention this if it is disabled. You can create a custom security level such that Active Scripting is disabled for Wikipedia, but not for other websites.
  • See Wikipedia:AutoWikiBrowser/Settings for a list of useful settings you can use with AWB.
  • A newline is represented by \r\n when doing find and replace.
  • AWB has some keywords that can be used in the textboxes/find-and-replace dialog. %%title%% represents the title of the current article (e.g. "John Smith"), and %%key%% will give you the human name category key for the current article (e.g. "Smith, John"). Other keywords can be implemented on request.

Getting the sources

AutoWikiBrowser is licensed under the GPL (see the license file).

.NET

AutoWikiBrowser is written in .NET. Version 3.5 or earlier is required to compile AWB. .NET 3.0 is included on Windows Vista. .Net 3.5 is included on Windows 7. .NET version 4.0 and greater will work, but will require a slight modification to AWB's code and the installation of Windows SDK for Windows 7 or Windows SDK for Windows 8.

Download source

To get the sources run the command  svn checkout https://svn.code.sf.net/p/autowikibrowser/code/AWB/ .  You'll need network access to SourceForge and its SVN server. If that doesn't work you probably need an SVN client:

  1. Download and install TortoiseSVN. It is the recommended SVN client program.
  2. Create a folder AWB (or whatever name you prefer) on your computer.
  3. Right click on the folder, and select "SVN Checkout...".
  4. In the dialog window that appears (titled "Checkout") enter  http://svn.code.sf.net/p/autowikibrowser/code/AWB/  for the field URL of Repository. (This is read only access, the read-write URLs are different)
  5. Click OK

This is a MB download with ~1,200 files and folders from the SourceForge SVN server at http://svn.code.sf.net/p/autowikibrowser/code/AWB/.

Please note that you can contribute features to AWB and fix bugs in AWB. Read access is anonymous, but if you register as a developer of AWB, sourceforge sends a URL with write access.

Compile source

You will now need to compile the code yourself. You will need a copy of a third-party C# IDE such as SharpDevelop (free), Microsoft Visual Studio Community 2019 or lower version (free) or a more complete version of the Visual Studio 2019 suite, such as Professional (cost). Alternatively, you can use newer versions of Visual Studio if you are willing to allow it to make modifications to your copy of the source.

SharpDevelop

  1. Download and install the latest version of SharpDevelop
  2. Click on File -> Open -> Project/Solution. Open the "AutoWikiBrowser no plugins" solution file.
  3. Press F8 to build AWB. The AWB executable will be placed in ...\AWB\AWB\bin\debug. Copy AutoWikiBrowser.exe, Newtonsoft.Json.dll, WikiFunctions.dll and Interop.mshtml.dll from the folder to where you run AWB from.

Visual Studio

If you already have Visual Studio 2019 or the latest release of 2022, with the ".NET Desktop Development" option installed, they are known to build AutoWikiBrowser correctly; go to step 3.

  1. Download and install the current version of Visual Studio Community.
  2. While configuring options during the installation, select at least ".NET Desktop Development".
  3. Browse to the toplevel folder, and run the Visual Studio AutoWikiBrowser solution file.
  4. When the IDE has loaded, select release rather than debug (next to the green forward arrow). On the solution explorer on the right hand side, right click on the solution, and select build solution. Visual Studio will now turn the source files into the required files to run AWB.
  5. Back in the AWB folder, browse to AWB\bin\Release, and copy AutoWikiBrowser.exe and WikiFunctions.dll from the Release folder to where you run AWB from.

MonoDevelop (Linux)

  • Use the "AutoWikiBrowser no plugins" solution file.
  • perl is required for the pre-build event to replace SVN revision number and date.

.NET 4.0/4.5 error

If you get an error while compiling saying something similar to, "Error loading code-completion information for Microsoft.mshtml from Microsoft.mshtml: Could not find assembly file.", this means you do not have .NET 3.5 or earlier installed. AWB can still be compiled, but will require a slight code change. It is recommended you install .NET 3.5, but if you cannot, see the talk page for further help.

Changes to AWB code

You can view all SVN changes one of two ways:

  1. Browse Commit
  2. Right click on the folder that contains the source code, then go Tortoise SVN -> Show log.

Update code from SVN

When new sources become available execute "SVN Update" from the context menu of your AWB folder. Then recompile the source.

You are not automatically notified of new versions; if you use AWB on a regular basis, and want to use the source version, check this daily, and build a new release version if there are changes. That way you are up to date with all bug fixes and new features.

See also