Wikipedia talk:AutoWikiBrowser

From Wikipedia, the free encyclopedia
Jump to: navigation, search

This is the discussion page for the AWB project. It is also the place to discuss using the AWB program itself (if you need help, or have a question about AWB, etc.). Before asking questions, please read the Frequently asked questions below. Where to make specific types of reports or requests is explained in the Before you post section below...

Before you post[edit]

Do you want to ... Please use
Report a bug in AWB? Wikipedia talk:AutoWikiBrowser/Bugs
Report an incorrectly fixed typo? Wikipedia talk:AutoWikiBrowser/Typos
Request a feature for a future version of AWB? Wikipedia talk:AutoWikiBrowser/Feature requests
Request approval to use AWB? Wikipedia talk:AutoWikiBrowser/CheckPage
Ask a question about AWB or ask for help? This page

Frequently asked questions[edit]

  • When I start it up I get one of the following errors:
    "The application failed to initialize properly (0xc0000135). Click on OK to terminate the application.", or
    "To run this application, you must first install one of the following versions of the .NET Framework..."
    This error means your computer does not have the .NET framework version 2 installed properly. You can choose from various versions for download here, or you can run Microsoft Update and select version 2 of the .Net framework from the "Optional Updates" section, if you want the choice made for you.
  • Does AWB run on Linux or Mac?
    Linux, a qualified yes. Mac, not natively but via virtualisation.
    AWB runs reasonably under Wine. See details here
    AWB might also run under Mono.
    A native version, PyAutoWikiBrowser (screenshots here), based on Python, is being developed separately for Unix-like systems.
    On the Mac, AWB is not natively available, but an option is to use virtualisation with Parallels Desktop for Mac (subject to meeting supported operating systems requirements) and then run Microsoft Windows virtually with AWB as the Windows instructions above. Note this option is not free, as a license is required for both Parallels Desktop for Mac and Microsoft Windows. An alternative visualization method is to use the free VirtualBox, see also Comparison of platform virtual machines.
  • Does AWB work on other projects/languages?
    Many Wikimedia projects and languages are supported, see the "User and project preferences" option in the general menu. Other languages will be added on request, though at the moment the interface is always in English. You are also able to use AWB with third-party wikis: Options > Preferences > Site, you can change the wiki there. The wiki must support the Bot API required by AWB. This means that it should have latest HEAD version of MediaWiki or something close to that. The wmf-deployment branch is also recommended, as this is what is currently live on WMF sites.
  • What interwiki link order does AWB use?
    The software reads the interwiki sort order from Wikipedia:AutoWikiBrowser/IW, which is generally mirrored to reflect the order at m:Interwiki sorting order.
  • AWB puts stubs after categories, though categories are always rendered the last by MediaWiki?
    According to WP:STUB#Categorizing stubs, By convention they are placed at the end of the article, after the External links section, any navigation templates, and the category tags, so that the stub category will appear last. If your wiki uses another order, please let us know here.
  • I don't like or use Internet Explorer; please use Firefox instead.
    AWB does not use Internet Explorer per se. It does, however, use the same web browser control (MSHTML) as Internet Explorer; the equivalent Firefox component does not provide the needed functionality.
  • How do I open the page in another browser if I can't use the one in AWB?
    Right click on the edit box in the bottom right side of your screen. Select "Open page in browser".
  • How do I edit a page that doesn't exist?
    Uncheck "Ignore non existing pages" in the "Skip articles" box.
  • How do I skip certain articles?
    Use the "Skip if contains" and "Skip if doesn't contain" on the "Skip" tab
  • Can't you leave up a "stable" version, so I don't have to download new versions?
    It is important to keep people up to date with the latest versions, because their use of the software doesn't just affect them, but the whole of Wikipedia. As any bugs that remain will be trivial; hopefully releases won't be too frequent.
  • How can I stop AWB clicking when it changes pages?
    This is a Windows sound theme setting. This page explains how to turn off the clicking sound.
    Alternatively, delete the following key from the Windows registry:
    HKEY_CURRENT_USER\AppEvents\Schemes\Apps\Explorer\Navigating\.Current
  • AWB randomly crashes upon page load on my system, and I always use a browser other than Internet Explorer when using Wikipedia.
    You may have installed custom scripts incompatible with IE. Wrap the contents of your monobook.js into conditional:
               //Detect IE5.5+
               if (navigator.appVersion.indexOf("MSIE")==-1)
               {
                   // Previous contents go here
                   ....
               }
  • I get Just In Time Debugger Messages when loading AWB/loading pages.
    In Internet Explorer, go to Tools --> Options --> Advanced. Make sure 'Disable Script Debugging (Internet Explorer)' and 'Disable Script Debugging (Other)' Are both checked. Press apply and close.
  • Why does AWB run very, very slowly if I try to make changes in the edit window on larger pages, especially pages with long lists or tables?
    If running on Windows, exit the Speech Recognition software that is built into some versions of Windows; don't just turn it 'Off', you must 'Exit' the software if you have started up Speech Recognition.
  • When I do a clean install of AutoWikiBrowser the application seems to find old setting data somewhere. I'd like to do a really clean install. Any ideas?
    Clean up your registry and remove the folder "C:\Documents and Settings\user name\Local Settings\Application Data\AutoWikiBrowser" (Windows XP) or "C:\Users\user name\AppData\Local\AutoWikiBrowser\" (Windows 7). Note that the application data folder may be hidden.
  • AWB prompts that there is a newer version but won't update
    Check the version number of your AWBUpdater.exe. The current version is 2.0.2.0. If you have an older version, you have to download the latest AWB version and make a clean install.
  • Which .NET Framework version do I have?
    You can find your .NET Framework version in Help --> About box.
  • Where are the default settings stored?
    • Windows XP: C:\Documents and Settings\[username]\Local Settings\Application Data\AutoWikiBrowser
    • Windows Vista and Windows 7: C:\Users\[username]\AppData\Local\AutoWikiBrowser\Default.xml
  • I cannot copy text from the diff window using the Control+C keyboard shortcut.
    You must have Microsoft.mshtml.dll available for AWB to use for this functionality to work. You can try downloading the file (there are a number of third party websites offering DLL file downloads) and putting it in the same folder as AutoWikiBrowser.exe. This is reported not to work for all users, presumably due to .NET Framework problems.
  • Is there any way to set AWB to not use https? (GFW blocks 443 port)
    In preferences, set project to "custom". Set the left box to http. In the webpage box, type en.wikipedia.org/w/ (English Wikipedia) or zh.wikipedia.org/w/ (Chinese Wikipedia). Note that leaving off the /w/ will result in a "root element missing" error.

Discussion[edit]

This talk page is automatically archived by MiszaBot I. Any sections older than 7 days are automatically archived to Wikipedia talk:AutoWikiBrowser/Archive 27. Sections without timestamps are not archived.

bug in template redirects[edit]

Resolved

Some how {{refimprove}} is being changed to {{BLP sources}}. Reported to me because of this edit and confirmed with User:Bgwhite/Sandbox. I can't see what is causing this on the template redirect page. Bgwhite (talk) 21:33, 16 October 2014 (UTC)

I believe {{BLP sources}} is what goes on BLP's rather than {{refimprove}} (likely this is recommended somewhere and thus AWB replaces it). The bug appears to be a duplication and then putting the tags in {{Multiple issues}}. Stevie is the man! TalkWork 21:40, 16 October 2014 (UTC)
Yes, the problem is the duplication. Someone needs to check all these refimprove tags. There are many cases to be checked. I pin Rjwilmsi for this one. -- Magioladitis (talk) 22:08, 16 October 2014 (UTC)
This seems to be an undocumented general fix. Maybe done in Template conversions? Using AWB's List compare, I found 348 articles with both {{refimprove}} and {{BLP sources}}. There are potentially multiple ways that this could be resolved:
  1. Change AWB behavior to remove {{refimprove}} if {{BLP sources}} already exists.
  2. Change AWB behavior to remove duplicate maintenance templates within {{Multiple issues}} (after conversions but before checking if {{Multiple issues}} should be removed)
  3. Leave AWB behavior as is, and I could create a bot to do #1.
Thoughts? GoingBatty (talk) 01:06, 17 October 2014 (UTC)
It's documented at Conversions. In AWB we could do remove {{refimprove}} if {{BLP sources}} already exists. Rjwilmsi 09:35, 17 October 2014 (UTC)
@Rjwilmsi: I'm sorry, but I don't see {{refimprove}} or {{BLP sources}} in Conversions. GoingBatty (talk) 04:03, 18 October 2014 (UTC)

GoingBatty what about now? Lol. -- Magioladitis (talk) 08:27, 18 October 2014 (UTC)

@Magioladitis: Yes, I see it now. Thanks for adding it - I didn't add it myself because I wasn't 100% sure it was part of the Conversions code. GoingBatty (talk) 20:34, 18 October 2014 (UTC)
@Magioladitis: How was this resolved to ensure that AWB doesn't leave duplicate {{BLP sources}} templates on an article? Thanks! GoingBatty (talk) 14:32, 19 October 2014 (UTC)

GoingBatty rev 10481 -- Magioladitis (talk) 13:25, 20 October 2014 (UTC)

Others using AWB are identifying typos my AWB installation doesn't find[edit]

I seem to keep running into situations where I have run AWB on an article, with no typos being fixed, then someone else drops in with AWB and fixes a typo. Are there additional typo lists available to some users that aren't available to those using the built-in RegEx typo lists? Stevie is the man! TalkWork 17:31, 17 October 2014 (UTC)

@Stevietheman: On the Options tab, check the "Find and replace" box, then click the "Normal Settings" box and add your own Find & replace rules. You can even copy rules from the Typo list into your normal "settings" rules, where you can choose to apply them (if you're good and careful) in sections of an article that are off-limits to the normal Typo fixing (for example, in image captions and the "Short-summary" parameters of a list of TV episodes – that's where you can find lots of errors that most AWB users skim right over). It is also possible to maintain your own full list of Typo rules to use instead of the ones at WP:AWB/T, though I have never tried that. If the other editor's edit summary does not contain "typo(s) fixed:", they are probably running their own F&R rules. Chris the speller yack 18:17, 17 October 2014 (UTC)
Their summaries contain "typo(s) fixed", so that's why I thought there were more typos I could check beyond the built-in list. I'm already aware of F&R, but I'd rather have the extra typos list these others are using, as I don't know in advance which typos to look for. Does anyone share such lists? Stevie is the man! TalkWork 18:22, 17 October 2014 (UTC)
@Stevietheman: Some of my typo-fixing settings files are linked from the top of the section User:John of Reading/Typo fixing with AutoWikiBrowser#Common misspellings. Disclaimers: some of these settings files are years old, some contain rules that have significant numbers of false positives, and so on. Use them with care! -- John of Reading (talk) 08:23, 18 October 2014 (UTC)
Thanks. I ran the B and C settings against 5,500+ articles in a project and two legitimate typos were found. This is better than zero, of course, but it made me realize that I'm probably not missing too much from just sticking to the conservative list of AWB typos. Stevie is the man! TalkWork 14:24, 19 October 2014 (UTC)

Avoiding removal of stub tags[edit]

I can use {{Not a typo}} around words to avoid them being corrected by AWB and other processes. Is there something similar that can be used around stubs so that they won't be removed? There are some articles which are stubs in terms of the amount/quality of prose in them but AWB decides they're not stubs because of the number of words (prose & non-prose) in them. Stevie is the man! TalkWork 18:30, 17 October 2014 (UTC)

I guess I could also write a module that prevents stub removal, but that only affects my use of AWB. Stevie is the man! TalkWork 18:32, 17 October 2014 (UTC)

Stevie is the man! Where is the page in question? WP:STUB applies to pages with very few words. Consider using {{expand section}}. -- Magioladitis (talk) 18:41, 17 October 2014 (UTC)
After considering your thinking/workaround, I'm not sure I have a good example at this point. I suppose I could reorient how I deal with the few articles my question applies to, by using tags other than stubs. But I'm still wondering: Are non-prose words (lists, tables) counted toward determining stub status? Stevie is the man! TalkWork 18:55, 17 October 2014 (UTC)
Stevie is the man! in fact 2 words in a list are counted as 1 prose word. I think this is fair. -- Magioladitis (talk) 19:02, 17 October 2014 (UTC)
Per our instructions, AWB "removes {{stub}} if article has more than 500 words (comments, categories and persondata are excluded from word count). Words in bulleted text are divided by 2 to avoid destubbing pages with big lists and little text." Any ideas for improvements are welcome. -- Magioladitis (talk) 19:04, 17 October 2014 (UTC)
(Responding to first comment) By what standard is that considered fair? I'm curious. On an article set as a "list article", I can see how counting words 1 for 1 makes sense, but on a standard article, I can see how text in lists/tables wouldn't be counted at all in favor of prose. I think this is fair because, after all, prose is the expectation, not lists. Stevie is the man! TalkWork 19:08, 17 October 2014 (UTC)
My ideas for improvement are 1) counting words 1-for-1 in list articles, and 2) not counting any words in lists/tables in non-list articles. Stevie is the man! TalkWork 19:14, 17 October 2014 (UTC)

The new text search (CirrusSearch) – you may want to get ready before it gets here[edit]

Right now (on EN, ZH, FR and DE Wikipedias) AWB is able to use either the existing text search (LuceneSearch) or the new text search (CirrusSearch). Probably in the next month or so, it will only be able to use the new text search (Cirrus), which will be made the default text search method. Once this happens, for many months you will still be able to use the old text search from your browser, but not from AWB. In your browser, go to Preferences – Beta features and check the box for "New search" to use Cirrus. There is sometimes a lag of several minutes before AWB will notice that your choice has changed. You can use new features of Cirrus with AWB at that point. If you launch AWB and do a text search before logging on or loading a settings file, it will still use Lucene. Once you log on or load a settings file, it will use the search method that you specified in your Beta preferences.

There are some good features in Cirrus, and a few drawbacks.

  • The index for Cirrus usually updated within seconds, unlike Lucene's daily update. If you fix all occurrences of "nucklehead", you can catch new misspellings a minute later.
  • You can find a phrase that has intervening words: "short-lived airline" will find just that phrase, but append "~" and a number to cast a wider net: "short-lived airline"~1 will also find "short-lived British airline", while "short-lived airline"~2 will find all of those plus "short-lived South African airline".
  • (bad and good) Cirrus does not pay any attention to hyphens, whereas Lucene does; Lucene can find "well-known for". With Cirrus you will have to do a normal search that has a RegEx-style "insource" search appended to it;   "well-known for" insource:/[Ww]ell-known for /   (the first half of that search provides a rough cut that includes "well-known for" and "well known for"; the second part fine-tunes those results with a RegEx-style search of the source itself). The real bad part: only one of these RegEx-style searches can run on Wikipedia at a time. A few can be queued up, but if too many are queued up, the search will be rejected. Try some of these from your browser to get a feel for them. If the queue is full, AWB will just return no results, with no explanation. How fast it comes back without results may give you some indication. Using a RegEx-style search this way also allows you to perform case-sensitive searches and searches for blanks where there should be hyphens.

To jump quickly from one search method to another in your browser, add

&srbackend=LuceneSearch

or

&srbackend=CirrusSearch

to the URL.

Once Cirrus is made the default method, adding &srbackend=LuceneSearch to the URL from your browser will be the only way to use that search method. AWB will be unable to use Lucene for searching text.

I will now entertain questions. Chris the speller yack 21:20, 18 October 2014 (UTC)

Ref bot[edit]

I think a separate bot for duplicate references being combined would be a great and important bot for many users. Just a thought.--BabbaQ (talk) 15:13, 19 October 2014 (UTC)

@BabbaQ: You may want to post this on WP:Bot requests. How do you propose the bot could generate a list of articles to edit? GoingBatty (talk) 15:18, 19 October 2014 (UTC)
I have made a post about this on the Requests. Thanks.--BabbaQ (talk) 15:27, 19 October 2014 (UTC)

AWB and custom modules[edit]

Because settings files are xml files, AWB treats text inside <Code>...</Code> as just more xml. Is this a correct thing to be doing? File -> Open settings... chokes and dies if it finds what it thinks might be the start of an xml tag: <. I had thought to put a custom module in a settings file so that all I would need to do after loading the settings file would be to Make module instead of the copy/paste then Make module.

I had thought that I could do this:

    <Code>//<!--        public string ProcessArticle(string ArticleText, string ArticleTitle, int wikiNamespace, out string Summary, out bool Skip)
        {
            Skip = false;
            Summary = "test";
 
            ArticleText = "test \r\n\r\n" + ArticleText;
 
            return ArticleText;
        }//--></Code>

But that method gives this error message:

System.Xml.XmlException: Unexpected node type Comment. ReadElementString method can only be called on elements with simple or empty content.

Is there a way to get AWB to treat the content of <Code>...</Code> as literal text that should just be handled in just the same manner as that text that I paste into the Make module text area?

Even better, is there or can there be a filename field at Make module so that when AWB loads a settings file, it will automatically fetch (and build?) the custom module source file?

Trappist the monk (talk) 15:05, 23 October 2014 (UTC)

Is there a way to protect a page or section from AWB?[edit]

The section Sacred Harp#Origins of the music contains a list with complex formatting (multiple paragraphs and images within a list item) that can't be accomplished with standard wiki markup, so it uses html list markup instead. An AWB edit recently attempted to convert this to wiki markup but failed to preserve the formatting and left a vestigial HTML tag. I have two questions:

  1. Was this mistake an AWB bug or a user error?
  2. Is there a magic word or some other tag that can be applied to this passage so that AWB excludes it and doesn't attempt to alter it in the future? Ibadibam (talk) 20:27, 23 October 2014 (UTC)

Thanks. Ibadibam (talk) 20:27, 23 October 2014 (UTC)

Ibadibam, I made the edit. It was a completely manual edit and not AWB. You should always contact the editor first instead of doing a run-around. Leave a message at the article's talk page and ping me. Bgwhite (talk) 20:49, 23 October 2014 (UTC)
Thanks, Bgwhite. I'm not an AWB user so I don't have any understanding of the process, or why a manual edit would include an AWB clause. Ibadibam (talk) 20:54, 23 October 2014 (UTC)
@Ibadibam: I think Bgwhite meant that he used AWB to manually edit the article, and that the problem was not one of AWB's general fixes. GoingBatty (talk) 21:28, 23 October 2014 (UTC)