I made this program for searching the database dump (The "Articles, templates, image descriptions, and primary meta-pages." dump, normally about 1100 MB).
It can find simple or regex (regular expression) text matches in articles. It can also find articles with more or less than a certain number of characters or links.
It is good for finding common mistakes, such as typos. It takes only a few minutes to process the entire database. The lists it produces are useful with the WP:AWB, a pywikibot or pasted into Wikipedia.
A typical run takes between 1 and 6 minutes using an Athlon XP 2500 CPU.
If you use this software, please let me know what you think!
- Search for common spelling/grammar errors.
- Search for people categories that do not have a sort key e.g. search for [[Category:Economists]].
- 22.214.171.124, released 4 May 2006