Help:WordToWiki
| This page is a how-to guide detailing a practice or process on the English Wikipedia. |
Contents |
[edit] Microsoft Word
Microsoft released an add-in that allows you to save your Microsoft Office Word 2007 or above documents straight into MediaWiki.
- Download the "Microsoft Office Word Add-in For MediaWiki" from Microsoft Download Center, and install it.
- Save the document as "MediaWiki (*.txt)" file type.
- Copy the text from the (*.txt) file into your Wiki page
This Microsoft add-in does not handle images and throws an error (fixed?)
[edit] Two-stage conversion from Word to MediaWiki
The following methods both perform: Word -> HTML -> MediaWiki.
[edit] Quick and Dirty
- Open your document in Word, and "save as" an HTML file.
- Open the HTML file in a text editor and copy the HTML source code to the clipboard.
- Paste the HTML source into the large text box labeled "Raw HTML" on the html to wiki page.
- Click the "Convert HTML to wiki markup" button.
- Select the text in the "MediaWiki markup" text box and copy it to the clipboard.
- Paste the text to a Wikipedia article.
[edit] Automated scripts
The conversion can also be done using a combination of two scripts and two software packages.
- The following two software packages must be installed:
- wvHtml Word to HTML converter - part of the "wvWare" word viewing library. (Note: wvHtml is deprecated and the site recommends using
AbiWord --to=htmlinstead. AbiWord can be obtained at abisource.com.) - HTML::WikiConverter - a perl module to convert HTML to wiki markup language.
- wvHtml Word to HTML converter - part of the "wvWare" word viewing library. (Note: wvHtml is deprecated and the site recommends using
- Write the bash script "doc2mw", and the perl script "html2mw", both shown below.
- Call doc2mw passing the word document as parameter. i.e.
> doc2mw my_word.doc
doc2mw: a bash script taking a single parameter, which calls wvHtml followed by html2mw.
#!/bin/bash
# doc2mw - Word to MediaWiki converter
FILE=$1
TMP="$$-${FILE}"
if [ -x "./html2mw" ]; then
HTML2MW='./html2mw'
else
HTML2MW='html2mw'
fi
wvHtml --targetdir=/tmp "${FILE}" "${TMP}"
# but see also AbiWord: http://www.abisource.com/help/en-US/howto/howtoexporthtml.html
# Remove extra divs
perl -pi -e "s/\<div[^\>]+.\>//gi;" "/tmp/${TMP}"
${HTML2MW} "/tmp/${TMP}"
rm "/tmp/${TMP}"
html2mw: a perl script called by doc2mw, which uses HTML::WikiConverter to convert html -> mediawiki.
#!/usr/bin/perl
# html2mw - HTML to MediaWiki converter
use HTML::WikiConverter;
my $b;
while (<>) { $b .= $_; }
my $w = new HTML::WikiConverter( dialect => 'MediaWiki' );
my $p = $w->html2wiki($b);
# Substitutions to get rid of nasty things we don't need
$p =~ s/<br \/>//g;
$p =~ s/\ \;//g;
print $p;
Disclaimer: These scripts are probably not the best way to do this, only a possible way to do this. Please feel free to improve them.
[edit] OpenOffice or LibreOffice
OpenOffice versions 3.3 and later and derivatives like LibreOffice can send Word documents directly to a MediaWiki. (At least for the German version of OpenOffice 3.3.0 you need to install the ‘Sun Wiki Publisher’-extension first!)
Once you have added the MediaWiki-server of your choice, future submissions can happen automatically.
- Open the Word document in OpenOffice or LibreOffice Writer.
- Go to File / Send-To / To MediaWiki
- Select your MediaWiki-server (or click on the button "Add..." to add a new site).
- Select a title and summary for your article, check the box if it's a minor revision.
- Click the send button.
Alternatively you can use the manual ‘export-function’ as well: File -> Export -> choose ‘MediaWiki (.txt)’-format.