This is a package to build robots for MediaWiki wikis like Wikipedia. Some example robots are
included. 

=======================================================================
PLEASE DO NOT PLAY WITH THIS PACKAGE. These programs can actually
modify the live wiki on the net, and proper wiki-etiquette should
be followed before running it on any wiki.
=======================================================================

To get started on proper usage of the bot framework, please refer to:

    http://meta.wikimedia.org/wiki/Using_the_python_wikipediabot

The contents of the package are:

===  Library routines ===

LICENSE                : a reference to the MIT license
wikipedia.py           : The wikipedia library
wiktionary.py          : The wiktionary library
config.py              : Configuration module containing all defaults. Do not
                         change these! See below how to change values.
titletranslate.py      : rules and tricks to auto-translate wikipage titles
date.py                : Date formats in various languages
family.py              : Abstract superclass for wiki families. Subclassed by
                         the classes in the 'families' subdirectory.
catlib.py              : Library routines written especially to handle
                         category pages and recurse over category contents.
gui.py                 : Some GUI elements for solve_disambiguation.py
mediawiki_messages.py  : Access to the various translations of the MediaWiki
                         software interface.
pagegenerators.py      : Generator pages.
userlib.py             : Library to work with users, their pages and talk pages.
BeautifulSoup.py       : is a Python HTML/XML parser designed for quick turnaround
                         projects like screen-scraping. See more:
                         http://www.crummy.com/software/BeautifulSoup

=== Utilities ===

basic.py               : Is a template from which simple bots can be made.
checkusage.py          : Provides a way for users of the Wikimedia toolserver to check the 
                         use of images from Commons on other Wikimedia wikis.
extract_wikilinks.py   : Two bots to get all linked-to Wikipedia pages from an
                         HTML-file. They differ in their output: extract_names
                         gives bare names (can be used for solve_disambiguation.py,
                         table2wiki.py or windows-chars.py), extract_wikilinks
                         gives them in interwiki-link format (can be used for
                         interwiki.py)
followlive.py          : Periodically grab the list of new articles and analyze
                         them. If the article is too short, a menu will let you
                         easily add a template.
get.py                 : Script to gets a page and writes its contents to standard output.
login.py               : Log in to an account on your "home" wikipedia. 
splitwarning.py        : split an interwiki.log file into warning files for each
                         separate language. suggestion: Zip the created files up, 
                         put them somewhere on the internet, and send an 
                         announcement of the location on the robot mailinglist.
test.py                : Check whether you are logged in.
testfamily.py          : Check whether you are logged in all known languages in a family.
xmltest.py             : Read an XML file (e.g. the sax_parse_bug.txt sometimes
                         created by interwiki.py), and if it contains an error,
                         show a stacktrace with the location of the error.
editarticle.py         : Edit an article with your favourite editor. Run
                         the script with the "--help" option to get
                         detailed infortion on possiblities.
sqldump.py             : Extract information from local cur SQL dump
                         files, like the ones at http://download.wikimedia.org
rcsort.py              : A tool to see the recentchanges ordered by user instead of by date.
threadpool.py          : 
xmlreader.py           : 
watchlist.py           : Allows access to the bot account's watchlist.
wikicomserver.py       : This library allows the use of the pywikipediabot directly
                         from COM-aware applications.

=== Robots ===

capitalize_redirects.py: Script to create a redirect of capitalize articles.
casechecker.py         : Script to enumerate all pages in the wikipedia and find all titles
                         with mixed latin and cyrilic alphabets.
category.py            : add a category link to all pages mentioned on a page,
                         change or remove category tags
catall.py              : Add or change categories on a number of pages.
catmove.pl             : Need Perl programming language for takes a list of category
                         moves or removes to make and uses category.py.
clean_sandbox.py       : This bot makes the cleaned of the page of tests.
commons_link.py        : This robot include commons template to linking Commons and 
                         your wiki project.
copyright.py           : This robot check copyright text in Google, Yahoo! and Live Search.
cosmetic_changes.py    : Can do slight modifications to a wiki page source code
                         such that the code looks cleaner.
delete.py              : This script can be used to delete pages en masse.
disambredir.py         : Changing redirect names in disambiguation pages.
featured.py            : A robot to check feature articles.
fixes.py               : This is not a bot, perform one of the predefined replacements
                         tasks, used for "replace.py -fix:replacement".
image.py               : This script can be used to change one image to another or 
                         remove an image entirely.
imagetransfer.py       : Given a Wikipedia page, check the interwiki links
                         for images, and let the user choose among them for
                         images to upload
inline_images.py       : This bot looks for images that are linked inline (i.e., they 
                         are hosted from an external server and hotlinked).
interwiki.py           : A robot to check interwiki links on all pages (or
                         a range of pages) of a wikipedia.
interwiki_graph.py     : Possible create graph with interwiki.py.
imageharvest.py        : Bot for getting multiple images from an external site.
isbn.py                : Bot for converts all ISBN-10 codes to the ISBN-13 format.
makecat.py             : Given an existing or new category, find pages for that
                         category.
movepages.py           : Bot page moves to another title.
nowcommons.py          : This bot can be deleted images with NowCommons template.
pagefromfile.py        : This bot takes its input from a file that contains a number of
                         pages to be put on the wiki.
redirect.py            : Fix double redirects and broken redirects. Note:
                         solve_disambiguation also has functions which treat
                         redirects.
refcheck.py            : This script checks references to see if they are properly
                         formatted. 
replace.py             : Search articles for a text and replace it by another
                         text. Both text are set in two configurable
                         text files. The bot can either work on a set of given
                         pages or crawl an SQL dump.
saveHTML.py            : Downloads the HTML-pages of articles and images.
selflink.py            : This bot goes over multiple pages of the home wiki, 
                         searches for selflinks, and allows removing them.
solve_disambiguation.py: Interactive robot doing disambiguation.
speedy_delete.py       : This bot load a list of pages from the category of candidates
                         for speedy deletion and give the user an interactive prompt to decide
                         whether each should be deleted or not.
spellcheck.py          : This bot spellchecks Wikipedia pages.
standardize_interwiki.py:A robot that downloads a page, and reformats the
                         interwiki links in a standard way (i.e. move all
                         of them to the bottom or the top, with the same
                         separator, in the right order).
standardize_notes.py   : Converts external links and notes/references to
                       : Footnote3 ref/note format.  Rewrites References.
table2wiki.py          : Semi-automatic converting HTML-tables to wiki-tables.
templatecount.py       : Display the list of pages transcluding a given list
				 of templates.
template.py            : change one template (that is {{...}}) into another.
touch.py               : Bot goes over all pages of the home wiki, and edits
                         them without changing.
unlink.py              : This bot unlinks a page on every page that links to it.
unusedfiles.py         : Bot appends some text to all unused images and other
                         text to the respective uploaders.
upload.py              : upload an image to Wikipedia.
us-states.py           : A robot to add redirects to cities for US state 
                         abbreviations.
warnfile.py            : A robot that parses a warning file created by 
                         interwiki.py on another wikipedia language,
                         and implements the suggested changes without
                         verifying them.
weblinkchecker.py      : Check if external links are still working.
welcome.py             : Script to welcome new users.
windows_chars.py       : Change characters that are not part of Latin-1 into
                         something harmless. It is advisable to do this on
                         Latin-1 wikis before switching to UTF-8.

=== Directories ===

archive                : Contains old bots.
category               : 
copyright              : Contains information retrieved by copyright.py
deadlinks              : Contains information retrieved by weblinkchecker.py
disambiguations        : If you run solve_disambiguation.py with the -primary
                         argument, the bot will save information here
families               : Contains wiki-specific information like URLs,
                         languages, encodings etc.
featured               : Stored featured article in cache file.
interwiki_dump         : If the interwiki bot is interrupted, it will store
                         a dump file here. This file will be read when using 
                         the interwiki bot with -restore or -continue.
interwiki_graphs       : Contains graphs for interwiki_graph.py
logs                   : Contains logfiles.
mediawiki-messages     : Information retrieved by mediawiki_messages.py will
                         be stored here.
login-data             : login.py stores your cookies here (Your password won't
                         be stored as plaintext).
simplejson             : A simple, fast, extensible JSON encoder and decoder
                         used by query.py.
spelling               : Contains dictionaries for spellcheck.py.
userinterfaces         : Contains Tkinter, WxPython, terminal and transliteration 
                         interfaces user choose in user-config.py
watchlists             : Information retrieved by watchlist.py will be stored
                         here.
wiktionary             : Contains script to used for Wiktionary project.


=== Unit tests ===

wiktionarytest.py      : Unit tests for wiktionary.py



External software can be used with PyWikipediaBot:
  * Win32com library for use with wikicomserver.py
  * Pydot, Pyparsing and Graphviz for use with interwiki_graph.py
  * JSON for use with query.py
  * PyGoogle to access Google Web API and PySearch to access Yahoo! Search
    Web Services for use with copyright.py and pagegenerators.py
  * MySQLdb to access MySQL database for use with pagegenerators.py

PyWikipediaBot makes use of some modules that are part of python, but that
are not installed by default on some Linux distributions:
  * python-xml (required to parse XML via SaX2)
  * python-celementtree (recommended if you use XML dumps)
  * python-tkinter (optional, used by some experimental GUI stuff)


More precise information, and a list of the options that are available for
the various programs, can be retrieved by running the bot with the -help
parameter, e.g.

    python interwiki.py -help

You need to have at least python version 2.4 (http://www.python.org/download/) 
installed on your computer to be able to run any of the code in this package.
Although some of the code may work on python version 2.3, support for older
versions of python is not planned.

You do not need to "install" this package to be able to make use of
it. You can actually just run it from the directory where you unpacked
it or where you have your copy of the SVN sources.

Before you run any of the programs, you need to create a file named
user-config.py in your current directory. It needs at least two lines: 
The first line should set your real name; this will be used to identify you
when the robot is making changes, in case you are not logged in. The
second line sets the code of your home language. The file should look like:

===========
username='My name'
mylang='xx'
===========

There are other variables that can be set in the configuration file, please
check config.py for ideas.

After that, you are advised to create a username + password for the bot, and
run login.py. Anonymous editing is not possible.
