8,236
edits
(Tools summary section) |
|||
(12 intermediate revisions by 7 users not shown) | |||
Line 1: | Line 1: | ||
{{Translations}} | {{Translations}} | ||
The Kiwix tools are a set of scripts (mostly in Perl) aiming to help creating content usable by Kiwix. | ''This tools are deprecated. Although they may work under certain conditions, we don't provide support for them.'' | ||
The Kiwix tools are a set of scripts (mostly in Perl) aiming to help creating content usable by Kiwix. Current development code can be found at: | |||
<nowiki>svn co http://kiwix.svn.sourceforge.net/svnroot/kiwix/tools/ kiwix-tools</nowiki> | |||
Kiwix is primarily designed as a tool to publish copies of Wikipedia, but every effort is made to ensure it would also be useful for: | Kiwix is primarily designed as a tool to publish copies of Wikipedia, but every effort is made to ensure it would also be useful for: | ||
Line 36: | Line 39: | ||
* the storage space you have for the final result | * the storage space you have for the final result | ||
* how to make the selection if necessary. | * how to make the selection if necessary. | ||
==Prerequisites== | |||
You'll need a bunch of Perl modules to run these scripts. Here is a list of modules one tester ([[User:Ijon]]) had to install given a plain Perl 5.10 installation on Ubuntu Linux. Your mileage may vary. Install them using CPAN (perl -MCPAN -e shell), CPANPLUS (cpanp(1)), or your distro's Perl bundling mechanism. | |||
* Array::PrintCols | |||
* Getargs::Long | |||
* HTML::Parser | |||
* HTML::Tagset | |||
* LWP | |||
* Log::Agent | |||
* Log::Log4perl | |||
* Term::Query | |||
* URI | |||
* XML::DOM | |||
* XML::NamespaceSupport | |||
* XML::Parser | |||
* XML::Parser::PerlSAX | |||
* XML::RegExp | |||
* XML::SAX | |||
* XML::SAX::Expat | |||
* XML::Simple | |||
I managed to install these by installing this subset and allowing automatic installation of dependencies: | |||
* XML::Simple | |||
* XML::DOM | |||
* Term::Query | |||
* Array::PrintCols | |||
* Log::Log4perl | |||
* Getargs::Long | |||
=== Debian/Ubuntu dependencies === | |||
<pre>sudo apt-get install liblog-log4perl-perl libdata-dumper-simple-perl libxml-simple-perl | |||
libxml-libxml-perl libarray-printcols-perl libgetargs-long-perl | |||
liburi-perl libdata-dumper-simple-perl libhtml-linkextractor-perl | |||
libhtml-parser-perl libdbd-pg-perl</pre> | |||
==Usage== | ==Usage== | ||
Here is a list of available scripts (many of them are specific to Mediawiki): | Here is a list of available scripts (many of them are specific to Mediawiki): | ||
===Mediawiki Maintenance=== | ===Mediawiki Maintenance=== | ||
* | * {{ScriptTool|backupMediawikiInstall.pl}} creates a tgz archive of a complete existing Mediawiki installation (code + resources + database). | ||
* | * {{ScriptTool|installMediawiki.pl}} brings up an instance of Mediawiki from source code without human intervention. This actually simulates the manual Mediawiki installation process. | ||
* | * {{ScriptTool|resetMediawikiDatabase.pl}} empties a local instance of Mediawiki of all pages. | ||
===Mirroring Tools=== | ===Mirroring Tools=== | ||
* | * {{ScriptTool|buildHistoryFile.pl}} given a list of articles and an online Mediawiki site, obtains complete histories of each page on the list. | ||
** | ** {{ScriptTool|extractContributorsFromHistoryFile.pl}} extracts a list of authors from the histories obtained by the buildHistoryFile.pl script. | ||
* | * {{ScriptTool|buildContributorsHtmlPages.pl}} given a template and a list of authors, builds a custom set of HTML pages containing all of the authors on the list. | ||
* | * {{ScriptTool|checkMediawikiPageCompleteness.pl}} check if the local copies of pages from an online Mediawiki site are complete, i.e. have no dependencies (template files, multimedia resources, etc.) missing. | ||
* | * {{ScriptTool|checkPageExistence.pl}} given a list of page titles and an online Mediawiki site, checks whether such pages exist in it. This can be handy, for example, to see what pages have been replicated. | ||
* | * {{ScriptTool|checkRedirects.pl}} checks if there are no pages redirecting to non-existent pages (i.e. broken redirects). Eventually, it should also check against pages redirecting to each other. | ||
* | * {{ScriptTool|listAllImages.pl}} lists all images of an online Mediawiki site. | ||
* | * {{ScriptTool|listAllPages.pl}} lists all pages in an online Mediawiki site. | ||
* | * {{ScriptTool|listCategoryEntries.pl}} lists the pages belonging to a category, recursively. | ||
* | * {{ScriptTool|listRedirects.pl}} list page redirects in an online Mediawiki site. | ||
* | * {{ScriptTool|mirrorMediawikiCode.pl}} downloads the exact same version used by an online MediaWiki site; this includes both Mediawiki code and Mediawiki extensions. | ||
* | * {{ScriptTool|mirrorMediawikiInterwikis.pl}} installs to a local Mediawiki site the InterWikis (cross-language links) exactly identical to an online Mediawiki site. | ||
* | * {{ScriptTool|mirrorMediawikiPages.pl}} copies a set of pages and their dependencies (template and multimedia resources) from an online Mediawiki site to a local Mediawiki site. | ||
* | * {{ScriptTool|modifyMediawikiEntry.pl}} removes, deletes, or replaces a list of pages from an online Mediawiki site. | ||
===Dumping Tools=== | ===Dumping Tools=== | ||
Line 68: | Line 107: | ||
===ZIM Generation=== | ===ZIM Generation=== | ||
* [http://kiwix.svn.sourceforge.net/viewvc/kiwix/dumping_tools/scripts/buildZimFileFromDirectory.pl?view=log buildZimFileFromDirectory.pl] creates a ZIM file from a directory tree containing static HTML and other content files. | * [http://kiwix.svn.sourceforge.net/viewvc/kiwix/dumping_tools/scripts/buildZimFileFromDirectory.pl?view=log buildZimFileFromDirectory.pl] creates a ZIM file from a directory tree containing static HTML and other content files. | ||
[[Category:Developer's Guide]] | |||
== Virtual machine == | |||
We have prepared a VM to help people to make ZIM files from their HTML files. Download it [http://download.kiwix.org/dev/ZIMmakerVMv3.ova there]. Unix login/pass are root/kiwix and for postgres: postgres/kiwix. To build your ZIM file go to root/dumping_tools/scripts and use buildZimFileFromdirectory.pl. | |||
== See also == | |||
* [[:File:Pediapress zim creation approach with mediawiki collection extension.jpg]] | |||
* [[:File:Kiwix zim creation approach.jpg]] |
edits