Difference between revisions of "Tools/en"

Jump to navigation Jump to search
5,657 bytes added ,  10 years ago
no edit summary
(Translation of the Generation section)
 
(13 intermediate revisions by 7 users not shown)
Line 1: Line 1:
{{Translations}}
{{Translations}}


The Kiwix tools are a set of scripts (mostly in Perl) aiming to help creating content usable by Kiwix.
''This tools are deprecated. Although they may work under certain conditions, we don't provide support for them.''
 
The Kiwix tools are a set of scripts (mostly in Perl) aiming to help creating content usable by Kiwix. Current development code can be found at:
  <nowiki>svn co http://kiwix.svn.sourceforge.net/svnroot/kiwix/tools/ kiwix-tools</nowiki>


Kiwix is primarily designed as a tool to publish copies of Wikipedia, but every effort is made to ensure it would also be useful for:
Kiwix is primarily designed as a tool to publish copies of Wikipedia, but every effort is made to ensure it would also be useful for:
Line 36: Line 39:
* the storage space you have for the final result
* the storage space you have for the final result
* how to make the selection if necessary.
* how to make the selection if necessary.
==Prerequisites==
You'll need a bunch of Perl modules to run these scripts.  Here is a list of modules one tester ([[User:Ijon]]) had to install given a plain Perl 5.10 installation on Ubuntu Linux.  Your mileage may vary.  Install them using CPAN (perl -MCPAN -e shell), CPANPLUS (cpanp(1)), or your distro's Perl bundling mechanism.
* Array::PrintCols
* Getargs::Long
* HTML::Parser
* HTML::Tagset
* LWP
* Log::Agent
* Log::Log4perl
* Term::Query
* URI
* XML::DOM
* XML::NamespaceSupport
* XML::Parser
* XML::Parser::PerlSAX
* XML::RegExp
* XML::SAX
* XML::SAX::Expat
* XML::Simple
I managed to install these by installing this subset and allowing automatic installation of dependencies:
* XML::Simple
* XML::DOM
* Term::Query
* Array::PrintCols
* Log::Log4perl
* Getargs::Long
=== Debian/Ubuntu dependencies ===
<pre>sudo apt-get install liblog-log4perl-perl libdata-dumper-simple-perl libxml-simple-perl
libxml-libxml-perl libarray-printcols-perl libgetargs-long-perl
liburi-perl libdata-dumper-simple-perl libhtml-linkextractor-perl
libhtml-parser-perl libdbd-pg-perl</pre>
==Usage==
Here is a list of available scripts (many of them are specific to Mediawiki):
===Mediawiki Maintenance===
* {{ScriptTool|backupMediawikiInstall.pl}} creates a tgz archive of a complete existing Mediawiki installation (code + resources + database).
* {{ScriptTool|installMediawiki.pl}} brings up an instance of Mediawiki from source code without human intervention. This actually simulates the manual Mediawiki installation process.
* {{ScriptTool|resetMediawikiDatabase.pl}} empties a local instance of Mediawiki of all pages.
===Mirroring Tools===
* {{ScriptTool|buildHistoryFile.pl}} given a list of articles and an online Mediawiki site, obtains complete histories of each page on the list.
** {{ScriptTool|extractContributorsFromHistoryFile.pl}} extracts a list of authors from the histories obtained by the buildHistoryFile.pl script.
* {{ScriptTool|buildContributorsHtmlPages.pl}} given a template and a list of authors, builds a custom set of HTML pages containing all of the authors on the list.
* {{ScriptTool|checkMediawikiPageCompleteness.pl}} check if the local copies of pages from an online Mediawiki site are complete, i.e. have no dependencies (template files, multimedia resources, etc.) missing.
* {{ScriptTool|checkPageExistence.pl}} given a list of page titles  and an online Mediawiki site, checks whether such pages exist in it.  This can be handy, for example, to see what pages have been replicated.
* {{ScriptTool|checkRedirects.pl}} checks if there are no pages redirecting to non-existent pages (i.e. broken redirects).  Eventually, it should also check against pages redirecting to each other.
* {{ScriptTool|listAllImages.pl}} lists all images of an online Mediawiki site.
* {{ScriptTool|listAllPages.pl}} lists all pages in an online Mediawiki site.
* {{ScriptTool|listCategoryEntries.pl}} lists the pages belonging to a category, recursively.
* {{ScriptTool|listRedirects.pl}} list page redirects in an online Mediawiki site.
* {{ScriptTool|mirrorMediawikiCode.pl}} downloads the exact same version used by an online MediaWiki site;  this includes both Mediawiki code and Mediawiki extensions.
* {{ScriptTool|mirrorMediawikiInterwikis.pl}} installs to a local Mediawiki site the InterWikis (cross-language links) exactly identical to an online Mediawiki site.
* {{ScriptTool|mirrorMediawikiPages.pl}} copies a set of pages and their dependencies (template and multimedia resources) from an online Mediawiki site to a local Mediawiki site.
* {{ScriptTool|modifyMediawikiEntry.pl}} removes, deletes, or replaces a list of pages from an online Mediawiki site.
===Dumping Tools===
* [http://kiwix.svn.sourceforge.net/viewvc/kiwix/dumping_tools/scripts/checkEmptyFilesInHtmlDirectory.pl?view=log checkEmptyFilesInHtmlDirectory.pl] checks whether a directory and its subdirectories contain empty files.
* [http://kiwix.svn.sourceforge.net/viewvc/kiwix/dumping_tools/scripts/dumpHtml.pl?view=log dumpHtml.pl] given a local Mediawiki site, makes all-static copies of pages, i.e. creates a directory with all needed HTML.
* [http://kiwix.svn.sourceforge.net/viewvc/kiwix/dumping_tools/scripts/launchTntreader.pl?view=log launchTntreader.pl] easily launches the tntreader program.
* [http://kiwix.svn.sourceforge.net/viewvc/kiwix/dumping_tools/scripts/optimizeContents.pl?view=log optimizeContents.pl] optimizes a directory with HTML pages and resources. This script calls the following extensions: [http://tidy.sourceforge.net/ HTML Tidy] for HTML files;  The [http://sourceforge.net/projects/littleutils/ Little utils] for images.
===ZIM Generation===
* [http://kiwix.svn.sourceforge.net/viewvc/kiwix/dumping_tools/scripts/buildZimFileFromDirectory.pl?view=log buildZimFileFromDirectory.pl] creates a ZIM file from a directory tree containing static HTML and other content files.
[[Category:Developer's Guide]]
== Virtual machine ==
We have prepared a VM to help people to make ZIM files from their HTML files. Download it [http://download.kiwix.org/dev/ZIMmakerVMv3.ova there]. Unix login/pass are root/kiwix and for postgres: postgres/kiwix. To build your ZIM file go to root/dumping_tools/scripts and use buildZimFileFromdirectory.pl.
== See also ==
* [[:File:Pediapress zim creation approach with mediawiki collection extension.jpg]]
* [[:File:Kiwix zim creation approach.jpg]]

Navigation menu