|
|
Line 66: |
Line 66: |
| === Mediawiki DumpHTML extension work === | | === Mediawiki DumpHTML extension work === |
|
| |
|
| The [http://www.mediawiki.org/wiki/Extension:DumpHTML Mediawiki DumpHTML extension] is the solution to export the dynamic generated HTML pages in a set of static HTML/Media files. This extension is the best way to get a good set of HTML pages from a Mediawiki (in comparison with a Web site mirroring tool for example).
| | * [[Mediawiki DumpHTML extension work]] |
| | |
| Unfortunately, they are few pain points:
| |
| * Not maintained, bugs are not fixed, new features are not implemented
| |
| * Only available for Mediawiki system admin
| |
| * Does not generate ZIM files
| |
| | |
| ==== Workpackage1: Revamping and fixing bugs ====
| |
| | |
| The worth point is that the DumpHTML extension is not correctly maintained and with the time, [https://bugzilla.wikimedia.org/buglist.cgi?query_format=advanced&list_id=2671&component=DumpHTML&resolution=---&product=MediaWiki%20extensions many issues were discovered]. Currently, the extension is not really usable without fixing/tweaking the Mediawiki code.
| |
| | |
| The purpose of this work package is to fix the most critical bugs to achieve that everyone having a Mediawiki would be able to simply get a HTML dumps of his content and consequently being able to easily generate a ZIM file afterwards
| |
| | |
| After the revamping, the result should be perfect in HTML similar to what dumpHTML.pl does.
| |
| | |
| Delivarables:
| |
| * Revamping dumpHTML and fixing bugs (80-120 hours)
| |
| | |
| ==== Workpackage2: phpzim creation an integration in DumpHTML extension ====
| |
| | |
| phpzim would be a new php module allowing to create/write and read ZIM file directly in PHP. This would be a binding of the zimlib, like pyzim in Python. With this library done, we will be able to create ZIM file directly from the DumpHTML.
| |
| | |
| To be able to get a ZIM file, the user will have to call dumpHTML.php and specify that he wants a ZIM file output (not a HTML dump) also some other meta informations like title, creator, etc.
| |
| | |
| So the user will need a system access where the Mediawiki instance runs and also to install phpzim (should be packaged).
| |
| | |
| Deliverables:
| |
| * phpzim (40 hours)
| |
| * updated dumpHTML (20 hours)
| |
| | |
| ==== Workpackage3: Integrating Collection and DumpHTML extensions and new features ====
| |
| | |
| By integrating the DumpHTML and the Collection extension we want to give to everyone the capacity to easily create small ZIMs from the Wikipedia user interface with following advantages:
| |
| * exactly the same rendering as online
| |
| * no external dependency to install for the Mediawiki admin
| |
| * rendering done by MediaWiki (as fast as online browsing)
| |
| | |
| In addition we want to implement a few additional features (see the list of deliverables).
| |
| | |
| Deliverables:
| |
| * Book and DumpHTML integration (30 hours)
| |
| * DumpHTML parallel processing (15 hours)
| |
| * Build selection based on list of titles (20 hours)
| |
| * Create an offline skin for mobiles to make dumps for mobiles (15 hours)
| |
| * Make offline skin to avoid pictures (5 hours)
| |
|
| |
|
| === Kiwix plug === | | === Kiwix plug === |