Difference between revisions of "Mediawiki DumpHTML extension improvement"

Jump to navigation Jump to search
no edit summary
Line 1: Line 1:
The '''DumpHTML extension improvement''' is an effort which needs to be granted as soon a possible to provide an efficient and a simple to use (and deploy) way for people wanting to bring offline their Mediawiki base wiki.
The '''Mediawiki DumpHTML extension improvement''' is an effort which needs to be granted as soon a possible to provide an efficient and a simple to use (and deploy) way for people wanting to bring offline their Mediawiki base wiki.


== Context ==
== Context ==
The ZIM format was choosen by the top actors around Mediawiki to provide an offline usable version of their content. The ZIM format was designed to deal efficiently with hugh amount of data. This format is complementary to the [https://secure.wikimedia.org/wikipedia/en/wiki/EPUB EPUB] which is more though for small content.  
The ZIM format was choosen by the top actors around Mediawiki to provide an offline usable version of their content. The ZIM format was designed to deal efficiently with hugh amount of data. This format is complementary to the [https://secure.wikimedia.org/wikipedia/en/wiki/EPUB EPUB] which is more though for small content.  


But, we still suffer from a lack of tools to build such files and only a few people have the mandatory know-how to do it.
But, we still suffer from a lack of tools to build such files and only a few people have the mandatory know-how and tools to do it:
* Kiwix, using a hacked version of Mediawiki DumpHTML extension, which currently is the only one project generating (and more or less able) big ZIM files from WMF projects. [[Template:ZIMdumps|You may already yet download on Kiwix Web site such ZIM files]].
* Mediawiki Collection extension developed by Pediapress which is on Wikipedia user friendly but really complicated too install on a separate instance, slow and not able at all to deal with huge amount of data. In addition, the technical approach is they are not able at all to tune the content efficiently for offline usages.


We have currently
Our year long experience showed us that the [http://www.mediawiki.org/wiki/Extension:DumpHTML Mediawiki DumpHTML extension] (at least the approach) is the best solution to export the Mediawiki dynamic generated HTML pages in a set of static HTML/Media files. This extension is better and has more potential to get a good set of HTML pages from a Mediawiki (in comparison with a Web site mirroring tool for example or an extern rendering solution).  
 
The [http://www.mediawiki.org/wiki/Extension:DumpHTML Mediawiki DumpHTML extension] is the best solution to export the dynamic generated HTML pages in a set of static HTML/Media files. This extension is better and has more potential to get a good set of HTML pages from a Mediawiki (in comparison with a Web site mirroring tool for example or an extern rendering solution). This is true especially if you deal with an big amount of content and actually this is the solution retained to the big ZIM [[Template:ZIMdumps|you may already yet download on Kiwix Web site]].


Unfortunately, they are few pain points:
Unfortunately, they are few pain points:

Navigation menu