Difference between revisions of "Talk:Wikipedia"

From Kiwix
Jump to navigation Jump to search
(Questions about ZIM file generation)
 
 
(4 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== Creating the files with images ==
== Creating the files with images ==
How are the files which include images created? I can see making the smaller ones with the export tool, but the 40GB one with all the images seems harder. I found [http://permalink.gmane.org/gmane.org.wikimedia.offline/521 An old thread] listing some of the tools used, but the default way to use those tools seems to hit Wikipedia's servers fairly hard, at least if everything is working correctly and all the downloads work. Is there a way to do this from the [http://dumps.wikimedia.org/enwiki/20140304/ official dumps], which should be both faster and less likely to overload the Wikipedia servers? I think I can do this by hosting a Wikimedia instance with Parsoid myself and setting the script's <code>parsoidUrl</code> to "http://localhost:8000/localhost/" and <code>hostUrl</code> to "http://localhost/". Would this actually work? Would it have any subtle issues to be aware of? Is there a better solution at the moment? --[[User:DanielH|DanielH]] ([[User talk:DanielH|talk]]) 09:07, 8 March 2014 (CET)
How are the files which include images created? I can see making the smaller ones with the export tool, but the 40GB one with all the images seems harder. I found [http://permalink.gmane.org/gmane.org.wikimedia.offline/521 An old thread] listing some of the tools used, but the default way to use those tools seems to hit Wikipedia's servers fairly hard, at least if everything is working correctly and all the downloads work. Is there a way to do this from the [http://dumps.wikimedia.org/enwiki/20140304/ official dumps], which should be both faster and less likely to overload the Wikipedia servers? I think I can do this by hosting a Wikimedia instance with Parsoid myself and setting the script's <code>parsoidUrl</code> to "http://localhost:8000/localhost/" and <code>hostUrl</code> to "http://localhost/". Would this actually work? Would it have any subtle issues to be aware of? Is there a better solution at the moment? --[[User:DanielH|DanielH]] ([[User talk:DanielH|talk]]) 09:07, 8 March 2014 (CET)
:This email tells you how it is done, you can get more information [http://openzim.org/wiki/Build_your_ZIM_file available here]. There is no easy way to create ZIM files from the XML/Wikicode, but what you propose will certainly work. The idea is not that everybody creates ZIM files of Wikipedia with millions of entries, but that we create them and other use them. But, we really want to provide all Wikipedia ZIM file at least with a new version per month. If you need a ZIM file which is not already available, open a feature request. [[User:Kelson|Kelson]] ([[User talk:Kelson|talk]]) 10:15, 8 March 2014 (CET)
:: The email says what tools were used, but not how. As you now know, I've already read the page you linked, which doesn't even mention mwoffliner (this should probably be fixed by somebody who understands mwoffliner). The documentation for mwoffliner helps, but seems to want to be used with the online servers instead of dumps, which is wasteful and slow; that's why I'm trying to figure out how to combine it with the dumps. I don't actually want to do Wikipedia myself; I want several other English Wikimedia projects ''and the corresponding pictures'' (if I didn't want the images I'd just use the dumps, a local Wikimedia install, Wiki2html and zimwriterfs as mentioned on the page you linked). Is it still OK to open a feature request for English non-Wikipedia Wikimedia projects? And even if you did provide all these files, I'd still want to know ''how'' you generated them. --[[User:DanielH|DanielH]] ([[User talk:DanielH|talk]]) 05:27, 9 March 2014 (CET)
== Recommendation of non-free software ==
Right now "torrent software" links to µTorrent web site, which is a closed source software. I would be best to recommend a free and open source one (like [https://en.wikipedia.org/wiki/Transmission_(BitTorrent_client) Transmission]) or point to the [https://en.wikipedia.org/wiki/Comparison_of_BitTorrent_clients comparison of BitTorrent clients in Wikipedia]. Looking at the [https://en.wikipedia.org/wiki/Comparison_of_BitTorrent_clients#Operating_system_support operating system support] I see that [https://en.wikipedia.org/wiki/QBittorrent qBittorrent] is the only one which can run on Windows, Mac OS X, Linux, BSD, iOS and Android. If you want to recommend only one client, it seems like the universal one.
:You are right. The point is that, even if a closed-souce software, utorrent is a really good bittorrent client. I use myself Transmission, but Transmission is buggy with web seeds, so we can not recommend it. I don't know qBittorrent, this is a problem. If If I had to change to an open-source software, I would propose [http://deluge-torrent.org/ deluge]. You can change for deluge if you want, I agree. [[User:Kelson|Kelson]] ([[User talk:Kelson|talk]]) 22:36, 23 January 2015 (CET)
::I also use Transmission (in Debian stable). Just tried Deluge on Windows 7 and the GUI freezes from time to time. In the same machine, with the same torrents, µTorrent worked correctly. I'm sure there is no single client which is open source, bug free, multiplatform, with the prettiest GUI and recently updated. That's way I would prefer a more neutral approach and link to the comparison of clients.

Latest revision as of 15:03, 25 January 2015

Creating the files with images

How are the files which include images created? I can see making the smaller ones with the export tool, but the 40GB one with all the images seems harder. I found An old thread listing some of the tools used, but the default way to use those tools seems to hit Wikipedia's servers fairly hard, at least if everything is working correctly and all the downloads work. Is there a way to do this from the official dumps, which should be both faster and less likely to overload the Wikipedia servers? I think I can do this by hosting a Wikimedia instance with Parsoid myself and setting the script's parsoidUrl to "http://localhost:8000/localhost/" and hostUrl to "http://localhost/". Would this actually work? Would it have any subtle issues to be aware of? Is there a better solution at the moment? --DanielH (talk) 09:07, 8 March 2014 (CET)

This email tells you how it is done, you can get more information available here. There is no easy way to create ZIM files from the XML/Wikicode, but what you propose will certainly work. The idea is not that everybody creates ZIM files of Wikipedia with millions of entries, but that we create them and other use them. But, we really want to provide all Wikipedia ZIM file at least with a new version per month. If you need a ZIM file which is not already available, open a feature request. Kelson (talk) 10:15, 8 March 2014 (CET)
The email says what tools were used, but not how. As you now know, I've already read the page you linked, which doesn't even mention mwoffliner (this should probably be fixed by somebody who understands mwoffliner). The documentation for mwoffliner helps, but seems to want to be used with the online servers instead of dumps, which is wasteful and slow; that's why I'm trying to figure out how to combine it with the dumps. I don't actually want to do Wikipedia myself; I want several other English Wikimedia projects and the corresponding pictures (if I didn't want the images I'd just use the dumps, a local Wikimedia install, Wiki2html and zimwriterfs as mentioned on the page you linked). Is it still OK to open a feature request for English non-Wikipedia Wikimedia projects? And even if you did provide all these files, I'd still want to know how you generated them. --DanielH (talk) 05:27, 9 March 2014 (CET)

Recommendation of non-free software

Right now "torrent software" links to µTorrent web site, which is a closed source software. I would be best to recommend a free and open source one (like Transmission) or point to the comparison of BitTorrent clients in Wikipedia. Looking at the operating system support I see that qBittorrent is the only one which can run on Windows, Mac OS X, Linux, BSD, iOS and Android. If you want to recommend only one client, it seems like the universal one.

You are right. The point is that, even if a closed-souce software, utorrent is a really good bittorrent client. I use myself Transmission, but Transmission is buggy with web seeds, so we can not recommend it. I don't know qBittorrent, this is a problem. If If I had to change to an open-source software, I would propose deluge. You can change for deluge if you want, I agree. Kelson (talk) 22:36, 23 January 2015 (CET)
I also use Transmission (in Debian stable). Just tried Deluge on Windows 7 and the GUI freezes from time to time. In the same machine, with the same torrents, µTorrent worked correctly. I'm sure there is no single client which is open source, bug free, multiplatform, with the prettiest GUI and recently updated. That's way I would prefer a more neutral approach and link to the comparison of clients.