Difference between revisions of "Athens 2023"

Jump to navigation Jump to search
→‎Achievements: Added achievement of integrating standard Zimit reading into Kiwix JS
(→‎Achievements: Added achievement of integrating standard Zimit reading into Kiwix JS)
 
(16 intermediate revisions by 5 users not shown)
Line 18: Line 18:
We need to (does not have to be in this order):
We need to (does not have to be in this order):
* Assess current situation
* Assess current situation
** Present Webrecorder/Kiwix current activities and projects
** <s>Present Webrecorder/Kiwix current activities and projects</s> Ilya not available
** Present current sofware stack and how it interacts together
** Present current sofware stack and how it interacts together
** List and identify the weaknesses (at least the one not clearly identify already) in the current architecture/software  
** List and identify the weaknesses (at least the one not clearly identify already) in the current architecture/software  
Line 24: Line 24:
**Go over the crawler's CLI params to understand how/when to use them (<code>docker run --rm -it ghcr.io/openzim/zimit:dev crawl --help</code>)
**Go over the crawler's CLI params to understand how/when to use them (<code>docker run --rm -it ghcr.io/openzim/zimit:dev crawl --help</code>)
**<s>Status of <bdi>[https://github.com/webrecorder/browsertrix-crawler/issues/207 Success status code on failure]</bdi></s>
**<s>Status of <bdi>[https://github.com/webrecorder/browsertrix-crawler/issues/207 Success status code on failure]</bdi></s>
**Status of [https://github.com/webrecorder/browsertrix-crawler/issues/246 Disable browser updates]
**<s>Status of [https://github.com/webrecorder/browsertrix-crawler/issues/246 Disable browser updates]</s> Fixed in Zimit, but not yet upstream in Browsertrix
**Status of [https://github.com/webrecorder/browsertrix-crawler/issues/159 SSLError]
**<s>Status of [https://github.com/webrecorder/browsertrix-crawler/issues/159 SSLError]</s>
**[https://github.com/openzim/warc2zim/issues/109 First access to warc2zim file doesn't correctly catch external links]
**[https://github.com/openzim/warc2zim/issues/109 First access to warc2zim file doesn't correctly catch external links]


Line 32: Line 32:
** [https://github.com/openzim/warc2zim/issues/65 How communicate to a user the boundaries of a ZIM?]
** [https://github.com/openzim/warc2zim/issues/65 How communicate to a user the boundaries of a ZIM?]
** [https://github.com/openzim/zimit/issues/126 Should we still use Service workers?]
** [https://github.com/openzim/zimit/issues/126 Should we still use Service workers?]
** [https://github.com/openzim/warc2zim/issues/72 What kind of size optimisation should we run?]
** <s>[https://github.com/openzim/warc2zim/issues/72 What kind of size optimisation should we run?]</s> WONTFIX
** [https://github.com/openzim/warc2zim/issues/104 Assess pseudo namespaces]
** [https://github.com/openzim/warc2zim/issues/104 Assess pseudo namespaces]
**<bdi>[https://github.com/openzim/zimit/issues/166 Should we accept invalid HTTPs?]</bdi>
**<bdi>[https://github.com/openzim/zimit/issues/166 Should we accept invalid HTTPs?]</bdi>
Line 40: Line 40:
** [https://github.com/openzim/zimit/issues/138 Out of scope homepage redirect]
** [https://github.com/openzim/zimit/issues/138 Out of scope homepage redirect]
** See how to simplify/improve Wabac ZIM related part
** See how to simplify/improve Wabac ZIM related part
**<bdi>[https://github.com/openzim/zimit-frontend/issues/35 Can't clear options]</bdi>


* Implement new features
* Implement new features
Line 50: Line 51:


== Achievements ==
== Achievements ==
'''Jaifroid:'''
* Increased understanding of the warc2zim implementation and the underlying Replay software greatly, thanks to the help of MGautier, Kelson, and discussions with the others
* Began work on integrating a standard implementation (based on the current Zimit and warc2zim versions) into Kiwix JS using wombat.js and wabac.js (Service Worker)
** Although this is not yet functional, I achieved loading of the landing page into Kiwix JS, but not yet transformation of static links via the Service Worker
** I successfully integrated the Kiwix JS Service Worker and wabac.js into a single Service Worker, with the Fetch routed first to wabac.js, and handed off to Kiwix JS SW when needing to extract assets from the ZIM
** I successfully managed to load wombat.js into the iframe document, but the configuration is not yet correct
** Work so far is in https://github.com/kiwix/kiwix-js/pull/1010
**EDIT 4/12/2023 '''This goal is now achieved and is in preview release in the Browser Extension offline-first PWA v3.11.5+'''
* Worked on ironing out several issues with my non-SW-based implementation in KJSWL, using knowledge gleaned at the Hackathon
** Greatly increased fidelity of rendering of Zimit-based archives, including a lot of dynamic content
** As a good test, ''mesquartierschinois'' now loads flawlessly in the KJSWL implementation fully offline, with dynamic loading of the entries as the user scrolls. YouTube videos work and stream offline, but Vimeo is not implemented (it is a separate fuzzy transformation)
** Many other dynamic ZIMs are now working very well
** A severe issue with the app attempting to load assets as main pages has been resolved
** Loading is pretty fast at least on a desktop PC, but it also runs on iOS acceptably (only in Safari). Android is slow but useable, especially once a site's assets are cached via Cache API. N.B. On Android, it is not possible to use Firefox, because Firefox for Android unfortunately has a bug which attempts to load the ZIM archive into memory or internal storage, which fails for large archives. Chrome / Edge or Samsung Internet work fine (the fastest is Samsung Internet due to its optimized file reading speed).
** An implementation with the many changes can be tested at https://kiwix.github.io/kiwix-js-windows/dist/
'''Matthieu:'''
Succeed to create a POC of warc2zim creating zim files with static rewriting and so not needing a Service Worker.


== Agenda ==
== Agenda ==
Line 68: Line 90:


[[Category:Hackathon]]
[[Category:Hackathon]]
==Budget==
*Hosting: CHF 1'874.05
*F&B: CHF 737.24
*Travel: <!--- 276.5.389.15+280+75+448.4 --->1'469.05
12

edits

Navigation menu