TED

From Kiwix
Jump to navigation Jump to search

TED (Technology, Entertainment, Design) is a global set of conferences under the slogan "ideas worth spreading". They address a wide range of topics within the research and practice of science and culture, often through storytelling. The speakers are given a maximum of 18 minutes to present their ideas in the most innovative and engaging ways they can. Its web site is www.ted.com. The purpose of this project is to create a sustainable solution to create a ZIM file providing the TED and TEDx videos in a similar manner like ted.com

Goals

  • A script (python) able to create easily following ZIM files of the TED and TEDx videos with the possibility to filter by language/conference/topic
    • A list of TED talks can be found here
  • The data should be scraped from ted.com.
  • Videos should be available in HTML5 and subtitles need to be supported
  • The ZIM should provide a simple filtering/search solution to find content (by author, language, title, conference, topic, ....)

One way to achieve it

  1. Retrieve the list of TED(x) presentations with medatas in a local database
    1. A whole list of the available TED talks is available here (official) or here (unofficial)
    2. TEDx talks by language are available here.
  2. Download videos and re-encode them if necessary
  3. Retrieve the video subtitle files
    1. Subtitle don't make so much sense for TEDx
    2. TED has a translation program here
  4. Create the necessary templates of the index web pages (For the search/filter feature, a javascript client side solution should be tried)
    1. Interesting to read this to get an idea how to store a database client side.
  5. Fill the HTML templates with the data from the XML/RDF and write the index pages in a target directory
  6. Run zimwriterfs to create the corresponding ZIM file of your target directory