Installing and configuring all the software needed for this course on your machine might be tedious. We have prepared cloud-based virtual machines (VMs), one for each student, which can be remotedly accessed and used during the current semester. Please find here a comprehensive guide on how to connect to the cloud-based virtual machines provided by the Laboratory for Internet Computing (LInC).

A (not very) short intro to Python can be found here

Week Description Material
1 No Lab    
2 NLTK and Apache OpenNLP LAB01.pdf,
3 NLTK in Python - Exercises LAB02.pdf  
4 Public Holiday NO LAB  
5 Apache Lucene LAB03.pdf,
Lucene Example 1
6 Apache Hadoop 1
  • Background information for MapReduce
  • Introduction to Hadoop (please read message about virtual machine above)
Hadoop 1 Source Code
Hadoop 1 Solution
7 Apache Hadoop 2 - Exercises   LAB05.pdf
Hadoop 2 Source Code --
8 Apache Hadoop 3 - Exercises   LAB06.pdf
Hadoop 3 Source Code
9 ElasticSearch LAB07.pdf  
10 ElasticSearch - Exercises LAB08.pdf
11 Apache Spark LAB09.pdf
12 Apache Spark - Exercises LAB10.pdf
13 No Lab (project presentations week)