Installing and configuring all the software needed for this course on your machine might be tedious. We have prepared cloud-based virtual machines (VMs), one for each student, which can be remotedly accessed and used during the current semester. Please find here a comprehensive guide on how to connect to the cloud-based virtual machines provided by the Laboratory for Internet Computing (LInC).

Week Description Material
2 Inverted Index and the Boolean Model using NLTK and Apache OpenNLP LAB01.pdf,
3 Apache Lucene LAB02.pdf,
Lucene 1 Solution
Lucene 2 Solution
4 Apache Solr LAB03.pdf  
5 ElasticSearch LAB04.pdf
6 Apache Hadoop 1 LAB05.pdf
Hadoop 1 Source Code
Hadoop 1 Solution
7 Apache Hadoop 2   LAB06.pdf
Hadoop 2 Source Code --
Hadoop 2 Solution
8 Apache Hadoop 3   LAB07.pdf
Hadoop 3 Source Code
Hadoop 3 Solution
9 Apache Nutch LAB08.pdf  
10 Apache Tika LAB09.pdf
11 Text Clustering and Classification in Python LAB10
12 Apache Spark LAB11.pdf  
13 Assignment 2 Demonstration All students are kindly requested to be present.  
14 Projects Presentations