Labs

Installing and configuring all the software needed for this course on your machine might be tedious. We have prepared cloud-based virtual machines (VMs), one for each student, which can be remotedly accessed and used during the current semester. Please find here a comprehensive guide on how to connect to the cloud-based virtual machines provided by the Laboratory for Internet Computing (LInC).

Week Description Material
2 Inverted Index and the Boolean Model using NLTK and Apache OpenNLP LAB01.pdf,
lab1.py
OpenNLP.zip
 
3 Apache Lucene LAB02.pdf,
dataset.zip
Lucene 1 Solution
Lucene 2 Solution
 
4 Apache Solr LAB03.pdf  
5 ElasticSearch LAB04.pdf
lab4.zip
elasticJava.zip
 
6 Apache Hadoop 1 LAB05.pdf
Hadoop 1 Source Code
Dataset
Hadoop 1 Solution
 
7 Apache Hadoop 2   LAB06.pdf
Hadoop 2 Source Code -- WordCount.java
Hadoop 2 Solution
 
8 Apache Hadoop 3   LAB07.pdf
Hadoop 3 Source Code
Dataset
Hadoop 3 Solution
9 Apache Nutch LAB08.pdf  
10 Apache Tika LAB09.pdf
LAB09.zip
 
11 Text Clustering and Classification in Python LAB10
Lab10-description.pdf
labeledTrainData.tsv
 
12 Apache Spark LAB11.pdf  
13 Assignment 2 Demonstration All students are kindly requested to be present.  
14 Projects Presentations