UCY Logo

University of Cyprus
Dept. of Computer Science

EPL451: Data Mining on the Web

Schedule

Spring 2013

 

EPL451 | Contract | Schedule | Assignments | Resources |  News

 

Lectures

Topics

Readings

Introduction

22/1

What is data mining on the Web/ Introductory lecture

[slides]

Chapter 1, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

Optional Reading:

25/1

What is data mining on the Web/ Introductory lecture

[slides]

Chapter 1, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

29/1

Map-Reduce Framework

[slides]

Chapter 2, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

Optional Reading:

1/2

Map-Reduce Framework

[slides]

Chapter 2, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

 

1/2

Hadoop

[slides]

[VM image]

[WordCount.java]

Data Mining Concepts

5/2

Frequent itemsets and Association rules

[slides]

Chapter 6, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

Optional Reading:

8/2

Frequent itemsets and Association rules

[slides]

Chapter 6, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

8/2

Hadoop

[slides] [Dataset]

Programming with Hadoop
12/2

Finding Similar Items

[slides]

Chapter 3, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011
15/2

Finding Similar Items

[slides]

Chapter 3, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

15/2

Hadoop

[Lab 3 Files & Instructions]

Programming with Hadoop

 

19/2

Finding Similar Items/ Clustering

[slides]

Chapter 3, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

Chapter 7, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

22/2

Clustering

[slides]

 

Chapter 7, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

Optional Reading:

  • Jain, A. K., Murty, M. N., and Flynn, P. J. 1999. Data clustering: a review. ACM Comput. Surv. 31, 3 (Sep. 1999), 264-323.

22/2

Mahout

[Lab 4 Presentation]

26/2

Clustering

[slides]

Chapter 7, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

Applications on the Web

1/3

Recommendation Systems

[slides]

Chapter 9, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

1/3

Mahout

[Lab 5]

 

  • Συνέχεια από προηγούμενο lab (Freq. Pattern Matching)
  • Mahout kmeans Clustering (Lab 5)
5/3

Recommendation Systems

[slides]

Chapter 9, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

Optional Reading:

8/3

Mahout

[Lab 6]

  • Mahout kmeans Clustering
12/3

Dimensionality Reduction

[slides]

Chapter 9, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

Optional Reading:

15/3

Link Analysis and Web search

[slides]

Chapter 5, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

Optional Reading:

 

15/3

Mahout

[Lab 7]

 

  • Introduction to Mahout Recommender
  • Example of User Based Recommendations
19/3

Midterm

[sample]

Chapters: 1, 2, 3, 6, 7

Grades

22/3

Link Analysis and Web search

[slides]

Chapter 5, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

22/3

Mahout

[Lab 8]

  • Dimensionality Reduction
26/3

Link Analysis and Web search

[slides]

Chapter 5, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

29/3

Mining Data Streams

[slides]

Chapter 4, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

29/3

Mahout

[Lab 9]

  • Example of Pagerank with mahout
2/4

Mining Data Streams

[slides]

Chapter 4, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

Optional Reading:

5/4

Advertising on the Web

[slides]

Chapter 8, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011

5/4

[Lab 10]

[Snap Graph Library homepage]

  • Introduction for Snap Graph Library
  • Finding Graph Communities using Snap Graph library
9/4

Advertising on the Web

[slides]

Chapter 8, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2011
12/4

The Role of Twitter in YouTube Videos Diffusion

[slides]

G. Christodoulou, C. Georgiou, G. Pallis. The Role of Twitter in YouTube Videos Diffusion. In Proceedings of the 13th International Conference on Web Information System Engineering (WISE 2012), Springer LNCS, Paphos, Cyprus, November 28th-30th, 2012.

Optional Reading:

12/4

Search Engine Optimization

Room 201, Pure & Applied Sciences Bldg. I

Invited talk by Pantelis Vladimirou (Webarts)

 

16/4
Projects Presentations Πρόγραμμα παρουσιάσεων
19/4
Projects Presentations

19/4

Projects Review  
  Final Exams  

[EPL451]

George Pallis , © 2013