Schedule

Date

 

Description

Bibliography

Slides

22/01/2024

 

What is data mining on the Web/ Introductory lecture

Chapter 1, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Optional Reading:

Lecture 1

25/01/2024

 

Map-Reduce Framework

Chapter 2, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Optional Reading:

Lecture 2

29/01/2024

 

Frequent itemsets and Association rules

Chapter 6, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Optional Reading:

Lecture 3

01/02/2024

 

Frequent itemsets and Association rules

Chapter 6, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Lecture 4

05/02/2024

 

Finding Similar Items

Chapter 3, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Lecture 5

08/02/2024

 

Finding Similar Items

Chapter 3, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Lecture 6

15/02/2024

 

Clustering

Chapter 7, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Lecture 7

19/02/2024

 

Clustering

Chapter 7, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Optional Reading:

  • Jain, A. K., Murty, M. N., and Flynn, P. J. 1999. Data clustering: a review. ACM Comput. Surv. 31, 3 (Sep. 1999), 264-323.

Lecture 8

22/02/2024

 

Recommendation Systems

Chapter 9, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Optional Reading:

Lecture 9

26/02/2024

 

Dimensionality Reduction

Chapter 9, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Optional Reading:

Lecture 10

29/02/2024

 

Mining Data Streams

Chapter 4, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Lecture 11

4/03/2024

 

Mining Data Streams

Chapter 4, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Lecture 12

07/03/2024

 

Midterm

Chapters: 1, 2, 3, 6, 7, 9

Midterm sample

11/03/2024

 

Link Analysis and Web search

Chapter 5,Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Optional Reading:

Lecture 13

14/03/2024

 

Link Analysis and Web search

Chapter 5, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Lecture 14

21/03/2024

 

Advertising on the Web

Chapter 8, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Lecture 15

28/03/2024

 

Learning through Experimentation 

Chapter 8, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Multi-armed bandit problem (wikipedia)

A Contextual-Bandit Approach to Personalized News Article Recommendation by Li, Chu, Langford, Schapier. WWW 2010.

 

Lecture 16

04/04/2024

 

Large-Scale Machine Learning

Chapter 12, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Lecture 17

8/04/2024

 

Mining Social-Network Graphs

Chapter 10, Mining Massive Datasets, by Jure Leskovec, Anand Rajaraman and Jeff Ullman, Cambridge University Press, 2014

Lecture 18

11/04/2024

 

Use case I: Analysing, Detecting and Categorizing Polarizing Topics in News Media

Use case II: StreamSight: A Query-Driven Framework for Streaming Analytics in Edge Computing

Vosoughi, S., Roy, D. & Aral, S. Science 359, 1146–1151 (2018). Article

Don't blame bots, fake news is spread by humans | Sinan Aral

StreamSight: A Query-Driven Framework for Streaming Analytics in Edge Computing Edge computing is the emerging architectural paradigm extending cloud technologies to the logical extremes of the network for on-demand and delay-sensitive services. However, once service placement on edge-enabling resources has been dealt with, a new challenge arises: how to process enormous volumes of streaming data to provide query-driven analytics while still satisfying the delay-critical servicing requirements. To overcome this challenge we introduce StreamSight, a framework for edge-enabled IoT services which provides a rich and declarative query model abstraction for expressing complex analytics on monitoring data streams and then dynamically compiling these queries into stream processing jobs for continuous execution on distributed processing engines. Zacharias Georgiou, Moysis Symeonides, Demetris Trihinas, George Pallis, Marios D. Dikaiakos In UCC, 2018

Lecture 19

15/04/2024

 

Project Presentations

18/04/2024

 

Projects Presentations