close

The Department of Computer Science at the University of Cyprus cordially invites you to the PhD Defense entitled:

Cache Content Duplication

Speaker: Marios Kleanthous
Affiliation: University of Cyprus, Cyprus
Category: PhD Defense
Location: Room 148, Faculty of Pure and Applied Sciences (FST-01), 1 University Avenue, 2109 Nicosia, Cyprus (directions)
Date: Friday, April 6, 2012
Time: 10:30-11:30 EET
Host: Yanos Sazeides (yanos AT cs.ucy.ac.cy)
URL: https://www.cs.ucy.ac.cy/colloquium/presentations.php#cs.ucy.pres.2012.kleanthous

Abstract:
The importance of caches and memory hierarchy has increased over time due to the growing gap between processor and memory performance, and it has become more important in Simultaneous Multithreading processors and Chip-multiprocessors. To cover this memory gap, caches have been the subject of numerous studies aiming to improve their performance as well as their power and area efficiency. This thesis identifies a new phenomenon in caches that has the potential to improve cache performance and efficiency: the Cache Content Duplication (CCD). CCD occurs when there is a miss for a block in a cache and the entire content of the missed block is already in the cache in a block with a different tag. Caches aware of content-duplication can have lower miss penalty by fetching, on a miss to a duplicate block, directly from the cache instead of accessing lower in the memory hierarchy, and can have lower miss rates by allowing only blocks with unique content to enter a cache. The usefulness of CCD is also examined at all levels of the memory hierarchy. First, we show that CCD is a frequent phenomenon for instruction caches and that an idealized duplication-detection mechanism for instruction caches has the potential to increase performance of an out-of-order processor, with a 16KB, 8-way, 8 instructions per block instruction cache, often by more than 10% and up to 36%. We also propose CATCH, a hardware mechanism for dynamically detecting CCD for instruction caches. Experimental results for an out-of-order processor show that a duplication-detection mechanism with a 1.38KB cost captures on average 58% of the CCD's idealized potential. Second, we examine another case of CCD which we call Text Cloning. Text Cloning can occur when running multiple copies of the same binary, Extrinsic Text Cloning, or when running multiple instances of the same application in a Virtually Indexed Virtually Tagged cache, Intrinsic Text Cloning. Results show that both Intrinsic Text Cloning and Extrinsic Text Cloning can reduce an application's performance. Specifically, Extrinsic Text Cloning causes up to 11% slowdown on existing platforms. Furthermore, we show that CATCH can benefit performance by eliminating the duplication due to Intrinsic Text Cloning and Extrinsic Text Cloning. Third, we investigate the potential of CCD for L1 data caches. The results indicate that caches exhibit a high amount of dirty blocks thus making the CCD detection and creating stable correlations between different blocks very difficult. If a block is written, all duplicate relations to that block need to be invalidated. Our analysis also shows that zero runs are very frequent in L1 data caches and, therefore, previously proposed zero detection mechanisms can provide good solutions. Finally, this thesis considers the CCD phenomenon for Last Level Caches. The LLC caches are written less frequently (L1 data cache acts as a filter) and have less zero runs because they mostly store evicted cache blocks that have already written with non-zero values. Results indicate that CCD is very frequent for various block granularities, from 4bytes up to 64bytes, and has potential to improve processors performance or save energy. A new cache design, the Content Duplication Aware Cache, is proposed to detect and eliminate CCD in LLCs. The results indicate that the Content Duplication Aware Cache can improve performance moderately but can reduce Energy Delay product considerably, up to 15% and 10% on average, for multiprogram workloads.

Short Bio:
Marios Kleanthous is a PhD. Candidate at the Department of Computer Science, University of Cyprus. He received his BSc. in Informatics and Telecommunications from National and Kapodistrian University of Athens in 2004 and his MSc. in Computer Science from the University of Cyprus in 2006. On September 2006 we worked in ARM Ltd in Cambridge for three months during a HiPEAC funded internship. His research interests include Memory Hierarchy Optimizations and especially Cache Compression techniques.

  Other Presentations Web: https://www.cs.ucy.ac.cy/colloquium/presentations.php
  Colloquia Web: https://www.cs.ucy.ac.cy/colloquium/
  Calendar: https://www.cs.ucy.ac.cy/colloquium/schedule/cs.ucy.pres.2012.Kleanthous.ics