Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: UCY

University of Cyprus
Dept. of Computer Science

ΕΠΛ 605: Προχωρημένη Αρχιτεκτονική Υπολογιστών

Εαρινό 2017

EPL605 | Course Contract | Lectures & Readings | Homework | Tutorials | Labs   | Resources | What's New?


Θέματα Εργαστήριων














Εισαγωγή στο UNIX και άλλα εργαλεία

Βιβλίο: Computer Architecture, 5th Edition: A Quantitative Approach

Ιστορία των UNIX/GNU/LINUX Λειτουργικών Συστημάτων

Bash and Perl Scripts.

Δικαιώματα Αρχείων.

Unix Commands: time, grep, sort, head, unique, chmod, make, gcc flags, gdb.

Editors: cat, vi, etc.

For More on the Linux Commands and useful links:


Εργαστήριo 1



PIN Tool

Εισαγωγή στο εργαλείο PIN και πειραματισμός με βασικά PINTOOLS

Εργαστήριο 2




SPEC CPU 2006 and gprof,

Running and Profiling SPEC CPU 2006 using PIN

SPEC CPU 2006: Compiling and Running

PIN SPEC CPU 2006 Benchmarks

Profiling using gprof a  SPEC CPU 2006 Benchmarks


SPEC CPU2006 command lines.pdf




Hardware Counters and the perf Linux Tool

Χρήση του Εργαλείου perf (Linux profiling with performance counters)

Examples using 4 different version of the Matrix Multiplication Algorithm. The Effect of gcc optimizations (gcc –O0 to 4) on the system performance.



Εργαστήριο 4

GCC optimization Flags (


Labs CPU:





SimpleScalar Architecture Simulator

Introduction to SimpleScalar:

Run Examples with different cache configuration and comparing the results

Εργαστήριο 5




Αργία (Καθαρή Δευτέρα)





Cache Characterization

Characterization of Cache memory using variation on Sattolo's algorithm to incrementally generate a random cyclic permutation that increases in size each time.


cat /proc/meminfo

cat /proc/vm

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/*

ls -l /sys/devices/system/cpu/cpu[01]/cache/index[0123]/*



taskset -c 0 ./a.out 500000

(CPU affinity. CPU affinity is a scheduler property that "bonds" a process to a given set of CPUs on the system.)


RDTSC and RDTSCP assembly instructions.

Intel CPUs have a timestamp counter to keep track of every cycle that occurs on the CPU. Starting with the Intel Pentium® processor, the devices have included a per-core timestamp register that stores the value of the timestamp counter and that can be accessed by the RDTSC and RDTSCP assembly instructions.

Cache Characterization. (Lorena Ndreu)






Micro-ARchitectural and System Simulator for x86-based Systems

MARSSx86: Downloading, compiling, running and simulating SPEC CPU 2006 on CentOS Linux release 7.3.1611 (Core) without been root.(here)

Checkpoint creation for SPEC2006 see and modify:


Εργαστήριο 7




Micro-ARchitectural and System Simulator for x86-based Systems

MARSSx86 Configurations

Εργαστήριο 8





Princeton Application Repository for Shared-Memory Computers

The Princeton Application Repository for Shared-Memory Computers (PARSEC) is a benchmark suite composed of multithreaded programs.


(Introduction to ARM Architecture and SoC

Εργαστήριο 9




Παρουσιάσεις Papers από τους φοιτητές του μαθήματος


















ΕΠΛ605: How-To Guide


Remote connect to Linux Machines


(Do not always use the same machine i.e. 103ws1)

How to get the Space used by each folder

duhs * or du –hsc to get the total used space for your account. 3rd year students have 600MByte of space.

Another way to see how much space you have and how much it is used is to use the departments portal

To free space delete the .cache folder

How to Tunnel X Windows Securely over SSH

ssh –X

How to check Linux Distribution version

cat /etc/centos-release

CentOS Linux release 7.3.1611 (Core)

uname -a

Linux b103ws32 3.10.0-514.6.1.el7.x86_64 #1 SMP Wed Jan 18 13:06:36 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Screen a full-screen window manager that multiplexes a physical terminal between several processes (typically interactive shells).


Where can I find the Benchmark Software and Input files


Export gnuplot graph into a file

set term png

set output "output.png"

plot sin(x) with linespoints pointtype 3

Machine Identification Commands

cat /proc/cpuinfo

cat /proc/meminfo






lspci (better try lspci | grep NVIDIA)

Cache Identification Commands



cat /proc/meminfo

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/size

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/level                    

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/type

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/ways_of_associativity

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/number_of_sets

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/physical_line_partition 

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/coherency_line_size     

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/shared_cpu_list         

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/shared_cpu_map

Using lstopo to print the Machine Topology

Download from


For the Labs Machines

vi configure +5316 (remove some lines from the configuration file since your are not root)

#HWLOC_VERSION="`$srcdir/config/ $srcdir/VERSION`"

#if test "$?" != "0"; then

#    as_fn_error $? "Cannot continue" "$LINENO" 5


#HWLOC_RELEASE_DATE="`$srcdir/config/ $srcdir/VERSION --release-date`"


chmod +x configure




/hwloc-1.9.1/hwloc-1.9.1>./utils/lstopo HWTopology.pdf


For cs9472

No Modifications are needed for the configutation file other than chmod +x configuration



lstopo --of txt cs6478lstopo.output.txt


Useful Commands on Linux OS


Execute a process on the background

./a.out &

Kill a process that run on the foreground


Stop a processes execution and force it to run on the background



Bring the last process from background to foreground


See the running processes


Task Manager like


kill a process that run on the background

kill -9 [process id]

SimpleScalar get Stats at any instruction interval (i.e. every 100 instructions)

sim.h line 79 add


counter_t last_sim_num_insn;



sim-outorder.c line 4602 add


if ((last_sim_num_insn != sim_num_insn) &&(sim_num_insn % 100) == 0){

            last_sim_num_insn = sim_num_insn;            

            printf("sim_num_insn=%d\n",(unsigned int)sim_num_insn);            



Execute with

./sim-outorder tests-alpha/bin/test-math 2> output.txt

And get your static with

grep bpred_bimod.misses  output.txt


 *Compile, run and pin a benchmark for EPL605


This is the method that should be used to compile, run and pin a benchmark for EPL605 (Tested on all minus 481):

Lets assume that you have the code in the folder $SPEC (i.e. SPEC = ~/ SPEC2006INTEL64/

cd /home/students/cs/SPEC2006/SPEC2006DVD


cd ~/SPEC2006INTEL64/

./ -d .

(For the above you MUST be in the SPEC2006INTEL64

Enter the architecture you are using:



set SPEC = ~/SPEC2006INTEL64

. ./shrc

cd config

cp linux64-amd64-gcc42.cfg EPL605-configuration.cfg

vi EPL605-configuration.cfg (Edit the compilers)

CC           = /usr/bin/gcc

CXX          = /usr/bin/g++

FC           = /usr/bin/gfortran


runspec --config=EPL605-configuration.cfg --action=build --tune=base 401

This will compile benchmark 401 and put the executable in exe/

(NOTE 1: If you want to change the compilation FLAGS i.e. from O2 to O3 or to add more compilation flags then you have to modify EPL605-configuration.cfg.

It is better to create a new copy before modifying it.

NOTE 2: the name of the executable includes amd64. This is just another parameter in epl605-configuration.cfg that can be changed to intel64 if you like.)

To just run the bzip2.

./benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64-m64-gcc41-nn ../benchspec/CPU2006/401.bzip2/data/ref/input/input.source


To ryn your SPEC under PIN goto PIN Directory and execute the following command

./pin -t source/tools/ManualExamples/obj-intel64/ -- $SPEC/benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64-m64-gcc42-nn $SPEC/benchspec/CPU2006/401.bzip2/data/ref/input/input.source

cat inscount.out

Count 92080051447


Profilling the SPEC

vi SPECINTEL64/config/EPL605-configuration.cfg

COPTIMIZE     = -O0 -pg


FOPTIMIZE    = -O0 -pg
runspec --config=EPL605-configuration.cfg --action=build --tune=base 401

$SPEC/benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64-m64-gcc42-nn $SPEC/benchspec/CPU2006/401.bzip2/data/ref/input/input.source

gprof $SPEC/benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64-m64-gcc42-nn > gprogResults.txt

less gprogResults.txt

>gprof benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64-m64-gcc42-nn | python | dot -Tjpg -o output.jpg


Downloading, compiling, running and simulating SPEC CPU 2006 on CentOS release 6.5 without been root.

All the files are installed in the following folder

cd /home/students/cs/SPEC2006/simulators/

Download Scons and an untar it in a clean folder:

cd /home/students/cs/SPEC2006/simulators/images


tar -xvf scons-2.5.1.tar.gz

cd scons-2.5.1

python install --prefix=.


Download Marss x86 using git clone from: 

cd /home/students/cs/SPEC2006/simulators/marss_x86

git clone git://


Compile using (use c=n where n the number of cores) (Only from B103 machines for now): 

cd /home/students/cs/SPEC2006/simulators/marss_x86

../scons/bin/scons -Q config=config/default.conf


To Run the Qemu with simulation, telnet and network capabilities:

qemu/qemu-system-x86_64 -curses -monitor telnet:,server,nowait -m 2048 -hda ../images/ubuntu-natty-SPEC2006-STD.qcow2 -net nic,model=ne2k_pci -net user -simconfig simconfig

or if you run it from your folder

qemu/qemu-system-x86_64 -curses -monitor telnet:,server,nowait -m 2048 -hda /home/students/cs/SPEC2006/simulators/marss_x86/images/ubuntu-natty-SPEC2006-STD.qcow2 -net nic,model=ne2k_pci -simconfig simconfig

where simconfig

# Sample Marss simconfig file

 -machine single_core


 # Logging options

 -logfile resutlts/test.log

 -loglevel 10

 # Start logging after 10million cycles

 # -startlog 10m


 # Stats file

 -stats results/test.stats.yml

 -stopinsns 2000000000


Access to the machine: root root

Terminate the cachine from inside: shutdown 0 -P

On a new console telnet the QEMU's monitor console with:

telnet 1234

One in the Guest(Emulated) OS you will have to fix the apt-get in order to install new software like gcc and wget

vi /ect/apt/sources/list

and replace the natty with trusty

to install the gcc compiler which is needed to compile the SPECs.

apt-get install gcc 

To copy SPEC2006 into the image:

mkdir SPEC2006

cd SPEC2006

scp -r [your_username] .

you can follow the standard SPEC procedure for the installation with:

(You don’t have to do it if you are using ubuntu-natty-SPEC2006-STD.qcow2

./install -d ~/SPEC2006_MARSSx86 

set SPEC = ~/SPEC2006_MARSSx86
. ./shrc
cd config
cp linux64-amd64-gcc42.cfg EPL605-configuration.cfg
vi EPL605-configuration.cfg (Edit the compilers)
CC           = /usr/bin/gcc
CXX          = /usr/bin/g++
FC           = /usr/bin/gfortran

To compile 401.bzip2

runspec --config=EPL605-configuration.cfg --action=build --tune=base 401

To run 401.bzip2

./benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64-m64-gcc41-nn ./benchspec/CPU2006/401.bzip2/data/ref/input/input.source
To simulate401.bzip2

~/start_sim; ./benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64-m64-gcc42-nn ./benchspec/CPU2006/401.bzip2/data/ref/input/input.source ; ~/kill_sim;

Typical output file (test.stats.yml





Executing from within the Pin Directory.


./pin -t ./source/tools/Tests/obj-intel64/ -- /home/students/cs/SPEC2006/SPEC2006INTEL64/benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64-m64-gcc41-nn /home/students/cs/SPEC2006/SPEC2006INTEL64/benchspec/CPU2006/401.bzip2/data/ref/input/input.source


cp /home/students/cs/SPEC2006/SPEC2006INTEL64/benchspec/CPU2006/464.h264ref/data/all/input/foreman_qcif.yuv .

./pin -t ./source/tools/Tests/obj-intel64/ -- /home/students/cs/SPEC2006/SPEC2006INTEL64/benchspec/CPU2006/464.h264ref/exe/h264ref_base.amd64-m64-gcc41-nn -d /home/students/cs/SPEC2006/SPEC2006INTEL64/benchspec/CPU2006/464.h264ref/data/ref/input/foreman_ref_encoder_baseline.cfg

Specification of the Branch Predictor Type

-bpred <type>

-bpred bimod <size>

-bpred:2lev <l1size> <l2size> <hist_size>








Petros Panayi, © 2017