Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: Description: UCY

University of Cyprus
Dept. of Computer Science

ΕΠΛ 605: Προχωρημένη Αρχιτεκτονική Υπολογιστών

Εαρινό 2017


Directory
EPL605 | Course Contract | Lectures & Readings | Homework | Tutorials | Labs   | Resources | What's New?


 

Θέματα Εργαστήριων

 

Week

Date

Περιγραφή

Εργασίες / ΑΝΑΓΝΩΣΜΑΤΑ

Διαφάνειες

1

16/01/2017

 

 

 

2

23/01/2017

Εισαγωγή στο UNIX και άλλα εργαλεία

Βιβλίο: Computer Architecture, 5th Edition: A Quantitative Approach

Ιστορία των UNIX/GNU/LINUX Λειτουργικών Συστημάτων

Bash and Perl Scripts.

Δικαιώματα Αρχείων.

Unix Commands: time, grep, sort, head, unique, chmod, make, gcc flags, gdb.

Editors: cat, vi, etc.

For More on the Linux Commands and useful links:

http://www.oreillynet.com/linux/cmd/

https://www.gnu.org/

https://gcc.gnu.org/onlinedocs/

https://sourceware.org/gdb/download/onlinedocs/gdb/index.html

https://www.gnu.org/software/make/manual/

https://www.gnu.org/software/bash/manual/bash.pdf

http://www.tldp.org/LDP/abs/abs-guide.pdf

http://www.tldp.org/HOWTO/pdf/Bash-Prog-Intro-HOWTO.pdf

https://www.gnu.org/software/grep/manual/

http://www.gnuplot.info/

https://godbolt.org/

 

Εργαστήριo 1

3

30/01/2017

PIN Tool

Εισαγωγή στο εργαλείο PIN και πειραματισμός με βασικά PINTOOLS

Εργαστήριο 2

https://software.intel.com/en-us/articles/pintool

https://software.intel.com/sites/landingpage/pintool/docs/67254/Pin/html/

 

4

06/02/2017

SPEC CPU 2006 and gprof,

Running and Profiling SPEC CPU 2006 using PIN

SPEC CPU 2006: Compiling and Running

PIN SPEC CPU 2006 Benchmarks

Profiling using gprof a  SPEC CPU 2006 Benchmarks

EPL605Lab4SPEC2006_gprof.pdf

gprof2dot.py

SPEC CPU2006 command lines.pdf

http://www.spec.org/cpu2006/

https://www.spec.org/cpu2006/Docs/

https://www.spec.org/cpu2006/publications/CPU2006benchmarks.pdf

https://www.cs.utah.edu/dept/old/texinfo/as/gprof.html

 

5

13/02/2017

Hardware Counters and the perf Linux Tool

Χρήση του Εργαλείου perf (Linux profiling with performance counters)

Examples using 4 different version of the Matrix Multiplication Algorithm. The Effect of gcc optimizations (gcc –O0 to 4) on the system performance.

 

 gnuplot

Εργαστήριο 4

Lab3Files.zip

https://perf.wiki.kernel.org/index.php/Tutorial

GCC optimization Flags (https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html)

https://sourceware.org/binutils/docs/binutils/objdump.html

https://sourceware.org/gdb/current/onlinedocs/gdb/index.html#SEC_Contents

http://www.gnuplot.info/docs_4.6/gnuplot.pdf

http://www.brendangregg.com/Slides/Velocity2015_LinuxPerfTools.pdf

 

Labs CPU:

http://ark.intel.com/products/80815/Intel-Core-i5-4590-Processor-6M-Cache-up-to-3_70-GHz

https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf

 

 

6

20/02/2017

SimpleScalar Architecture Simulator

Introduction to SimpleScalar:

Run Examples with different cache configuration and comparing the results

Εργαστήριο 5

http://www.simplescalar.com

http://www.simplescalar.com/docs/simple_tutorial_v2.pdf

 

7

27/02/2017

Αργία (Καθαρή Δευτέρα)

 

 

8

06/03/2017

Cache Characterization

Characterization of Cache memory using variation on Sattolo's algorithm to incrementally generate a random cyclic permutation that increases in size each time.

 

cat /proc/meminfo

cat /proc/vm

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/*

ls -l /sys/devices/system/cpu/cpu[01]/cache/index[0123]/*

 

g++ lat.cc

taskset -c 0 ./a.out 500000

http://linuxcommand.org/man_pages/taskset1.html

(CPU affinity. CPU affinity is a scheduler property that "bonds" a process to a given set of CPUs on the system.)

 

RDTSC and RDTSCP assembly instructions.

Intel CPUs have a timestamp counter to keep track of every cycle that occurs on the CPU. Starting with the Intel Pentium® processor, the devices have included a per-core timestamp register that stores the value of the timestamp counter and that can be accessed by the RDTSC and RDTSCP assembly instructions.

Cache Characterization. (Lorena Ndreu)

 

http://blog.stuffedcow.net/2013/01/ivb-cache-replacement/

 

http://ark.intel.com/products/80815/Intel-Core-i5-4590-Processor-6M-Cache-up-to-3_70-GHz

https://www.spec.org/cpu2006/results/res2014q3/cpu2006-20140725-30611.html

https://www.spec.org/cpu2006/results/res2014q3/cpu2006-20140725-30613.html

http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/4th-gen-core-family-desktop-vol-1-datasheet.pdf

 

https://en.wikipedia.org/wiki/Time_Stamp_Counter

http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-32-ia-64-benchmark-code-execution-paper.pdf

 

 

9

13/03/2017

MARSSx86

Micro-ARchitectural and System Simulator for x86-based Systems

MARSSx86: Downloading, compiling, running and simulating SPEC CPU 2006 on CentOS Linux release 7.3.1611 (Core) without been root.(here)

Checkpoint creation for SPEC2006 see and modify:

https://github.com/avadhpatel/marss/blob/master/util/create_checkpoints.py

 

Εργαστήριο 7

http://marss86.org/~marss86/index.php/Home

http://wiki.qemu.org/Main_Page

http://marss86.org/~marss86/index.php/Run-time_Configuration

http://bertha.cs.binghamton.edu/downloads/Marss_MICRO_2012_tutorial.pdf

https://github.com/downloads/avadhpatel/marss/Marss_ISCA_2012_tutorial.pdf

http://man7.org/linux/man-pages/man1/screen.1.html

10

13/03/2017

MARSSx86

Micro-ARchitectural and System Simulator for x86-based Systems

MARSSx86 Configurations

Εργαστήριο 8

matrix_serial_std.c

11

27/03/2017

PARSEC 3.0

Princeton Application Repository for Shared-Memory Computers

The Princeton Application Repository for Shared-Memory Computers (PARSEC) is a benchmark suite composed of multithreaded programs.

 

(Introduction to ARM Architecture and SoC https://www.arm.com/)

Εργαστήριο 9

http://parsec.cs.princeton.edu/

http://parsec.cs.princeton.edu/download/tutorial/3.0/parsec-tutorial.pdf

http://parsec.cs.princeton.edu/doc/parsec-report.pdf

http://marss86.org/~marss86/index.php/Disk_Images

 

12

02/04/2017

Παρουσιάσεις Papers από τους φοιτητές του μαθήματος

 

 

13

24/04/2017

 

 

 

 

 

 

 

 

 

 

 

 

 

ΕΠΛ605: How-To Guide

 

Remote connect to Linux Machines

ssh username@103ws1.in.ucy.ac.cy

(Do not always use the same machine i.e. 103ws1)

How to get the Space used by each folder

duhs * or du –hsc to get the total used space for your account. 3rd year students have 600MByte of space.

http://its.cs.ucy.ac.cy/en/faqs

Another way to see how much space you have and how much it is used is to use the departments portal

https://portal.cs.ucy.ac.cy/login.php

To free space delete the .cache folder

How to Tunnel X Windows Securely over SSH

ssh –X username@103ws1.in.ucy.ac.cy

How to check Linux Distribution version

cat /etc/centos-release

CentOS Linux release 7.3.1611 (Core)

uname -a

Linux b103ws32 3.10.0-514.6.1.el7.x86_64 #1 SMP Wed Jan 18 13:06:36 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Screen a full-screen window manager that multiplexes a physical terminal between several processes (typically interactive shells).

http://man7.org/linux/man-pages/man1/screen.1.html

 

Where can I find the Benchmark Software and Input files

/home/students/cs/SPEC2006/

Export gnuplot graph into a file

set term png

set output "output.png"

plot sin(x) with linespoints pointtype 3

Machine Identification Commands

cat /proc/cpuinfo

cat /proc/meminfo

/sys/devices/system/cpu/cpu0/cache/index0/size

/sys/devices/system/cpu/cpu0/cache/index0/type

/sys/devices/system/cpu/cpu0/cache/index1/size

/sys/devices/system/cpu/cpu0/cache/index1/type

lscpu

lspci (better try lspci | grep NVIDIA)

Cache Identification Commands

lscpu

/proc/cpuinfo

cat /proc/meminfo

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/size

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/level                    

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/type

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/ways_of_associativity

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/number_of_sets

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/physical_line_partition 

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/coherency_line_size     

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/shared_cpu_list         

cat /sys/devices/system/cpu/cpu[01]/cache/index[0123]/shared_cpu_map

Using lstopo to print the Machine Topology

Download from http://www.open-mpi.org/software/hwloc/v1.9/downloads/hwloc-1.9.1.tar.gz

============================================================================

For the Labs Machines

vi configure +5316 (remove some lines from the configuration file since your are not root)

#HWLOC_VERSION="`$srcdir/config/hwloc_get_version.sh $srcdir/VERSION`"

#if test "$?" != "0"; then

#    as_fn_error $? "Cannot continue" "$LINENO" 5

#fi

#HWLOC_RELEASE_DATE="`$srcdir/config/hwloc_get_version.sh $srcdir/VERSION --release-date`"

 

chmod +x configure

./configure

make

./utils/hwloc-info

/hwloc-1.9.1/hwloc-1.9.1>./utils/lstopo HWTopology.pdf

============================================================================

For cs9472

No Modifications are needed for the configutation file other than chmod +x configuration

./configure

make

lstopo --of txt cs6478lstopo.output.txt

 

Useful Commands on Linux OS

 

Execute a process on the background

./a.out &

Kill a process that run on the foreground

Ctrl-c

Stop a processes execution and force it to run on the background

Ctrl-z

bg

Bring the last process from background to foreground

fg

See the running processes

ps

Task Manager like

top

kill a process that run on the background

kill -9 [process id]

SimpleScalar get Stats at any instruction interval (i.e. every 100 instructions)

sim.h line 79 add

 

counter_t last_sim_num_insn;

 

 

sim-outorder.c line 4602 add

 

if ((last_sim_num_insn != sim_num_insn) &&(sim_num_insn % 100) == 0){

            last_sim_num_insn = sim_num_insn;            

            printf("sim_num_insn=%d\n",(unsigned int)sim_num_insn);            

            sim_print_stats(stderr);

}

Execute with

./sim-outorder tests-alpha/bin/test-math 2> output.txt

And get your static with

grep bpred_bimod.misses  output.txt

 

 *Compile, run and pin a benchmark for EPL605

 

This is the method that should be used to compile, run and pin a benchmark for EPL605 (Tested on all minus 481):

Lets assume that you have the code in the folder $SPEC (i.e. SPEC = ~/ SPEC2006INTEL64/

cd /home/students/cs/SPEC2006/SPEC2006DVD

bash

cd ~/SPEC2006INTEL64/

./install.sh -d .

(For the above you MUST be in the SPEC2006INTEL64

Enter the architecture you are using:

linux-redhat62-ia32

)

set SPEC = ~/SPEC2006INTEL64

. ./shrc

cd config

cp linux64-amd64-gcc42.cfg EPL605-configuration.cfg

vi EPL605-configuration.cfg (Edit the compilers)

CC           = /usr/bin/gcc

CXX          = /usr/bin/g++

FC           = /usr/bin/gfortran

 

runspec --config=EPL605-configuration.cfg --action=build --tune=base 401

This will compile benchmark 401 and put the executable in exe/

(NOTE 1: If you want to change the compilation FLAGS i.e. from O2 to O3 or to add more compilation flags then you have to modify EPL605-configuration.cfg.

It is better to create a new copy before modifying it.

NOTE 2: the name of the executable includes amd64. This is just another parameter in epl605-configuration.cfg that can be changed to intel64 if you like.)

To just run the bzip2.

./benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64-m64-gcc41-nn ../benchspec/CPU2006/401.bzip2/data/ref/input/input.source

 

To ryn your SPEC under PIN goto PIN Directory and execute the following command

./pin -t source/tools/ManualExamples/obj-intel64/inscount0.so -- $SPEC/benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64-m64-gcc42-nn $SPEC/benchspec/CPU2006/401.bzip2/data/ref/input/input.source

cat inscount.out

Count 92080051447

 

Profilling the SPEC

vi SPECINTEL64/config/EPL605-configuration.cfg

COPTIMIZE     = -O0 -pg

CXXOPTIMIZE  = -O0 -pg

FOPTIMIZE    = -O0 -pg
runspec --config=EPL605-configuration.cfg --action=build --tune=base 401

$SPEC/benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64-m64-gcc42-nn $SPEC/benchspec/CPU2006/401.bzip2/data/ref/input/input.source

gprof $SPEC/benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64-m64-gcc42-nn > gprogResults.txt

less gprogResults.txt

>gprof benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64-m64-gcc42-nn | python gprof2dot.py | dot -Tjpg -o output.jpg

=========================================================================================================================================

Downloading, compiling, running and simulating SPEC CPU 2006 on CentOS release 6.5 without been root.

All the files are installed in the following folder

cd /home/students/cs/SPEC2006/simulators/

Download Scons and an untar it in a clean folder:

cd /home/students/cs/SPEC2006/simulators/images

wget http://prdownloads.sourceforge.net/scons/scons-2.3.4.tar.gz

tar -xvf scons-2.5.1.tar.gz

cd scons-2.5.1

python setup.py install --prefix=.

 

Download Marss x86 using git clone from: 

cd /home/students/cs/SPEC2006/simulators/marss_x86

git clone git://github.com/avadhpatel/marss.git

 

Compile using (use c=n where n the number of cores) (Only from B103 machines for now): 

cd /home/students/cs/SPEC2006/simulators/marss_x86

../scons/bin/scons -Q config=config/default.conf

 

To Run the Qemu with simulation, telnet and network capabilities:

qemu/qemu-system-x86_64 -curses -monitor telnet:127.0.0.1:1234,server,nowait -m 2048 -hda ../images/ubuntu-natty-SPEC2006-STD.qcow2 -net nic,model=ne2k_pci -net user -simconfig simconfig

or if you run it from your folder

qemu/qemu-system-x86_64 -curses -monitor telnet:127.0.0.1:1234,server,nowait -m 2048 -hda /home/students/cs/SPEC2006/simulators/marss_x86/images/ubuntu-natty-SPEC2006-STD.qcow2 -net nic,model=ne2k_pci -simconfig simconfig

where simconfig

# Sample Marss simconfig file

 -machine single_core

 

 # Logging options

 -logfile resutlts/test.log

 -loglevel 10

 # Start logging after 10million cycles

 # -startlog 10m

 

 # Stats file

 -stats results/test.stats.yml

 -stopinsns 2000000000

 

Access to the machine: root root

Terminate the cachine from inside: shutdown 0 -P

On a new console telnet the QEMU's monitor console with:

telnet 127.0.0.1 1234

One in the Guest(Emulated) OS you will have to fix the apt-get in order to install new software like gcc and wget

vi /ect/apt/sources/list

and replace the natty with trusty

to install the gcc compiler which is needed to compile the SPECs.

apt-get install gcc 

To copy SPEC2006 into the image:

mkdir SPEC2006

cd SPEC2006

scp -r [your_username]p@103ws30.in.cs.ucy.ac.cy:/home/students/cs/SPEC2006/SPEC2006DVD/ .

you can follow the standard SPEC procedure for the installation with:

(You don’t have to do it if you are using ubuntu-natty-SPEC2006-STD.qcow2

./install -d ~/SPEC2006_MARSSx86 

set SPEC = ~/SPEC2006_MARSSx86
. ./shrc
cd config
cp linux64-amd64-gcc42.cfg EPL605-configuration.cfg
vi EPL605-configuration.cfg (Edit the compilers)
CC           = /usr/bin/gcc
CXX          = /usr/bin/g++
FC           = /usr/bin/gfortran

To compile 401.bzip2

runspec --config=EPL605-configuration.cfg --action=build --tune=base 401

To run 401.bzip2

./benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64-m64-gcc41-nn ./benchspec/CPU2006/401.bzip2/data/ref/input/input.source
To simulate401.bzip2

~/start_sim; ./benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64-m64-gcc42-nn ./benchspec/CPU2006/401.bzip2/data/ref/input/input.source ; ~/kill_sim;

Typical output file (test.stats.yml

 

)

=========================================================================================================================================

CPU2006

Executing from within the Pin Directory.

401.bzip2

./pin -t ./source/tools/Tests/obj-intel64/icount1.so -- /home/students/cs/SPEC2006/SPEC2006INTEL64/benchspec/CPU2006/401.bzip2/exe/bzip2_base.amd64-m64-gcc41-nn /home/students/cs/SPEC2006/SPEC2006INTEL64/benchspec/CPU2006/401.bzip2/data/ref/input/input.source

464.h264ref

cp /home/students/cs/SPEC2006/SPEC2006INTEL64/benchspec/CPU2006/464.h264ref/data/all/input/foreman_qcif.yuv .

./pin -t ./source/tools/Tests/obj-intel64/icount1.so -- /home/students/cs/SPEC2006/SPEC2006INTEL64/benchspec/CPU2006/464.h264ref/exe/h264ref_base.amd64-m64-gcc41-nn -d /home/students/cs/SPEC2006/SPEC2006INTEL64/benchspec/CPU2006/464.h264ref/data/ref/input/foreman_ref_encoder_baseline.cfg

Specification of the Branch Predictor Type

-bpred <type>

-bpred bimod <size>

-bpred:2lev <l1size> <l2size> <hist_size>

   


 

[EPL605]


 

 

 

 

Petros Panayi, © 2017