- Cache Competition (Basic)
- DM Cache Arithmetic (Basic)
- 3-Way Set Associative Cache Arithmetic (Intermediate)
50.002 Computation Structures
Information Systems Technology and Design
Singapore University of Technology and Design
Cache Issues
Each topic’s questions are grouped into three categories: basic, intermediate, and challenging. You are recommended to do all basic problem set before advancing further.
Cache Competition (Basic)
The following questions ask you to evaluate alternative cache designs using patterns of memory references taken from running programs. Each of the caches under consideration has a total capacity of 8 (4-byte) words, with one word stored in each cache line. The cache designs under consideration are:
- DM: a direct-mapped cache.
- S2: a 2-way set-associative cache with a least-recently-used replacement policy. Note that since we have a total capacity of 8 words, it means we have 4 sets for S2.
- FA: a fully-associative cache with a least-recently-used replacement policy.
The questions below present a sequence of addresses for memory reads. You should assume the sequences repeat from the start whenever you see ”…”
Keep in mind that byte addressing is used; addresses of consecutive words in memory differ by 4.
Each question asks which cache(s) give the best hit rate for the sequence. Answer by considering the steady-state hit rate, i.e., the percentage of memory references that hit in the cache after the sequence has been repeated many times.
- Which cache(s) have the best hit rate for the sequence 0, 16, 4, 36, …? The “…” means that we repeat the sequences: 0, 16, 4, 36, 0, 16, 4, 36, … and so on. This applies to all other questions below.
Show Answer
DM: locations 4 and 36 collide, so each iteration has 2 hits, 2 misses.
S2: 100% hit rate. 0 and 16 map to the same set, as do 4 and 36, but since the cache is 2-way associative they don’t collide.
FA: 100% hit rate. The cache is only half filled by this loop. - Which cache(s) have the best hit rate for the sequence 0, 4, 8, 12, 16, 20, 24, 28, 32, …?
Show Answer
DM: locations 0 and 32 collide, so each iteration has 7 hits, 2 misses.
S2: locations 0, 16 and 32 all map to the same set. The LRU replacement strategy replaces 0 when accessing 32, 16 when accesing 0, 32 when accessing 16, etc., so each iteration has 6 hits, 3 misses.
FA: has 0% hit rate in the steady state since the LRU replacement strategy throws out each location just before it’s accessed by the loop - Which cache(s) have the best hit rate for the sequence 0, 4, 8, 12, 16, 20, 24, 28, 32, 28, 24, 20, 16, 12, 8, 4, …?
Show Answer
**All** caches perform the same. Locations 0 and 32 trade places in the caches, so each iteration has 14 hits and 2 misses.
- Which cache(s) have the best hit rate for the sequence 0, 4, 8, 12, 32, 36, 40, 44, 16, …?
Show Answer
DM: 32 collides with 0, 36 with 4, 40 with 8, 44 with 12, so each iteration has only 1 hit and 8 misses.
S2: locations 0, 16 and 32 trade places in the cache, so each iteration has 6 hits and 3 misses.
FA: 0 hits since LRU throws out each location just before it’s accessed by the loop.
-
Assume that a cache access takes 1 cycle and a memory access takes 4 cycles. If a memory access is initiated only after the cache has missed, what is the maximum miss rate we can tolerate before use of the cache actually slows down accesses?
Show AnswerIf accesses always go to Memory, it takes 4 cycles per access. With the cache, the average number of cycles per access is 1 + (miss rate) * 4. Hence if the miss rate is larger than 75% then the average number of cycles per access is more than 4. Our maximum miss rate is therefore 75%.
DM Cache Arithmetic (Basic)
The diagram below illustrates a blocked, direct-mapped cache for a computer that uses 32-bit data words and 32-bit byte addresses. Assume that memory location 0x0012347B
was present in the cache.
- Using the row and column labels from the figure, in what cache location could we find the data from that memory location? You may assume that Row
15
means index0xF
, Row14
means index0xE
, and so on.Show AnswerExpanding the last 8-bit of the memory location we have
0111 1011
. There are two bits for block indexing and four bits for row indexing. Since last two bits are disregarded as indicted in the image, the block index is `10` and the row index is `0111`. This corresponds to `DATA 10` column and `Row 7` row. - Can data from locations
0x0012347B
and0x00123470
be present in the cache at the same time?Show AnswerExpanding the last 8-bit of the memory location we have
0111 1011
and0111 0000
respectively. So yes, they can be both present in the cache at the same time.0x0012347B
stays at `DATA 10` column and0x00123470
stays at `DATA 00` column, and both resides in `Row 7`.
3-Way Set Associative Cache Arithmetic (Intermediate)
Take a look at the following 3-way set associative cache:
The shaded region indicates that the tag-data cells are filled with some data. For the LRU bit, the higher value means that the item is more recently used. Assume that word addressing is used unless stated otherwise, and each instruction is 32-bit long. The label A, B, C, and 0 to 7 drawn in the picture is for you to identify which corresponding set and DM cache a particular data resides.
According to the diagram, we have 8 sets, indexed from 0 to 7, and three DM caches (A to C) in total.
For each of the question below, assume higher bits of the address is zero and therefore isn’t written.
- The computer now requests for
Mem[0x1304]
, and results in cache miss. It accesses the memory and write it onto the cache above. Where canMem[0x1304]
reside at?Show AnswerIt can reside in either `A4`, `B4`, or `C4`.
- The computer now requests for
Mem[0x1305]
, and results in another cache miss. It accesses the memory and write it onto the cache above. Where canMem[0x1305]
reside at?Show AnswerIt can reside in `B5` only.
- The computer now requests for
Mem[0xB317]
, and results in yet another cache miss. It accesses the memory and write it onto the cache above. Where canMem[0xB317]
reside at?Show AnswerIt replaces `A7` and resides at `A7`.
- How many bits of data can be contained in this 3-way set associative cache in total?
Show Answer
768. This is obtained from 24 (cache lines) times 32 bits of data each.
- What is the value of
t
(i.e: the number of bits stored in the ‘Tag’ column) if this 3- way set associative cache is used with a Beta CPU?Show Answer27, since Beta uses byte addressing.
Another engineer would like to expand this 3-way set associative cache such that its block size is two.
- What is the value of
k
now?Show AnswerStill 3, doesn't change since the number of sets remain the same.
- What is the value of
t
now? i.e: How many bits are stored in the ‘Tag’ column of the cache?Show Answer28, since now we need 1 bit to index the block, but `k` remains at 3 bits.
- How many bits of data can be stored in this 3-way set associative cache in total if it is used with a Beta CPU but we don’t ignore the last two bits as usual? (see the diagram above, we don’t offset the last two bits of the address JUST FOR THIS QUESTION, for thought experiment sake).
Show Answer
It is natural to think that the capacity is doubled, to be 1536 bits. However, the Beta CPU uses **byte** addressing, so only 25% of the cache is useful. That leaves us with 384 bits.