Cache Notes
- Cache
- Cache is a small amount of fast memory combined with a large
amount (the non-cache) of slower memory.
- Access to memory is generally slower than the CPU, so cache is a
scheme to help speed up the memory access time.
- Locality principle - the observation that the memory references
made in any short time interval tend to use only a small fraction
of the total memory.
- This forms the basis of caching systems.
- If a given memory reference is to address A, it is likely
that the next memory reference will be in the general vicinity of A.
- The general idea of the locality principle is that when a
word is referenced, it is brought from the large slow memory
into the cache, so that the next time it is used, it can be accessed quickly.
- If a word is read or written k times in a short interval, the computer
will need 1 reference to slow memory and k-1 references to fast memory (cache).
- The first reference to the word is made to main (slow) memory because
at this point the word is not in the cache.
- After the first reference to word, the word is placed in the cache
- A reference to a word first checks the cache for the word.
If the word is not in the cache, then main memory is checked.
- The larger k is, the better the overall performance because all
of the calls to the word (except the first on) are satisfied out of the cache
- h, the hit ratio, is the fraction of all references that can be
satisfied from the cache.
- h = (k-1)/k
- For example, if a word is read/written 10 times, main memory
is accessed once, and cache is accessed 9 times. (the word is placed
in cache after the first hit).
- The hit ratio here is 9/10 or 90%. If k is 1000 times,
h= 999/1000=99.9%.
- The miss ration is 1-h.
- mean access time = c + (1-h)m
where c is the cache access time, and m is the main memory access time
- As h approaches 1, most of the access is done in cache, and the
mean access time approaces c, the cache access time.
- As h approaches 0, the mean access time approaches c + m, first
a time c to check the cache (unsuccessfully) and then a time m to
do the memory reference.
- Associative cache
- In associative cache:
Number of blocks that are needed = size of memory/block size
- In associative cache, each slot contains:
slot size (in bits) = 1 bit (valid/invalid) +
#bits needed for block addresses + #bits needed to store the actual data
- The #bits needed to store the actual data is computed:
(block size in bytes) * 8 bits/byte
- Associative cache - consists of a number of slots or lines,
each containing: valid bit, block number, value
- When Valid = 1, the slot is in use in the cache
- In Fig 28, there is a cache with 1024 slots.
- In associative cache, the order of the entries is random. When
the computer is reset, all the Valid bits are set to 0, no cache entries
are valid.
- Fig 28 - the memory is 2^24 bytes divided into 2^22 4 byte blocks
- The cache needs 22 bits to store labels of up to 22 bit-length names.
- The block size is 4 bytes per block, which is 32 bits per block.
- The cache needs 1 bit (valid/invalid) + 22 bits (for the block number) +
32 bits (for the block value) = 55 bits per line X 1024 lines
- Direct-mapped cache
- In direct-mapped cache:
Each slot contains:
1 bit (valid/invalid) + # bits for tag + #bits for actual data
- To compute the number of bits for the tag:
#bits for address - log base 2 of the block size - log base 2 of number of slots
- Subracting log base 2 of the block size is like dividing by
the block size.
- Subtracting log base 2 of the number of slots is like performing
the operation: block number mod the number of slots
- Fig 29: 2^24 addresses, 4 (2^2) byte blocks, 1024 (2^10)slots in the cache
- Slot number = 2^24/2^2 mod 2^10 = 2^22 mod 2^10
- This is the same as using the lower 10 bits.
- The upper 12 bits are the tag