- Introduced due to clock speed discrepancy in DRAM (dynamic RAM main memory) and CPU clock
- Based on the locality principle
- It states due to cyclic structure of programs and data packed into linear arrays addresses close by have a high probability of being used near in the future
- Thus the concept of static RAM (sRAM) is implemented as on chip memory
- The cache is subdivided into subsets of lines. There are several ways of mapping
- Direct mapping. A line in main memory is stored in exact same location in cache
- Fully associative. A line in main memory can be stored in any location in cache
- N-Way Associative. A line in main memory can be stored in N number of places on cache
- CD flag of cr0 processor register determines whether caching is enabled (1) or not
- NW flag of cr0 determines write-through or write-back
- Pentium cache allows each page frame to have its own cache management policy
- Therefore the page directory and page table entry has 2 extra flags
- PCD: Page cache disable
- PWT: Page write-through
- Linux enables page cache always and always uses write-back
- Therefore the page directory and page table entry has 2 extra flags
Cache Unit: Two Parts
- Cache Controller: Essentially the Page directory of cache. Contains a few flags and a tag. Which is a way of representing the requested memory location (called a tag).
- Data stored comes in 3 parts in order tag, cache controller subset index, offset within the line
- sRAM (hardware cache): The actual data stored
Cache Process
- CPU attempts to access RAM. CPU extracts subset index from physical address.
- CPU compares the tags of all lines in the subset with the high order bits of physical address
- If a line with the same tag is found CPU has a cache hit (other wise cache miss)
Cache Hits and Misses
- Cache Hit: When searching for a portion of RAM the CPU finds it in the cache.
- For reads, the data is extracted from sRAM and put in a register dRAM is not accessed
- For write the CPU uses one of two strategies write-through and write-back
- Write-through: The controller always writes into both RAM and the cache line, effectively switching off the cache for write operations
- Write-back: More immediate efficiency, only the cache line is updated and the contents of the RAM are left unchanged. RAM is only written to when the CPU executes a FLUSH command usually after cache misses
- Cache Miss: When the CPU fails to do the above
- Cache line will be written into RAM and if neccesary correct line will be fetched from main memory.
Multiprocessor System Cache
- Each CPU has its own cache
- If two or more CPUs have the same memory on cache an update of shared cache memory must be updated on all CPUs with that memory
- This is done using cache snooping
Levels of Cache
- L1 is the fastest cache and often the smallest L2, L3 and onwards are slower and often larger than L1
- Linux assumes only one cache