Hardware Cache (sRAM)

  • Introduced due to clock speed discrepancy in DRAM (dynamic RAM main memory) and CPU clock
  • Based on the locality principle
    • It states due to cyclic structure of programs and data packed into linear arrays addresses close by have a high probability of being used near in the future
  • Thus the concept of static RAM (sRAM) is implemented as on chip memory
  • The cache is subdivided into subsets of lines.  There are several ways of mapping
    • Direct mapping. A line in main memory is stored in exact same location in cache
    • Fully associative. A line in main memory can be stored in any location in cache
    • N-Way Associative. A line in main memory can be stored in N number of places on cache
  • CD flag of cr0 processor register determines whether caching is enabled (1) or not
  • NW flag of cr0 determines write-through or write-back
  • Pentium cache allows each page frame to have its own cache management policy
    • Therefore the page directory and page table entry has 2 extra flags
      • PCD: Page cache disable
      • PWT: Page write-through
    • Linux enables page cache always and always uses write-back

 

Cache Unit: Two Parts

  1. Cache Controller: Essentially the Page directory of cache.  Contains a few flags and a tag. Which is a way of representing the requested memory location (called a tag).
    • Data stored comes in 3 parts in order tag, cache controller subset index, offset within the line
  2. sRAM (hardware cache): The actual data stored

Cache Process

  1. CPU attempts to access RAM.  CPU extracts subset index from physical address.
  2. CPU compares the tags of all lines in the subset with the high order bits of physical address
  3. If a line with the same tag is found CPU has a cache hit (other wise cache miss)

Cache Hits and Misses

  • Cache Hit: When searching for a portion of RAM the CPU finds it in the cache.
    • For reads, the data is extracted from sRAM and put in a register dRAM is not accessed
    • For write the CPU uses one of two strategies write-through and write-back
      • Write-through:  The controller always writes into both RAM and the cache line, effectively switching off the cache for write operations
      • Write-back:  More immediate efficiency, only the cache line is updated and the contents of the RAM are left unchanged. RAM is only written to when the CPU executes a FLUSH command usually after cache misses
  • Cache Miss: When the CPU fails to do the above
    • Cache line will be written into RAM and if neccesary correct line will be fetched from main memory.

Multiprocessor System Cache

  • Each CPU has its own cache
  • If two or more CPUs have the same memory on cache an update of shared cache memory must be updated on all CPUs with that memory
    • This is done using cache snooping

Levels of Cache

  • L1 is the fastest cache and often the smallest L2, L3 and onwards are slower and often larger than L1
  • Linux assumes only one cache

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s