Chapter 2 Memory Hierarchy Design
Memory Technology and Optimizations
Performance metrics
- Latency
- Access time
- Time between read request and when desired word arrives
- Cycle time
- Minimum time between unrelated requests to memory
- Bandwidth
- Access time
High Bandwidth Memory (HBM)
- A packaging innovation rather than a circuit innovation
- Reduce access latency by shortening the delay between the DRAM and the processor
- Interposer stacking (2.5D) available, vertical stacking (3D) still under development due to heat constraint
Six Basic Cache Optimizations
Average memory access time:
- Average memory access time = Miss rate × Miss penalty + Hit Time
Six basic cache optimizations:
- Reduce the miss rate
- Larger block size (reduce compulsory misses)
- Increase conflict misses, Increases miss penalty
- Larger total cache capacity (reduce capacity misses)
- Increase hit time, Increase cost, Increase power
- Higher associativity (reduce conflict misses)
- Increase hit time
- More tag comparisons
- Increase power
- Increase hit time
- Larger block size (reduce compulsory misses)
- Reduce the miss penalty
- Higher number of cache levels
- L1 miss penalty can be reduced by introducing L2 cache
- balance fast-hits and few-misses
- Giving priority to read misses over writes
- “read-after-write” data hazard
- Let read wait until write buffer is empty
- Or check the content of the write buffer on a read miss
- conflict -> wait
- non-conflict -> handle read miss first
- Prioritizing read can reduce unnecessary stalling of read miss
- “read-after-write” data hazard
- Higher number of cache levels
- Reduce the time to hit in cache
- Avoiding address translation in cache indexing
- Virtual Cache avoids address translation
- virtual address -> cached block
- use virtual address in both set indexing & tag comparison
- virtual address -> cached block
- Obstacles to realize virtual cache
- To ensure protection
- copy protection bit of TLB into an added field on cache miss
- To support process switching
- adopt PID marker in cache
- To ensure compatibility with multiple alias logic addresses (cache consistency)
- antialiasing: compare physical addresses of existing blocks with the fetched one
- To ensure protection
- Virtual Cache avoids address translation
- Avoiding address translation in cache indexing
Ten Advanced Cache Optimizations
Advanced Cache Optimizations
- Reducing the hit time
- Reducing the miss penalty
- Reducing the miss rate
- Increasing cache bandwidth
- Reducing the miss pe