Chapter 2 Memory Hierarchy Design
Memory Technology and Optimizations
Performance metrics
- Latency 
- Access time 
- Time between read request and when desired word arrives
 
 - Cycle time 
- Minimum time between unrelated requests to memory
 
 - Bandwidth
 
 - Access time 
 
![![[Pasted image 20241220174926.png|500]]](https://i-blog.csdnimg.cn/direct/3293c45fbdaa4c3da08cec36caa2b7dd.png)
High Bandwidth Memory (HBM)
- A packaging innovation rather than a circuit innovation
 - Reduce access latency by shortening the delay between the DRAM and the processor
 - Interposer stacking (2.5D) available, vertical stacking (3D) still under development due to heat constraint
 
![![[Pasted image 20241220175322.png|650]]](https://i-blog.csdnimg.cn/direct/93c12004bb284f72b4dd6f25e6e40e63.png)
Six Basic Cache Optimizations
Average memory access time:
- Average memory access time = Miss rate × Miss penalty + Hit Time
 
Six basic cache optimizations:
- Reduce the miss rate 
- Larger block size (reduce compulsory misses) 
- Increase conflict misses, Increases miss penalty
 
 - Larger total cache capacity (reduce capacity misses) 
- Increase hit time, Increase cost, Increase power
 
 - Higher associativity (reduce conflict misses) 
- Increase hit time 
- More tag comparisons
 
 - Increase power
 
 - Increase hit time 
 
 - Larger block size (reduce compulsory misses) 
 - Reduce the miss penalty 
- Higher number of cache levels 
- L1 miss penalty can be reduced by introducing L2 cache
 - balance fast-hits and few-misses
 
 - Giving priority to read misses over writes 
- “read-after-write” data hazard 
- Let read wait until write buffer is empty
 - Or check the content of the write buffer on a read miss 
- conflict -> wait
 - non-conflict -> handle read miss first
 
 - Prioritizing read can reduce unnecessary stalling of read miss
 
 
 - “read-after-write” data hazard 
 
 - Higher number of cache levels 
 - Reduce the time to hit in cache 
- Avoiding address translation in cache indexing 
- Virtual Cache avoids address translation 
- virtual address -> cached block 
- use virtual address in both set indexing & tag comparison
 
 
 - virtual address -> cached block 
 - Obstacles to realize virtual cache 
- To ensure protection 
- copy protection bit of TLB into an added field on cache miss
 
 - To support process switching 
- adopt PID marker in cache
 
 - To ensure compatibility with multiple alias logic addresses (cache consistency) 
- antialiasing: compare physical addresses of existing blocks with the fetched one
 
 
 - To ensure protection 
 
 - Virtual Cache avoids address translation 
 
 - Avoiding address translation in cache indexing 
 
![![[Pasted image 20241220180510.png|650]]](https://i-blog.csdnimg.cn/direct/1f000bec3b39461c84bef42a00b9483c.png)
![![[Pasted image 20241220175802.png]]](https://i-blog.csdnimg.cn/direct/2365306cd1904e48a721c72aeae40abf.png)
Ten Advanced Cache Optimizations
Advanced Cache Optimizations
- Reducing the hit time
 - Reducing the miss penalty
 - Reducing the miss rate
 - Increasing cache bandwidth
 - Reducing the miss pe
 
