Limitations of Memory System Performance
- Latency: The time it takes for a block of data to be retrieved from memory after a request has been made.
- Bandwidth: The rate at which data can be transferred from memory to the processor.
1: Improving Effective Memory Latency Using Caches
- Cache: a low-latency, high-bandwidth storage that is a smaller, faster intermediary between the processor and DRAM.
- Memory stored in the cache will henchforth be accessed in the cache.
- Hit ratio: The fraction of data references satisfied by the cache.
- Memory bound: The limit of effective computation rate due to memory.
- Temporal Locality of Reference: Repeated reference to a data item in a small time window.
2: Impact of Memory Bandwidth
- Memory bandwidth is determined by the bandwidth of the memory bus as well as the bandwidth of the memory units.
- One common technique to increase memory bandwidth is to increase the size of the memory blocks.
- Cache line: A memory block made up of n words.
- Wide buses can be used for multi-word memory blocks. These buses are typically more expensive to construct. Single words of a memory block can also be transferred through a narrower bus.
- Spatial locality: consecutive data words in memory are used by successive instructions.
- Tiling: breaking the iteration space into blocks and computing the result one block at a time.
- Important concepts to improve the memory bound on performance:
- Exploit temporal and spatial locality.
- The ratio of number of ops vs. number of memory accesses gives a good indicator of tolerance to memory bandwidth.
- Memory layouts and code organization affects spatial and temporal locality.
3: Alternative Approaches for Hiding Memory Latency
- Prefetching: Anticipating memory accesses in advance and issuing requests for data in memory before computation.
- Multithreading: Making single memory accesses to a larger chunk of memory in separate threads of execution.
4: Tradeoffs of Multithreading and Prefetching
- In the example, multithreading is not latency bound but bandwidth bound.