Computer Organization and Design: The Hardware and Software Interface
Notes
Preface
Chapter 1: Computer Abstractions and Technology
1.4. Under the Covers
- The five classic components of a computer are input, output, memory, datapath, and control, with the last two sometimes combined and called the processor.
- Access times for DRAM in the 50 nanoseconds range.
- Access times for flash is in the 5 to 50 microseconds range.
- Access times for hard disk is in 5 to 20 ms range.
- Flash memory bits wear out after 100,000 to 1,000,000 writes
1.5 Technologies for Building Processors and Memory
- Transistor is an on/off switch controlled by electricity
1.6. Performance
- wall clock time, response time, elapsed time all refer to the total time to complete a task
- CPU time only looks is the time CPU spends computing and does not include time spent time waiting for I/O.
- Hertz measures cycles per second. If a complete clock clock cycle takes 250 picoseconds then you would get 1/250x10-12 = 4x10^9 or 4GHz
- The average number of clock cycles each instruction takes to execution is called clock cycles per instruction or CPI
- Instructions per clock cycle (IPC) is the inverse of CPI
- Clock rate is the inverse of clock cycle time
1.7. The Power Wall
- CMOS stands for complementary metal oxide semiconductor
- Current problem with microprocessor improvement is lowering voltage makes the transistors too leaky, like water faucets that cannot be completely shut off. 40% of power consumption in server chips is due to leakage.
1.8. The Sea Change: The Switch from Uniprocessors to Multiprocessors
1.11. Historical Perspective and Further Reading
Chapter 2: Instructions
2.1. Introduction
- Stored-program is the idea instructions and data of many types can be stored in memory as numbers and thus easy to change.
2.3. Operands of the Computer Hardware
- word a natural unit of access in a computer, usually a group of 32 bits.
- data transfer instruction is a command that moves data between memory and registers.
- load is the data transfer instructions that copies data from memory to a register. This is called load word in RISC-V
- The register added to form address is called the base-register and the constant is called the offset.
- store copies data from register to memory. This is called store word in RISC-V.
- alignment restriction is when words must start at address that are multiple of 4 (remember a word is 4 bytes, 32 bites). RISC-V and Intel x86 do not have alignment restrictions.
- The process of putting less frequently used variables into memory is called spilling registers.
- add immediate is a quick add instruction to add one constance operand. This avoids having to call a load.
2.5 Representing Instrucstions in the Computer
- RISC-V fields:
- opcode: Basic operation of the instruction (7 bits)
- rd: The register destination operand. It gets the result of the operation (5 bits)
- funct3: An additional opcode field. (5 bits)
- rs1: The first register source operand. (3 bits)
- rs2: The second register source operand. (5 bits)
- funct7: An additional opcode field. (7 bits)
2.8 Supporting Procedures in Computer Hardware
- program counter register holds the address of the current instruction being executed.
- The stack “grows” from higher addresses to lower address, this means that you push values onto the stack by subtracting from the sp and adding to the sp shrinks the stack, therby popping values off the stack.
- frame pointer is a value denoting the location of the saved registers and local variables for a given procedure.
2.12. Translating and Starting a Program
- Dynamically Linked Libaries pay a good deal of overhead the first time a routine is called, but only single indrect branch thereafter.
2.17. Real Stuff; ARMv7 (32-bit) Instruction
Chapter 4: The Processor
4.3. Building a Datapath
- register file is a structure that consists of a set of registers that can be read and written by supplying a register number to accessed.
Chapter 5: Large and Fast: Exploiting Memory Hierarchy
5.2. Memory Technologies
5.7. Virtual Memory
- page fault is an event that occurs when an accessed page is not present in main memory.
- The page table, program counter and registers is what specifies the state of a virtual machine.
Chapter 6. Parallel Processors from Client to Cloud
6.4. Hardware Multithreading
- thread includes the program counter, register state, and the stack. Threads commonly share a single address space whereas processes don’t.
- process includes one or more threads, the address space, and the operating system state. A process switch usually invokes the OS but a thread switch does not.
- There is fine-grained, coarse-grained and simultaneous multithreading. The differences is when the process decides to switch between threads to execution. Fine-grained is round robin switch on each instruction. Coarse-grained waits for long pauses. And Simultaneous is constantly executing instructions from different threads at the same time.
6.15. Concluding Remarks
Review
This book was a “fun” read for me. This means I don’t worry about doing all the practice problems or force myself to take notes throughout the book. Please keep this in mind when reading my review.
After reading three chapters of Computer Architecture: A Quantitative Approach, I felt the material was a bit too advanced for me. Based on the recommendations in that book, I switched to Computer Organization and Design. I was pleasantly surprised by how readable this textbook was from cover to cover. I know many university textbooks may not be intended to be read this way, but this one was great.
I really enjoyed the structure of the book and felt it was an excellent introductory textbook on computer architecture. The “Fallacies and Pitfalls” and “Historical Perspective” sections at the end of each chapter were some of my favorite parts. This book definitely gets my recommendation for anyone looking to learn more about how computers work.