ALU, Datapath and Control Unit | Part 2
This lecture covers micro-operations in CPU architecture, explaining how elementary CPU operations are described using Register Transfer Language (RTL). The instructor discusses clock cycles required for instruction fetch, different ALU-based architectures (accumulator-based, register-based, etc.), and analyzes a GATE exam question about clock cycles needed for a PUSH instruction.
Summary
The lecture begins with a recap of previous content about CPU components including the accumulator, data registers, ALU, flag register, instruction register, and the instruction fetch cycle. The instructor reviews the four micro-operations involved in instruction fetch: transferring PC content to MAR, fetching data from memory to MDR, moving MDR content to IR, and incrementing the PC.
The instructor then formally defines micro-operations as the most elementary, fundamental, and atomic operations performed by a CPU — operations that cannot be further subdivided. Using the instruction fetch example, four distinct micro-operations are identified. A key point emphasized is that most micro-operations complete in one clock cycle, though exceptions exist (e.g., memory access may take multiple clock cycles due to memory speed constraints).
The concept of clock cycles is explained with practical examples: a 1 GHz processor completes one clock cycle in 1 nanosecond, while a 1 MHz processor takes 1 microsecond (1000 nanoseconds) per cycle, making the GHz processor 1000x faster. The instructor clarifies that parallel micro-operations are only possible when they don't share the internal bus — using an analogy of two people sharing one scooter to explain bus contention.
Five types of ALU architectures are discussed: (1) Accumulator-based — one input from accumulator, other from register/memory, output back to accumulator; (2) Register-based — both inputs from GPRs, output to accumulator; (3) Register-memory based — first input from GPR, second from GPR or memory; (4) Complex/memory-memory based — both inputs from GPR or memory; (5) Stack-based — uses top-of-stack and next-of-stack registers. Simple A+B examples are shown for each architecture.
The lecture concludes with a detailed GATE 2001 question analysis involving a CPU where registers A, B, A1, A2, MDR, bus, and ALU are 8-bit wide, but SP (stack pointer) and MAR are 16-bit. The instructor explains how 8-bit to 16-bit transfer requires multiplexers (MUX) and demultiplexers (DEMUX), requiring two clock cycles for SP-to-MAR transfer. For the PUSH R instruction, the total clock cycle count is calculated as: 2 cycles (SP→MAR transfer, 16-bit via 8-bit bus) + 1 cycle (R→MDR) + 2 cycles (memory write via MDR) = 5 clock cycles total, with SP decrement happening in parallel at no extra cost due to local circuitry.
Key Insights
- The instructor explains that parallel micro-operations are only possible when they do not share the internal bus — operations requiring the same shared bus must execute sequentially, each consuming one clock cycle, similar to two people sharing a single scooter.
- The instructor clarifies that memory access via MAR and MDR requires two clock cycles due to the speed difference between registers and main memory, contrasting with cache memory which could potentially complete in one clock cycle.
- The instructor identifies that transferring a 16-bit stack pointer value over an 8-bit internal bus requires 8 multiplexers (MUX) arranged to send the first 8 bits in one clock cycle and the remaining 8 bits in the next, necessarily consuming two clock cycles.
- For the PUSH R instruction, a common student mistake is writing M[SP] ← R directly, but the correct sequence requires R's content to first go to MDR before being written to memory — skipping MDR is architecturally impossible given how the data bus is connected.
- The instructor explains that stack pointer decrement (SP-1) costs zero additional clock cycles in this architecture because the SP has internal local circuitry for decrement, allowing it to execute in parallel with other operations without consuming extra clock cycles.
Topics
Full transcript available for MurmurCast members
Sign Up to Access