Pipelining is the process of carrying out multiple instructions concurrently
Each instruction will be at a different stage of the fetch-decode-execute cycle
One instruction can be fetched while the previous one is being decoded and the one before is being executed
In the case of a branch the pipeline cannot accurately predict the next instructions, so the pipeline is flushed
Pipeline Flushing clears all in-flight instructions from the pipeline
This is inefficient because it discards work already done, forcing the processor to restart the pipeline and slowing execution
This table shows which stage each instruction is at during each step:
Fetch
Decode
Execute
Step 1
Instruction A
Step 2
Instruction B
Instruction A
Step 3
Instruction C
Instruction B
Instruction A
Step 4
Instruction D
Instruction C
Instruction B
While one instruction is being executed, the next instruction will be decoded and the following instruction will be fetched
How does pipelining improve processor performance?
Increases instruction throughput by allowing multiple instructions to be at different stages of completion simultaneously
Maximises hardware utilisation as different functional units of the CPU work on different instructions at once rather than sitting idle
Reduces the clock cycle time, as breaking the execution process into smaller stages allows the processor to run at a higher frequency
Overlaps the Fetch-Decode-Execute cycle, ensuring that while one instruction is being executed, the subsequent ones are already being prepared
Disadvantages of pipelining
Pipeline hazards occur when the next instruction cannot execute in the following clock cycle, causing performance-degrading stalls (or bubbles)
Data Hazards: Occur when an instruction depends on the result of a previous instruction that has not yet completed its execution
Control Hazards: Caused by branches or jumps that alter the programme flow, often requiring the pipeline to be flushed (emptying already-fetched instructions)
Structural Hazards: Arise when two or more instructions require the same hardware resource, such as memory or the ALU, at the same time
Increased design complexity: Requires more sophisticated control units and logic to handle synchronisation and hazard detection, which is particularly challenging in CISC architectures where execution time is not uniform
Diminishing returns: Performance gains are not always linear, as the overhead of managing the pipeline and the frequency of dependencies can limit the overall speedup