Презентация Computer structure pipeline онлайн

На нашем сайте вы можете скачать и просмотреть онлайн доклад-презентацию на тему Computer structure pipeline абсолютно бесплатно. Урок-презентация на эту тему содержит всего 64 слайда. Все материалы созданы в программе PowerPoint и имеют формат ppt или же pptx. Материалы и темы для презентаций взяты из открытых источников и загружены их авторами, за качество и достоверность информации в них администрация сайта не отвечает, все права принадлежат их создателям. Если вы нашли то, что искали, отблагодарите авторов - поделитесь ссылкой в социальных сетях, а наш сайт добавьте в закладки.
Презентации » Технология » Computer structure pipeline



Оцените!
Оцените презентацию от 1 до 5 баллов!
  • Тип файла:
    ppt / pptx (powerpoint)
  • Всего слайдов:
    64 слайда
  • Для класса:
    1,2,3,4,5,6,7,8,9,10,11
  • Размер файла:
    672.04 kB
  • Просмотров:
    57
  • Скачиваний:
    0
  • Автор:
    неизвестен



Слайды и текст к этой презентации:

№1 слайд
Computer Structure Pipeline
Содержание слайда: Computer Structure Pipeline Lecturer: Aharon Kupershtok

№2 слайд
A Basic Processor
Содержание слайда: A Basic Processor

№3 слайд
Pipelined Car Assembly
Содержание слайда: Pipelined Car Assembly

№4 слайд
Содержание слайда:

№5 слайд
Pipelining Pipelining does
Содержание слайда: Pipelining Pipelining does not reduce the latency of single task, it increases the throughput of entire workload Potential speedup = Number of pipe stages Pipeline rate is limited by the slowest pipeline stage Partition the pipe to many pipe stages Make the longest pipe stage to be as short as possible Balance the work in the pipe stages Pipeline adds overhead (e.g., latches) Time to “fill” pipeline and time to “drain” it reduces speedup Stall for dependencies Too many pipe-stages start to loose performance IPC of an ideal pipelined machine is 1 Every clock one instruction finishes

№6 слайд
Pipelined CPU
Содержание слайда: Pipelined CPU

№7 слайд
Structural Hazard Different
Содержание слайда: Structural Hazard Different instructions using the same resource at the same time Register File: Accessed in 2 stages: Read during stage 2 (ID) Write during stage 5 (WB) Solution: 2 read ports, 1 write port Memory Accessed in 2 stages: Instruction Fetch during stage 1 (IF) Data read/write during stage 4 (MEM) Solution: separate instruction cache and data cache Each functional unit can only be used once per instruction Each functional unit must be used at the same stage for all instructions

№8 слайд
Pipeline Example cycle
Содержание слайда: Pipeline Example: cycle 1

№9 слайд
Pipeline Example cycle
Содержание слайда: Pipeline Example: cycle 2

№10 слайд
Pipeline Example cycle
Содержание слайда: Pipeline Example: cycle 3

№11 слайд
Pipeline Example cycle
Содержание слайда: Pipeline Example: cycle 4

№12 слайд
Pipeline Example cycle
Содержание слайда: Pipeline Example: cycle 5

№13 слайд
RAW Dependency
Содержание слайда: RAW Dependency

№14 слайд
Using Bypass to Solve RAW
Содержание слайда: Using Bypass to Solve RAW Dependency

№15 слайд
RAW Dependency
Содержание слайда: RAW Dependency

№16 слайд
Forwarding Hardware
Содержание слайда: Forwarding Hardware

№17 слайд
Forwarding Control Forwarding
Содержание слайда: Forwarding Control Forwarding from EXE (L3) if (L3.RegWrite and (L3.dst == L2.src1)) ALUSelA = 1 if (L3.RegWrite and (L3.dst == L2.src2)) ALUSelB = 1 Forwarding from MEM (L4) if (L4.RegWrite and ((not L3.RegWrite) or (L3.dst  L2.src1)) and (L4.dst = L2.src1)) ALUSelA = 2 if (L4.RegWrite and ((not L3.RegWrite) or (L3.dst  L2.src2)) and (L4.dst = L2.src2)) ALUSelB = 2

№18 слайд
Register File Split Register
Содержание слайда: Register File Split Register file is written during first half of the cycle Register file is read during second half of the cycle Register file is written before it is read  returns the correct data

№19 слайд
Can t Always Forward
Содержание слайда: Can't Always Forward

№20 слайд
Stall If Cannot Forward
Содержание слайда: Stall If Cannot Forward

№21 слайд
Software Scheduling to Avoid
Содержание слайда: Software Scheduling to Avoid Load Hazards Fast code LW Rb,b LW Rc,c LW Re,e ADD Ra,Rb,Rc LW Rf,f SW a,Ra SUB Rd,Re,Rf SW d,Rd

№22 слайд
Control Hazards
Содержание слайда: Control Hazards

№23 слайд
Control Hazard on Branches
Содержание слайда: Control Hazard on Branches

№24 слайд
Control Hazard on Branches
Содержание слайда: Control Hazard on Branches

№25 слайд
Control Hazard on Branches
Содержание слайда: Control Hazard on Branches

№26 слайд
Control Hazard on Branches
Содержание слайда: Control Hazard on Branches

№27 слайд
Control Hazard on Branches
Содержание слайда: Control Hazard on Branches

№28 слайд
Control Hazard on Branches
Содержание слайда: Control Hazard on Branches

№29 слайд
Control Hazard Stall Stall
Содержание слайда: Control Hazard: Stall Stall pipe when branch is encountered until resolved Stall impact: assumptions CPI = 1 20% of instructions are branches Stall 3 cycles on every branch  CPI new = 1 + 0.2 × 3 = 1.6 (CPI new = CPI Ideal + avg. stall cycles / instr.) We loose 60% of the performance

№30 слайд
Control Hazard Predict Not
Содержание слайда: Control Hazard: Predict Not Taken Execute instructions from the fall-through (not-taken) path As if there is no branch If the branch is not-taken (~50%), no penalty is paid If branch actually taken Flush the fall-through path instructions before they change the machine state (memory / registers) Fetch the instructions from the correct (taken) path Assuming ~50% branches not taken on average CPI new = 1 + (0.2 × 0.5) × 3 = 1.3

№31 слайд
Dynamic Branch Prediction
Содержание слайда: Dynamic Branch Prediction

№32 слайд
BTB Allocation Allocate
Содержание слайда: BTB Allocation Allocate instructions identified as branches (after decode) Both conditional and unconditional branches are allocated Not taken branches need not be allocated BTB miss implicitly predicts not-taken Prediction BTB lookup is done parallel to IC lookup BTB provides Indication that the instruction is a branch (BTB hits) Branch predicted target Branch predicted direction Branch predicted type (e.g., conditional, unconditional) Update (when branch outcome is known) Branch target Branch history (taken / not-taken)

№33 слайд
BTB cont. Wrong prediction
Содержание слайда: BTB (cont.) Wrong prediction Predict not-taken, actual taken Predict taken, actual not-taken, or actual taken but wrong target In case of wrong prediction – flush the pipeline Reset latches (same as making all instructions to be NOPs) Select the PC source to be from the correct path Need get the fall-through with the branch Start fetching instruction from correct path Assuming P% correct prediction rate CPI new = 1 + (0.2 × (1-P)) × 3 For example, if P=0.7 CPI new = 1 + (0.2 × 0.3) × 3 = 1.18

№34 слайд
Adding a BTB to the Pipeline
Содержание слайда: Adding a BTB to the Pipeline

№35 слайд
Adding a BTB to the Pipeline
Содержание слайда: Adding a BTB to the Pipeline

№36 слайд
Adding a BTB to the Pipeline
Содержание слайда: Adding a BTB to the Pipeline

№37 слайд
Using The BTB
Содержание слайда: Using The BTB

№38 слайд
Using The BTB cont.
Содержание слайда: Using The BTB (cont.)

№39 слайд
Backup
Содержание слайда: Backup

№40 слайд
MIPS Instruction Formats
Содержание слайда: MIPS Instruction Formats

№41 слайд
The Memory Space Each memory
Содержание слайда: The Memory Space Each memory location is 8 bit = 1 byte wide has an address We assume 32 byte address An address space of 232 bytes Memory stores both instructions and data Each instruction is 32 bit wide  stored in 4 consecutive bytes in memory Various data types have different width

№42 слайд
Register File The Register
Содержание слайда: Register File The Register File holds 32 registers Each register is 32 bit wide The RF supports parallel reading any two registers and writing any register Inputs Read reg 1/2: #register whose value will be output on Read data 1/2 RegWrite: write enable

№43 слайд
Memory Components Inputs
Содержание слайда: Memory Components Inputs Address: address of the memory location we wish to access Read: read data from location Write: write data into location Write data (relevant when Write=1) data to be written into specified location Outputs Read data (relevant when Read=1) data read from specified location

№44 слайд
The Program Counter PC Holds
Содержание слайда: The Program Counter (PC) Holds the address (in memory) of the next instruction to be executed After each instruction, advanced to point to the next instruction If the current instruction is not a taken branch, the next instruction resides right after the current instruction PC  PC + 4 If the current instruction is a taken branch, the next instruction resides at the branch target PC  target (absolute jump) PC  PC + 4 + offset×4 (relative jump)

№45 слайд
Instruction Execution Stages
Содержание слайда: Instruction Execution Stages Fetch Fetch instruction pointed by PC from I-Cache Decode Decode instruction (generate control signals) Fetch operands from register file Execute For a memory access: calculate effective address For an ALU operation: execute operation in ALU For a branch: calculate condition and target Memory Access For load: read data from memory For store: write data into memory Write Back Write result back to register file update program counter

№46 слайд
The MIPS CPU
Содержание слайда: The MIPS CPU

№47 слайд
Executing an Add Instruction
Содержание слайда: Executing an Add Instruction

№48 слайд
Executing a Load Instruction
Содержание слайда: Executing a Load Instruction

№49 слайд
Executing a Store Instruction
Содержание слайда: Executing a Store Instruction

№50 слайд
Executing a BEQ Instruction
Содержание слайда: Executing a BEQ Instruction

№51 слайд
Control Signals
Содержание слайда: Control Signals

№52 слайд
Pipelined CPU Load cycle Fetch
Содержание слайда: Pipelined CPU: Load (cycle 1 – Fetch)

№53 слайд
Pipelined CPU Load cycle Dec
Содержание слайда: Pipelined CPU: Load (cycle 2 – Dec)

№54 слайд
Pipelined CPU Load cycle Exe
Содержание слайда: Pipelined CPU: Load (cycle 3 – Exe)

№55 слайд
Pipelined CPU Load cycle Mem
Содержание слайда: Pipelined CPU: Load (cycle 4 – Mem)

№56 слайд
Pipelined CPU Load cycle WB
Содержание слайда: Pipelined CPU: Load (cycle 5 – WB)

№57 слайд
Datapath with Control
Содержание слайда: Datapath with Control

№58 слайд
Multi-Cycle Control
Содержание слайда: Multi-Cycle Control

№59 слайд
Five Execution Steps
Содержание слайда: Five Execution Steps Instruction Fetch Use PC to get instruction and put it in the Instruction Register. Increment the PC by 4 and put the result back in the PC. IR = Memory[PC]; PC = PC + 4; Instruction Decode and Register Fetch Read registers rs and rt Compute the branch address A = Reg[IR[25-21]]; B = Reg[IR[20-16]]; ALUOut = PC + (sign-extend(IR[15-0]) << 2); We aren't setting any control lines based on the instruction type (we are busy "decoding" it in our control logic)

№60 слайд
Five Execution Steps cont.
Содержание слайда: Five Execution Steps (cont.) Execution ALU is performing one of three functions, based on instruction type: Memory Reference: effective address calculation. ALUOut = A + sign-extend(IR[15-0]); R-type: ALUOut = A op B; Branch: if (A==B) PC = ALUOut; Memory Access or R-type instruction completion Write-back step

№61 слайд
The Store Instruction
Содержание слайда: The Store Instruction

№62 слайд
RAW Hazard SW Solution
Содержание слайда: RAW Hazard: SW Solution

№63 слайд
Delayed Branch Define branch
Содержание слайда: Delayed Branch Define branch to take place AFTER n following instruction HW executes n instructions following the branch regardless of branch is taken or not SW puts in the n slots following the branch instructions that need to be executed regardless of branch resolution Instructions that are before the branch instruction, or Instructions from the converged path after the branch If cannot find independent instructions, put NOP

№64 слайд
Delayed Branch Performance
Содержание слайда: Delayed Branch Performance Filling 1 delay slot is easy, 2 is hard, 3 is harder Assuming we can effectively fill d% of the delayed slots CPInew = 1 + 0.2 × (3 × (1-d)) For example, for d=0.5, we get CPInew = 1.3 Mixing architecture with micro-arch New generations requires more delay slots Cause computability issues between generations

Скачать все slide презентации Computer structure pipeline одним архивом: