Code Monkey home page Code Monkey logo

fpga-fabric_design_and_architecture's Introduction

FPGA-Fabric_Design_and_Architecture

This document is a report of the 5-day intensive workshop on FPGAs, organized by VLSI System Design.
As a gist, a 4-bit counter and RVMyth processor are taken as the units under test. These are processed on various platforms (Vivado, VTR, and SOFA), and the results obtained on these platforms are compared from timing, power, and area point of view. Day-to-day gist of the workshop is given below:

  • On day-1, a 4-bit counter is designed and analysed on Vivado, and theory on Virtual Input/Output (VIO) was discussed briefly .
  • On day-2, Verilog To Route (VTR) flow is explored with the 4-bit counter example and the results (timing, Power, and Area) of VTR and Vivado are compared.
  • On day-3, the processor code for RVMyth is studied and is run till bitstream on Vivado.
  • On day-4, the timing, Power, and Area results of the 4-bit counter are studied on Skywater OpenSource FPGA (SOFA).
  • On day-5, the RVMyth processor is studied on SOFA and the timing, power, and area results are noted.

The elaboration of these steps makes this document.

Day-wise contents of the workshop

  • Day1

    • Introduction to FPGA

      click here

      • What is an FPGA?
      • LUTs and ways for programming FPGAs
      • The Basys FPGA boards and Vivado
    • Vivado_Counter

      click here

      • Verilog Simulation
      • Elaboration
      • Map pins
      • Slack
      • Synthesis
      • Bitstream constraints
      • Bitstream generation view on Basys3
      • Timing
      • Power_Area
    • VIO_Counter

      click here

      • Introduction
  • Day2

    • Introduction to OpenFPGA

      click here

      • Part-1
      • Part-2
      • VTR flow
    • VPR

      click here

      • xml blif
      • tseng GUI
      • Timing report
    • VTR

      click here

      • VTR flow with VPR GUI
      • Post synthesis simulation
      • Timing_Area
      • Power Analysis
    • Earch and Basys3 result comparison

      click here

  • Day3

    • RISC_V core programming using Vivado

      click here

      • RVMyth Vivado RTL to Synthesis
      • RVMyth Vivado Synthesis to bitstream
  • Day4

    • Introduction to SOFA FPGA Fabric IP

      click here

      • Counter Area
      • Counter Timing
      • Counter post impl
      • Counter Power
  • Day5

    • RISC_V core on custom SOFA fabric

      click here

      • SOFA-RVMyth run
      • SOFA-RVMyth timing and area
      • RVMyth post impl netlist
      • SOFA-RVMyth Vivado simulation

Introduction to FPGA

After the complete design flow, ASICs are sent to the foundry for fabrication. Once fabrication is done, no change can be made on the IC, especially at the design level. This fact imposes a huge pressure on the reliabilty of the design. It'd be very useful (not just from monetary terms, but from time to market point of view as well) to have a programmable device that could be used to test multiple design codes (reprogrammable). Research towards this goal yielded devices such as Programmable Logic Arrays (PLAs), Complex Programmable Logic Devices (CPLDs), and Field Programmable Gate Arrays (FPGAs). They basically synthesize a customizable hardware and this hardware could then be used to study the timing, power, and area parameters of the design. This course uses Basys3 FPGA offered by Xilinx.

Just like we get the layout as the end product of the whole design procedure of an ASIC, we get a BITSTREAM in case of an FPGA. The architecture of an FPGA is explained using the below figure:

image

Since there is a Flip-Flop bank in the CLB, it is capable of storing small data.

The below figure shows one of the many ways a code Z = ~(X.Y) could get realised on an FPGA, by using LUT from one of the CLBs:

image

The following figure shows a zoom-out view of how the above NAND gate is actually realized. It may be noted that each I/O port can have multiple wires (X, Y, and Z signals are given to wires and connections between these wires (also called interconnets) are made or broken depending on the requirement).

image

Also, interconnections can be made between different CLBs. Let's say, for example, we'd like to implement an expression Z = ~(X.Y) + P. A single CLB may be used to realise this logic but alternately, two CLBs might be used too. This is shown in the below figure:

image

CLB_1 implements the NAND logic, while CLB_2 implements the OR logic. Closely observing the connections made it may be obvious that the block diagram implemented is:

flowchart LR;
  X --> id1([CLB_1]);
  Y --> id1([CLB_1]);
  id1([CLB_1]) --> id2([CLB_2]);
  P --> id2([CLB_2]);
  id2([CLB_2]) --> Z;
Loading

The design flow of FPGA is briefed in the following flow-chart:

flowchart TD;
  Architecture_Description --> RTL_Design_and_TestBench;
  RTL_Design_and_TestBench --> Behavioural_Simulation;
  Behavioural_Simulation --> Synthesis_and_Timing_Analysis;
  Timing_Constraints -->  Synthesis_and_Timing_Analysis;
  Pin_Assignments -->  Synthesis_and_Timing_Analysis;
  Synthesis_and_Timing_Analysis --> Implementating_Place_and_Route;
  Implementating_Place_and_Route --> Bitstream_Generation
Loading

The generated bitstream would be used (either on software or hardware) to get Timing, Power, and Area reports.

The following pointers are to considered while writing RTL for FPGAs:

  • Delays should be implemented by designing counters and then keep counting for a few cycles (depending on how much delay is required). #time is not synthesizeable.
  • There is no provison to initialize a variable. The initial block is only for testbenches.
  • User Defined Primitives (UDPs) are non-synthesizeable (obviously!).
  • Indeterminate sizes should be given a fixed size to get synthesized.
  • If the design has loops, it should be made sure that these loops are terminating.

Some information about Basys3 :
The following figure (taken from here) highlights the importamt components on the Basys3 board:

image

Ironically, the most important component is not highlighted, the FPGA bank. According the reference manual,

" The Basys3 board is a complete, ready-to-use digital circuit development platform based on the latest Artix-7โ„ข Field Programmable Gate Array (FPGA) from Xilinx. With its high-capacity FPGA XC7A35T-1CPG236C, low overall cost, and collection of USB, VGA, and other ports, the Basys3 can host designs ranging from introductory combinational circuits to complex sequential circuits like embedded processors and controllers. It includes enough switches, LEDs and other I/O devices to allow a large number designs to be completed without the need for any additional hardware, and enough uncommitted FPGA I/O pins to allow designs to be expanded using Digilent Pmods or other custom boards and circuits ".

There are different ways of programming the board:

flowchart TD;
  Programming --> On_site_using_the_actual_board;
  Programming --> Remote;
  Remote --> Send_input_through_VIO_observe_output_on_actual_board;
  Remote --> Input_through_VIO_output_observed_on_ILA;
Loading

In remote way of programming, if the IP address of the board is known, we may send the input to the board through Virtual Input/Output (VIO), and observe the output from the board. In cases where the board is not available (like this workshop), the inputs are processed through VIO and the output can be observed on a Integrated Logic Analyser (ILA).


Vivado Counter

Vivado is an Integrated Design Environment (IDE) from Xilinx (now AMD). The tool offers an intuitive GUI and all of its options are written in native Tool Command Language (TCL). Vivado can be used for analysis and constraint assignment at any stage of the design (such as synthesis, PnR).

Once opened (by using vivado command), we may either create a new project or open an existing project. Upon choosing to create a new project, the tool prompts us to choose the board we wish to work upon. If we can't find the tool in the drop-down list, we need to update it's board library (there's a button for that), and then choose the board. The following code is then loaded as a design source:

`timescale 1ns / 1ps
//////////////////////////////////////////////////////////////////////////////////
// Description: 4 bit counter with source clock (100MHz) division.
//////////////////////////////////////////////////////////////////////////////////
module counter_clk_div(clk,rst,counter_out);
input clk,rst;
reg div_clk;
reg [25:0] delay_count;
output reg [3:0] counter_out;

//////////clock division block////////////////////
always @(posedge clk)
begin

  if(rst)
    begin
      delay_count<=26'd0;
      div_clk <= 1'b0; //initialise div_clk
      counter_out<=4'b0000;
    end

  else
    if(delay_count==26'd212)
      begin
        delay_count<=26'd0; //reset upon reaching the max value
        div_clk <= ~div_clk;  //generating a slow clock
      end

    else
      begin
        delay_count<=delay_count+1;
      end
end
/////////////4 bit counter block///////////////////
always @(posedge div_clk)
begin

  if(rst)
    begin
      counter_out<=4'b0000;
    end
  else
    begin
      counter_out<= counter_out+1;
    end
end

endmodule 

As commented in the code, it is a 4-bit counter. The following snapshot from the reference manual of Basys3 says that the internal clock speed is 100MHz. In order to observe and analyse the output comfortably, the clock speed has been scaled down around 470kHz by using a delay loop.

image

The following testbench is then loaded as a simulation source:

`timescale 1ns / 1ps

module test_counter();
reg clk, reset;
wire [3:0] out;

//create an instance of the design
counter_clk_div dut(clk, reset, out);  

initial begin
//these statements are sequential and are executed one after the other 
clk=0;  //at time=0
reset=1;//at time=0
#20; //delay 20 units
reset=0; //after 20 units of time, reset becomes 0
end
always 
#5 clk=~clk;  // toggle or negate the clk input every 5 units of time

endmodule

The following result is observed upon simulation:

1 1 Verilog code running result

It may be observed that the count is increasing for each clk_div posedge.

Check this snapshot from Vivado GUI which shows the steps that are done one after the other:

image

Since simulation is done, the next step is RTL analysis. This is done by opening an elaborated design. Check below snapshot from Xilinx user guide

image

The report of the elaborated design looks as below. We'd have choice to analyse layout, I/O planning, and floorplanning. The below figure shows I/O planning,and different pins avaiable:

2 1 Elaboration_1

The next step is to map the signals from the design onto these I/O ports:

image

But before that, to what what pins these signals needs to be mapped, that is to be chosen. For this, we may either use the board to check the pin numbers or go the schematics. Since there's no board available, the pin configiration is observed from the schematics:

image

After deciding on the pins, they are mapped with the signals as follows:

3 1_speed

The constraints file (.xdc) is then saved:

image

The contents of this file are as follows:

image

After running the synthesis, the design is opened and the timing report is checked. The report is as follows:

image

The frequency was then given as a constraint: 5 1

and the new timing report: image

The synthesized netlist is shown below: image

After synthesis, the next step is implementation: 6 1 1

Post-implementation timing report: image

Utilization summary: image

Power summary: image

The next step after synthesis is bitstream generation: 6 1 2

fpga-fabric_design_and_architecture's People

Contributors

stativeboss avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.