Code Monkey home page Code Monkey logo

ravenoc's Introduction

Regression Tests lint-sv lint-editorconfig codecov

Gitter chat

ravenoc_logo

RaveNoC - configurable Network-on-Chip

Table of Contents

Quickstart regression

To run the regression tests for the NoC, please follow the sequence below:

$ cd ravenoc_project_folder
$ docker run --rm --name ravenoc_run -v $(pwd):/ravenoc -w /ravenoc aignacio/ravenoc tox

To run a specific test, use the Makefile, replacing the SPEC_TEST variable by the name/flavor of the test you want then:

make all

Once it is running, it will create the run_dir folder with all the logs and waveforms (in .fst format) for each of the runs. To get more details of the tests please read Tb readme.

Introduction

RaveNoC is a configurable HDL for mesh NoCs topology that allows the user to change parameters and setup new configurations. In summary, the features of the RaveNoC are:

  1. Mesh topology (2D-XY)
  2. Valid/ready flow control
  3. Switching: Pipelined wormhole
  4. Virtual channel flow control
  5. Slave I/F AMBA AXI4
  6. Different IRQs that can be muxed/masked individually
  7. Support multiple clock configurations (CDC)
  8. Configurable parameters:
    • Flit/AXI data width
    • Number of buffers in the input module
    • Number of virtual channels
    • Order of priority in the VCs
    • Dimensions of the NoC (Rows_X_Cols)
    • Routing algorithm
    • Maximum size of packets

Integration

The RTL top file exports arrays of inputs/outputs of an AXI4 slave interface that has length equal to the number of routers in the NoC i.e Rows X Cols. Also as an input parameter of ravenoc module, there is AXI_CDC_REQ array which is used to specify if each router need or not the CDC async gp fifo due to cross clock domain aspect.

There is a single clock/async. reset for the NoC and an array of clocks/async. resets for the AXIs due to the fact that every router can have a different clock domain. An additional input called bypass_cdc is used in the testbench but it is not recommended to be used during integration once if CDC is not required, the user should change the AXI_CDC_REQ parameter as mentioned in the specific array index.

For every router a set of CSRs (Control and Status registers) are available which can be individually programmable per unit. The list of CSRs available are:

CSR Address Description Default Permissions
RAVENOC_VERSION `AXI_CSR_BASE_ADDR+'h0 RaveNoC HW version 1.0 Read-Only
ROUTER_ROW_X_ID `AXI_CSR_BASE_ADDR+'h4 Row / X - ID of the Router 0 Read-Only
ROUTER_COL_Y_ID `AXI_CSR_BASE_ADDR+'h8 Column / Y - ID of the Router 0 Read-Only
IRQ_RD_STATUS `AXI_CSR_BASE_ADDR+'hC Returns the IRQ value per VC -- Read-Only
IRQ_RD_MUX `AXI_CSR_BASE_ADDR+'h10 Controls the input mux of IRQs DEFAULT R/W
IRQ_RD_MASK `AXI_CSR_BASE_ADDR+'h14 Controls the input mask of the IRQs 'hFFFF R/W
WR_BUFFER_FULL `AXI_CSR_BASE_ADDR+'h18 Indicates if the wr. buffer is full 0 Read-only
IRQ_PULSE_ACK `AXI_CSR_BASE_ADDR+'h1C When IRQ=PULSE_HEAD, ack the inter. 0 Write-only

See the SV structs to understand the possible values for the IRQ_RD_MUX.

Additional CSRs

There are some additional CSRs which are generated based on the number of virtual channels that the NoC is configured. Each CSR is connected to the read pointer FIFO element bits that indicate the size of the packet of each individual VC read FIFO. They are read-only CSRs and the start address is right after the default CSR table above. For instance, in a NoC with 4xVCs the CSRs are the ones listed below:

CSR Address Description Default Permissions
RD_SIZE_VC_PKT_0 `AXI_CSR_BASE_ADDR+'h20 Size of the packet in VC0 0 Read-Only
RD_SIZE_VC_PKT_1 `AXI_CSR_BASE_ADDR+'h24 Size of the packet in VC1 0 Read-Only
RD_SIZE_VC_PKT_2 `AXI_CSR_BASE_ADDR+'h28 Size of the packet in VC2 0 Read-Only
RD_SIZE_VC_PKT_3 `AXI_CSR_BASE_ADDR+'h2C Size of the packet in VC3 0 Read-Only

Considering the example above, to get the size of the packet in the virtual channel 3, the user must read the address AXI_CSR_BASE_ADDR+'h24.

IRQs

In the top level it is available an array of IRQs (Interrupt Request Signals) that is a struct which is connected to every router / AXI modules of the NoC. All the IRQs are related to the AXI read VC buffers of the router. Two CSRs mentioned previously are important to configure the IRQ behavior in each router. The IRQ_RD_MUX selects which is the input source for the IRQs, that can be either the empty or full flags of the read AXI buffers or a comparison with the number of flits available to be read at the read buffer. And the IRQ_RD_MASK is an input mask that does the AND logical operation with every bit of the output of IRQ_RD_MUX and in case this one is set to comparison, the mask will represent the reference value. When the MUX is selected to IRQ_PULSE_HEAD_FLIT, the IRQ needs to be acknowledged by writing 0x00 into the IRQ_PULSE_ACK CSR. The image down below tries to explain in a more ilustrative way: IRQs RaveNoC

Configurable parameters

The following parameters are configurable and can be passed by compilation time as system verilog macros. Please check that not all parameters are indicated to change unless you have some interest to look inside the design to understand how it is used and wants to build something custom for one specific application. To check which are the default values for all the parameters, see the main defines file.

SV Macro Description Default Value Range
FLIT_DATA_WIDTH Flit data width in bits, AXI data width will be equal 32 (32,64) - 128 not tested
FLIT_BUFF Number of flits buffered in each virtual channel input fifo 2 (1,2,4,8...) - Must be a power of 2
N_VIRT_CHN Number of virtual channels 3 (1,2,3,4...) - Up to 32
H_PRIORITY Priority order on the virtual channels ZeroHighPrior ZeroHighPrior or ZeroLowPrior
NOC_CFG_SZ_ROWS Number of rows in the NoC - X 2 1 (if cols > 1),2,3,4... - Any int. value
NOC_CFG_SZ_COLS Number of cols in the NoC - Y 2 1 (if rows > 1),2,3,4... - Any int. value
ROUTING_ALG Routing algorithm of the input module "XYAlg" "XYAlg" or "YXAlg"
MAX_SZ_PKT Max number of flits per packet 256 Min. val == 1
AUTO_ADD_PKT_SZ If set, NoC will auto append pkt size on the header flit 0 0 - user sets the pot size or 1
RD_AXI_BFF(x) Math macro to gen the the num. of buffers per RD VCs on AXI4 slave x<=2?(1<<x):4 --
CDC_TAPS Number of FIFO slots in the async gp fifo used for CDC 2 >=2 - Must be a power of 2

RTL micro architecture

Router

The NoC has been constructed in a way that most part of the modules are replicated through generate SV constructions, thus the behavior is generic and it was designed in a way that the user could reuse as much as possible in different hierarchies. Each router is composed by input modules, output modules, a NI (network interface) with one CSR bank (control and status register) and an AXI4 slave interface. The diagrams below exemplifies the modules mentioned and how many instances are used at each router. Router Through each router, we have 5x output modules and 5x input modules, with all connected to each other from different directions (i.e other 4x ports). One router is capable of routing a packet composed by one (single head flit) or more flits through its ports (west, east, north, south or local). The only requirement for packet payload is the header flit, that in RaveNoC follows the below encoding:

Max_width+2b---------------------------------------------0
| FLIT_TYPE (2b) | X_DEST | Y_DEST | PKT_WIDTH | MESSAGE |
+--------------------------------------------------------+

Each port will always select a router for the current flit in the correspondent virtual channel to route but never in the same direction i.e it'll never return from where it came from. This also applies to the local port (the one in diagonal in the previous diagram). So if a flit is pushed through the router it is because it has a valid destination and the flit should move in between the input modules internal FIFOs. For instance in a NoC 2x2, considering all the connections with routers we have something like this: Router connections Example

In the RaveNoC routers there are also some additional components which are used for assembling the packet and converting clock domains between the NoC and the AXI interface. The image below shows internally how these pieces are connected together and what is the datapath of the packet. Router detailed In the pkt_proc, the flit type is appended to the packet and send to the cdc_pkt. The cdc_pkt module is optional and not instantiated individually in each router if AXI_CDC_REQ[router] is set to 0.

Input module

One router has exactly 5x input modules, each input module can have one or more virtual channels, each virtual channel has a FIFO inside that is also configurable and it is responsible for storing the flits that comes from its input interface matching the correspondent virtual channel id.

The router connections of west, east, north, south come from another router, in the local port it comes from the network interface that is consuming/producing flits all the time. Every time a head flit arrives in the input module, the input router inside this module will decode its destination by looking to the current node address of the router and the target one in the header flit. Input router Depending upon the chosen algorithm for the routing, it selects one of 4x possible output modules internally. Also it is important to highlight that each virtual channel has it is own independent FIFO and when a higher priority virtual channel message comes, the lower priority flits are preempted inside their FIFOs, allowing only the higher priority flit pass through.

If the FIFOs are full, each independent one will set zero the correspondent ready interface signal to generate back pressure on the connections in. If there is space available and the interface has valid asserted, then a flit will go through the input module and this module will (in the next clock cycle) forward it is route to the output.

Output module

The output module has no sequential elements like FFs to store the flits, so it means by that every time a route has been established, it will connect the correspondent FIFO input module to the next router in the NoC. Output router Each output module has a round-robin arbiter per virtual channel, so in a long time, it will keep fairness between the different input modules of the same virtual channel. The same concept applied in the input module of "preemptive virtual channels" is used in the output module, where a flit coming from a higher priority virtual channel will have precedence over the lower ones.

Thus the input module routing is responsible for locking the current route by its own, on each independent virtual channel, once it should be restored after the high priority flit has been transferred.

Network Interface

In each router we have also a NI - Network Interface that is the entry point for the flits in/out of the NoC. The NI has a single AXI4 slave I/F which is used by the crossbar its connected, to transmit/receive packets from the network. Also it has a dedicated I/F to exchange data with the local input module, once we need to send the packets and it exports the IRQs - Interrupt Requests of the router. ni Inside each NI, we have 2x FIFOs (WR/RD) for the outstanding AXI transactions (by default it is support up to 2 but can be configured as well) and *N-*FIFOs, one for each virtual channel. As the priority of the VCs is different, the user can tweak the RD_AXI_BFF(x) macro to define the size of each FIFO per VC, thus not all these FIFOS will have the same depth.

Write packets

Every time a new AXI write txn arrives at the slave I/F, it is decoded to see if the address is inside the CSR address space or if the user wants send a packet. To send a packet the user must write in the wr buffer VCs address space, which by default is located at 'h1000+(VC_ID*'h8) where VC_ID is the numeric value of the VC. For instance, if the NoC has 4xVCs and the user wants to send a packet with priority equal to 2, it should write it at the address 'h1010. In the first beat of the AXI burst the user must to follow the encoding mentioned before with the correct X/Y DEST and the PKT_WIDTH, the NoC does not have a mechanism in place to check these parameters correctly thus it needs be checked prior to sending to the AXI I/F. The user can write in the NI AXI I/F through multiple bursts as long as they match the total pkt size (PKT_WIDTH). The PKT_WIDTH mentioned before refers for the number of flits - 1 that it is intended to be send over the NoC, for instance follow the example down below:

User wants to sent 1024 bytes over 32 bit NoC
PKT_WIDTH = 1024 bytes --- /4 ---> 256 flits (each flit has 4 bytes in a 32bit NoC) - 1 (overhead header flit) = 255

Read packets

In the other way for an AXI read, it is decoded to see if the address is inside the CSR address space or if the user wants read a packet from the read VC buffers. For the reads, all the operations must consider the address space default 'h2000+(VC_ID*'h8) where VC_ID is the numeric value of the VC. Similar the write example, if the NoC has 4xVCs and the user wants to read a packet with priority equal to 2, it should write it at the address 'h2010. The user can read from the NI AXI I/F through multiple bursts as long as they match the total pkt size.

Other info

This project uses CI to run regression tests, check linting on the RTL and parse with editorconfig configuration. For the RTL linting, it is used verible, running the verible-verilog-lint and for the editor config check, it is used editorconfig-checker. Also, it is added support to FuseSoC through core file description in CAPI2 format.

License

RaveNoC is licensed under the permissive MIT license.Please refer to the LICENSE file for details.

ravenoc's People

Contributors

aignacio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ravenoc's Issues

router top file:Confused about ns_con

Sir, I'm confused by this "con", I guess it's an abbreviation for connection, but what about the ns in front of it? Why only ns, sn, not nw, ne? Can you answer my doubts when you have time? Thank you very much!

router_if ns_con [(NoCCfgSzRows+1)NoCCfgSzCols] ();
router_if sn_con [(NoCCfgSzRows+1)NoCCfgSzCols] ();
router_if we_con [NoCCfgSzRows
(NoCCfgSzCols+1)] ();
router_if ew_con [NoCCfgSzRows
(NoCCfgSzCols+1)] ();

Why the project only has axi slave interface

Why does the network interface only need the axi slave interface? Is it because you only consider connecting to the CPU? If the situation is that the CPUs not only need to actively initiate writes ,but also need to be read. Should I consider add an axi master interface?

Why is IRQ needed?

Hello sir, I am studying ravenoc project.Thank you for your selfless disclosure of your work, it has been a valuable learning resource for me. I don't understand what IRQ does, can you please explain a little more?It appears in the axi csr file and in the axi slave interface file and it confuses me。
Why is IRQ needed?Is it a signal to send an interrupt request to the CPU?
what is the meaning of IRQ_Mask and IRQ_Mux?

How to use this repo

Hello,

I'm looking for guidance on which files are essential when using your repository for NOC analysis. While I've successfully generated Docker images for CPU, network I/O, and other components, I'm eager to understand how to utilize this repository more effectively for contributing to the open source community. Your insights would be greatly appreciated.

Thank you, and I'm eagerly awaiting your response.

what does *N-*FIFO mean here?

Hello👋,
I would like to ask: what does *N-*FIFO mean here? It means that each VC has a corresponding Buffer in NI? For example, if there are 4 VCs, there are 4 buffers in NI?

“Inside each NI, we have 2x FIFOs (WR/RD) for the outstanding AXI transactions (by default it is support up to 2 but can be configured as well) and *N-*FIFOs, one for each virtual channel”。

regarding the configurable files

hi
I needed to know what files I needed to run while I was using you're repo to analyse NOC and also I was able to achieve the docker images about the CPU, network I/O and other things but

and also how to use this repo in a better way so that we can contribute to the Open source

thank you
really looking forward for your response

Unable to run regression tests

Hi @aignacio I tried following the steps you mentioned in the section on "Quickstart regression" however I am facing certain issues and warnings after running the tox command for regression. The issue that I am currently facing is that even though the "run_dir" folder is being generated it is empty. I have tried to figure out the issue but I have been unsuccessful this far. For your reference I am attaching the logfile.log file that I generated after running the tox command. I would be very grateful if you could kindly help me resolve this issue.
logfile.log

Design of flit fifo of RaveNoC Router using Block RAM for area optimization in FPGA platform

Hi Anderson,

I was exploring the RaveNoC Router for the implementation in FPGA domain, as I thought that RaveNoC router is more suited for ASIC design purpose. I have attempted to optimize LUT counts through design of Block RAM for fifo. This design will help for Xilinx FPGA implementation. I have made both the design configurable using generate statement.
I am giving the link for the code and the snapshot of synthesis for both the design. Further, I am attaching the simulation waveform for block RAM fifo. Here the Block RAM has 2 clock cycle read latency.
Could you please review the code and the results and let me know your feedback.

Thanks,
Madhumita Mukherjee.

code link: https://github.com/madhumita-mukherjee/RaveNoC-router.git

device_utilization_BRAM_fifo
device_utilization_of_original_Rave_fifo
simulation_of_BRAM_fifo

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.