yosyshq / picorv32 Goto Github PK

View Code? Open in Web Editor NEW

2.9K 2.9K 732.0 913 KB

PicoRV32 - A Size-Optimized RISC-V CPU

License: ISC License

Makefile 5.93% C 19.79% Assembly 21.19% Verilog 45.52% Python 2.38% Tcl 0.65% Shell 2.85% C++ 0.69% Nix 1.01%

picorv32's People

Contributors

Stargazers

Watchers

Forkers

kammoh ujamjar 8l euripedesrocha martindale hoangt hacklinux lgtmcu proteus-cpi cpehle changeyourname neuschaefer 32bitmicro huleg stevekerrison hrs-allbsd cheong2k wallclimber21 rgwan mafei ankistein chadski thon16 babooppa6 ravindrakant tkliche guztech e19293001 wanfhx shuvamnandi neethu-mohanan ednutting oleg-nenashev outputlogic open-design markmatthewanderson gwsu erguotou bqwer poweihuang17 plank3ffect conorpp embecosm wzab jdecheverri cmteric ycyang0508 qicny mytoys yarrey ssnuthakki abhishekntu neocogent clker rowhit kassemmkk ganinaleksei daveshah1 kkdas088 krocki copperdong taomiao diantaowang zhuzhengchao shbadawy noytzach ytchang7199 tibortth4 joepfortunato lcf2016 zxm0709 mithro davemcewan ahappyforest thoughtpolice ironsteel coreychen922 jerry-jho qsprakash bcfanny guitarhua willg1996 tomverbeure grahamedgecombe tinyfpga hellokayt ivstepanov deanoc hyperpicc pdaxrom ckj119940887 develone cr1901 johnntd steve-holmes program-bear lxing1988 co9 itviewer lerwys

picorv32's Issues

jal instruction execution doesn't align with expectation

Hi, Clifford

When I simulate picorv32 with my system, the jal instruction does't jump the the expected address.

as you can see from the waveform, when the current pc is 0x512, cpu get a jal instruction, the value is 0xbfdd in the objdump:

512: bfdd j 508 <main+0x8>,

since main starts at 0x500, I expect the pc jump to 0x508, however, it jumps to 0x1508, the variable decoded_imm_uj is 0xff6, when it adds to 0x512, the result is 0x1508, should we just use the lower 12 bits to generate reg_next_pc?

Could you please have a look at this? Thank you very much!

log.txt is the simulation log when add DEBUY macro for picorv32
mem.txt is the memory initialization file

log.txt

mem.txt

riscv32imac

Hi!

I am targeting Alpine Linux.

It has been decided that the UNIX/Linux platform ABIs will mandate the A extension. The latest Linux port has removed kernel support for atomic emulation (the cmpxchg syscall). So A is now mandatory for the musl-riscv port given the Linux has removed the syscall.

Can you see picorv32 supporting the A extension?

Thanks.

decoded_imm_uj may cause confusion

I meet decoded_imm_uj first time, I think it used for both U-immediate and J-immediate. But it just use as J-immediate

// Here extract to form J-immediate and signed-extend it
{ decoded_imm_uj[31:20], decoded_imm_uj[10:1], decoded_imm_uj[11], decoded_imm_uj[19:12], decoded_imm_uj[0] } <= $signed({mem_rdata_latched[31:12], 1'b0});

decoded_imm_uj only use for JAL, and U-immediate just extract from instruction directly.

case (1'b1)
  instr_jal:
    decoded_imm <= decoded_imm_uj;
  |{instr_lui, instr_auipc}:
    decoded_imm <= mem_rdata_q[31:12] << 12;

From the above, I suggest that decoded_imm_uj should be decoded_imm_j.

can it work on spartan7 xc7s50?

Xilinx ISE 14.7 synthesis

A few changes are needed for successful synthesis by the legacy Xilinx ISE toolchain:

xst incorrectly implements the register file with ENABLE_REGS_DUALPORT = 1, but ENABLE_REGS_DUALPORT = 0 works fine
xst doesn't like memory in an always @* combinational block
xst has some problems with parameterized macros

I did look at issues #2 and #25, and I'm a little puzzled as to how synthesis even completed, unless they made the same changes to the RTL as I did. I haven't seen problems with the PC register once the design is in runnable shape.

Attached is an example project targeting the Spartan 3E Starter board. It contains a picorv32 core with the plain memory interface, some block RAM, a UART, and some test software in C.

I'm just putting this out there in the hope it will make picorv32 more useful for older Spartan-6 or Spartan-3E designs; the main repository probably doesn't want to deal with irksome `ifdef OLD_XILINX stuff. I have some Spartan-6 hardware for which I'd like to use picorv32 (currently using picoblaze).

picorv32-Xilinx-ISE.tar.gz

io constraints

Hi Clifford,
I am trying to implement and STA this block using opensource qflow and opentimer. Can you please help me know what are the IO constraints that needs to be used? Any pointer or pdf will help....I am using target clock frequency of 400MHz to start with....
Thanks

dhrystone wrong

Hi,
I have used the given dhrystone.Because I have no iverilog, I change it to vcs tool.But aftr nrue it,when I check the result wave,it just stop to read data at 18382 cycles, compute by 10010^6/(175718382) is too large.Do you know how to get the right dhrystone.

asm call needs volatile

https://github.com/cliffordwolf/picorv32/blob/31588b871e0258f28a6daa5a34a6ad53ab11f267/dhrystone/stdlib.c#L29

Probably not an issue in your use case, but I had this piece of code compiled way because of the missing volatile:

unsigned int start_time = time();
unsigned int elapsed_time = start_time;
while (elapsed_time < MS_TO_CYCLES(100)){
    elapsed_time = time() - start_time;
}

volatile is also missing here:
https://github.com/cliffordwolf/picorv32/blob/31588b871e0258f28a6daa5a34a6ad53ab11f267/firmware/stats.c#L31

placement fails on iCeCube2

When i run picorv32 with icecube2, it fails with the error below in placer (windows). I suspect its a multiple assignment issue on decoder_trigger, and decoder_pseudo_trigger, or synplify optimizing that bit away due to the
decoder_pseudo_trigger <= 0;
assignment . I'll try to get a sim going, to debug this more.

Design Statistics after Packing
Number of LUTs : 1545
Number of DFFs : 608
Number of DFFs packed to IO : 0
Number of Carrys : 233
Device Utilization Summary after Packing
Sequential LogicCells
LUT and DFF : 502
LUT, DFF and CARRY : 106
Combinational LogicCells
Only LUT : 872
CARRY Only : 62
LUT with CARRY : 65
LogicCells : 1607/7680
PLBs : 211/960
BRAMs : 8/32
IOs and GBIOs : 9/206
PLLs : 0/2
I2088: Phase 3, elapsed time : 2.3 (sec)
Phase 4
I2712: Tool unable to find location for GB cpu.decoder_trigger_RNIV293
Error during global Buffer placement

Make fails

Hi when i am trying to Run make test or any make commands its produce the following error picorv32.v:1878: error: invalid module item.
picorv32.v:1879: syntax error
picorv32.v:1879: error: Invalid module instantiation
picorv32.v:1880: error: Invalid module instantiation
picorv32.v:1881: error: Invalid module instantiation
picorv32.v:1885: error: invalid module item.
picorv32.v:1887: syntax error
can you give guidance to correct the above mentioned issue.

removed

Tagged release of picorv32

I'm considering updating the picorv32 core for the FuseSoC standard library, but it would be great to have a tagged release that I can use

very large clocked process makes it impossible to bring out non-clocked signals

In the current code, there are some huge clocked always clauses, like the one that starts on line 1168.

There is nothing fundamentally wrong with this as long as you don't need to bring out anything combinationally.

The problem is that this is what I'd like to do. :-)

The immediate issue is that, for one reason or the other, cpuregs is not detected by Quartus as a memory block, so it consumes tons of regular registers. (From your documentation, it seems that Vivado is detecting this just fine.)

I'd like to experiment by replacing constructs like this:
cpuregs[latched_rd] <= reg_pc + (latched_compr ? 2 : 4);

to
cpuregs_wr = 1'b1;
cpuregs_addr = latched_rd
cpuregs_wdata = reg_pc + (latched_compr ? 2 : 4);

and then have 1 place with a single cpuregs[cpuregs_addr] ... clause.

If that still doesn't do the trick, I'd even instantiate an Altera memory macro to force a memory block.

With the current clocked always block, such an experiment requires a full-out rewrite of the whole always block instead of a small surgical patch.

Would you be open to converting this clock process into 2 processes, a pure combinational one and a clock one that just contains the register? I'm willing to do the work if you just tell me the naming convention. My standard convention is:

always(*)
<var_name>_nxt = ....

always @(posedge clk)
<var_name> <= <var_name>_nxt;

But I'm willing to use whatever way you prefer.

(I'm NOT asking to make the change for the cpuregs isolation itself, that could be a forked branch on my side of you think it makes the code look too ugly.)

In addition to cpuregs experiments, such a change would also make it possible to later insert logic to scan out register contents via jtag etc.

synthesis mismatch with the latest rtl on xilinx spartan6 fpga

Hi Clifford

Tried targeting the picorv32 to spartan6 (xc6slx25) board. The rtl simulation is ok, but FPGA didn't work, then I do a experiment to run the firmware you provided, it seems that the post-synthesis netlist have something wrong with the PC register, the PC add 0xC rather than 0x4 at certain point where I think there is no jumb instruction.

Could you help to have a look at this, Thank you very much!

In waveform bad_wave_netlist_sim, the 62 (start from 1) mem_axi_araddr is 0x780, while the previous one is 0x774, in the good wavefrom the mem_axi_araddr is 0x778, this is the mismatch point.

The picorv32_netlist.txt is the post synthesis netlist.

The testbench, firmware.hex and rtl is the latest one, with id: ef86b30

picorv32_netlist.txt

scall instruction causes trap instead of interrupt

I'm working on porting a rtos to picorv. scall doesn't seem to be implemented even dough it's part of the user level isa. Is there a reason for this? I think this would fix the problem.

instr_sbreak <= !CATCH_ILLINSN && ((mem_rdata_q[6:0] == 7'b1110011 && (mem_rdata_q[31:7] == 'b0000000000010000000000000) || mem_rdata_q[31:7] == 'b0000000000000000000000000)) ||
                    (COMPRESSED_ISA && mem_rdata_q[15:0] == 16'h9002));

Question about default 'starting' location for instruction memory requests after reset

The privileged spec states that the reset vector is address 0x200, and when doing a bare-metal compile that's also where the first instruction is located. Currently after your processor goes out of reset it starts querying for instructions at address 0. Is this intentional?

IceStorm compatibility?

Heya!

Wanted to reach out and see if this would be compatible with a BlackIce board? I'd love to use the IceStorm tools to get this running!

Thank you!

PicoRV32 Documents

HI, I am interested to learn PicoRV32, So Can you update Architecture & Pipeline, Datasheet for PicoRV32?

How to implement register file to block RAM manually?

My toolchain doesn't support to implement register file as dedicated memory. How to implement it manually to reduce core size?

Will there be a PicoRV64?

I just think it would be cool if an RV64G design (Rocket?) could fit onto one of the ICE40 FPGAs so there was a relatively easy and open source way to hack on these. I think softfp would actually be fine for porting most software.

I just wanted to talk about this, really. It's not an issue per se.

Incorrect handling of SBREAK/ILLNSN with PCPI

I'm trying to put a minimal example of this together (or a patch if I get there quicker), but won't get a chance until tomorrow. I leave this here for now as a test of my own sanity.

If PCPI is enabled and an illegal instruction or sbreak occur, upon retirq, the signal pcpi_valid remains asserted. The eventual effect of this is that when the next valid PCPI instruction comes along, it causes another SBREAK IRQ to occur, presumably because the PCPI timeout never got reset.

I actually want to handle ecall, and I guess I could do it with another PCPI extension, but that seems like a less graceful solution.

Potential power savings

There are several cases in the code with constructs like this:

cpu_state_ld_rs1: begin 
    ...        
    reg_op2 <= 'bx;

There are similar constructs for reg_out and div.pcpi_rd.

I assume that this construct exists to avoid the area of the feedback mux?

I'm wondering if the synthesis tools are smart enough to insert a clock gate for those registers instead of clocking out random data at each clock cycle.

IMO, instead of hoping that this would be the case, the more conservative way would be to remove the x assignment altogether. Feedback muxes don't really exist anymore and have been replaced by clock gates anyway, so the only extra logic cost would be the one to control the clock gate.

Also, a clock gate on alu_out_q would be nice as well.

MMU and privilege support?

Is there any interest and/or timeline for adding a privileged mode and an MMU?

Interpreting mem_wstrb

Hi,

I'm trying to understand the PicoRV32 native memory interface. How should mem_wstrb be interpreted in the != 0 case?

Is there no JTAG module? Can we only load the programs into spi-flash to run them?

Default picosoc register file doesn't support q regs

Hi.

Today I discovered that the default PicoSoc configuration doesn't seem to support the q registers, which are used for saving registers in IRQ handlers.

The test case that I used was a basic IRQ handler in ASM. In my test case, this worked fine:

irq_vec:
    picorv32_retirq_insn()

.. but this didn’t:

irq_vec:
    // backup x10/x11 in q2/3
    picorv32_setq_insn(q2, x10)
    picorv32_setq_insn(q3, x11)

    // modify X10/X11
    addi x10, zero, 0
    addi x11, zero, 0

    // restore x10 and x11 from Q registers
    picorv32_getq_insn(x10, q2)
    picorv32_getq_insn(x11, q3)

    // return from IRQ
    picorv32_retirq_insn()

It turned out that the issue was in the default register bank provided by PicoSoc, and the fact that picorv32 uses registers r32-r35 for the Q registers. The default register bank truncates addresses to 5 bits, so the q registers don't work.

module picosoc_regs (
    input clk, wen,
    input [5:0] waddr,
    input [5:0] raddr1,    
    input [5:0] raddr2,
    input [31:0] wdata,
    output [31:0] rdata1,
    output [31:0] rdata2
);

reg [31:0] regs [0:31];
always @(posedge clk)
    if (wen) regs[waddr[4:0]] <= wdata;  // <---- address truncated

    assign rdata1 = regs[raddr1[4:0]];   // <---- address truncated
    assign rdata2 = regs[raddr2[4:0]];   // <---- address truncated
endmodule

I've got a pull request ready to go - I'll send it through for review shortly.

Unable to build testbench waveform in ubuntu

Tried building the testbench.vcd in ubuntu 12.04 with vivado,

getting error as below.

prashantravi@ubuntu:/picorv32/picorv32$ make testbench.vcd
iverilog -o testbench.exe testbench.v picorv32.v
picorv32.v:41: syntax error
I give up.
make: *** [testbench.exe] Error 2
prashantravi@ubuntu:/picorv32/picorv32$

Please help

Quartus: Verilog HDL error at picorv32.v(2299): constant value overflow

Both Quartus II Version 13.1.4 and Quartus Prime Version 17.0.0 give me the warning

Warning (10259): Verilog HDL error at picorv32.v(2299): constant value overflow

for picorv32 version c9de800 ('Remove some trailing whitespace').

Reset coding style

I just got started in building my own test bench for the
picorv32.v and noticed that there are 'bx values on the output
ports of picorv32 after the reset. Would it be a good coding
style in adding initial values for the FF output ports?

LH instruction causes MISALIGNED HALFWORD error

I am trying to run picorv32 core with a bit more serious SW and hit below issue:

DECODE: 0x000066d8 0x00c59783 lh
LD_RS1: 11 0x00000113
MISALIGNED HALFWORD: 0x0000011f
MISALIGNED HALFWORD: 0x0000011f
MISALIGNED HALFWORD: 0x0000011f
MISALIGNED HALFWORD: 0x0000011f
TRAP after 1545 clock cycles
ERROR!

The code at address 0x66d8 is from riscv-newlib fflush() function (toolchain compiled as rv32i).
7784 000066b0 <_fflush_r>:
7785 66b0: fe010113 addi sp,sp,-32
7786 66b4: 00812c23 sw s0,24(sp)
7787 66b8: 00112e23 sw ra,28(sp)
7788 66bc: 00050413 mv s0,a0
7789 66c0: 00050c63 beqz a0,66d8 <_fflush_r+0x28>
7790 66c4: 03852783 lw a5,56(a0)
7791 66c8: 00079863 bnez a5,66d8 <_fflush_r+0x28>
7792 66cc: 00b12623 sw a1,12(sp)
7793 66d0: 188000ef jal ra,6858 <__sinit>
7794 66d4: 00c12583 lw a1,12(sp)
7795 66d8: 00c59783 lh a5,12(a1)
7796 66dc: 00078c63 beqz a5,66f4 <_fflush_r+0x44>
7797 66e0: 00040513 mv a0,s0
7798 66e4: 01812403 lw s0,24(sp)
7799 66e8: 01c12083 lw ra,28(sp)
7800 66ec: 02010113 addi sp,sp,32
7801 66f0: db9ff06f j 64a8 <__sflush_r>
7802 66f4: 01c12083 lw ra,28(sp)
7803 66f8: 01812403 lw s0,24(sp)
7804 66fc: 00000513 li a0,0
7805 6700: 02010113 addi sp,sp,32
7806 6704: 00008067 ret

Any idea?

Thanks,
Yanghao Hua

spiflash_tb fails

Running make spiflash_tbin picosoc gives me

iverilog -s testbench -o spiflash_tb.vvp spiflash.v spiflash_tb.v
riscv32-unknown-elf-gcc -march=rv32imc -Wl,-Bstatic,-T,sections.lds,--strip-debug -ffreestanding -nostdlib -o firmware.elf start.s firmware.c
riscv32-unknown-elf-objcopy -O verilog firmware.elf /dev/stdout | sed -e '1 s/@00000000/@00100000/; 2,65537 d;' > firmware.hex
vvp -N spiflash_tb.vvp
VCD info: dumpfile spiflash_tb.vcd opened for output.

Reset` (FFh)
-- BEGIN
-- SPI SDR ff 00
-- END

Power Up (ABh)
-- BEGIN
-- SPI SDR ab 00
-- END

Read Data (03h)
-- BEGIN
-- SPI SDR 03 00
-- SPI SDR 10 03
-- SPI SDR 00 10
-- SPI SDR 00 00
-- SPI SDR 00 93
-- SPI SDR 00 00
ERROR: Got 00 (00000000) but expected 02 (00000010).
-- SPI SDR 00 00
ERROR: Got 00 (00000000) but expected 30 (00110000).
-- SPI SDR 00 00
ERROR: Got 00 (00000000) but expected 01 (00000001).
-- SPI SDR 00 93
ERROR: Got 93 (10010011) but expected 23 (00100011).
-- SPI SDR 00 01
ERROR: Got 01 (00000001) but expected 22 (00100010).
-- SPI SDR 00 00
ERROR: Got 00 (00000000) but expected 50 (01010000).
-- SPI SDR 00 00
-- END

Quad I/O Read (EBh)
-- BEGIN
-- SPI SDR eb 00
-- QSPI SDR 10 --
-- QSPI SDR 00 --
-- QSPI SDR 00 --
-- QSPI SDR a5 --
-- QSPI SDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 93 (10010011).
-- QSPI SDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 02 (00000010).
-- QSPI SDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 30 (00110000).
-- QSPI SDR -- z9
ERROR: Got z9 (zzzz1001) but expected 01 (00000001).
-- QSPI SDR -- 30
ERROR: Got 30 (00110000) but expected 23 (00100011).
-- QSPI SDR -- 00
ERROR: Got 00 (00000000) but expected 22 (00100010).
-- QSPI SDR -- 00
ERROR: Got 00 (00000000) but expected 50 (01010000).
-- QSPI SDR -- 09
ERROR: Got 09 (00001001) but expected 00 (00000000).
-- END

Continous Quad I/O Read
-- BEGIN
-- QSPI SDR 10 --
-- QSPI SDR 00 --
-- QSPI SDR 00 --
-- QSPI SDR ff --
-- QSPI SDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 93 (10010011).
-- QSPI SDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 02 (00000010).
-- QSPI SDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 30 (00110000).
-- QSPI SDR -- z9
ERROR: Got z9 (zzzz1001) but expected 01 (00000001).
-- QSPI SDR -- 30
ERROR: Got 30 (00110000) but expected 23 (00100011).
-- QSPI SDR -- 00
ERROR: Got 00 (00000000) but expected 22 (00100010).
-- QSPI SDR -- 00
ERROR: Got 00 (00000000) but expected 50 (01010000).
-- QSPI SDR -- 09
ERROR: Got 09 (00001001) but expected 00 (00000000).
-- END

DDR Quad I/O Read (EDh)
-- BEGIN
-- SPI SDR ed 00
-- QSPI DDR 10 --
-- QSPI DDR 00 --
-- QSPI DDR 00 --
-- QSPI DDR a5 --
-- QSPI DDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 93 (10010011).
-- QSPI DDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 02 (00000010).
-- QSPI DDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 30 (00110000).
-- QSPI DDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 01 (00000001).
-- QSPI DDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 23 (00100011).
-- QSPI DDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 22 (00100010).
-- QSPI DDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 50 (01010000).
-- QSPI DDR -- 93
ERROR: Got 93 (10010011) but expected 00 (00000000).
-- END

Continous DDR Quad I/O Read
-- BEGIN
-- QSPI DDR 10 --
-- QSPI DDR 00 --
-- QSPI DDR 00 --
-- QSPI DDR ff --
-- QSPI DDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 93 (10010011).
-- QSPI DDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 02 (00000010).
-- QSPI DDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 30 (00110000).
-- QSPI DDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 01 (00000001).
-- QSPI DDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 23 (00100011).
-- QSPI DDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 22 (00100010).
-- QSPI DDR -- zz
ERROR: Got zz (zzzzzzzz) but expected 50 (01010000).
-- QSPI DDR -- 93
ERROR: Got 93 (10010011) but expected 00 (00000000).
-- END

FAIL
make: *** [Makefile:50: spiflash_tb] Error 1

problems about the docs

hi,
when study the picorv32, some confused without documents, where can i find the descriptions or request for help on these?

about the interrupt vector table description, from the De-assambled code of the example, seems the ram 0-F both for the reset handler, how should i arrange the vectors, or just like the one in original riscv?
is there any Unit for Debug implementation(i mean the support for breakpoint, step...)? it should important for the fpga debug, while can not find in the readme.
any PCPI detail descriptions? can not find the modules in the readme for pcpi functions, while the PCPI interface always toggle even the parameter PCPI closed.

thanks.

Possible waitirq stall

I've been debugging some IRQ related stuff I'm doing, and although this issue doesn't related to that, staring at this stuff did make me wonder.

The waitirq instruction will sit and wait for an unmasked interrupt to take place, and then store the pending interrupt list into rd, so you can see what interrupts were serviced upon progressing.

However, waitirq is dependent on irq_pending, which is has the IRQ mask applied to it (obviously).

But what happens if:

IRQ is raised on a masked input.
I execute maskirq to unmask it.
My ISR is run.
I then execute waitirq.

Perhaps the interrupt I was expecting got serviced between maskirq and waitirq, which could happen if it raised any time before waitirq gets executed, right?

Is there a defect in my reasoning here, or would a stall be possible? If this sounds sane, I can try to write a simulation example that causes it.

Issue while Building the RV32I Toolchain

Hi,
I was trying to build the RV32I toolchain and got an error at this step ../configure --with-arch=RV32I --prefix=/opt/riscv32i:

checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking for grep that handles long lines and -e... /bin/grep
checking for fgrep... /bin/grep -F
checking for grep that handles long lines and -e... (cached) /bin/grep
checking for bash... /bin/bash
checking for __gmpz_init in -lgmp... yes
checking for mpfr_init in -lmpfr... yes
checking for mpc_init2 in -lmpc... yes
checking for curl... /usr/share/centrifydc/bin/curl
checking for wget... /usr/bin/wget
checking for ftp... /usr/bin/ftp
configure: error: Unknown arch

Did anyone else also face this problem?

Synthesis simulation mismatch, when targeted for xc6slx9-2tqg144 using XILINX ISE 14.7

Hi Clifford

Tried targeting the picorv32 to spartan6 board (papilio pro). Unfortunately the port didn't run on the
target. On further investigation, noticed that while behavioral simulation worked, none of the post-
translate, post-map or post-place and route simulations did not work as well. Seems to generate
some trap after a few cycles! The result seems to be the same for both FAST_MEMRY=0 and 1!

Any pointers to debug this?

Is there a GDB stub or as such that can be used to debug the firmware for example using a serial
port?

csmith Fatal error

Hi when i am trying to run csmith & torture from scripts folder. I am getting the following fatal errors. Can you give your suggestions to resolve in this issues.
csmith error

---------------- 1 (1) ----------------
rm -f test.hex test.elf test.c test_ref test.ld output_ref.txt output_sim.txt
make spike test.hex
make[1]: Entering directory `/home/krradhak/picorv32/picorv32/scripts/csmith'
echo "integer size = 4" > platform.info
echo "pointer size = 4" >> platform.info
csmith --no-packed-struct -o test.c
gawk '/Seed:/ {print$2,$3;}' test.c
Seed: 1483867854
gcc -m32 -o test_ref -w -Os -I /home/krradhak/tools/csmith/src/csmith-AbsExtension.o test.c
test.c:10:20: fatal error: csmith.h: No such file or directory
 #include "csmith.h"
                    ^
compilation terminated.
make[1]: *** [test_ref] Error 1
make[1]: Leaving directory `/home/krradhak/picorv32/picorv32/scripts/csmith'
SKIP

Torture Error 👍

bash test.sh
+ test -f config.vh
+ test -f test.S
++ sed '/march=/ ! d; s,^// ,-,;' config.vh
+ riscv32-unknown-elf-gcc -m32 -march=RV32IMC -ffreestanding -nostdlib -Wl,-Bstatic,-T,sections.lds -o test.elf test.S
test.sh: line 22: riscv32-unknown-elf-gcc: command not found
make: *** [test] Error 127

Question: Runtime code modification: Is it possible?

It's a crazy idea (which the answer is probably 'no') but I wanted to ask if it's possible to emit machine code at runtime (either in flash or SRAM)?

Sorry for the stupid question. I'm not looking for a complete example, I just want to know if it's possible. I know that picorv32 does not implement the RISC-V privileged mode and that's why I'm asking.

riscv-gcc commit does not exist?

Hi Clifford -

Tried running your makefile / build scripts for the RISC-V GCC toolchain and got the following error on two computers:

Cloning into 'riscv-gcc'...
Checking connectivity... done.
fatal: reference is not a tree: 4fb4d8f9e9ac8a28d6ea5117688eadbcd0f7978e
Unable to checkout '4fb4d8f9e9ac8a28d6ea5117688eadbcd0f7978e' in submodule path 'riscv-gcc'
make[1]: *** [build-riscv32im-tools-bh] Error 1
make[1]: Leaving directory `/home/drichmond/Research/repositories/git/picorv32'
make: *** [build-riscv32im-tools] Error 2

Perhaps the commit tree changed recently? I was able to compile this code using your instructions several days ago.

custom0 opcode not defined in riscv-gnu-toolchain

How is the custom0 opcode defined for RV32 toolchain?

I am trying to work with your IRQ examples in the firmware directory.

When I compile with the riscv32im toolchain I am getting unrecognized opcode custom0 errors:

start.S: Assembler messages:
start.S:22: Error: unrecognized opcode `custom0 2,x1,0,1'
start.S:23: Error: unrecognized opcode `custom0 3,x2,0,1'
start.S:28: Error: unrecognized opcode `custom0 x2,0,0,0'
start.S:31: Error: unrecognized opcode `custom0 x2,2,0,0'
start.S:34: Error: unrecognized opcode `custom0 x2,3,0,0'
start.S:77: Error: unrecognized opcode `custom0 a1,1,0,0'
start.S:88: Error: unrecognized opcode `custom0 0,x2,0,1'
start.S:91: Error: unrecognized opcode `custom0 1,x2,0,1'
start.S:94: Error: unrecognized opcode `custom0 2,x2,0,1'
start.S:126: Error: unrecognized opcode `custom0 x1,1,0,0'
start.S:127: Error: unrecognized opcode `custom0 x2,2,0,0'
start.S:129: Error: unrecognized opcode `custom0 0,0,0,2'

Does the custom0 opcode need to be patched into the binutils riscv-opc.c file?

Write Address and Data undefined with simple hello world program

I'm attempting to write a simple hello world function to run on picorv32 in sim. I'm running in the scripts cxxdemo folder. I can run make test just fine, but when I try to write a custom Hello World example I get an undefined write address and data: WR: ADDR=xxxxxxxX DATA=xxxxxxxx MASK=1111

hello.cc

 #include <stdio.h>
 #include <iostream>
 #include <vector>
 #include <algorithm>

 int main() {

	printf("Hello World!\n");

	return 0;
 }

Makefile (modified of cxxdemo makefile)

RISCV_TOOLS_PREFIX = /opt/riscv32ic/bin/riscv32-unknown-elf-
CXX = $(RISCV_TOOLS_PREFIX)g++
CC = $(RISCV_TOOLS_PREFIX)gcc
AS = $(RISCV_TOOLS_PREFIX)gcc
CXXFLAGS = -MD -Os -Wall -std=c++11
CCFLAGS = -MD -Os -Wall -std=c++11
LDFLAGS = -Wl,--gc-sections
LDLIBS = -lstdc++

test: testbench.vvp firmware32.hex
	vvp -N testbench.vvp

hello: testbench_hello.vvp hello32.hex
	vvp -N testbench_hello.vvp

testbench.vvp: testbench.v ../../picorv32.v
	iverilog -o testbench.vvp testbench.v ../../picorv32.v
	chmod -x testbench.vvp

testbench_hello.vvp: testbench_hello.v ../../picorv32.v
	iverilog -o testbench_hello.vvp testbench_hello.v ../../picorv32.v
	chmod -x testbench_hello.vvp

firmware32.hex: firmware.elf start.elf hex8tohex32.py
	$(RISCV_TOOLS_PREFIX)objcopy -O verilog start.elf start.tmp
	$(RISCV_TOOLS_PREFIX)objcopy -O verilog firmware.elf firmware.tmp
	cat start.tmp firmware.tmp > firmware.hex
	python3 hex8tohex32.py firmware.hex > firmware32.hex
	rm -f start.tmp firmware.tmp

hello32.hex: hello.elf
	$(RISCV_TOOLS_PREFIX)objcopy -O verilog hello.elf hello.tmp
	cat hello.tmp > hello.hex
	python3 hex8tohex32.py hello.hex > hello32.hex
	rm -f hello.tmp

firmware.elf: firmware.o syscalls.o
	$(CC) $(LDFLAGS) -o $@ $^ -T ../../firmware/riscv.ld $(LDLIBS)
	chmod -x firmware.elf

hello.elf: hello.o
	$(CC) $(LDFLAGS) -o $@ $^ -T ../../firmware/riscv.ld $(LDLIBS)
	chmod -x hello.elf

start.elf: start.S start.ld
	$(CC) -nostdlib -o start.elf start.S -T start.ld $(LDLIBS)
	chmod -x start.elf

clean:
	rm -f *.o *.d *.tmp start.elf
	rm -f firmware.elf firmware.hex firmware32.hex
	rm -f hello.elf hello.hex hello32.hex
	rm -f testbench.vvp testbench.vcd
	rm -f testbench_hello.vvp

-include *.d
.PHONY: test clean

testbench_hello.v (modified from testbench.v)

`timescale 1 ns / 1 ps
//`undef VERBOSE_MEM
`define VERBOSE_MEM
//`undef WRITE_VCD
`define WRITE_VCD
`undef MEM8BIT

module testbench_hello;
	reg clk = 1;
	reg resetn = 0;
	wire trap;

	always #5 clk = ~clk;

	initial begin
		repeat (100) @(posedge clk);
		resetn <= 1;
	end

	wire mem_valid;
	wire mem_instr;
	reg mem_ready;
	wire [31:0] mem_addr;
	wire [31:0] mem_wdata;
	wire [3:0] mem_wstrb;
	reg  [31:0] mem_rdata;

	picorv32 #(
		.COMPRESSED_ISA(1)
	) uut (
		.clk         (clk        ),
		.resetn      (resetn     ),
		.trap        (trap       ),
		.mem_valid   (mem_valid  ),
		.mem_instr   (mem_instr  ),
		.mem_ready   (mem_ready  ),
		.mem_addr    (mem_addr   ),
		.mem_wdata   (mem_wdata  ),
		.mem_wstrb   (mem_wstrb  ),
		.mem_rdata   (mem_rdata  )
	);

	localparam MEM_SIZE = 4*1024*1024;
`ifdef MEM8BIT
	reg [7:0] memory [0:MEM_SIZE-1];
	initial $readmemh("hello.hex", memory);
`else
	reg [31:0] memory [0:MEM_SIZE/4-1];
	initial $readmemh("hello32.hex", memory);
`endif

	always @(posedge clk) begin
		mem_ready <= 0;
		if (mem_valid && !mem_ready) begin
			mem_ready <= 1;
			mem_rdata <= 'bx;
			case (1)
				mem_addr < MEM_SIZE: begin
`ifdef MEM8BIT
					if (|mem_wstrb) begin
						if (mem_wstrb[0]) memory[mem_addr + 0] <= mem_wdata[ 7: 0];
						if (mem_wstrb[1]) memory[mem_addr + 1] <= mem_wdata[15: 8];
						if (mem_wstrb[2]) memory[mem_addr + 2] <= mem_wdata[23:16];
						if (mem_wstrb[3]) memory[mem_addr + 3] <= mem_wdata[31:24];
					end else begin
						mem_rdata <= {memory[mem_addr+3], memory[mem_addr+2], memory[mem_addr+1], memory[mem_addr]};
					end
`else
					if (|mem_wstrb) begin
						if (mem_wstrb[0]) memory[mem_addr >> 2][ 7: 0] <= mem_wdata[ 7: 0];
						if (mem_wstrb[1]) memory[mem_addr >> 2][15: 8] <= mem_wdata[15: 8];
						if (mem_wstrb[2]) memory[mem_addr >> 2][23:16] <= mem_wdata[23:16];
						if (mem_wstrb[3]) memory[mem_addr >> 2][31:24] <= mem_wdata[31:24];
					end else begin
						mem_rdata <= memory[mem_addr >> 2];
					end
`endif
				end
				mem_addr == 32'h 1000_0000: begin
					$write("%c", mem_wdata[7:0]);
				end
			endcase
		end
		if (mem_valid && mem_ready) begin
`ifdef VERBOSE_MEM
			if (|mem_wstrb)
				$display("WR: ADDR=%x DATA=%x MASK=%b", mem_addr, mem_wdata, mem_wstrb);
			else
				$display("RD: ADDR=%x DATA=%x%s", mem_addr, mem_rdata, mem_instr ? " INSN" : "");
`endif
			if (^mem_addr === 1'bx ||
					(mem_wstrb[0] && ^mem_wdata[ 7: 0] == 1'bx) ||
					(mem_wstrb[1] && ^mem_wdata[15: 8] == 1'bx) ||
					(mem_wstrb[2] && ^mem_wdata[23:16] == 1'bx) ||
					(mem_wstrb[3] && ^mem_wdata[31:24] == 1'bx)) begin
				$display("CRITICAL UNDEF MEM TRANSACTION");
				$finish;
			end
		end
	end

`ifdef WRITE_VCD
	initial begin
		$dumpfile("testbench.vcd");
		$dumpvars(0, testbench_hello);
	end
`endif

	always @(posedge clk) begin
		if (resetn && trap) begin
			repeat (10) @(posedge clk);
			$display("TRAP");
			$finish;
		end
	end
endmodule

I seem to get similar results when running in the firmware folder. But I don't get very far, where as this vcd appears to get many cycles in. Is it possible that there needs to be some kind of clean up code executed in order to exit correctly?

Errors while compiling the testbench.v and picorv32.v in Modelsim

Hi Clifford,

I'm trying to implement your design PicoRV32 on FPGA, thus I need to compile both files in Modelsim for simulation. However, I met errors in both files:

for picorv32.v:
** Error: C:\Users\Cy\Desktop\za\ma\picorv32-master\picorv32-master\picorv32.v(555): A begin/end block was found with an empty body. This is permitted in SystemVerilog, but not permitted in Verilog. Please look for any stray semicolons.
** Error: C:\Users\Cy\Desktop\za\ma\picorv32-master\picorv32-master\picorv32.v(556): A begin/end block was found with an empty body. This is permitted in SystemVerilog, but not permitted in Verilog. Please look for any stray semicolons.
** Error: C:\Users\Cy\Desktop\za\ma\picorv32-master\picorv32-master\picorv32.v(557): A begin/end block was found with an empty body. This is permitted in SystemVerilog, but not permitted in Verilog. Please look for any stray semicolons.
** Error: C:\Users\Cy\Desktop\za\ma\picorv32-master\picorv32-master\picorv32.v(558): A begin/end block was found with an empty body. This is permitted in SystemVerilog, but not permitted in Verilog. Please look for any stray semicolons.
** Error: C:\Users\Cy\Desktop\za\ma\picorv32-master\picorv32-master\picorv32.v(581): A begin/end block was found with an empty body. This is permitted in SystemVerilog, but not permitted in Verilog. Please look for any stray semicolons.
** Error: C:\Users\Cy\Desktop\za\ma\picorv32-master\picorv32-master\picorv32.v(582): A begin/end block was found with an empty body. This is permitted in SystemVerilog, but not permitted in Verilog. Please look for any stray semicolons.
** Error: C:\Users\Cy\Desktop\za\ma\picorv32-master\picorv32-master\picorv32.v(589): A begin/end block was found with an empty body. This is permitted in SystemVerilog, but not permitted in Verilog. Please look for any stray semicolons.
** Error: C:\Users\Cy\Desktop\za\ma\picorv32-master\picorv32-master\picorv32.v(590): A begin/end block was found with an empty body. This is permitted in SystemVerilog, but not permitted in Verilog. Please look for any stray semicolons.

as for the testbench.v:
** Error (suppressible): C:\Users\Cy\Desktop\za\ma\picorv32-master\picorv32-master\testbench.v(79): (vlog-2388) 'trap' already declared in this scope (picorv32_wrapper).

I copied the code in this part:
module picorv32_wrapper #(
parameter AXI_TEST = 0,
parameter VERBOSE = 0
) (
input clk,
input resetn,
output trap,
output trace_valid,
output [35:0] trace_data
);

wire trap;
wire tests_passed;
reg [31:0] irq;

As far as I know, in Verilog it's not allowed to define a internal wire or reg with the same name as the of a port. So maybe you can give me some thoughts?

Thanks a lot!

Best regards,
Cy

Opcode after jump executed on error

To do some basic tests with your PicoRV32 I wrote a testbench, connected a rom with a simple
bare metal program in it and and let it run.

I started with a tiny C-program that I compiled with the riscv-gnu-toolchain

#include <stdio.h>
void main(void)
{ 
  register int cpureg15 asm ("a5") = 0; 
  //a5 = ABI-name of R15 

  while (1)
  {
    cpureg15++;
  }
}

It compiled to :

  -- Test 1
00000000 <main>:
   0:   ff010113            addi    sp,sp,-16
   4:   00812623            sw  s0,12(sp)
   8:   01010413            addi    s0,sp,16
   c:   00000793            li  a5,0
  10:   00178793            addi    a5,a5,1
  14:   ffdff06f            j   10 <main+0x10>

The first thing I noticed was a xxxxxxxx on mem_addr[31:0] at about 150ns.
I NOP-ed the first three words since they were not required and the xxxxxxxx was gone.

  -- Test 2
00000000 <main>:
   0:   00000013            nop
   4:   00000013            nop
   8:   00000013            nop
   c:   00000793            li  a5,0
  10:   00178793            addi    a5,a5,1
  14:   ffdff06f            j   10 <main+0x10>

I decided to go after this effect later on.

The next thing I saw were the TRAP at 275ns, mem_addr=0x00000018 at 205ns and
mem_rdata =0x00000000 at 215ns.

As the riscv-spec points out on page 6 (Instruction length encoding) an opcode
of 0x00000000 shall lead to a trap. Since I indeed zeroed the unused portion
of the rom, I added a NOP at (the technically speaking unused) address 0x18 :
Now the TRAP was gone but register 15 stuck at zero and was obviously not incremented.
No wonder, since the two instructions that executed were at 0x14 and 0x18 (JUMP and NOP).

  -- Test 3
00000000 <main>:
   0:   00000013            nop
   4:   00000013            nop
   8:   00000013            nop
   c:   00000793            li  a5,0
  10:   00178793            addi    a5,a5,1
  14:   ffdff06f            j   10 <main+0x10>  
  18:   00000013            nop

To ensure that the NOP at 0x18 was not only prefetched but also executed, I swapped the opcodes
of 0x10 and 0x18. Now R15 began incrementing.

  -- Test 4
00000000 <main>:
   0:   00000013            nop
   4:   00000013            nop
   8:   00000013            nop
   c:   00000793            li  a5,0
  10:   00000013            nop
  14:   ffdff06f            j   10 <main+0x10>  
  18:   00178793            addi    a5,a5,1

I swapped the opcodes back and calculated a jump further back, aiming at 0x0C.
In fact 0x10, 0x14 and 0x18 were executed and R15 was incrementing.

  -- Test 5
00000000 <main>:
   0:   00000013            nop
   4:   00000013            nop
   8:   00000013            nop
   c:   00000793            li  a5,0
  10:   00178793            addi    a5,a5,1
  14:   ff9ff06f            j   0c <main+0x0c>  
  18:   00000013            nop

I replaced the NOP at 0x18 with a second increment and in fact R15 got incremented twice per loop.

  -- Test 6
00000000 <main>:
   0:   00000013            nop
   4:   00000013            nop
   8:   00000013            nop
   c:   00000793            li  a5,0
  10:   00178793            addi    a5,a5,1
  14:   ff9ff06f            j   0c <main+0x0c>  
  18:   00178793            addi    a5,a5,1

My guess is, that the offset of the jump instruction is added correctly to the program counter
but only after executing the opcode following the jump instruction by mistake.

My question is : can you reproduce this effect or is something wrong in my setup?
Sadly I'm no verilog guy, so there is no use in trying to debug your code on my own.

Vivado 2016.1, ARTY-board (Artix-7 XC7A35T), riscv32-unknown-elf-gcc (GCC) 5.3.0

Project:
picorv32test.zip

VCS compilation failure with mem_rdata_latched wire

When attempting to use VCS 2011.12-SP1 I noticed that the mem_rdata_latched wire was being used before it was defined. Moving it above its usage in the mem_done assignment fixed the compilation error.

Simulation fails with VCS; VCS does not sign-extend using $signed()

I saw this referenced in #33 but closed. I am using VCS N-2017.12-SP2-1 and traced simulation errors back to sign-extending using $signed, which appears to be the same issue.

After the following code changes I see expected results using VCS.
picorv32.patch.txt

Example cxxdemo needs regs 16-31

The example code for cxxdemo is compiled using the rv32ic compiler, which I believe uses the full 32 registers of the instruction set. I had to enable the high regs in the testbench, otherwise it simply gave a TRAP exit.

With attached patch below test case passes.
testbench.v.patch.txt

Little error on line testbench.v:190

Grettings.

I'd like to report a possible syntax error on line testbench.v:190

output            mem_axi_rvalid = 0,

should be

output reg            mem_axi_rvalid = 0,

This was tested using Simulation in Vivado.

Cannot find module SB_IO

I was trying to adapt the hx8kdemo for Zedboard, but when I try to construct the project in Vivado the module flash_io_buf doesn´t appears.

Verilator generated executable didn't run far...

Using the following verilator commands:
verilator -Wno-lint -Wno-MULTIDRIVEN -trace --top-module picorv32_wrapper --cc testbench.v picorv32.v --exe testbench.cc cd obj_dir/ make -j -f Vpicorv32_wrapper.mk
Got these error message:
TRAP after 8428 clock cycles ERROR! %Error: testbench.v:268: Verilog $stop Aborting... Aborted (core dumped)

Is there a way to simulate the CPU + DRAM memory latencies with this repository?

It would be interesting to do that to get more realistic performance metrics, even if a simplistic 100 cycle delay to get to main memory.

torture fails

Hi when i am trying to run torture it shows following error
bash test.sh

test -f config.vh
test -f test.S
++ sed '/march=/ ! d; s,^// ,-,;' config.vh
riscv32-unknown-elf-gcc -m32 -march=RV32IMC -ffreestanding -nostdlib -Wl,-Bstatic,-T,sections.lds -o test.elf test.S
riscv32-unknown-elf-gcc: error: unrecognized command line option '-m32'
make: *** [test] Error 1

Note : Give your suggestion to resolve this issue.