nvdla / hw Goto Github PK
View Code? Open in Web Editor NEWRTL, Cmodel, and testbench for NVDLA
License: Other
RTL, Cmodel, and testbench for NVDLA
License: Other
when i am tring to synthesis the NVDLA at wlm mode,and i got the error info like:
"
Info: Reading input SDC xxx/NV_NVDLA_partition_a.sdc using "read_sdc"
Reading SDC version 2.0
Error: Value for 'object_list' must have 1 elements (CMD-036)
Error: Value for '-from' must have 1 elements (CMD-036)
"
what i should do?
Error-[SFCOR] Source file cannot be opened
Source file
"/home/eda/workdir/nvdla_hw_master/hw-master/verif/../vmod/dw_components/DW02_tree.v"
cannot be opened for reading due to 'No such file or directory'.
Please fix above issue and compile again.
"/home/eda/workdir/nvdla_hw_master/hw-master/verif/synth_tb/slave2mem_wr.v",
189
And vmod/nvdla/cmac/NV_NVDLA_CMAC_CORE_MAC_mul.v use it fellows:
526 | .INPUT (pp_in_l0n0[119:0]) //|< r
… |
539 | `else
540 | DW02_tree #(5, 24) u_tree_l0n1 (
541 | .INPUT (pp_in_l0n1[119:0]) //|< r
542 | ,.OUT0 (pp_out_l0n1_0[23:0]) //|> w
could you enlight me about how the ROI pooling is supported?
Could you please explain what does the "make verifcom DUMP=1 DUMPER=VERDI" stands in the simulation makefile? I cannot find any reference to "verifcom".
Also if I use make run, make verdi, verdi (as stated in the example) cannot find "simvwork" library and debussy.fsdb. Is it a makefile related error, or am I missing a step?
SDP has 2 func
if the SDP is like ' single cycle single job' and the kernal is small enough, then the running time of SDF for both functions could be double, is this guessing correct ?
in edge device,the ddr is limit and the process latency is sensitive.
if we do batch Calculation
1,the cost of ddr will increase with the number of batch.And in the CNN,the number of datas are also large.and the hidden layer data are more large ,due to the increase of channel.
2,when we collect 16 frames from sensor in the 60fps,the delay is near 300ms. Maybe in some apply,the delay is too long.
so i think ,in the edge device ,the common batch is 1 or 2. in this case ,i see the nvdla's Network MAC Utilization is low.Are you consider to optimization the performance in this case,that batch is low
There are paths in Partition O from capture register using falcon clock to a register using core clock violated even at low frequency run (e.g. 500MHz). The arrive time for falcon clock and core clock is pretty late (e.g. 35ns), even I already set the ideal network for them. What is the constraints for CDC paths, and how to deal with the CDC paths?
PDP has a dedicated memory interface to fecth input map from memory and output map directly to memory.
Does this "memory" means the memory outside of DLA? If it does, Is it CPU to reshape the data?
Why not to fetch the input map from the conv buffer? What would be the cost by implementing this design?
The SDP uses lookup table to implement the non-linear function like sigmoid, tanh, PReLU. What's the highest precision it can support? Does it can also support the single/half precision?
what should i do if i have no PDP rams?
can i replace PDP-rams with DP-rams?
does the function has an impact?
before synthesis, which files do i need to(or recommend) replace, except the p_SSYNC_xxx files and rams mentioned in ‘integrator's manual’?
do i need to replace all the files in path/vmod/vlibs?
File: verif/synth_tb/tb_top.v
Line: 675
It has a call to $vcdpluson - a VCS specific system function. This gives issue with other tools. Please consider adding:
ifdef VCS $vcdpluson
endif
Thanks
Srini
File: synth_tb/csb_master_seq.v
Lines: 23 (approx):
input reg [31:0] mcsb2mseq_rdata;
input reg dut2mseq_intr0;
Is that right? Somehow other tools are fine, but Icarus Verilog chokes on this.
Regards
Srini
Hello,
First of all great initiative! Thanks!
I am trying to compile/run this code on Mentor's Questa. I found one small issue (As reported by the tool).
File: vmod/vlibs/RANDFUNC.vlib
There is a hash generator function. it uses an assignment like below:
modName = $sformatf ("%m");
With modName declared as "reg" it doesn't accept - as string to a packed type requires a cast. I did a simple fix as in:
modName = int'($sformatf ("%m"));
Not sure if that's enough as the modName could be larger than 32 bits maybe? But it maks the compiler happy and it proceeds to running the test (Still fighting other issues in that, so no results yet).
I believe Questa is right in this case and you may be able to run the code with the above change in VCS as well. if not we could add a `ifdef VCS stuff.
Thanks for listening.
Regards
Srini
what's the MAX_BUSY_CYCLE mean(mentioned in NVDLA_OpenSource_Performance.xls)? include the zero-calculations or not?
checktest : FAILED : . : all transactions completed with no errors but dump_mem mismatched: Found mismatches between ./0.chiplib_dump.raw2 and ./0.chiplib_replay.raw2.
NVINFO : ===============================================================
NVINFO : To summarize the results again, run : make check TESTDIR=../traces/traceplayer/conv_8x8_fc_int16
NVINFO : ===============================================================
Hello,
I am trying to upgrade to latest version of NVDLA source code as there has been quite a few updates since I tried it in its early days. One thing I noticed is, the "build" process has been introduced - good/bad - I can learn to like that for large project. I follow: http://nvdla.org/integration_guide.html and try and do:
./tools/bin/tmake -build vmod
My first trial failed, owing to the "tmake" not having 755 like permission (Am on TCSH if that matters, guess not). It is minor, I had to manually change it, perhaps it is worth considering at your end for the GitHub upload version as well.
Am now stuck on "can't locate Perl IO/Tee.pm" issue. Will fight the CPAN later.
Thanks
Srini
If I use second memory ,can you supply a simple sram slave controller?
Can I use the “slave_mem_wrap” for synthesis ?
when i run the nvdla testcase,i can't understand the full meaning in input.txn. I mean i can't find the register
in "Address space layout". for example,"write reg 0xfff1401 0x0", i can’t find the description for 0x1401,except the comment.
File: synth_tb/slave_mem_wrap.v
Lines: 89-112
Current declaration looks like:
output reg axi_slave0_bready; // Not exact signal name
And then further down the code, this signal is connected to the output of a module instantiation.
Strictly speaking the above should be "output axi_slave0_bready" (i.e. default wire).
2 tools seem to work OK with original code. XLM (Cadence) throws error. Removing the "reg" makes it go through.
Thanks
Srini
is there a data path from PDP to CDP?
we can see that there is a data path from PDP to CDP (according to “i.e., the convolution core can pass data to the Single Data Point Processor, which can pass data to the Planar Data Processor, and in turn to the Cross-channel Data Processor” & the architecture diagram, at: http://nvdla.org/primer.html#hardware-architecture) , actually, there is not data path between PDP and CDP according to the RTL source released on github.
there maybe a mistake in the Primer document.
Please double check it.
In ~/hw_mast/verif/sim/
make build DUMP=1 DUMPER=VERDI
make run DUMP=1 DUMPER=VERDI TESTDIR=../traces/first_release
make verdi DUMP=1 DUMPER=VERDI TESTDIR=../traces/first_release
There is a file name“input_feature_map.dat” in the directory "hw_mast/verif/sim/conv_8x8_fc_int16"that seems to be used as a memory input in the simulation phase.
The question is, how should I use it?
Hi,
I tried to run a simulation in Modelsim refer to 'sim/Makefile', but encountered a lot of PSL assertion errors and some mismatches between the expected data and mine. So I hope there is a sim script for Modelsim and I can do the test more easily.
Thanks
as we known, the format of freature and weight is very important to understand the architecture of NVDLA. Could you please give a brief description of the format,thanks in advance.
How to use the sheet | The "Configuration Input" sheet exposes the hardware parameters for the configuration of interest. Fill in your configuration settings and view the resulting performance in the "Performance Sumary" sheet. This sheet supports several very common CNN Networks. Additional sheets can be added in a similar fashon to estimate performance on other convolutional networks. |
---|
For example, I changed "Clock Frequency (MHz)" from 1000MHz to 2000MHz, in theory it should half the runtime or double the FPS. But actually nothing happend.
Has considered coding RTL with SystemVerilog which with better readability, higher coding efficiency, supported by allmost all EDA tools.
Hi,
I'm new to nvdla. I just tried to clone the repo and follow the integration guide http://nvdla.org/integration_guide.html#env-setup for initial setup, but it failed at './tools/bin/tmake -build vmod'. I'm working on mac os. Could you please let me know what would be missing here? Thanks.
==============================================
files are generated under /Users/pighashub/myWork/nvdla/outdir/nv_large/spec/manual
==============================================
/usr/local/bin/g++-5 -E -undef -nostdinc -P -C nv_large.spec -o /Users/pighashub/myWork/nvdla/outdir/nv_large/spec/defs/project.def
g++-5: warning: nv_large.spec: linker input file unused because linking not done
../../tools/bin/defgen -i /Users/pighashub/myWork/nvdla/outdir/nv_large/spec/defs/project.def -o /Users/pighashub/myWork/nvdla/outdir/nv_large/spec/defs/project.h -b c
ERROR: defgen: file not found: /Users/pighashub/myWork/nvdla/outdir/nv_large/spec/defs/project.def
at ../../tools/bin/defgen line 112.
main::Error('file not found: /Users/pighashub/myWork/nvdla/outdir/nv_large/spe...') called at ../../tools/bin/defgen line 78
main::gen_define('/Users/pighashub/myWork/nvdla/outdir/nv_large/spec/defs/project.def', '/Users/pighashub/myWork/nvdla/outdir/nv_large/spec/defs/project.h', 'c') called at ../../tools/bin/defgen line 69
make: *** [project.h] Error 2
../../tools/make/vmod_common.make:16: *** ../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../tools/make/vmod_common.make:16: *** ../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
../../../tools/make/vmod_common.make:16: *** ../../../outdir/nv_large/spec/defs/project.h does not exists, please make hw/spec/defs first, exiting.... Stop.
logfile: outdir/build.log
==============================================
Filehandle GEN1 opened only for input at /Library/Perl/5.18/IO/Tee.pm line 132.
==================BUILD PASS==================
Filehandle GEN1 opened only for input at /Library/Perl/5.18/IO/Tee.pm line 132.
==============================================
Filehandle GEN1 opened only for input at /Library/Perl/5.18/IO/Tee.pm line 132.
Could you enlight me about how the dilate converlution is supported ?
Is this NMS feature supported?
for the input.txn, they are 88322 and 33322, but they don't match the .dat file.
There are many comments as following. Looks like there are codes generated by tools "ness".
// DO NOT EDIT, generated by ness version 2.0, backend=verilog
//
// Command: /home/ip/shared/inf/ness/2.0/38823533/bin/run_ispec_backend verilog nvdla_all.nessdb defs.touch-verilog -backend_opt '--nogenerate_io_capture' -backend_opt '--generate_ports'
Does that means configuring or reworking on these codes has to be done by this tool?
Thank you.
Hi,
when I look into ./hw/outdir/nv_small/vmod/include/hw/outdir/nv_small/vmod/include/NV_HWACC_NVDLA_tick_defines.vh
I see the following line
`include "NV_HWACC_common_tick_defines.vh"
But I can't seem to find NV_HWACC_common_tick_defines.vh
Is the NV_HWACC_NVDLA_tick_defines.vh even needed?
Linus
If the MAC number is 1024 and C/K are multiples of 16, convolution layer,
according to the equation of MAC cycle:
CEILING(E11L11M11, 16)*CEILING(IF(K11, F11, F11-I11+1)/L11, 1)*CEILING(IF(K11, G11, G11-J11+1)/M11,1)CEILING(H11IF(D11="fc", $C$9, 1), 16)I11J11/L11/M11/$C$4/IF(I11=3, IF(J11=3, 2.25, 1), 1)/IF(D11="fc", $C$9, 1)
Does it means that the utilization of MAC in this layer is equal to 100%?
when i run "make" in the verif/sim directory,i got the error info. as follow:
"xxxx/xxxx/DW02_ree.v"
cannot be oped or reading due to "No such file or directory"
#! /usr/bin/env perl
this way could be better than
#!/home/utils/perl-5.8.8/bin/perl
in :
hw/verif/synth_tb/sim_scripts/checktest_synthtb.pl
hw/verif/synth_tb/sim_scripts/raw_mem_to_synth_mem.pl
...
The file NV_NVDLA_partition_o.v has been mistakenly committed using (for example) #ifdef rather than `ifdef.
Don't you run design checks before committing RTL updates?
i can‘t see the div operation in nvdla.
Hi,
I am seeing the following error while running syn/scripts/syn_launch.sh script.
_######################
#####################
setVar RTL_DEPS ""
Info: Setting RTL_DEPS from env, value = osdla_syn_20171004_1853/scripts/NV_NVDLA_partition_a.files.vc
set vcsOpt "{-f $RTL_DEPS}"
{-f osdla_syn_20171004_1853/scripts/NV_NVDLA_partition_a.files.vc}
catch {eval {analyze -format sverilog -work WORK} -vcs $vcsOpt} analyzeStatus
Running PRESTO HDLC
Searching for ./NV_NVDLA_partition_a.v
Searching for osdla_syn_20171004_1853/src/NV_NVDLA_partition_a.v
Error: Unable to open file `NV_NVDLA_partition_a.v': in search_path {. osdla_syn_20171004_1853/src}. (VER-41)
*** Presto compilation terminated with 1 errors. ***
0
if { $analyzeStatus != 1 } {
puts "${synMsgErr} Analyze failed! Aborting..."
exit 1
}
Error: Analyze failed! Aborting..._
I checked the osdla_syn_20171004_1853 folder. The files do exist. Can someone please help me.
It means the weight of the convolution is loacal and not shared.
Could you enlight me how it could be supported?
it is well known that "Low Power" is one of the most important feature for embedded deep learning accelerators, what about NVDLA?what's the 'small NVDLA' power (alexnet 1000fps@28nm)?
While running the conv_8x8_fc_int16 test case I am getting the result as failure. It gives the following message when I check it with the command "make check TESTDIR=../traces/traceplayer/conv_8x8_fc_int16"
" checktest : FAILED : conv_8x8_fc_int16 : all transactions completed with no errors but dump_mem mismatched: Found mismatches between conv_8x8_fc_int16/0.chiplib_dump.raw2 and conv_8x8_fc_int16/0.chiplib_replay.raw2 "
The contents of the 0.chiplib_dump.raw2 are all zeros. Where as the contents of the file 0.chiplib_replay.raw2 are the contents of the file output_feature_map.dat, as was mentioned in issue #34
While going through the log I found a warning saying:
"Warning-[STASKW_RMTMDWIFAL] Too many data words in file
/nvdla/hw-master/verif/synth_tb/csb_master_seq.v, 419
Too many data words in file 0.raw2 at line 1026 while executing $readmem.
Please ensure that the file has proper entries."
When I checked the 0.raw2 file it had following contents at the respective lines
1025: 1745c349
1026: 056d9f98
1027: faf4fc31
Can someone please help me fix this issue?
i find the reg "input_mask[127:0]" and "input_dat[1023:0]" is used for decode weight in the module NV_NVDLA_CSC_WL_dec.
when the wire "is_int8" is high ,the "vec_data_000" is equal to data_d1[7:0];if the input_mask[1]==0,the "vec_data_001" is the same with "vec_data_000".is it right?
but when "is_int8" and "is_fp16" is all low,the mode is "int16",the input_mask[odd] is equal input_mask[even],i am confused in the code.
for example,input_mask=128'h3,i see the result from code ,that output_data0=input_data[7:0],output_data1=input_data[15:8],output_data2=output_data3=...=output_data127=input_data[15:8].is it rihgt?in my opinion,the output_data2=output_data4=..output_data126=input_data[7:0],the output_data3=output_data5=...=output_dat127=input_data[15:8].
can you supply a testcase for decoder weight,that can enlight me the process of decoder.
ps: the process of compress is like run_length ,is it right?
where can i get the definition of the HW_CONFIG registers(mentioned in the http://nvdla.org/hwarch.html#glb).
i read the RTL source, found noting about the HW_CONFIG registers.
when i run "make" in verif/sim director,it got the error information like:
5NrIB_d(.data+0x498) undefined reference to GetArgVal.
i tried add '-m64' to g++ cmd line, but it does not work.
INFO: [Synth 8-5365] Flop u_partition_p/u_NV_NVDLA_sdp/u_core/u_ew/u_idx/NV_NVDLA_SDP_CORE_Y_idx_core_inst/FpFloatToIntFrac_8U_23U_8U_9U_35U_1_m_int_sva_2_reg[108] is being inverted and renamed to u_partition_p/u_NV_NVDLA_sdp/u_core/u_ew/u_idx/NV_NVDLA_SDP_CORE_Y_idx_core_inst/FpFloatToIntFrac_8U_23U_8U_9U_35U_1_m_int_sva_2_reg[108]_inv.
INFO: [Common 17-14] Message 'Synth 8-5365' appears 100 times and further instances of the messages will be disabled. Use the Tcl command set_msg_config to change the current settings.
Abnormal program termination (EXCEPTION_ACCESS_VIOLATION)
i know the mechanism of the CMAC according to reading the RTL source, and realized that the MAC utilization will be very low in such a scenario(dw convolution). is there a solution for this?
does this requires significant modifications? any ideas?
Hello:
When download the hw-master announced in 10.19, after modify the software environment variable and Designware libs, I can run the script and successfully pass the sanitys.
While I download the hw-master update in 11.02(has modify the software environment variable and Designware libs as well), it goes wrong while I run the case. The reason why is that there is no “../../outdir/nv_Large/vmod/..”ptah, even if I build the path "ourdir/nv_large", and put the vmod to this path, it failed again, while there is no "../vmod/nvdla/bdma/.v", actually there is no code about it.
So the code is incompatible with the Makefile? or I have done sth. wrong?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.