Code Monkey home page Code Monkey logo

Comments (6)

vmayoral avatar vmayoral commented on July 21, 2024 1

Hello @syed-ahmed,

I was able to run the KRS examples on KV260 and am currently working on accelerating ORB-SLAM2 ROS node using KRS.

Great to hear that!

I looked into the vivado platform shipped by KRS and it looks like it's a bare minimum acceleration platform. I was wondering if there were any instructions on how that platform was created? I looked into the artifacts of this repo but seems like it only ships with the exported hardware platform (whereas I'm interested in tcl scripts that created the hardware project and petalinux meta recipes). I want to build a pipeline like this using KRS and so was wondering if the platform in this repo need to be updated, such that it supports MIPI/VCU/audio pipelines.

That's correct, KRS alpha only ships a minimalistic Vitis platform that's then used by the Vitis compiler as a ground base to add whatever accelerators you have in your ROS 2 workspace. KRS alpha is only meant for basic (single Node) accelerators. Support for multiple accelerators/multiple Nodes is coming up in KRS beta and with that, also tools for simplified replacement of the Vitis platforms.

Right now, in alpha, the process is quite cumbersome and I don't recommend it but definitely doable if you know what you're doing. The source code of the platform files and the scripts to automate it are available in:

A few notes:

  • Many of these things are lacking documentation. You're working on the cutting edge (which is cool, but like me, expect a bumpy road)
  • Changes in the Vitis platform will also force you to review/update the device tree blobs as appropriate, if you want the extensions to the ROS 2 build system (ament_acceleration) and build tools (colcon-acceleration) to produce valid kernels, otherwise, your kernels will synthesize + place&route just fine, but they device tree inconsistencies won't allow them to interact with hardware successfully.

My plan is to release at least two Vitis platforms with KRS beta with tools to switch them easily and document the process on how to contribute your own platform. I'd be great to get your platform landing in here as a third one.

As a side note, I see ORB-SLAM just turned 3! (UZ-SLAMLab/ORB_SLAM3)

from acceleration_firmware_kv260.

syed-ahmed avatar syed-ahmed commented on July 21, 2024 1

Hi @vmayoral . Apologies for the late reply. I was transitioning from academia and had to stop working on this.

I was able to make a custom platform. The process was simpler than I thought. I was able to skip petalinux and reuse the artifacts that were already in kv260 firmware release of KRS. The only thing changed here is the hardware platform. However, a full set of instructions on generating the kv260 artifacts would of course help in the future (e.g. patching with PREEMPT_RT kernel, xilinx BSP modifications if any, added packages etc.). I don't really have a formal writeup, but here's an attempt at documenting the process here:

  1. Clone the platform repo
git clone https://github.com/syed-ahmed/xilinx-k26-som-2021.2 \
  && cd xilinx-k26-som-2021.2 \
  && git submodule update --init --recursive
  1. Make the platform.
cd kv260-vitis \
  && make platform PFM=kv260_ispMipiRx_vcu_DP
  1. Replace the krs kv260 platform with the generated platform from the previous step:
mv ~/krs_ws/src/acceleration/acceleration_firmware_kv260/acceleration_firmware_kv260/firmware/platform ~/krs_ws/src/acceleration/acceleration_firmware_kv260/acceleration_firmware_kv260/firmware/platform_bk \
  && cp -r platforms/xilinx_kv260_ispMipiRx_vcu_DP_202110_1 ~/krs_ws/src/acceleration/acceleration_firmware_kv260/acceleration_firmware_kv260/firmware/platform
  1. Generate the device tree from .xsa produced in step 2 by following the directions here.
  2. Replace the .dtsi in one of the acceleration examples to test. Example: replace the vadd_faster.dtsi in krs_ws/src/acceleration/acceleration_examples/nodes/faster_doublevadd_publisher/src with the .dtsi generated from step 4. Mine looks like as follows. Note firmware-name reflects that of vadd_faster.
/*
 * CAUTION: This file is automatically generated by Xilinx.
 * Version: XSCT 2021.2
 * Today is: Tue Mar 22 00:42:13 2022
 */


/dts-v1/;
/plugin/;
/ {
	fragment@0 {
		target = <&fpga_full>;
		overlay0: __overlay__ {
			#address-cells = <2>;
			#size-cells = <2>;
			firmware-name = "vadd_faster.bit.bin";
			resets = <&zynqmp_reset 116>, <&zynqmp_reset 117>, <&zynqmp_reset 118>, <&zynqmp_reset 119>;
		};
	};
	fragment@1 {
		target = <&amba>;
		overlay1: __overlay__ {
			afi0: afi0 {
				compatible = "xlnx,afi-fpga";
				config-afi = < 0 0>, <1 0>, <2 0>, <3 0>, <4 0>, <5 0>, <6 0>, <7 0>, <8 0>, <9 0>, <10 0>, <11 0>, <12 0>, <13 0>, <14 0x0>, <15 0x000>;
			};
			clocking0: clocking0 {
				#clock-cells = <0>;
				assigned-clock-rates = <99999001>;
				assigned-clocks = <&zynqmp_clk 71>;
				clock-output-names = "fabric_clk";
				clocks = <&zynqmp_clk 71>;
				compatible = "xlnx,fclk";
			};
			clocking1: clocking1 {
				#clock-cells = <0>;
				assigned-clock-rates = <99999001>;
				assigned-clocks = <&zynqmp_clk 72>;
				clock-output-names = "fabric_clk";
				clocks = <&zynqmp_clk 72>;
				compatible = "xlnx,fclk";
			};
		};
	};
	fragment@2 {
		target = <&amba>;
		overlay2: __overlay__ {
			#address-cells = <2>;
			#size-cells = <2>;
			audio_ss_0_audio_formatter_0: audio_formatter@80040000 {
				clock-names = "s_axi_lite_aclk", "m_axis_mm2s_aclk", "aud_mclk", "s_axis_s2mm_aclk";
				clocks = <&misc_clk_0>, <&misc_clk_1>, <&misc_clk_1>, <&misc_clk_0>;
				compatible = "xlnx,audio-formatter-1.0", "xlnx,audio-formatter-1.0";
				interrupt-names = "irq_mm2s", "irq_s2mm";
				interrupt-parent = <&gic>;
				interrupts = <0 111 4 0 110 4>;
				reg = <0x0 0x80040000 0x0 0x10000>;
				xlnx,include-mm2s = <0x1>;
				xlnx,include-s2mm = <0x1>;
				xlnx,max-num-channels-mm2s = <0x2>;
				xlnx,max-num-channels-s2mm = <0x2>;
				xlnx,mm2s-addr-width = <0x40>;
				xlnx,mm2s-async-clock = <0x1>;
				xlnx,mm2s-dataformat = <0x3>;
				xlnx,packing-mode-mm2s = <0x0>;
				xlnx,packing-mode-s2mm = <0x0>;
				xlnx,rx = <&audio_ss_0_i2s_receiver_0>;
				xlnx,s2mm-addr-width = <0x40>;
				xlnx,s2mm-async-clock = <0x1>;
				xlnx,s2mm-dataformat = <0x1>;
				xlnx,tx = <&audio_ss_0_i2s_transmitter_0>;
			};
			misc_clk_0: misc_clk_0 {
				#clock-cells = <0>;
				clock-frequency = <99999000>;
				compatible = "fixed-clock";
			};
			misc_clk_1: misc_clk_1 {
				#clock-cells = <0>;
				clock-frequency = <18432995>;
				compatible = "fixed-clock";
			};
			audio_ss_0_i2s_receiver_0: i2s_receiver@80060000 {
				aud_mclk = <18432995>;
				clock-names = "s_axi_ctrl_aclk", "aud_mclk", "m_axis_aud_aclk";
				clocks = <&misc_clk_0>, <&misc_clk_1>, <&misc_clk_0>;
				compatible = "xlnx,i2s-receiver-1.0", "xlnx,i2s-receiver-1.0";
				interrupt-names = "irq";
				interrupt-parent = <&gic>;
				interrupts = <0 108 4>;
				reg = <0x0 0x80060000 0x0 0x10000>;
				xlnx,depth = <0x80>;
				xlnx,dwidth = <0x18>;
				xlnx,num-channels = <0x1>;
				xlnx,snd-pcm = <&audio_ss_0_audio_formatter_0>;
			};
			audio_ss_0_i2s_transmitter_0: i2s_transmitter@80070000 {
				aud_mclk = <18432995>;
				clock-names = "s_axi_ctrl_aclk", "aud_mclk", "s_axis_aud_aclk";
				clocks = <&misc_clk_0>, <&misc_clk_1>, <&misc_clk_1>;
				compatible = "xlnx,i2s-transmitter-1.0", "xlnx,i2s-transmitter-1.0";
				interrupt-names = "irq";
				interrupt-parent = <&gic>;
				interrupts = <0 109 4>;
				reg = <0x0 0x80070000 0x0 0x10000>;
				xlnx,depth = <0x80>;
				xlnx,dwidth = <0x18>;
				xlnx,num-channels = <0x1>;
				xlnx,snd-pcm = <&audio_ss_0_audio_formatter_0>;
			};
			axi_iic_0: i2c@80030000 {
				#address-cells = <1>;
				#size-cells = <0>;
				clock-names = "s_axi_aclk";
				clocks = <&misc_clk_0>;
				compatible = "xlnx,axi-iic-2.1", "xlnx,xps-iic-2.00.a";
				interrupt-names = "iic2intc_irpt";
				interrupt-parent = <&gic>;
				interrupts = <0 107 4>;
				reg = <0x0 0x80030000 0x0 0x10000>;
			};
			axi_vip_0: axi_vip@a0000000 {
				/* This is a place holder node for a custom IP, user may need to update the entries */
				clock-names = "aclk";
				clocks = <&misc_clk_2>;
				compatible = "xlnx,axi-vip-1.1";
				reg = <0x0 0xa0000000 0x0 0x10000>;
				xlnx,axi-addr-width = <0x20>;
				xlnx,axi-aruser-width = <0x10>;
				xlnx,axi-awuser-width = <0x10>;
				xlnx,axi-buser-width = <0x0>;
				xlnx,axi-has-aresetn = <0x1>;
				xlnx,axi-has-bresp = <0x1>;
				xlnx,axi-has-burst = <0x1>;
				xlnx,axi-has-cache = <0x1>;
				xlnx,axi-has-lock = <0x1>;
				xlnx,axi-has-prot = <0x1>;
				xlnx,axi-has-qos = <0x1>;
				xlnx,axi-has-region = <0x0>;
				xlnx,axi-has-rresp = <0x1>;
				xlnx,axi-has-wstrb = <0x1>;
				xlnx,axi-interface-mode = <0x2>;
				xlnx,axi-protocol = <0x0>;
				xlnx,axi-rdata-width = <0x20>;
				xlnx,axi-rid-width = <0x10>;
				xlnx,axi-ruser-width = <0x0>;
				xlnx,axi-supports-narrow = <0x1>;
				xlnx,axi-wdata-width = <0x20>;
				xlnx,axi-wid-width = <0x10>;
				xlnx,axi-wuser-width = <0x0>;
			};
			misc_clk_2: misc_clk_2 {
				#clock-cells = <0>;
				clock-frequency = <299997000>;
				compatible = "fixed-clock";
			};
			capture_pipeline_mipi_csi2_rx_subsyst_0: mipi_csi2_rx_subsystem@80000000 {
				clock-names = "lite_aclk", "dphy_clk_200M", "video_aclk";
				clocks = <&misc_clk_0>, <&misc_clk_3>, <&misc_clk_2>;
				compatible = "xlnx,mipi-csi2-rx-subsystem-5.1", "xlnx,mipi-csi2-rx-subsystem-5.0";
				interrupt-names = "csirxss_csi_irq";
				interrupt-parent = <&gic>;
				interrupts = <0 104 4>;
				reg = <0x0 0x80000000 0x0 0x2000>;
				xlnx,axis-tdata-width = <32>;
				xlnx,max-lanes = <4>;
				xlnx,ppc = <2>;
				xlnx,vfb ;
				mipi_csi_portscapture_pipeline_mipi_csi2_rx_subsyst_0: ports {
					#address-cells = <1>;
					#size-cells = <0>;
					mipi_csi_port1capture_pipeline_mipi_csi2_rx_subsyst_0: port@1 {
						/* Fill cfa-pattern=rggb for raw data types, other fields video-format and video-width user needs to fill */
						reg = <1>;
						xlnx,cfa-pattern = "rggb";
						xlnx,video-format = <12>;
						xlnx,video-width = <8>;
						mipi_csirx_outcapture_pipeline_mipi_csi2_rx_subsyst_0: endpoint {
							remote-endpoint = <&capture_pipeline_v_frmbuf_wr_0capture_pipeline_mipi_csi2_rx_subsyst_0>;
						};
					};
					mipi_csi_port0capture_pipeline_mipi_csi2_rx_subsyst_0: port@0 {
						/* Fill cfa-pattern=rggb for raw data types, other fields video-format,video-width user needs to fill */
						/* User need to add something like remote-endpoint=<&out> under the node csiss_in:endpoint */
						reg = <0>;
						xlnx,cfa-pattern = "rggb";
						xlnx,video-format = <12>;
						xlnx,video-width = <8>;
						mipi_csi_incapture_pipeline_mipi_csi2_rx_subsyst_0: endpoint {
							data-lanes = <1 2 3 4>;
						};
					};
				};
			};
			misc_clk_3: misc_clk_3 {
				#clock-cells = <0>;
				clock-frequency = <199998000>;
				compatible = "fixed-clock";
			};
			capture_pipeline_v_frmbuf_wr_0: v_frmbuf_wr@b0010000 {
				#dma-cells = <1>;
				clock-names = "ap_clk";
				clocks = <&misc_clk_2>;
				compatible = "xlnx,v-frmbuf-wr-2.3", "xlnx,axi-frmbuf-wr-v2.2";
				interrupt-names = "interrupt";
				interrupt-parent = <&gic>;
				interrupts = <0 105 4>;
				reg = <0x0 0xb0010000 0x0 0x10000>;
				reset-gpios = <&gpio 78 1>;
				xlnx,dma-addr-width = <32>;
				xlnx,dma-align = <16>;
				xlnx,max-height = <2160>;
				xlnx,max-width = <3840>;
				xlnx,pixels-per-clock = <2>;
				xlnx,s-axi-ctrl-addr-width = <0x7>;
				xlnx,s-axi-ctrl-data-width = <0x20>;
				xlnx,vid-formats = "nv12";
				xlnx,video-width = <8>;
			};
			vcu_vcu_0: vcu@80100000 {
				#address-cells = <2>;
				#clock-cells = <1>;
				#size-cells = <2>;
				clock-names = "pll_ref", "aclk", "vcu_core_enc", "vcu_core_dec", "vcu_mcu_enc", "vcu_mcu_dec";
				clocks = <&misc_clk_4>, <&misc_clk_0>, <&vcu_vcu_0 1>, <&vcu_vcu_0 2>, <&vcu_vcu_0 3>, <&vcu_vcu_0 4>;
				compatible = "xlnx,vcu-1.2", "xlnx,vcu";
				interrupt-names = "vcu_host_interrupt";
				interrupt-parent = <&gic>;
				interrupts = <0 106 4>;
				ranges ;
				reg = <0x0 0x80140000 0x0 0x1000>, <0x0 0x80141000 0x0 0x1000>;
				reg-names = "vcu_slcr", "logicore";
				reset-gpios = <&gpio 80 0>;
				encoder: al5e@80100000 {
					compatible = "al,al5e-1.2", "al,al5e";
					interrupt-parent = <&gic>;
					interrupts = <0 106 4>;
					reg = <0x0 0x80100000 0x0 0x10000>;
				};
				decoder: al5d@80120000 {
					compatible = "al,al5d-1.2", "al,al5d";
					interrupt-parent = <&gic>;
					interrupts = <0 106 4>;
					reg = <0x0 0x80120000 0x0 0x10000>;
				};
			};
			misc_clk_4: misc_clk_4 {
				#clock-cells = <0>;
				clock-frequency = <49999500>;
				compatible = "fixed-clock";
			};
			zyxclmm_drm {
				compatible = "xlnx,zocl";
			};
			vcap_capture_pipeline_mipi_csi2_rx_subsyst_0 {
				compatible = "xlnx,video";
				dma-names = "port0";
				dmas = <&capture_pipeline_v_frmbuf_wr_0 0>;
				vcap_portscapture_pipeline_mipi_csi2_rx_subsyst_0: ports {
					#address-cells = <1>;
					#size-cells = <0>;
					vcap_portcapture_pipeline_mipi_csi2_rx_subsyst_0: port@0 {
						direction = "input";
						reg = <0>;
						capture_pipeline_v_frmbuf_wr_0capture_pipeline_mipi_csi2_rx_subsyst_0: endpoint {
							remote-endpoint = <&mipi_csirx_outcapture_pipeline_mipi_csi2_rx_subsyst_0>;
						};
					};
				};
			};
		};
	};
};
  1. Make changes to kv260.cfg to reflect new platform requirements (platform name, clks, vivado strategy etc.). Mine looks like this:
platform=kv260_ispMipiRx_vcu_DP
save-temps=1
debug=1

# Enable profiling of data ports
[profile]
data=all:all:all

[vivado]
prop=run.impl_1.strategy=Performance_ExploreWithRemap
  1. Compile acceleration examples as in KRS docs.

That's my progress so far. My next step would have been to:

Unfortunately I have to stop here since I don't have the bandwidth to work on this anymore :(. I hope somebody else will pick this up. May be I'll find some time in the future.

from acceleration_firmware_kv260.

syed-ahmed avatar syed-ahmed commented on July 21, 2024

Thanks @vmayoral! That explains a lot! Let's keep this issue open and I can document the process as I work on this.

from acceleration_firmware_kv260.

vmayoral avatar vmayoral commented on July 21, 2024

Hey @syed-ahmed!

Do you have any updates to share with us on your research? Let us know how we can help.

from acceleration_firmware_kv260.

vmayoral avatar vmayoral commented on July 21, 2024

Thanks for the update @syed-ahmed, progress looks great to me. I'll keep this open. Keep us posted on your next steps, this should be helpful to others following your path.

from acceleration_firmware_kv260.

vmayoral avatar vmayoral commented on July 21, 2024

I'm closing this for now @syed-ahmed, feel free to re-open or ping me if anything else is needed.

from acceleration_firmware_kv260.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.