Code Monkey home page Code Monkey logo

videocoreiv's Introduction

Disclaimer:

This is a independent documentation project based on a combination of static analysis
and trial and error on real hardware.  This work is 100% independent from and not
sanctioned by or connected with Broadcom or its agents.

No Broadcom documents or materials were used beyond those publically available 
(see Referenced Materials).

This work was undertaken and the information provided for non commercial use on the 
expectation that hobbyists of all ages will find the details useful for understanding 
and working with their Raspberry Pi hardware.

The hope is that Broadcom will be flattered by the interest in the device and
understand the benefits of opening up understanding to a larger audience of 
potential customers and developers.

Broadcom should be commended with making their SoC available for a project as 
exciting as the Raspberry Pi.

The intent is that no copyrighted materials are contained in this repository.  

Introduction

Purpose of this repo: Documentation and samples on the VideoCore IV instruction set as used in the BCM SoC used in the Raspberry Pi. As of early 2016, Broadcom has yet to release public information on the VPU, so it is hoped you find this repo useful.

The BCM2835 SoC (System on a Chip) in the original RaspberryPi has the following significant computation units:

  • (ARM) ARM1176JZF-S 700 MHz processor which acts as the "main" processor and typically runs Linux.
  • (VPU) Dualcore Videocore IV CPU @250MHz with SIMD Parallel Pixel Units (PPU) which runs scalar (integer and float) and vector (integer only) programs. Runs ThreadX OS, and generally coordinates all functional blocks such as video codecs, power management, video out.
  • (ISP) Image Sensor Pipeline (ISP) providing lens shading, statistics and distortion correction.
  • (QPU) QPU units which provide 24 GFLOPS compute performance for coordinate, vertex and pixel shaders. Whilst originally not documented, Broadcom released documentation and source code for the QPU in 2014.

Newer Raspberry Pi mix things up with faster and more modern ARM cores, but the VPU information here is still relevant.

For more information on the Raspberry Pi, see the foundation's site at http://raspberrypi.org, or the embedded linux wiki at http://elinux.org/R-Pi_Hub.

Active discussions take place on IRC (freenode) on #raspberrypi-internals, #raspberrypi-osdev, #raspberrypi-dev, and #raspberrypi.

There is a raspberrypi-internals mailing list, you can subscribe at mailing list page at freelists.org.

We are in a very early stage of understanding of the device. At this stage we only have Serial IO and GPIO for flashing things like the status led. You will need to attach a terminal to the Mini UART on the GPIO connector. For more details see "Getting started" below.

It is now possibly to use VideoCore Kernels from Userland / Linux, see https://github.com/hermanhermitage/videocoreiv/wiki/VideoCore-IV-Kernels-under-Linux. Our understanding of the Videocore Processor is nearing completion, and it is an excellent target for integer SIMD and DSP kernels. Essentially, it can be used for 16 way SIMD processing of 8, 16 and 32 bit integer values.

Videocore IV Community and Resources:

I recommend starting with Julian's GNU toolchain, at https://github.com/itszor/vc4-toolchain

Documentation:

  1. Getting started: https://github.com/hermanhermitage/videocoreiv/wiki/Getting-Started
  2. Instruction set: https://github.com/hermanhermitage/videocoreiv/wiki/VideoCore-IV-Programmers-Manual
  3. Hardware regs:
  1. Kernels from Linux: https://github.com/hermanhermitage/videocoreiv/wiki/VideoCore-IV-Kernels-under-Linux
  2. Performance Issues: https://github.com/hermanhermitage/videocoreiv/wiki/VideoCore-IV-Performance-Considerations
  3. 3d Pipeline Overview: https://github.com/hermanhermitage/videocoreiv/wiki/VideoCore-IV-3d-Graphics-Pipeline
  4. QPU Shader Processors (24 GFLOPS): https://github.com/hermanhermitage/videocoreiv-qpu

Methodology:

All information here has been obtained solely by a combination of:

  1. Static analysis.
  2. Experimentation on a Raspberry Pi.
  3. Discussions on #raspberrypi-osdev and #raspberrypi-internals.

All activities were undertaken on a Raspberry Pi running Debian.

Those interested in the legal issues involved with reverse engineering activities, please review:

  1. https://www.eff.org/issues/coders/reverse-engineering-faq
  2. http://www.chillingeffects.org/reverse/faq.cgi
  3. http://en.wikipedia.org/wiki/Reverse_engineering

We do not accept materials nor publish materials relating to DRM or its circumvention.

Referenced Materials

Software and Binaries

Official RasPi firmware and blobs

Available at https://github.com/raspberrypi/firmware/tree/master/boot. Releases after May the 10th 2012 are accompanied by a LICENSE.broadcom readme file containing copyright notice, a disclaimer and guidelines for use. Prior to this date the readme was not present.

Debian "Squeeze" Distribution

The distribution debian6-19-04-2012.zip from http://www.raspberrypi.org/downloads was used a development platform for the majority of the work you find here.

Data Sheets

  1. BCM2835 ARM Peripherals data sheet at http://www.raspberrypi.org/wp-content/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
  2. VideoCore® IV 3D Architecture Reference Guide at https://docs.broadcom.com/docs/12358545

Patents and Patent Applications

The original Alphamosaic patents and patent applications provide a wealth of information for understanding the structure of the VideoCore instruction set and architecture. Whilst the instruction encodings are different, and only a limited range of instructions are indicated they prove an invaluable reference for understanding the design space the engineers were exploring.

The newer Broadcom SoC patents and applications provide detailed information on how the VideoCore has been been integrated into a broader platform setting. They are invaluable for gaining a deeper insight into the additional function units present in the BCM2835 and how they fit together.

Patent Applications on Broadcom SoC Method and Systems

  • US20060184987 Intelligent Dma in a Mobile Multimedia Processor Supporting Multiple Display Formats
  • US20080291208 Method and System for Processing Data Via a 3d Pipeline Coupled to a Generic Video Processing Unit
  • US20080292216 Method and System for Processing Images using Variable Sized Tiles
  • US20080292219 Method and System for an Image Sensor Pipeline on a Mobile Imaging Device
  • US20090232347 Method and System for Inserting Software Processing In a Hardware Image Sensor Pipeline
  • US20110148901 Method and System for Tile Mode Renderer With Coordinate Shader
  • US20110154307 Method and System for Utilizing Data Flow Graphs to Compile Shaders
  • US20110154377 Method and System for Reducing Communication During Video Processing Utilizing Merge Buffering
  • US20110216069 Method and System for Compressing Tile Lists Used for 3d Rendering
  • US20110221743 Method and System for Controlling a 3d Processor Using a Control List in Memory
  • US20110227920 Method and System for a Shader Processor With Closely Couple Peripherals
  • US20110242113 Method and System for Processing Pixels Utilizing Scoreboarding
  • US20110242344 Method and System for Determining How to Handle Processing of an Image Based Motion
  • US20110242427 Method and System for Providing 1080P Video with 32 Bit Mobile DDR Memory
  • US20110249744 Method and System for Video Processing Utilizing Scalar Cores and a Single Vector Core
  • US20110254995 Method and System for Mitigating Seesawing Effect During Autofocus
  • US20110261059 Method and System for Decomposing Complex Shapes Into Curvy RHTS For Rasterization
  • US20110261061 Method and System for Processing Image Data on a Per Tile Basis in an Image Sensor Pipeline
  • US20110264902 Method and System For Suspending Video Processor and Saving Processor State in SDRAM Utilizing a Core Processor
  • US20110279702 Method and System for Providing A Programmable and Flexible Image Sensor Pipeline For Multiple Input Patterns

Patents on the baseline Alphamosaic processor

  • US7028143 Narrow/Wide Cache
  • US7036001 Vector Processing System
  • US7457941 Vector Processing System
  • US7043618 System for Memory Access in a Data Processor
  • US7107429 Data Access in a Processor
  • US7069417 Vector Processing System,
  • US7818540 Vector Processing System
  • US7080216 Data Access in a Processor
  • US7130985 Parallel Processor Executing an Instruction Specifying Any Location First Operand Register and Group Configuration in Two Dimensional Register File
  • US7167972 Vector/Scalar System With Vector Unit Producing Scalar Result from Vector Results According to Modifier in Vector Instruction
  • US7350057 Scalar Result Producing Method in Vector/Scalar System by Vector Unit from Vector Results According to Modifier in Vector Instruction
  • US7200724 Two Dimentional Access in a Data Processor
  • US7203800 Narrow/Wide Cache

Patents Applications on the baseline Alphamosaic processor:

Third Party Documents and Links

Some snippets of information appear in third party documents.

videocoreiv's People

Contributors

christinaa avatar hermanhermitage avatar jriwanek avatar mgottschlag avatar parlane avatar petemoore avatar phire avatar shacharr avatar tnorman42 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

videocoreiv's Issues

hello video core iv for galaxy y ...

hello man ..
galaxy y use video core iv bcm 2763
but no have sources ...
i need sources for compile cyanogenmod 7 and 9 ...

u can help me ?

thanks !!

Building kernels to run under linux

Could you provide a bit more information into how to experiment with running kernels from within linux?

Eg: How can one start with source code (videocore tinyasm) and end up with a proper kernel to run with the mailbox tool?

Pi4 firmware undocumented instructions

It seems that the Raspberry Pi 4 has a revision of the Videocore "GPU" that has instructions that either have encodings that are broken according to the docs (0xCEC004BC starts off like a control register access, but where it should be 1100 1100 for the first byte, it's 1100 1110 -- this is 22 bytes into the text-section of "start4x.elf"...)

There are also some others - such as 0x0010 at 8 bytes into the same file and section, which falls into the gap from the 0x0005 of RTI and the 0x01C0 start of SWI with register...

Any suggestions on how to figure out what these actually do - or if there is some kind of setup or encryption being done on the instruction stream from elsewhere in the binary ?

Not an issue, more of a question...

You have the btest opcode as being Rd & bit(Ra) == 0 - if "bit test" is looking to see if the bit is set, then Rd & bit(Ra) != 0 would be the result - 2 & 2 == 2 - 3 & 2 == 2 (3 & 1 == 1) - in no spot does the bit actually being set result in 0. So... is "bit test" (btest) testing for a bit being set or the bit being not set ? If not-set, then I can understand the logic, but...

Another source of VC code examples...

I don't know if you found this one in your searching (the site is in chinese though):

[ link redacted ]

It has some VC assembly and VC headers from Metaware's High C for Videocore.

rts opcode?

On the "VideoCore IV Kernels under Linux" page, looking through the assembly for alpha.bin, the last instruction is "rti".

What is that?

Looking at the actual bits of that instruction, it would appear to be "b rd" where "rd" is "lr" or "r26"...

Am I correct?

Potential clue for 'pixv' peripheral's actual name

Apparently the SMI (Secondary Memory Interface) docs mention a "Pixel Valve" which seems to fit with the pixv MMIO region listed in the register map. (no details other than "something called a pixel valve exists in the core" unfortunately, but more info than previously...)

dual core VPU?

Hello. Are there actually two VPU cores? There are several hints about it in different places but nothing definite. If yes do you know how to start the second one or how it behaves at boot time? Are they equivalent and can do same stuff?

Multiple framebuffers

Hi,

I'm desperately looking for a way to allocate and use at least 2 framebuffers on a PI0 (bare-metal) in order to switch them every 60th of a second and provide smooth and fast animations.

I found a way to allocate some memory from the GPU but I don't know how to say "Please render this other buffer on the next vsync".

PS : I'd like to use two buffers since copying all the pixels using ARM's ldmia/stmia is quite slow...

Interrupts support and user mode

Is it possible to enable interrupts and the clock? Is there an MMU somewhere as there is user mode?
What's the clock interrupt number in the table?

Hello World

Hello 😄

Does anyone know how to / if it's possible to have a simple hello-world bootloader for the raspberry pi in ASM?

I was shocked enough to find that the rPi bootcode.bin is run on the videocore chip

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.