Code Monkey home page Code Monkey logo

Comments (3)

GoogleCodeExporter avatar GoogleCodeExporter commented on August 13, 2024
This tutorial covers this particular function
https://community.arm.com/groups/processors/blog/2010/09/01/coding-for-neon--par
t-4-shifting-left-and-right

This is a port of the row_neon64.cc version of this function.
So the differences between 32 and 64 bit are minor.

This is the existing 64 bit code:
#define ARGBTORGB565                                      
    "shll       v0.8h,  v22.8b, #8             \n"  /* R  
    "shll       v21.8h, v21.8b, #8             \n"  /* G  
    "shll       v20.8h, v20.8b, #8             \n"  /* B  
    "sri        v0.8h,  v21.8h, #5             \n"  /* RG 
    "sri        v0.8h,  v20.8h, #11            \n"  /* RGB

This is the 32 bit port:
#define ARGBTORGB565                                        
    "vshll.u8    q0, d22, #8                   \n"  /* R    
    "vshll.u8    q8, d21, #8                   \n"  /* G    
    "vshll.u8    q9, d20, #8                   \n"  /* B    
    "vsri.16     q0, q8, #5                    \n"  /* RG   
    "vsri.16     q0, q9, #11                   \n"  /* RGB  


vsri shifts a register right by an immediate and inserts it into the 
destination.
e.g.
q0 rrrr_rrrr_0000_0000
q8 gggg_gggg_0000_0000

vsri.16     q0, q8, #5
shifts q8 (g) by 5
q8 0000 0ggg_gggg_g000_0000
then masks in 5 bits from q0
q0 rrrr_rggg_gggg_g000_0000

vsri.16     q0, q9, #11
then takes B
q9 bbbb_bbbb_0000_0000
shifts down by 11
q9 0000_0000_000b_bbbb
and masks in 11 bits from q0 with q9
q0 rrrr_rggg_gggb_bbbb

If 4444 were done the same way, it would be 7 instructions, same as it is now.

Now
#define ARGBTOARGB4444                                      
    "vshr.u8    d20, d20, #4                   \n"  /* B    
    "vbic.32    d21, d21, d4                   \n"  /* G    
    "vshr.u8    d22, d22, #4                   \n"  /* R    
    "vbic.32    d23, d23, d4                   \n"  /* A    
    "vorr       d0, d20, d21                   \n"  /* BG   
    "vorr       d1, d22, d23                   \n"  /* RA   
    "vzip.u8    d0, d1                         \n"  /* BGRA 
if done with vsri
#define ARGBTOARGB4444                                      
    "vshll.u8    q0, d23, #8                   \n"  /* A    
    "vshll.u8    q8, d22, #8                   \n"  /* R    
    "vshll.u8    q9, d21, #8                   \n"  /* G    
    "vshll.u8    q10, d20, #8                  \n"  /* B    
    "vsri.16     q0, q8, #4                    \n"  /* AR   
    "vsri.16     q0, q9, #8                    \n"  /* ARG  
    "vsri.16     q0, q10, #12                  \n"  /* ARGB 

but could be done on 8 bit values
#define ARGBTOARGB4444                                       
    "vsri.8      d23, d22, #4                  \n"  /* AR    
    "vsri.8      d21, d20, #4                  \n"  /* GB    
    "vzip.u8     d21, d23                      \n"  /* ARGB  
    "vmov        d0, d21                       \n"           
    "vmov        d1, d23                       \n"  


Original comment by [email protected] on 25 Feb 2016 at 1:25

from libyuv.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 13, 2024
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/ee99b85126aeafe64ba3da8f28aafcac80a595ac

commit ee99b85126aeafe64ba3da8f28aafcac80a595ac
Author: Frank Barchard <[email protected]>
Date: Mon Feb 29 20:22:25 2016

Port ARGBToRGB565 from aarch64 neon to 32 bit

The 64 bit version of ARGBToRGB565 to 32 bit. 64 bit is using sri which shifts 
and inserts, saving some masking.  The instruction is available for neon 32 bit 
as well.

[email protected], [email protected]
BUG=libyuv:571

Review URL: https://codereview.chromium.org/1724393002 .

[modify] 
https://crrev.com/ee99b85126aeafe64ba3da8f28aafcac80a595ac/README.chromium
[modify] 
https://crrev.com/ee99b85126aeafe64ba3da8f28aafcac80a595ac/include/libyuv/versio
n.h
[modify] 
https://crrev.com/ee99b85126aeafe64ba3da8f28aafcac80a595ac/source/row_neon.cc
[modify] 
https://crrev.com/ee99b85126aeafe64ba3da8f28aafcac80a595ac/source/row_neon64.cc

Original comment by [email protected] on 29 Feb 2016 at 8:22

from libyuv.

GoogleCodeExporter avatar GoogleCodeExporter commented on August 13, 2024

Original comment by [email protected] on 29 Feb 2016 at 8:31

  • Changed state: Fixed

from libyuv.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.