Comments (3)
This tutorial covers this particular function
https://community.arm.com/groups/processors/blog/2010/09/01/coding-for-neon--par
t-4-shifting-left-and-right
This is a port of the row_neon64.cc version of this function.
So the differences between 32 and 64 bit are minor.
This is the existing 64 bit code:
#define ARGBTORGB565
"shll v0.8h, v22.8b, #8 \n" /* R
"shll v21.8h, v21.8b, #8 \n" /* G
"shll v20.8h, v20.8b, #8 \n" /* B
"sri v0.8h, v21.8h, #5 \n" /* RG
"sri v0.8h, v20.8h, #11 \n" /* RGB
This is the 32 bit port:
#define ARGBTORGB565
"vshll.u8 q0, d22, #8 \n" /* R
"vshll.u8 q8, d21, #8 \n" /* G
"vshll.u8 q9, d20, #8 \n" /* B
"vsri.16 q0, q8, #5 \n" /* RG
"vsri.16 q0, q9, #11 \n" /* RGB
vsri shifts a register right by an immediate and inserts it into the
destination.
e.g.
q0 rrrr_rrrr_0000_0000
q8 gggg_gggg_0000_0000
vsri.16 q0, q8, #5
shifts q8 (g) by 5
q8 0000 0ggg_gggg_g000_0000
then masks in 5 bits from q0
q0 rrrr_rggg_gggg_g000_0000
vsri.16 q0, q9, #11
then takes B
q9 bbbb_bbbb_0000_0000
shifts down by 11
q9 0000_0000_000b_bbbb
and masks in 11 bits from q0 with q9
q0 rrrr_rggg_gggb_bbbb
If 4444 were done the same way, it would be 7 instructions, same as it is now.
Now
#define ARGBTOARGB4444
"vshr.u8 d20, d20, #4 \n" /* B
"vbic.32 d21, d21, d4 \n" /* G
"vshr.u8 d22, d22, #4 \n" /* R
"vbic.32 d23, d23, d4 \n" /* A
"vorr d0, d20, d21 \n" /* BG
"vorr d1, d22, d23 \n" /* RA
"vzip.u8 d0, d1 \n" /* BGRA
if done with vsri
#define ARGBTOARGB4444
"vshll.u8 q0, d23, #8 \n" /* A
"vshll.u8 q8, d22, #8 \n" /* R
"vshll.u8 q9, d21, #8 \n" /* G
"vshll.u8 q10, d20, #8 \n" /* B
"vsri.16 q0, q8, #4 \n" /* AR
"vsri.16 q0, q9, #8 \n" /* ARG
"vsri.16 q0, q10, #12 \n" /* ARGB
but could be done on 8 bit values
#define ARGBTOARGB4444
"vsri.8 d23, d22, #4 \n" /* AR
"vsri.8 d21, d20, #4 \n" /* GB
"vzip.u8 d21, d23 \n" /* ARGB
"vmov d0, d21 \n"
"vmov d1, d23 \n"
Original comment by [email protected]
on 25 Feb 2016 at 1:25
from libyuv.
The following revision refers to this bug:
https://chromium.googlesource.com/libyuv/libyuv.git/+/ee99b85126aeafe64ba3da8f28aafcac80a595ac
commit ee99b85126aeafe64ba3da8f28aafcac80a595ac
Author: Frank Barchard <[email protected]>
Date: Mon Feb 29 20:22:25 2016
Port ARGBToRGB565 from aarch64 neon to 32 bit
The 64 bit version of ARGBToRGB565 to 32 bit. 64 bit is using sri which shifts
and inserts, saving some masking. The instruction is available for neon 32 bit
as well.
[email protected], [email protected]
BUG=libyuv:571
Review URL: https://codereview.chromium.org/1724393002 .
[modify]
https://crrev.com/ee99b85126aeafe64ba3da8f28aafcac80a595ac/README.chromium
[modify]
https://crrev.com/ee99b85126aeafe64ba3da8f28aafcac80a595ac/include/libyuv/versio
n.h
[modify]
https://crrev.com/ee99b85126aeafe64ba3da8f28aafcac80a595ac/source/row_neon.cc
[modify]
https://crrev.com/ee99b85126aeafe64ba3da8f28aafcac80a595ac/source/row_neon64.cc
Original comment by [email protected]
on 29 Feb 2016 at 8:22
from libyuv.
Original comment by [email protected]
on 29 Feb 2016 at 8:31
- Changed state: Fixed
from libyuv.
Related Issues (20)
- ARGBToUVJRow_SSSE3 used but expected ARGBToUVJRow_AVX2 HOT 3
- NV12ToARGBRow_SSSE3 used, but should be NV12ToARGBRow_AVX2, based on I422ToARGBRow_AVX2 HOT 5
- I411ToARGBRow_SSSE3 used but expected AVX2. Adapt from I422ToARGBRow_AVX2 HOT 5
- I422ToYUY2Row_SSE2 - port to AVX2 HOT 1
- I422ToARGBRow_SSSE3 used; expected I422ToARGBRow_AVX2 HOT 5
- ScaleRowDown2Box_Odd_SSSE3 for odd source width subsampling.
- Convert16To8 for higher bit depth conversions
- Convert16ToF16 for higher bit depth conversions to half float.
- MJPGToARGB prototype in wrong header HOT 2
- Row_ name consistency HOT 3
- Signed int overflows in row_gcc.cc HOT 12
- test msan HOT 8
- rename MIPS_DSPR2 to DSPR2 HOT 2
- -DLIBYUV_DISABLE_X86=1 build HOT 2
- libyuv 'Source' shows old svn content - update or remove HOT 1
- SVN turn down HOT 2
- libyuv_neon.a library build failed in ios building HOT 2
- Android - android/test_runner.py ImportError: No module named dependency_manager HOT 4
- ARGBToA
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from libyuv.