arm-software / tool-solutions Goto Github PK

Tutorials & examples for Arm software development tools.

License: Apache License 2.0

Shell 12.08% Batchfile 0.88% C 42.67% Makefile 4.75% Assembly 11.05% Dockerfile 4.55% C++ 10.90% Python 12.31% PowerShell 0.30% CMake 0.45% BitBake 0.05%

tool-solutions's People

Stargazers

Watchers

Forkers

kfc6364 shashank545 fredbull24 hyeonjjung buzzmarshall brunomorishita myqtmac cloverpop raphaelwhuang softxmu ivsrihari armjoerozek juliegaskin lucaswu zloop1982 ekta1993 kaisayang chenyangchenyang piyush3db chakshumoar alvislin thefrench77 jeromekevin davidbp13 sanjeevs1 armflorentlebeau terryzsb naveen-arm pareenaverma zhanzr jiayang-hu youji-mochizuki-dn robertjkaye cfrod goloskokovic tgsmith61591 lifajin girishvermaarm jakemdaly jidongmei andtob01 kyrburb sanyaade-projects randhawp yih301 waitray maivnov muszka95 tsutsu3 luzhiquan oldjohn86 neuroidss vassilisarm babyge yjod22 lukeireland1 nareshkamboju jiaqiwang969 pjvdl atharvnagarikar joeramsay samhithaaaa shada wxthon global-localhost global19 global19-atlassian-net solidpixel nsircombe shmsong elfringham saurabh-nishad connorb573 qpc-database kinouani52 nullbyte91 snadampal ajiang38740 alphaics-prateek ankitshah009 mseth10 roy75117 diaena traderpedroso januxnet vovamarch haoshuai-orka hokim98 milpuz01 silvernine209 nickrf pradeepchawda anhtudotinfo doublemountainman jiaxinglei jondea teveltech rnshah9 isabella232 jjohnson-arm

tool-solutions's Issues

docker tensorflow-aarch64 build fail numpy/scipy

While building docker, fail due to permission denied at

# Build numpy from source, using OpenBLAS for BLAS calls
COPY scripts/build-numpy.sh $PACKAGE_DIR/.
COPY patches/site.cfg $PACKAGE_DIR/site.cfg
RUN $PACKAGE_DIR/build-numpy.sh

solved by adding exec permission on build-numpy.sh after copying:

# Build numpy from source, using OpenBLAS for BLAS calls
COPY scripts/build-numpy.sh $PACKAGE_DIR/.
COPY patches/site.cfg $PACKAGE_DIR/site.cfg
RUN sudo chmod +x $PACKAGE_DIR/build-numpy.sh
RUN $PACKAGE_DIR/build-numpy.sh

same holds for scipy.

E: Package 'android-tools-fsutils' has no installation candidate

While running build.sh:

Reading state information... Done
Package android-tools-fsutils is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source
However the following packages replace it:
android-sdk-libsparse-utils android-sdk-ext4-utils

E: Package 'android-tools-fsutils' has no installation candidate

Errors while installing ubuntu on HiKey960

Hi,
We are trying to install Ubuntu on Hikey960 using instructions at link below.
https://github.com/ARM-software/Tool-Solutions/tree/master/ml-tool-examples/hikey960-ubuntu

After recovery is successfully done, flashboot.sh is executed and we see below error. Can you please check this issue?

% ./fastboot.sh
target reported max download size of 134217728 bytes
sending 'ptable' (196 KB)...
OKAY [ 0.102s]
writing 'ptable'...
OKAY [ 0.016s]
finished. total time: 0.119s
target reported max download size of 134217728 bytes
sending 'xloader' (151 KB)...
OKAY [ 0.042s]
writing 'xloader'...
OKAY [ 0.281s]
finished. total time: 0.323s
target reported max download size of 134217728 bytes
sending 'fastboot' (3346 KB)...
OKAY [ 0.129s]
writing 'fastboot'...
FAILED (remote: No such partition.)
finished. total time: 0.130s

scaling_max_freq stuck at lowest available frequency on Hikey 960

Hi,

scaling_max_freq of Hikey 960 is showing 903MHz for big cores and 533MHz for little cores. How to set scaling_max_frequency to available frequencies on hikey?

i have tried using : echo 1421000 > /sys/devices/system/cpu/cpu7/cpufreq/scaling_max_freq
and cpufreq-set -c 7 -u 1421000
but none of them worked.
I also tried setting governor into performance mode, but it is still operating at 903MHz/533MHz.

scipy build error: apple m1 - arm64 try to build tensorflow with docker

Driven by curiosity, I am trying to build arm64- tensorflow with dockerfile, in apple m1 scilicon.
The error occurs when build scipy.

with command:
sudo ./build.sh --build-type tensorflow --tf_version 1 --jobs 4
build-tfv1.log

#39 257.8                  from scipy/sparse/sparsetools/sparsetools.cxx:34:
#39 257.8 /home/ubuntu/python3-venv/lib/python3.7/site-packages/numpy-1.17.1-py3.7-linux-aarch64.egg/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
#39 257.8    17 | #warning "Using deprecated NumPy API, disable it with " \
#39 257.8       |  ^~~~~~~
#39 258.2 g++: scipy/sparse/sparsetools/other.cxx
#39 258.2 In file included from /home/ubuntu/python3-venv/lib/python3.7/site-packages/numpy-1.17.1-py3.7-linux-aarch64.egg/numpy/core/include/numpy/ndarraytypes.h:1830,
#39 258.2                  from /home/ubuntu/python3-venv/lib/python3.7/site-packages/numpy-1.17.1-py3.7-linux-aarch64.egg/numpy/core/include/numpy/ndarrayobject.h:12,
#39 258.2                  from scipy/sparse/sparsetools/sparsetools.h:5,
#39 258.2                  from scipy/sparse/sparsetools/other.cxx:4:
#39 258.2 /home/ubuntu/python3-venv/lib/python3.7/site-packages/numpy-1.17.1-py3.7-linux-aarch64.egg/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
#39 258.2    17 | #warning "Using deprecated NumPy API, disable it with " \
#39 258.2       |  ^~~~~~~
#39 849.7 g++: fatal error: Killed signal terminated program cc1plus
#39 849.7 compilation terminated.
#39 885.2 Running from scipy source directory.
#39 885.2 /home/ubuntu/python3-venv/lib/python3.7/site-packages/numpy-1.17.1-py3.7-linux-aarch64.egg/numpy/distutils/system_info.py:782: UserWarning: Specified path /home/ubuntu/python3-venv/include/python3.7m is invalid.
#39 885.2   return self.get_paths(self.section, key)
#39 885.2 /home/ubuntu/python3-venv/lib/python3.7/site-packages/numpy-1.17.1-py3.7-linux-aarch64.egg/numpy/distutils/system_info.py:782: UserWarning: Specified path /usr/include/python3.7m is invalid.
#39 885.2   return self.get_paths(self.section, key)
#39 885.3 error: Command "g++ -DNDEBUG -g -fwrapv -O3 -Wall -mcpu=native -moutline-atomics -O3 -fPIC -D__STDC_FORMAT_MACROS=1 -Iscipy/sparse/sparsetools -I/home/ubuntu/python3-venv/lib/python3.7/site-packages/numpy-1.17.1-py3.7-linux-aarch64.egg/numpy/core/include -I/home/ubuntu/python3-venv/include -I/usr/local/include/python3.7m -c scipy/sparse/sparsetools/bsr.cxx -o build/temp.linux-aarch64-3.7/scipy/sparse/sparsetools/bsr.o -MMD -MF build/temp.linux-aarch64-3.7/scipy/sparse/sparsetools/bsr.o.d" failed with exit status 1
#39 ERROR: executor failed running [/bin/sh -c $PACKAGE_DIR/build-scipy.sh]: exit code: 1
------
 > [tensorflow-tools 12/19] RUN /home/ubuntu//packages/build-scipy.sh:
------
executor failed running [/bin/sh -c $PACKAGE_DIR/build-scipy.sh]: exit code: 1

Illegal hardware instruction for `pytorch-arm-neoverse:r23.11-torch-2.1.0-onednn-acl`

When I am running inference with libtorch from armswdev/pytorch-arm-neoverse:r23.11-torch-2.1.0-onednn-acl, the process crashed with illegal hardware instruction (core dumped).

armswdev/pytorch-arm-neoverse:r23.11-torch-2.1.0-openblas works, and a while ago, armswdev/pytorch-arm-neoverse:r23.06-torch-2.0.0-onednn-acl also works. I haven't tested it against other images.

Output of lscpu

Architecture:           aarch64
  CPU op-mode(s):       32-bit, 64-bit
  Byte Order:           Little Endian
CPU(s):                 4
  On-line CPU(s) list:  0-3
Vendor ID:              ARM
  Model name:           Neoverse-N1
    Model:              1
    Thread(s) per core: 1
    Core(s) per socket: 4
    Socket(s):          1
    Stepping:           r3p1
    BogoMIPS:           50.00
    Flags:              fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
NUMA:                   
  NUMA node(s):         1
  NUMA node0 CPU(s):    0-3
Vulnerabilities:        
  Gather data sampling: Not affected
  Itlb multihit:        Not affected
  L1tf:                 Not affected
  Mds:                  Not affected
  Meltdown:             Not affected
  Mmio stale data:      Not affected
  Retbleed:             Not affected
  Spec rstack overflow: Not affected
  Spec store bypass:    Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:           Mitigation; __user pointer sanitization
  Spectre v2:           Mitigation; CSV2, BHB
  Srbds:                Not affected
  Tsx async abort:      Not affected

pytorch container build fail

I pulled the repo and built pytorch container in Graviton3/Ubuntu(cmd as below), but got following error, what's the problem?

"./build.sh --build-type pytorch --build-target neoverse-v1 --tag r22.05.24-pytorch-1.10-openblas-v1"

.....
#19 1.707 reference/ usrto/ bin`/__aarch64_swp4_relld': #19 1.707 //usrusr//binlib//ldgcc:/ aarch64/-usrlinux/-libgnu//gcc7//aarch64.-.linux/-.gnu.//7./../.aarch64/-.linux.-/gnu./.libc.a/(aarch64dl--linuxsym.o-)gnu:/ libc.ain( dlfunction- sym.o`)do_sym:' :in
#19 1.707 (function. text`+do_sym0x288'):: #19 1.707 (undefined. textreference+ 0x288to) :` __aarch64_swp4_relundefined'
#19 1.707 reference to `__aarch64_swp4_rel'
#19 1.707 collect2: error: ld returned 1 exit status
#19 1.707 collect2: error: ld returned 1 exit status
#19 1.708 make: * [math/Dir.mk:74: build/bin/mathbench] Error 1
#19 1.708 make: * [math/Dir.mk:78: build/bin/mathbench_libc] Error 1
#19 ERROR: executor failed running [/bin/sh -c $PACKAGE_DIR/build-arm_opt_routines.sh]: exit code: 2

[pytorch-libs 4/10] RUN packages/build-arm_opt_routines.sh:

executor failed running [/bin/sh -c $PACKAGE_DIR/build-arm_opt_routines.sh]: exit code: 2

UnitTests failed

I do native build the armnn with the script but the UnitTests failed. The platform is Ultar96(armv8) w/ Ubuntu18.04.

xilinx@pynq:~/armnn-devenv/armnn/build$ ./UnitTests
Warning gator_func(/home/xilinx/armnn-devenv/gator/annotate/streamline_annotate.c:496): Not connected to gatord, the application will run normally but Streamline will not collect annotations. To collect annotations, please verify you are running gatord 5.24 or later and that SELinux is disabled.
Running 758 test cases...
UnitTests: /home/xilinx/armnn-devenv/armnn/src/armnn/InternalTypes.cpp:49: const char* armnn::GetLayerTypeAsCString(armnn::LayerType): Assertion (false)&&("Unknown layer type")' failed. unknown location(0): fatal error: in "NeonLayerSupport/IsLayerSupportedFloat16Neon": signal: SIGABRT (application abort requested) /home/xilinx/armnn-devenv/armnn/src/backends/backendsCommon/test/IsLayerSupportedTestImpl.hpp(471): last checkpoint UnitTests: /home/xilinx/armnn-devenv/armnn/src/armnn/InternalTypes.cpp:49: const char* armnn::GetLayerTypeAsCString(armnn::LayerType): Assertion (false)&&("Unknown layer type")' failed.
unknown location(0): fatal error: in "NeonLayerSupport/IsLayerSupportedFloat32Neon": signal: SIGABRT (application abort requested)
/home/xilinx/armnn-devenv/armnn/src/backends/backendsCommon/test/IsLayerSupportedTestImpl.hpp(471): last checkpoint
UnitTests: /home/xilinx/armnn-devenv/armnn/src/armnn/InternalTypes.cpp:49: const char* armnn::GetLayerTypeAsCString(armnn::LayerType): Assertion (false)&&("Unknown layer type")' failed. unknown location(0): fatal error: in "NeonLayerSupport/IsLayerSupportedUint8Neon": signal: SIGABRT (application abort requested) /home/xilinx/armnn-devenv/armnn/src/backends/backendsCommon/test/IsLayerSupportedTestImpl.hpp(471): last checkpoint UnitTests: /home/xilinx/armnn-devenv/armnn/src/armnn/InternalTypes.cpp:49: const char* armnn::GetLayerTypeAsCString(armnn::LayerType): Assertion (false)&&("Unknown layer type")' failed.
unknown location(0): fatal error: in "RefLayerSupported/IsLayerSupportedLayerTypeMatches": signal: SIGABRT (application abort requested)
/home/xilinx/armnn-devenv/armnn/src/backends/backendsCommon/test/IsLayerSupportedTestImpl.hpp(544): last checkpoint
UnitTests: /home/xilinx/armnn-devenv/armnn/src/armnn/InternalTypes.cpp:49: const char* armnn::GetLayerTypeAsCString(armnn::LayerType): Assertion (false)&&("Unknown layer type")' failed. unknown location(0): fatal error: in "RefLayerSupported/IsLayerSupportedFloat16Reference": signal: SIGABRT (application abort requested) /home/xilinx/armnn-devenv/armnn/src/backends/backendsCommon/test/IsLayerSupportedTestImpl.hpp(471): last checkpoint UnitTests: /home/xilinx/armnn-devenv/armnn/src/armnn/InternalTypes.cpp:49: const char* armnn::GetLayerTypeAsCString(armnn::LayerType): Assertion (false)&&("Unknown layer type")' failed.
unknown location(0): fatal error: in "RefLayerSupported/IsLayerSupportedFloat32Reference": signal: SIGABRT (application abort requested)
/home/xilinx/armnn-devenv/armnn/src/backends/backendsCommon/test/IsLayerSupportedTestImpl.hpp(443): last checkpoint
UnitTests: /home/xilinx/armnn-devenv/armnn/src/armnn/InternalTypes.cpp:49: const char* armnn::GetLayerTypeAsCString(armnn::LayerType): Assertion `(false)&&("Unknown layer type")' failed.
unknown location(0): fatal error: in "RefLayerSupported/IsLayerSupportedUint8Reference": signal: SIGABRT (application abort requested)
/home/xilinx/armnn-devenv/armnn/src/backends/backendsCommon/test/IsLayerSupportedTestImpl.hpp(471): last checkpoint

*** 7 failures are detected in the test module "UnitTests"

Installation of latest Ubuntu OS on Hikey960 boards

Hi,

I am trying to see if there are instructions to install latest version of Ubuntu OS on Hikey960 board. I have also noticed that instructions to install Ubuntu on hikey960 board are no longer available at the link - https://github.com/ARM-software/Tool-Solutions/tree/master/ml-tool-examples/hikey960-ubuntu
We were able to install Ubuntu 16.04 earlier with these instructions. I am trying to check if we can install Ubuntu 20.04 or higher on Hikey960 board.

Thanks,
Srihari

Can't build onednn in openblas　(tensorflow aarch64)

Simple typo

tensorflow-aarch64/build.sh

openblas )
  onddnn="openblas"
  shift
  ;;

How to enable CUDA when installing pytorch using docker?

I tried to build pytorch image using $ ./build.sh --build-type pytorch. But when I ran the image using docker run -it --init pytorch I found the CUDA is not available.

OS version: Ubuntu 18.04.4 LTS
CUDA version: 10.2.89

Cache configuration of Eigen library + TF

Default Cache configuration of Eigen library is not as per N1. To get best performance out of Eigen, cache line configuration in "eigen/Eigen/src/Core/products/GeneralBlockPanelKernel.h" must be as per hardware.

By default L1 is 16K (const std::ptrdiff_t defaultL1CacheSize). It should be 64K for N1

This is also applicable for L2 and L3. it must be set as per hardware configuration

PyTorch Hugging Face Models do not have ACL calls for Docker versions > 23.05

Hi,

Docker Tags:
r23.09-torch-2.0.0-onednn-acl
r23.05-torch-2.0.0-onednn-acl

I am unable to get acl calls in docker versions higher than 23.05 for Pytorch Hugging Face Models

Attaching oneDNN verbose calls for BERT model here
23.05_Bert_Verbose.txt
23.09_Bert_Verbose.txt

The code to reproduce this is attached as below:
PyT_Bert_Training.txt --> Use this for the first run to generate necessary inference checkpoints and files.
PyT_Bert_Inf.txt --> For subsequent runs to generate the oneDNN logs

Also, as a result, the later oneDNN verbose exhibits gemm:jit calls for Matmuls and this results in poor performance for inference compared to gemm:acl calls.

Thanks

[tensorflow] EIGEN_DEFAULT_L?_CACHE_SIZE not set if oneDNN is enabled

EIGEN_DEFAULT_L?_CACHE_SIZE are not set if build tensorflow with oneDNN.
But eigen gebp kernel can still be called even if oneDNN is enabled. E.g., start tensorflow with TF_ENABLE_ONEDNN_OPTS=0.

SciPy version 1.4.1 is too old, time to bump to new version

SciPy version specified in Docker file for Tensorflow aarch64 is more than 1.5 years old as of this writing. Even, scipy-pandas requires at least 1.5.1. I successfully, compiled SciPy version 1.6.3 from source without issue.

Think it's time to bump its version number in the Dockerfile.

Tensorflow AArch64 error with build-onednn.sh

I am trying to build Tensorflow docker image on an aarch64 device, and it's not able to find the ONEDNN_VERSION (it appears it's blank)

 Step 36/101 : RUN $PACKAGE_DIR/build-onednn.sh
 ---> Running in 6b6eb5dd3985
oneDNN VERSION
Cloning into 'mkl-dnn'...
fatal: 'v' is not a commit and a branch 'v' cannot be created from it
The command '/bin/sh -c $PACKAGE_DIR/build-onednn.sh' returned a non-zero code: 128
ERROR: Job failed: command terminated with exit code 128

command that is failing in scripts/build-onednn.sh:

cd $PACKAGE_DIR
readonly package=onednn
readonly version=$ONEDNN_VERSION
readonly tf_id=$TF_VERSION_ID
readonly src_host=https://github.com/intel
readonly src_repo=mkl-dnn

# Clone oneDNN
echo "oneDNN VERSION" $version
git clone ${src_host}/${src_repo}.git
cd ${src_repo}
git checkout v$version -b v$version

any advice?

OpenBLAS Warning on ARM: "Detect OpenMP Loop and this application may hang. Please rebuild the library with USE_OPENMP=1 option."

I get this error when running torch on ARM on Docker image python:3.11-slim:

OpenBLAS Warning : Detect OpenMP Loop and this application may hang. Please rebuild the library with USE_OPENMP=1 option.

When running the same code on x86 and the same Docker image it works without errors:

This is the relevant code:

import torch
from diffusers import DiffusionPipeline, KDPM2DiscreteScheduler

...

pipe = DiffusionPipeline.from_pretrained(model, custom_pipeline = "lpw_stable_diffusion", torch_dtype = torch.float32, safety_checker = dummy_checker)
pipe.scheduler = KDPM2DiscreteScheduler.from_config(pipe.scheduler.config, use_karras_sigmas = True, algorithm_type="sde-dpmsolver++")

...

image = pipe.text2img(prompt, negative_prompt = negative_prompt, width = IMAGE_WIDTH, height = IMAGE_HEIGHT, max_embeddings_multiples = 3, num_inference_steps = INFERENCE_STEPS).images[0]

Error when compiling image or speech example for corstone 300 tflite example

This example doesn't seem to work any more. The following command ./build_tflite_micro_app.sh -t m55+u55 -u img_class -i cat yields the following error:

--
Warning: L9931W: Your license for feature ds_compiler_eval20201 will expire in 30 days
Error: L6218E: Undefined symbol ethosu_init_v2 (referred from test_runner_test.o).
Error: L6218E: Undefined symbol ETHOSU_PMU_CNTR_Enable (referred from micro_interpreter.o).
Error: L6218E: Undefined symbol ETHOSU_PMU_CYCCNT_Reset (referred from micro_interpreter.o).
Error: L6218E: Undefined symbol ETHOSU_PMU_EVCNTR_ALL_Reset (referred from micro_interpreter.o).
Error: L6218E: Undefined symbol ETHOSU_PMU_Enable (referred from micro_interpreter.o).
Error: L6218E: Undefined symbol ETHOSU_PMU_Get_CCNTR (referred from micro_interpreter.o).
Error: L6218E: Undefined symbol ETHOSU_PMU_Get_EVCNTR (referred from micro_interpreter.o).
Error: L6218E: Undefined symbol ETHOSU_PMU_Set_EVTYPER (referred from micro_interpreter.o).
Error: L6218E: Undefined symbol ethosu_invoke_v2 (referred from ethosu.o).
Error: L6218E: Undefined symbol ethosu_irq_handler (referred from boot.o).
Finished: 0 information, 6 warning and 10 error messages.
armclang: warning: Your license for feature ds_suite_eval will expire in 30 days [-Wlicense-management]
armclang: warning: Your license for feature ds_compiler_eval20201 will expire in 30 days [-Wlicense-management]
armclang: error: linker command failed with exit code 1 (use -v to see invocation)
tensorflow/lite/micro/examples/test_runner/Makefile.inc:41: recipe for target 'tensorflow/lite/micro/tools/make/gen/fvp_cortex-m55/bin/test_runner_test' failed
make: *** [tensorflow/lite/micro/tools/make/gen/fvp_cortex-m55/bin/test_runner_test] Error 1

Any ideas how to fix this?

Best regards,

Name: Carlo Grisafi
Email: [email protected]

unauthorized: authentication required

with QNAP docker station and with this image https://hub.docker.com/r/linaro/tensorflow-arm-neoverse-n1 - I get
unauthorized: authentication required

apt (tensorflow-arm-neoverse): E: Could not open lock file /var/lib/apt/lists/lock - open (13: Permission denied)

I'm unable to install software using apt with the image armswdev/tensorflow-arm-neoverse. I tried multiple solutions I found online to remove the lock, but there is no root access on the image. How can I install additional packages?

Using tensorflow/tensorflow as the base image creates the image successfully with no workarounds
Link log where apt is unable to get the lock
Link to all my GitHub Actions runs with different solutions

OpenMp performs slower on Hikey960

Hi,

I have installed Ubuntu on Hikey960 using instructions at this link and i'm able to get my board up and working fine.

But, I'm facing performance issues while using OpenMP in my application, specifically on Hikey960 board. I have written basic for loop using OpenMP constructs as below

int main(){

    int size = 20000000;

    double milliseconds;
    struct timeval ts1, ts2;

    int* ip1 = (int*)malloc(sizeof(int)*size);
    int* ip2 = (int*)malloc(sizeof(int)*size);
    int* op = (int*)malloc(sizeof(int)*size);
    
    gettimeofday(&ts1, NULL);
#pragma omp parallel for
    for(int i=0;i<size;i++){
        op[i] = ip1[i]*ip2[i];
    }

    gettimeofday(&ts2, NULL);
    milliseconds = (ts2.tv_sec - ts1.tv_sec) * 1000.0;
    (milliseconds) += (ts2.tv_usec - ts1.tv_usec) / 1000.0;
    printf("Time taken:%f", milliseconds);

    return 0;
}

I have observed application using OpenMP performs slower than with out using OpenMP on Hikey960. Same OpenMP application runs faster on other ARM platforms such as Raspberry Pi.

Could you please let us know why OpenMP applications running slower on Hikey960 or is there any thing i need to configure on this board?

Any help would be highly appreciated.

Thanks,
Hari

Frequency above scaling_max_freq?

is it possible for the Governor or Userspace to "request" a frequency which is higher than the value "scaling_max_freq"? And if yes, what would happen then?

Thank you for all the help!

Container image build for tensorflow-aarch64 is failing

I tried building container image for tensorflow-aarch64 using the instruction provided in the repo, however the build is failing with below error:

#52 63.36 Analyzing: 2 targets (2 packages loaded, 0 targets configured)
#52 63.47 ERROR: /home/ubuntu/.cache/bazel/_bazel_ubuntu/f62f7389585d813e586b859fde16f5e8/external/bazel_tools/platforms/BUILD:34:6: in alias rule @bazel_tools//platforms:aarch64: Constraints from @bazel_tools//platforms have been removed. Please use constraints from @platforms repository embedded in Bazel, or preferably declare dependency on https://github.com/bazelbuild/platforms. See https://github.com/bazelbuild/bazel/issues/8622 for details.
#52 63.47 ERROR: /home/ubuntu/.cache/bazel/_bazel_ubuntu/f62f7389585d813e586b859fde16f5e8/external/bazel_tools/platforms/BUILD:84:6: in alias rule @bazel_tools//platforms:linux: Constraints from @bazel_tools//platforms have been removed. Please use constraints from @platforms repository embedded in Bazel, or preferably declare dependency on https://github.com/bazelbuild/platforms. See https://github.com/bazelbuild/bazel/issues/8622 for details.
#52 63.47 ERROR: /home/ubuntu/.cache/bazel/_bazel_ubuntu/f62f7389585d813e586b859fde16f5e8/external/bazel_tools/platforms/BUILD:34:6: Analysis of target '@bazel_tools//platforms:aarch64' failed
#52 63.47 ERROR: /home/ubuntu/.cache/bazel/_bazel_ubuntu/f62f7389585d813e586b859fde16f5e8/external/bazel_tools/platforms/BUILD:84:6: Analysis of target '@bazel_tools//platforms:linux' failed
#52 63.47 ERROR: /home/ubuntu/packages/serving/tensorflow_serving/model_servers/BUILD:402:10: While resolving toolchains for target //tensorflow_serving/model_servers:tensorflow_model_server: Target @local_execution_config_platform//:platform was referenced as a platform, but does not provide PlatformInfo
#52 63.47 ERROR: /home/ubuntu/packages/serving/tensorflow_serving/tools/pip_package/BUILD:5:10: While resolving toolchains for target //tensorflow_serving/tools/pip_package:build_pip_package: Target @local_execution_config_platform//:platform was referenced as a platform, but does not provide PlatformInfo
#52 63.48 WARNING: errors encountered while analyzing target '//tensorflow_serving/model_servers:tensorflow_model_server': it will not be built
#52 63.48 WARNING: errors encountered while analyzing target '//tensorflow_serving/tools/pip_package:build_pip_package': it will not be built
#52 63.49 INFO: Analyzed 2 targets (8 packages loaded, 8 targets configured).
#52 63.49 INFO: Found 0 targets...
#52 63.63 [0 / 1] [Prepa] BazelWorkspaceStatusAction stable-status.txt
#52 63.66 ERROR: command succeeded, but not all targets were analyzed
#52 63.69 INFO: Elapsed time: 60.264s, Critical Path: 0.03s
#52 63.69 INFO: 1 process: 1 internal.
#52 63.69 FAILED: Build did NOT complete successfully
#52 ERROR: process "/bin/sh -c $PACKAGE_DIR/build-tensorflow-serving.sh" did not complete successfully: exit code: 1
------
 > [tensorflow-serving  9/10] RUN /home/ubuntu/packages/build-tensorflow-serving.sh:
#52 63.47 ERROR: /home/ubuntu/packages/serving/tensorflow_serving/tools/pip_package/BUILD:5:10: While resolving toolchains for target //tensorflow_serving/tools/pip_package:build_pip_package: Target @local_execution_config_platform//:platform was referenced as a platform, but does not provide PlatformInfo
#52 63.48 WARNING: errors encountered while analyzing target '//tensorflow_serving/model_servers:tensorflow_model_server': it will not be built
#52 63.48 WARNING: errors encountered while analyzing target '//tensorflow_serving/tools/pip_package:build_pip_package': it will not be built
#52 63.49 INFO: Analyzed 2 targets (8 packages loaded, 8 targets configured).
#52 63.49 INFO: Found 0 targets...
#52 63.63 [0 / 1] [Prepa] BazelWorkspaceStatusAction stable-status.txt
#52 63.66 ERROR: command succeeded, but not all targets were analyzed
#52 63.69 INFO: Elapsed time: 60.264s, Critical Path: 0.03s
#52 63.69 INFO: 1 process: 1 internal.
#52 63.69 FAILED: Build did NOT complete successfully

While serching for this error online, I found this issue and as per the recommendation provided here, I tried building the image with lower versions of Bazel (like 5.3.2, 5.3.1, 5.1.1, 4.2.0, 6.0.0, etc), again with no luck, though it was throwing a different error with Bazel versions 5.x/4.x:

#52 64.15 Error: 'apple_common' value has no field or method 'objc_proto_aspect'
#52 65.09 ERROR: /home/ubuntu/packages/serving/WORKSPACE:44:10: error loading package '@com_github_grpc_grpc//': at /home/ubuntu/.cache/bazel/_bazel_ubuntu/f62f7389585d813e586b859fde16f5e8/external/com_github_grpc_grpc/bazel/grpc_build_system.bzl:28:6: at /home/ubuntu/.cache/bazel/_bazel_ubuntu/f62f7389585d813e586b859fde16f5e8/external/build_bazel_rules_apple/apple/ios.bzl:22:5: initialization of module 'apple/internal/testing/ios_rules.bzl' failed and referenced by '//external:grpc_lib'
#52 65.10 ERROR: /home/ubuntu/packages/serving/tensorflow_serving/model_servers/BUILD:242:11: error loading package '@com_github_grpc_grpc//': at /home/ubuntu/.cache/bazel/_bazel_ubuntu/f62f7389585d813e586b859fde16f5e8/external/com_github_grpc_grpc/bazel/grpc_build_system.bzl:28:6: at /home/ubuntu/.cache/bazel/_bazel_ubuntu/f62f7389585d813e586b859fde16f5e8/external/build_bazel_rules_apple/apple/ios.bzl:22:5: initialization of module 'apple/internal/testing/ios_rules.bzl' failed and referenced by '//tensorflow_serving/model_servers:grpc_status_util'
#52 65.10 ERROR: /home/ubuntu/.cache/bazel/_bazel_ubuntu/f62f7389585d813e586b859fde16f5e8/external/org_tensorflow/tensorflow/BUILD:856:11: error loading package '@com_github_grpc_grpc//': at /home/ubuntu/.cache/bazel/_bazel_ubuntu/f62f7389585d813e586b859fde16f5e8/external/com_github_grpc_grpc/bazel/grpc_build_system.bzl:28:6: at /home/ubuntu/.cache/bazel/_bazel_ubuntu/f62f7389585d813e586b859fde16f5e8/external/build_bazel_rules_apple/apple/ios.bzl:22:5: initialization of module 'apple/internal/testing/ios_rules.bzl' failed and referenced by '@org_tensorflow//tensorflow:grpc++'
#52 65.10 ERROR: /home/ubuntu/packages/serving/WORKSPACE:44:10: error loading package '@com_github_grpc_grpc//src/compiler': at /home/ubuntu/.cache/bazel/_bazel_ubuntu/f62f7389585d813e586b859fde16f5e8/external/com_github_grpc_grpc/bazel/grpc_build_system.bzl:28:6: at /home/ubuntu/.cache/bazel/_bazel_ubuntu/f62f7389585d813e586b859fde16f5e8/external/build_bazel_rules_apple/apple/ios.bzl:22:5: initialization of module 'apple/internal/testing/ios_rules.bzl' failed and referenced by '//external:grpc_cpp_plugin'
#52 79.51 ERROR: command succeeded, but not all targets were analyzed
#52 79.54 INFO: Elapsed time: 75.033s, Critical Path: 8.21s
#52 79.54 INFO: 282 processes: 7 internal, 275 local.
#52 79.54 FAILED: Build did NOT complete successfully
#52 79.55 FAILED: Build did NOT complete successfully

#52 ERROR: process "/bin/sh -c $PACKAGE_DIR/build-tensorflow-serving.sh" did not complete successfully: exit code: 1
------
 > [tensorflow-serving  9/10] RUN /home/ubuntu/packages/build-tensorflow-serving.sh:
#52 79.47   bazel-out/host/bin/external/com_google_protobuf/protoc '--python_out=bazel-out/aarch64-opt/bin/external/com_google_protobuf/python' -Iexternal/com_google_protobuf/python -Ibazel-out/aarch64-opt/bin/external/com_google_protobuf/python bazel-out/aarch64-opt/bin/external/com_google_protobuf/python/google/protobuf/any.proto)

Here are the commands that I executed on a Ubuntu 22.04 EC2 instance which can also be used to reproduce this issue:

Install docker: https://docs.docker.com/engine/install/ubuntu/
git clone https://github.com/ARM-software/Tool-Solutions.git
cd Tool-Solutions/docker/tensorflow-aarch64
./build.sh --build-type serving --onednn acl

Any suggestion to get this fixed would be helpful.

conflicting torch and torch vision versions

Hi there,

Currently, the build armswdev/pytorch-arm-neoverse:r22.02-torch-1.10.0-openblas installs:
torch==1.10.0a0+git36449ea
torchvision==0.9.1

However, according to pypi https://pypi.org/project/torchvision/ these two torch and torchvision versions are not compatible.

PL011 UART not in Peripheral region for Cortex-M hello world

Tool-Solutions/hello-world_fast-models/Cortex-M/system/Cortex-M.lisa

Line 22 in 12822ec

BusDecoder.pvbus_m_range[0x1c090000..0x1c09ffff] => pl011_uart.pvbus;

The v8-M architecture defines the address range of 0x4000_0000–0x0x5FFF_FFFF as the Peripheral region in the system address map. Peripherals should normally be placed in this address range so they have the correct memory type (Device, nGnRE) and attributes (XN, non-cacheable). 0xA000_0000–0xDFFFFFFF are also Device regions, but the Peripheral region is preferred.

[tensorflow-pytorch-aarch64--r22.09] TensorFlow docker images are broken

Issue Description
The tensorflow docker images built from r22.09 tag with onednn/acl as well as those available on docker hub (https://hub.docker.com/r/armswdev/tensorflow-arm-neoverse:r22.09-tf-2.10.0-onednn-acl or latest) are producing incorrect results for MLPerf resnet50 model.

the last working tag was: tensorflow-pytorch-aarch64--r22.08
TF2.10 official wheel works fine, so the issue is with one of the staging patches maintained on top of TF 2.10.
https://github.com/ARM-software/Tool-Solutions/tree/main/docker/tensorflow-aarch64/patches

How to reproduce
docker pull armswdev/tensorflow-arm-neoverse

follow this section to run MLPerf resnet50 inference with "--accuracy" option.
https://github.com/ARM-software/Tool-Solutions/blob/main/docker/tensorflow-aarch64/examples/README.md#mlcommons-tm-benchmarks

How to integrate the latest ARMCL 24.05 ?

Could someone give me a hint how to upgrade the armcl version to 24.05? I want to test the performance with Neoversion-V2 CPU.

Appreciate in advance.

Building ARM NN for ONNX environment

Hi, i am trying to build ARM NN for ONNX environment using your script. I modified your bash script to include onnx libraries but i'm going through some issues. Do you suggest me to cut out the tensorflow part? Anyway i don't need TF for my purposes.
How would you modify your script to include the ONNX parser?
Thank you.

When oneDNN is enabled, an unoptimized matmul is called by Tensorflow on aarch64

From the above code, we can see that if cblas is not enabled, an unoptimized matmul will be called by Tensorflow on aarch64, which will cause performance degradation. So, I think a fully optimized matmul of acl should be added to dnnl_sgemm to make full use of aarch64‘s isa and improve performance of mkl_matmul(a tf op) on aarch64.

Add prebuilt TensorFlow Serving image to Docker Hub

Hi! I am interested in using TensorFlow Serving to make predictions with a saved model.

I see that the build script in this repository supports building a TensorFlow Serving image. I was wondering if y’all could add a prebuilt TensorFlow Serving image to Docker Hub so that people don’t need to build their own images, similar to the full TensorFlow image.

Tensorflow huggingface BERT Model is slower in ARM compared to Intel

Docker Container Version/Tag : r23.07-tf-2.12.0-onednn-acl
ARM System : Graviton 3 (c7g.8xlarge)
Architecture: aarch64
CPU(s): 32
On-line CPU(s) list: 0-31
Vendor ID: ARM
Model: 1
Thread(s) per core: 1
Core(s) per socket: 32
Caches (sum of all):
L1d: 2 MiB (32 instances)
L1i: 2 MiB (32 instances)
L2: 32 MiB (32 instances)
L3: 32 MiB (1 instance)

Intel System: Icelake (c6i.8xlarge)
Architecture: x86_64
CPU(s): 32
On-line CPU(s) list: 0-31
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
CPU family: 6
Model: 106
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
Stepping: 6
Virtualization features:
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 768 KiB (16 instances)
L1i: 512 KiB (16 instances)
L2: 20 MiB (16 instances)
L3: 54 MiB (1 instance)

As per this blog, ACL inference should be faster than intel systems for transformer models,

We ran Tensorflow Hugging Face BERT model for Inference (Python Code Attached as txt file here)
TF_bert_inf - Copy.txt
Below are results for inference speeds in seconds :

Env Variables	Graviton	Icelake
No Opts	0.2294	0.145099
TF_ENABLE_ONEDNN_OPTS=1	0.2191	0.144636
ONEDNN_DEFAULT_FPMATH_MODE=BF16	1.49034	0.145511

From the results above, we see that the performance is almost 1.8x worse for ARM cores compared to Intel ones.
The code is run on 2 cores for both the Intel and ARM systems.
Another issue is enabling FPMATH mode to BF16 degrades the performance.
From the oneDNN logs, we see that when BF16 is enabled, there are overheads while executing reorder for ARM cores

Env Variables	Reorder Time (msecs)
TF_ENABLE_ONEDNN_OPTS=1	0.582031
ONEDNN_DEFAULT_FPMATH_MODE=BF16	11.1628

This is observed only for larger sized Matmul operations. Here the size was 768x768 and the reorder uses "simple:any" implementation instead of "jit:uni" in oneDNN.
Attaching oneDNN Verbose for both scenarios
Bert_TF12_issue_verbose_BF16.txt
Bert_TF12_issue_verbose_OPTS.txt

Request your views and comments on whether we need any other settings to improve the performance

hi, any plan to add script for cross-compile ARM NN targeting at Andriod system?

Hi, Thanks for the great script of build-armnn.sh to cross compile the ARM NN on a X86 system or native compile the ARM NN on a ARM linux system.

I wonder if it possible to cross compile from a X86 host to get the andriod version libraries? At first I thought it might be easy to modify the build-armnn.sh (change the cross compile toolchain to andriod ones). However, I found the gcc-arm tools (g++-aarch64-linux-gnu) are employed for cross compile, while currently the NDK tools (cross complie tool for andriod system) support clang only.

I'm in trouble how to adapt the script with ndk ( r17, r18b, r19 or other version) tools. It not only has relationship with the ARM NN part, but aslo the prototxt and boost libraries. Thus, might I know whether it is possible to add script for cross-compile ARM NN targeting at Andriod system?

Thanks in advance~~

Illegal instruction (core dumped) on cv2 import

I came across this repo from this blog post (which linked back to this git repository). I'm not sure whether this repo has any affiliation whatsoever with the jasonrandrews docker hub, so if not I apologize.

When I try to import open CV on an ARM64 machine, I get an illegal instruction error:

$ sudo docker run --rm -it -v `pwd`:/app -w / jasonrandrews/tensorflow-v2:latest /bin/bash
ubuntu@f3bdbb4d4682:/$ $VENV_DIR/bin/python -c 'import cv2'
Illegal instruction (core dumped)

I'm running this on a publicly available AMI on EC2: ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-arm64-server-20190814 (ami-0fb0129cd568fe35f). Interestingly, if I run the exact same sequence of commands on my Macbook, I get no such error.

Deal.ii installation problem

ssh does not work after reboot

I followed instructions at below link to install ubuntu on Hikey960.
https://github.com/ARM-software/Tool-Solutions/tree/master/ml-tool-examples/hikey960-ubuntu

I am trying to connect to the new hikey board with ssh. After every reboot, I have to manually create /var/run/sshd folder and then start ssh service on the hikey960 board. Without these commands, ssh to hikey board does not work.

Problems with Tensorflow Object Detection API

Is there any "ready packages" for the objects detection API on Arm64 ? I did install all need stuff and performed steps on your Docker image, but I am getting following errors:

`2022-08-04 16:58:51.293759: I tensorflow/core/util/util.cc:175] Experimental oneDNN custom operations are on. If you experience issues, please turn them off by setting the environment variable TF_ENABLE_ONEDNN_OPTS=0.
/home/ubuntu/python3-venv/lib/python3.8/site-packages/tensorflow_io/python/ops/init.py:98: UserWarning: unable to load libtensorflow_io_plugins.so: unable to open file: libtensorflow_io_plugins.so, from paths: ['/home/ubuntu/python3-venv/lib/python3.8/site-packages/tensorflow_io/python/ops/libtensorflow_io_plugins.so']
caused by: ["[Errno 2] The file to load file system plugin from does not exist.: '/home/ubuntu/python3-venv/lib/python3.8/site-packages/tensorflow_io/python/ops/libtensorflow_io_plugins.so'"]
warnings.warn(f"unable to load libtensorflow_io_plugins.so: {e}")
/home/ubuntu/python3-venv/lib/python3.8/site-packages/tensorflow_io/python/ops/init.py:104: UserWarning: file system plugins are not loaded: unable to open file: libtensorflow_io.so, from paths: ['/home/ubuntu/python3-venv/lib/python3.8/site-packages/tensorflow_io/python/ops/libtensorflow_io.so']
caused by: ['/home/ubuntu/python3-venv/lib/python3.8/site-packages/tensorflow_io/python/ops/libtensorflow_io.so: cannot open shared object file: No such file or directory']
warnings.warn(f"file system plugins are not loaded: {e}")
Traceback (most recent call last):
File "/home/ubuntu/tensorflow/models/research/object_detection/models/ssd_efficientnet_bifpn_feature_extractor.py", line 35, in
from official.legacy.image_classification.efficientnet import efficientnet_model
ModuleNotFoundError: No module named 'official.legacy'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "object_detection/builders/model_builder_test.py", line 20, in
from object_detection.builders import model_builder
File "/home/ubuntu/tensorflow/models/research/object_detection/builders/model_builder.py", line 70, in
from object_detection.models import ssd_efficientnet_bifpn_feature_extractor as ssd_efficientnet_bifpn
File "/home/ubuntu/tensorflow/models/research/object_detection/models/ssd_efficientnet_bifpn_feature_extractor.py", line 37, in
from official.vision.image_classification.efficientnet import efficientnet_model
File "/home/ubuntu/python3-venv/lib/python3.8/site-packages/official/vision/image_classification/efficientnet/efficientnet_model.py", line 37, in
from official.vision.image_classification import preprocessing
File "/home/ubuntu/python3-venv/lib/python3.8/site-packages/official/vision/image_classification/preprocessing.py", line 25, in
from official.vision.image_classification import augment
File "/home/ubuntu/python3-venv/lib/python3.8/site-packages/official/vision/image_classification/augment.py", line 31, in
from tensorflow.python.keras.layers.preprocessing import image_preprocessing as image_ops
ModuleNotFoundError: No module named 'tensorflow.python.keras.layers.preprocessing'
(python3-venv) ubuntu@fb9da9f9bef0:~/tensorflow/models/research$ python object_detection/builders/model_builder_test.py
2022-08-04 17:41:25.737320: I tensorflow/core/util/util.cc:175] Experimental oneDNN custom operations are on. If you experience issues, please turn them off by setting the environment variable TF_ENABLE_ONEDNN_OPTS=0.
/home/ubuntu/python3-venv/lib/python3.8/site-packages/tensorflow_io/python/ops/init.py:98: UserWarning: unable to load libtensorflow_io_plugins.so: unable to open file: libtensorflow_io_plugins.so, from paths: ['/home/ubuntu/python3-venv/lib/python3.8/site-packages/tensorflow_io/python/ops/libtensorflow_io_plugins.so']
caused by: ["[Errno 2] The file to load file system plugin from does not exist.: '/home/ubuntu/python3-venv/lib/python3.8/site-packages/tensorflow_io/python/ops/libtensorflow_io_plugins.so'"]
warnings.warn(f"unable to load libtensorflow_io_plugins.so: {e}")
/home/ubuntu/python3-venv/lib/python3.8/site-packages/tensorflow_io/python/ops/init.py:104: UserWarning: file system plugins are not loaded: unable to open file: libtensorflow_io.so, from paths: ['/home/ubuntu/python3-venv/lib/python3.8/site-packages/tensorflow_io/python/ops/libtensorflow_io.so']
caused by: ['/home/ubuntu/python3-venv/lib/python3.8/site-packages/tensorflow_io/python/ops/libtensorflow_io.so: cannot open shared object file: No such file or directory']
warnings.warn(f"file system plugins are not loaded: {e}")
Traceback (most recent call last):
File "/home/ubuntu/tensorflow/models/research/object_detection/models/ssd_efficientnet_bifpn_feature_extractor.py", line 35, in
from official.legacy.image_classification.efficientnet import efficientnet_model
ModuleNotFoundError: No module named 'official.legacy'

During handling of the above exception, another exception occurred:

I get similar error messages also on MacBook Air M1.