awesome-oneapi

An Awesome list of oneAPI projects

A curated list of awesome oneAPI and SYCL projects for solutions across industry and community. Inspired by awesome-machine-learning.

What is oneAPI?

oneAPI is an open, cross-industry, standards-based, unified, multiarchitecture, multi-vendor programming model that delivers a common developer experience across accelerator architectures – for faster application performance, more productivity, and greater innovation. See, https://oneapi.io/ for more information.

AI - Computer Vision
AI - Data Science
AI - Machine Learning
AI - Natural Language Processing
AI - Frameworks and Toolkits
Autonomous Systems
Data Visualization and Rendering
Energy
Gaming
Manufacturing
Mathematics and Science
Tools & Development
Tutorials

AI - Computer Vision

BMW-IntelOpenVINO-Detection-Inference-API - This is a repository for an object detection inference API using OpenVINO, supporting both Windows and Linux operating systems
Certiface Anti-Spoofing - Certiface AntiSpoofing use oneAPI for fast decode video for perform liveness detection with inference. The system is capable of spotting fake faces and performing anti-face spoofing in face recognition systems.
diffusers - Pyke Diffusers is a modular Rust library for pretrained diffusion model inference to generate images using ONNX Runtime as a backend for accelerated generation on both CPUs and GPUs, including features like low memory usage and quantization. It offers an interactive stable diffusion demo and instructions on how to install and use the tool.
Fast_Human_Pose_Estimation_Pytorch - This is an unofficial implementation for the paper "Fast Human Pose Estimation". The code mainly comes from the PyTorch implementation for Stacked Hourglass Network.
gocv - The gocv package is a set of Go bindings for the OpenCV 4 computer vision library that supports the latest releases of Go and OpenCV v4.7.0 on Linux, macOS, and Windows.
RapidOCR - This is the README for RapidOCR, a project that provides OCR tools and models for detecting text in images.
smart-retail-analytics - The retail analytics application uses video or camera resources to monitor activity and keep track of inventory.
Stable Diffusion - This repository contains Stable Diffusion models trained from scratch and will be continuously updated with new checkpoints.
stable_diffusion_arc - The project guide provides instructions on how to set up and run the stable diffusion inference model on Intel Arc GPUs.
stable-diffusion-webui-arc-directml - The project involves a web UI for stable diffusion on Intel ARC with DirectML.
stable_diffusion.openvino - This GitHub project provides an implementation of text-to-image generation using stable diffusion on Intel CPU or GPU. It requires Python 3.9.0 and is compatible with OpenVINO.
yolov5_export_cpu - The project provides documentation on exporting YOLOv5 models for fast CPU inference using Intel's OpenVINO framework

AI - Data Science

Boosting epistasis detection on Intel CPU+GPU systems - This work focuses on exploring the architecture of Intel CPUs and Integrated Graphics and their heterogeneous computing potential to boost performance and energy-efficiency of epistasis detection. This will be achieved making use of OpenCL Data Parallel C++ and OpenMP programming models.
Drift Detection for Edge IoT Applications - This concept drift project is run on video and image datasets such that we can calculate an overall precision and standard error. The concept drift detection technique finds True positives and False negatives using real and virtual drift detection.
HIAS TassAI Facial Recognition Agent - Security is an important issue for hospitals and medical centers to consider. Today's Facial Recognition can provide ways of automating security in the medical industry reducing staffing costs and making medical facilities safer for both patients and staff.

AI - Machine Learning

DQRM - Deep Quantized Recommendation Model (DQRM) is a recommendation framework that is small, powerful in inference, and efficient to train.
ort - ort is an (unofficial) ONNX Runtime 1.15 wrapper for Rust based on the now inactive onnxruntime-rs. ONNX Runtime accelerates ML inference on both CPU & GPU.
Performance and Portability Evaluation of the K-Means Algorithm on SYCL with CPU-GPU architectures - This work uses the k-means algorithm to asses the performance portability of one of the most advanced implementations of the literature He-Vialle over different programming models (DPC++ CUDA OpenMP) and multi-vendor CPU-GPU architectures.
dpcpp-svm - A DPC++ version of ThunderSVM. The mission of ThunderSVM is to help users easily and efficiently apply SVMs to solve problems. ThunderSVM exploits GPU and multi-core CPUs to achieve high efficiency.

AI - Natural Language Processing

Census (Python based) Use Intel® Distribution of Modin to ingest and process U.S. census data from 1970 to 2010 in order to build a ridge regression based model to find the relation between education and the total income earned in the US.
ChatGPTCLIBot - The chatgpt cli bot allows the user to run GPT models such as GPT 3.5 and GPT 4, and switch between them using the config.json file.
CTranslate2 - CTranslate2 is a C and Python library that optimizes inference with transformer models, supporting models trained in various frameworks. It implements various performance optimization techniques such as weights quantization, layers fusion, batch reordering, and more for benchmarks of transformer models on CPU and GPU.
fastRAG - Build and explore efficient retrieval-augmented generative models and applications. It's main goal is to make retrieval augmented generation as efficient as possible through the use of state-of-the-art and efficient retrieval and generative models.
Gavin AI - Gavin AI is a project created by Scot_Survivor (Joshua Shiells) ShmarvDogg which aims to have English human like conversations through the use of AI and ML. Gavin works on the Transformer architecture however Performer FNet architectures are being investigated for better scaling.
hachi - Hachi is a locally hosted web app that enables natural language search for videos and images, using an AI-based machine learning model powered by OpenAI CLIP.
Language Identification (Python based) Trains a model to perform language identification using the Hugging Face Speechbrain library and CommonVoice dataset, and optimized with IPEX and INC.
whisper-ctranslate2 - Whisper ctranslate2 is a command-line client based on ctranslate2, compatible with original OpenAI client.

AI - Frameworks and Toolkits

AI Personal Identifiable Information Data Protection - Provides anonyimzation functions, which include methods for masking, hashing and encrypting/decrypting the PII data in large datasets. Can be used to protect the privacy and security of individuals in a dataset.
AI Structured Data Generation - Generate structured synthetic data for training and inferencing.
AI based transcribing - A reference solution showing how to use speech to text conversion to convert audio session tapes into digital notes in a psychologist's office.
BMW-Anonymization-API - The BMW Anonymization API is a privacy tool designed to obfuscate sensitive information in images and videos to preserve individual anonymity. Its features include agnostic localization techniques, modular sensitive information training, scalable anonymization techniques, and compatibility with deep learning models
Credit Card Fraud Detection - Uses Intel AI Analytics Toolkit and scikit-learn to train a AI algorithm to detect credit card fraud.
Customer Chatbot - a pytorch based conversational AI chatbot for customer care.
Customer Churn Prediction - Using historical customer churn data along with service details, a machine learning model built to predict whether the customer is going to churn. Reducing churn is key in the telecommunications industry to attract new customers and avoid contract terminations.
Customer Segmentation for Online Retailers - Demonstrates how machine learning can aid in building a deeper understanding of a businesses clientele by segmenting customers into clusters that can be used to implement personalized and targeted campaign.
Data Streaming Anomaly Detection - help detect anomalies using tensorflow and oneAPI to build a deep learning model that can detect anomalies in data collected from a IOT device to monitor equipment condition and prevent any issue from being cascaded.
deeplearning4j - The Eclipse DeepLearning4J ecosystem supports all the needs for JVM-based deep learning applications with various libraries
deeplearning4j-examples - The Eclipse Deeplearning4j (DL4J) ecosystem is a set of projects that supports all the needs of a JVM-based deep learning application.
DeepRec - DeepRec is a recommendation deep learning framework based on TensorFlow, which has been developed since 2016 and supports core businesses such as Taobao search recommendation and advertising.
Demand Forecasting - Builds and trains an AI model using deep learning to train and utiliez a CNN-LSTM time series model that predicts the next days demand every item based on 130 days worth of sales data.
Digital Twin for Design Exploration - A model that can be used to test digital replicas of real world products or devices for faults.
Disease Prediction - Demonstrates using a deep learning based NLP pipeline to train a document classifier that takes in notes from patient's symptoms and predicts the diagnoses among a set of known diseases.
dlstreamer - The Intel Deep Learning Streamer is an open source streaming media analytics framework based on the GStreamer multimedia framework. It is optimized for performance and functional interoperability between GStreamer plugins built on various backend libraries, with support for over 70 pre-trained models for various use cases.
Documentation Automation - based on the Tensorflow BERT transfer learning NER Model, build a deep learning model to predict the named entity tags for a given sentence.
Drone Navigation Inspection - Find safe drone landing zone without damaging property or injuring people using oneAPI and TensorFlow.
Engineering Design Optimizations - Train a model to create new bicycle designs with unique frames and handles, and generalize rare novelties to a broad set of designs, competely automatic and without requiring human intervention.
flashlight - Flashlight is a machine learning library written in C and created by Facebook AI Research. It features internal APIs for tensor computation, high performance defaults using just-in-time kernel compilation, and scalability
Historical Assets Document Processing (OCR) - Allows you to process large amounts of structured, semi-structured and unstructured content in documents. Through the use of image processing, analysis, text region detection and text extraction using OCR - the results can then be stored and can be put into a database.
Image Data Generation - An AI-enabled image generator that aids in generating accurate image and image segmentation datasets where availability of such datasets are limited.
intel-extension-for-tensorflow - Intel Extension for TensorFlow is a plugin based on TensorFlow PluggableDevice, which aims to bring devices such as Intel XPU, GPU, and CPU into TensorFlow.
intel-extension-for-transformers - Intel Extension for Transformers is a toolkit designed to efficiently accelerate transformer-based models on Intel platforms, optimized for 4th gen Intel Xeon Scalable Processor (codename Sapphire Rapids).
intel-extension-for-pytorch - Intel Extension for PyTorch provides features optimizations for an extra performance boost on Intel hardware including CPUs and Discrete GPUs and offers easy GPU acceleration for Intel Discrete GPUs with PyTorch.
Invoice To Cash Automation - AI toolkit to extract information from claim documents to categorize the claims. Helps develop models to accelerate the resolution of accounts receivable claims for trade promotion deductions.
Intelligent Indexing - A reference kit to build an AI-based Natural Language Processing solution for classifying documents.
KernelAbstractions.jl - KernelAbstractions (KA) is a package that enables you to write GPU-like kernels targetting different execution backends.
Loan Default Risk Prediction - Train and utilize an AI model using XGBoost to predict the probability of a loan default from client characteristics and the type of loan obligation.
Medical Imaging Diagnostics - Using machine learning and deep learning, train an AI algorithm that identifies images that warrant further attention to classify abnormalities.
models - The ONNX Model Zoo is a collection of pre-trained, state-of-the-art machine learning models in the ONNX format. These models are contributed by community members and accompanied by Jupyter notebooks for model training and running inference with the trained model.
Network Intrusion Detection - A pattern based network intrusion system using oneAPI and machine learning.
neural-compressor - Intel Neural Compressor is an open-source Python library for applying popular model compression techniques, such as pruning, quantization, sparsity, and distillation, on all mainstream deep learning frameworks and Intel extensions.
nnfusion - A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
optimum - Optimum is an extension of Transformers and Diffusers that provides optimization tools for efficiency to train and run machine learning models on targeted hardware, while also being easy to use.
optimum-intel - Optimum Intel is an interface between the Transformers and Diffusers libraries and Intel's different tools and libraries that help accelerate end-to-end pipelines on Intel architectures.
Order to Delivery Time Forecasting - A machine learning based predictive model that provides delivery time forecasting for e-commerce platform.
pipeline-server - Intel Deep Learning Streamer (DL Streamer) is a Python package and microservice that supports the deployment of optimized media analytics pipelines. It includes customizable media analytics containers, APIs to monitor pipelines, no-code pipeline definitions, and deep learning model integration with openvino.
Power Line Fault Detection - Process and analyze signals from a 3-phase power supply system used in power lines to predict whether or not a signal has a partial discharge using SciPy and NumPy calculations.
PPLNN - PPLNN, which is short for "PPLNN is a Primitive Library for Neural Network", is a high-performance deep-learning inference engine for efficient AI inferencing. It can run various ONNX models and has better support for OpenMMLab.
Predictive Asset Maintenance - Shows an alternative method of using oneAPI AI Analytics Toolkit over the stock version of the same package like XGBoost.
Product Recommedation - A reference kit that demonstrates one way where AI can be used to build a recommendation system for an e-commerce business using scikit-learn and oneAPI.
Purchase Prediction - A oneAPI based reference AI model that uses machine learning to predict purchases of customers.
pycaret - PyCaret is an open-source, low-code machine learning library in Python that automates the machine learning workflow. It is an end-to-end machine learning and model management tool that replaces hundreds of lines of code with a few lines to make experiments exponentially fast and efficient.
pynufft - The pynufft library is a Python package for non-uniform fast Fourier transform, based on a min-max interpolator, with experimental support for CuPy, PyTorch, and TensorFlow Eager mode
scikit-learn-intelex - Intel r Extension for scikit learn is a free AI accelerator that can accelerate existing scikit learn code without the need to change the existing code. It offers patching and replacing the stock scikit learn algorithms with their optimized versions provided by the extension, which results in over 10-100x acceleration across a variety of applications.
shumai - The Shumai project is a differentiable tensor library for TypeScript and JavaScript built with Bun and Flashlight. It provides standard array utilities, gradients, and supported operators.
Structural Damage Assessment - A PyTorch-based AI model that works on satellite-captured images to assess the severity of damage in the aftermath of a natural disaster.
Synthetic Voice/Audio Generation - Generate synthetic voices and speeches - can be used in chatbots, virtual assistants, and is applicable in a host of applications. Voice synthesis technology is increasingly used to create more natural sounding virtual assistants.
Text Data Generation - Creates synthetic data that is artificially generated. This reference kit uses a pre-trained GPT2 modle provided by hugging face to generate synthetic data applicable to product testing and training machine learning algorithms without running into privacy issues.
Traffic Camera Object Detection - reference kit demonstrating how to improve traffic using a number of different technology and oneAPI.
Vertical Search Engine - Demonstrates a possible reference implementation of a deep learning based NLP pipeline for semantic search of an organization's document using a pre-trained model.
Visual Process Discovery - A reference kit implementing visual process discovery. VPDs can be used to enhance customer experience by providing personalized solutions knowing their needs as they navigate through a company's website.
Visual Quality Inspection - Build a computer vision based model for building quality visual inspection based on a dataset from the pharma industry.
webnn-native- WebNN Native is an implementation of the Web Neural Network API, providing building blocks, headers, and backends for ML platforms including DirectML, OpenVINO, and XNNPACK.
ZenDNN - Zen deep neural network library ZendNN is a powerful library for deep learning inference applications on AMD CPUs. It includes APIs for basic neural network building blocks and is optimized for AMD CPUs.

Autonomous Systems

Alice - We are writing a tutorial for an open source project on how we build an AI to work on the open source project as if she were a remote developer. Bit of a self fulfilling prophecy but who doesn't love an infinite loop now and again.
FastChat - FastChat is an open platform for training, serving, and evaluating large language model based chatbots.

Data Visualization and Rendering

Atrc - The ATRC offline rendering lab includes various features such as path tracing, photon mapping, and many material models. It has an optional integrated OIDN and Embree library and an interactive scene editor.
Blender - Blender is the free and open source 3D creation suite. It supports the entirety of the 3D pipeline-modeling, rigging, animation, simulation, rendering, compositing, motion tracking and video editing.
Brayns - Brayns is a large scientific visualization platform based on CPU ray tracing, using an extension plugin architecture. It comes with several pre-made plugins, such as CircuitExplorer and MoleculeExplorer, and requires several dependencies to build
ChameleonRT - ChameleonRT is an example path tracer that runs on multiple ray tracing backends including Embree, SYCL, DXR, Optix, Vulkan, Metal, and Ospray.
embree - Embree is a high performance ray tracing library developed by Intel that targets graphics application developers to improve the performance of photo-realistic rendering applications. It includes various primitive types such as triangles, quads, grids, and curve primitives, and supports dynamic scenes. Embree also offers support for both CPUs and GPUs, while maintaining one code base to improve productivity and eliminate inconsistencies between the two versions of the renderer.
fresnel - Fresnel is a Python library for path tracing that can be used to generate high quality images in real time.
f3d - F3D is a fast and minimalist 3D viewer that supports multiple file formats and can show animations, supporting thumbnails and many rendering and texturing options including real-time physically based rendering and raytracing.
hdospray - The ospray for hydra is an open-source plugin for Pixar's USD to extend the hydra rendering framework with Intel Ospray. It is highly optimized for Intel CPU architectures ranging from laptops to large-scale distributed HPC systems.
LightWave Explorer - Lightwave explorer is an open source nonlinear optics simulator, intended to be fast, visual, and flexible for students and researchers to play with ultrashort laser pulses and nonlinear optics without having to buy a laser first.
ml-hypersim - The HyperSim dataset is a photorealistic synthetic dataset for indoor scene understanding that includes dense per-pixel semantic instance segmentations and complete camera information for every image.
oidn - Intel Open Image Denoise is an open-source library for image denoising in ray tracing rendering applications with high quality and performance, thanks to efficient deep learning-based filters that can be trained using the included toolkit and user-provided image datasets.
openpgl - The Intel Open Path Guiding Library (Open PGL) implements path guiding into a renderer, offering implementations of current state-of-the-art path guiding methods which increase the sampling quality and renderer efficiency.
ospray - Ospray is an open source, scalable and portable ray tracing engine designed for high fidelity visualization on Intel architecture CPUs. It allows users to easily build interactive applications using ray-tracing based rendering for both surface and volume-based visualizations.
ospray_studio - Ospray Studio is an open-source, interactive visualization and ray tracing application that utilizes Intel Ospray as its core rendering engine. Users can create scene graphs to render complex scenes with high-fidelity or very large scenes requiring supercomputing resources.
point-cloud-utils - Point Cloud Utils (PCU) is an easy-to-use Python library for processing and manipulating 3D point clouds and meshes. It provides several algorithms for generating point samples on meshes, downsampling point clouds, and computing distances between point clouds.
redner - Redner is a differentiable renderer that can compute correct rendering gradients stochastically without approximation. It can simulate photons and produce realistic lighting phenomena, and handle the derivatives of these features correctly.
SORT - Sort is a cross platform physically based renderer that can be used as a standalone ray tracing program or as a renderer plugin for Blender.
Substrate - A toolset to help developers create and deploy cloud-based VaaS services (Visualization as a Service). Deployment targets include any platforms capable of running Docker Swarm, such as Amazon AWS, institutional clusters and even personal servers. Native for Python environment (pip installable).
tracer - Tracer is a renderer that uses Embree and USD to produce photorealistic images using path tracing on the CPU, with features like subpixel jitter antialiasing, depth of field, and a variety of integrators.
vistle - Vistle is a modular data-parallel visualization system. It requires a C++14 compatible compiler that supports ISO/IEC 14882:2014, alongside compiling requirements of Boost, CMake and MPI. Additionally, it supports Covise, OpenCover, OpenSceneGraph and Qt 5 libraries, and also provides support code, rendering libraries, controlling code for Vistle session and visualization algorithm modules.
volppm - Volppm is a volumetric progressive photon mapping project that features homogeneous mediums for chromatic absorption and scattering coefficients.
yocto-gl - Yocto GL is a collection of small C++17 libraries for building physically based graphics algorithms. Each library is split into smaller ones, making code navigation easier.

Energy

A DPC++ Backend for the OCCA Portability Framework - OCCA—an open source portable and vendor neutral framework for parallel programming on heterogeneous platforms—is used by mission critical computational science and engineering applications of public and private sector organizations including the U.S. Department of Energy and Shell.

Gaming

NovelRT - NovelRT is a cross-platform game engine for visual novels and 2D games. It is still in the early alpha stage, but currently supports graphics and audio.

Manufacturing

S3_DeformFDM - The S3 Slicer is a framework for achieving support-free strength reinforcement and surface quality in multi-axis 3D printing by computing the rotation-driven deformation for the input model.

Misc

MuSYCL - muSYCL, the SYCL musical! This is a small music synthesizer to experiment with C++23 programming, design patterns and acceleration on hardware accelerators like GPU, FPGA or CGRA with the SYCL 2020 standard.

Mathematics and Science

1D Heat Transfer Simulation - (C++ based, from Intel) This 1D-Heat-Transfer sample is an application that simulates the heat propagation on a one-dimensional isotropic and homogeneous medium. The code sample includes both parallel and serial calculations of heat propagation.
3D Wave Simulation - (C++ based, from Intel) The ISO3DFD sample refers to Three-Dimensional Finite-Difference Wave Propagation in Isotropic Media; it is a three-dimensional stencil to simulate a wave propagating in a 3D isotropic medium. Starts with a simple serial implementation and shows how to use SYCL to offload to the GPU. Then shows how to optimize.
ACTS GPU Ramp - Demonstrator tracking chain on accelerators
arpack-ng - Arpack ng is a collection of Fortran77 subroutines designed to solve large scale eigenvalue problems and is a community project maintained by volunteers.
Amber Amber is a high-performance molecular dynamics (MD) code used by thousands of scientists in academia, national labs, and industry for computational drug discovery and related research.
ATLAS Charged Particle Seed Finding with DPC++ - The ATLAS Experiment is one of the general-purpose particle physics experiments built at the Large Hadron Collider (LHC) at CERN in Geneva. Its goal is to study the behavior of elementary particles at the highest energies ever produced in a laboratory help us better understand universe.
dedekind-MKL - Selected BLAS and LAPACK Java bindings for Intel's oneAPI Math Kernel Library (oneMKL) on Windows and Linux.
Discrete Cosine Transform Imeage Compression - (C++ based, from Intel) The Discrete Cosine Transform (DCT) sample demonstrates how DCT and Quantizing stages can be implemented to run faster using SYCL* by offloading image processing work to a GPU or other device.
Direction Field Visualization with Python - This project demonstrates the visualization of a direction field with Python using the differential equation of a falling object as a case study. The effectiveness of Heterogeneous Computing is also shown by exploring optimized libraries added functionalities in Intel® Distribution for Python.
GinkgoOneAPI - In this project we want to explore the potential of having an Intel OneAPI backend for the Gingko software package: https://ginkgo-project.github.io/
GROMACS A free and open-source software suite for high-performance molecular dynamics and output analysis.
repulsive-surfaces - A numerical framework for optimization of surface geometry while avoiding (self-)collision.
GeometricTools - The Geometric Tools Engine (GTE) is a collection of source code for high-performance computing in mathematics, geometry, graphics, image analysis, and physics, using CPU multithreading and GPU programming.
gptoolbox - This is a toolbox of useful MATLAB functions for geometry processing, constrained optimization and image processing. It contains several features such as mesh deformation, mesh parameterization, and discrete differential geometry operators for triangle and tetrahedral meshes.
gtensor - gtensor is a multi-dimensional array C++14 header-only library for hybrid GPU development. It was inspired by xtensor, and designed to support the GPU port of the GENE fusion code.
Homogeneous and Heterogeneous Implementations of a tridiagonal solver on Intel® Xeon® E-2176G with oneMKL getrs - Homogeneous and Heterogeneous implementations of a tridiagonal solver with oneMKL getrs
Jacobi Iterative Solver for Multi-GPU - (C++ based, from Intel) Illustrates how to use the Jacobi Iterative method to solve linear equations. This sample starts with a CPU-oriented application and shows how to use SYCL to offload regions of the code to a GPU. The sample walks through developing an optimization strategy by iteratively optimizing the code and ultimately targetting multi-GPUs if available.
LAMMPS - LAMMPS is a classical molecular dynamics simulation code designed to run efficiently on parallel computers. It was developed at Sandia National Laboratories, a US Department of Energy facility, with funding from the DOE. It is an open-source code, distributed freely under the terms of the GNU Public License (GPL) version 2.
mapmap_cpu - MapMap CPU is a massively parallel generic MRF map solver with minimal input assumptions, capable of solving a large class of MRF problems.
MF-LBM - This is a lattice Boltzmann code designed for direct numerical simulation of flow in porous media. It is written in Fortran 90 and optimized for vectorization and parallel programming. code to SYCL.
Monte Carlo Based Finanical Simulation for Multi-GPU - (C++ based, from Intel) Evaluates fair call price for a given set of European options using the Monte Carlo approach. MonteCarlo simulation is one of the most important algorithms in quantitative finance. This sample uses a single CPU Thread to control multiple GPUs. Shows how to migrate CUDA based code to SYCL.
mt-kahypar - MT-KaHyPar is a multi-threaded algorithm for partitioning graphs and hypergraphs. It aims to minimize an objective function defined on the hyperedges while balancing block sizes and optimizing connectivity. It can partition extremely large graphs and hypergraphs with comparable solution quality to the best sequential graph partitioners while being more than an order of magnitude faster with only ten threads.
NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems.
NWGraph - The Northwest Graph Library (NWGraph) is a high-performance header-only generic C++ graph library based on C++20 concepts and ranges. It includes multiple graph algorithms for well-known graph kernels and supporting data structures.
octotiger - Octo-Tiger is an astrophysics program simulating the evolution of star systems based on the fast multipole method on adaptive Octrees. It was implemented using high-level C++ libraries, specifically HPX and Vc, which allows its use on different hardware platforms.
Odd Even Merge and Sorting - (C++ based, from Intel) Demonstrates how to use the odd-even mergesort algorithm (also known as "Batcher's odd–even mergesort") which may benefit whenn working with batches of short-sized to mid-sized (key, value) array pairs. Shows how to migrate CUDA based code to SYCL.
Optical Flow Method - (C++ based, from Intel) The HSOpticalFlow sample is a computation of per-pixel motion estimation between two consecutive image frames caused by movement of object or camera. Shows how to migrate CUDA based code to SYCL.
PyPardisoProject - Pypardiso is a Python package for solving large sparse linear systems of equations using the Intel OneAPI Math Kernel Library Pardiso solver. It provides the same functionality as Scipy's spsolve but is faster in many cases.
repulsive-surfaces - A numerical framework for optimization of surface geometry while avoiding (self-)collision.
SPHinxXsys - SPHinXsys provides C++ APIs for physically accurate simulation and optimization. It aims to handle coupled industrial dynamic systems including fluid, solid, multi-body dynamics and beyond. The multi-physics library is based a unique and unified computational framework by which strong couplings have been achieved for all involved physics. suanPan - suanPan is a finite element method (FEM) simulation platform for applications in fields such as solid mechanics and civil/structural/seismic engineering. The name suanPan (in some places such as suffix it is also abbreviated as suPan) comes from the term Suan Pan (算盤), which is Chinese abacus.

Tools and Development

ArrayFire - oneAPI Backend - ArrayFire is a general-purpose tensor library that simplifies the process of software development for the parallel architectures found in CPUs GPUs and other hardware acceleration devices. This project is to develop a oneAPI backend to the library which currently supports CUDA OpenCL and x86.
ArrayFire - Rust Bindings - Rust bindings for ArrayFire a general-purpose tensor library that simplifies the process of software development for the parallel architectures found in CPUs GPUs and other hardware acceleration devices. This project is to develop a oneAPI backend to the library which currently supports CUDA OpenCL and x86.
amrex-sycl - A SYCL plug-in to run AMReX apps on AMD/Nvidia GPUs. The plug-in consists of a build script and code patches which extend AMReX's SYCL capability beyond Intel GPUs.
chip-spv - The "chip spv" project allows for the portability of HIP and CUDA applications to platforms supporting SPIR-V. Currently, it offers support for OpenCL and Level-Zero as low-level runtime alternatives. Selected BLAS and LAPACK Java bindings for Intel's oneAPI Math Kernel Library on Windows and Linux
dedekind-MKL - Selected BLAS and LAPACK Java bindings for Intel's oneAPI Math Kernel Library (oneMKL) on Windows and Linux.
formulog** - Formulog is a logic programming language that supports Datalog, SMT queries, and first-order functional programming. It requires JRE 11 and a supported SMT solver, such as Z3, Boolector, CVC4, or Yices.
gtensor - gtensor is a multi-dimensional array C++14 header-only library for hybrid GPU development. It was inspired by xtensor, and designed to support the GPU port of the GENE fusion code.
HeCBench - The hecbench repository contains a collection of benchmarks for studying performance portability and productivity with various heterogeneous computing languages.The benchmarks are divided into categories like computer vision, bioinformatics, and finance.
HPCToolKit - HPCToolkit is an open-source performance tool that is in some respects similar to VTune� though it also works on Power and ARM architectures. It also works on NVIDIA and AMD GPUs. Our aim is to also use it for performance analysis of Intel GPUs with Intel’s OpenCL to our targets as a prelude to A0
kharma - Kokkos-based High-Accuracy Relativistic Magnetohydrodynamics with AMR. KHARMA is an implementation of the HARM scheme for gerneral relativistic magnetohydrodynamics (GRMHD) in C++. It is based on the Parthenon AMR infrastructure, using Kokkos for parallelism and GPU support.
Kokkos - Kokkos Core implements a programming model in C++ for writing performance portable applications targeting all major HPC platforms. For that purpose it provides abstractions for both parallel execution of code and data management. Kokkos is designed to target complex node architectures with N-level memory hierarchies and multiple types of execution resources. It currently can use CUDA, HIP, SYCL, HPX, OpenMP and C++ threads as backend programming models with several other backends in development.
levelzero-jni - Intel LevelZero JNI library for TornadoVM. This project is a Java Native Interface (JNI) binding for Intel's Level Zero. This library is as designed to be as closed as possible to the LevelZero API for C++.
libxsmm - LIBXSMM is a library for specialized dense and sparse matrix operations as well as for deep learning primitives such as small convolutions.
numba-dpex - Numba dpex is an extension for the Numba Python JIT compiler that provides a kernel programming API and an offload feature. It supports devices including Intel CPUs, integrated GPUs, and discrete GPUs.
oneapi-containers - The Intel OneAPI Containers simplify programming by delivering the tools to deploy applications and solutions on various architectures. These containers allow developers to set up and distribute environments for profiling and execute applications built with OneAPI toolkits.
oneAPI.jl - The oneapi.jl GitHub project provides support for working with the oneapi unified programming model and offers low-level wrappers for the level zero library, kernel programming, and high-level array programming capabilities.
Open-source Scientific Applications and Benchmarks - This repository contains a collection of data-parallel programs for evaluating oneAPI direct programming. Each program is written with CUDA, SYCL, and OpenMP target offloading. Intel® DPC++ Compatibility Tool (DPCT) can convert a CUDA program to a SYCL program.
OpenVisualCloud Dockerfiles - This repository contains Docker build files for software stacks and services designed for media delivery, media analytics, cloud gaming and graphics, and immersive media.
p2rng - A modern header-only C++ library for parallel algorithmic (pseudo) random number generation supporting OpenMP, CUDA, ROCm and oneAPI.
PySYCL - SYCL functionalities within Python for GPU targetted development.
QSVEnc - QSVenc is a software developed to investigate the performance and image quality of the hw encoder QSV of Intel. It is a command line version that runs independently and a plugin for AviUtl.
RayBNN_Raytrace - Ray tracing library using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI
RcppParallel - The rcppparallel project provides high-level functions for parallel programming with Rcpp and supports using Intel TBB for performance on Windows, macOS, and Linux systems.
R-oneMKL - The oneMKL package establishes the connection between the R environment and Intel oneAPI Math Kernel Library (oneMKL), a prerequisite of using oneMKL.MatrixCal package. Specifically, oneMKL provides necessary header files and dynamic library files to R, and imports files from the packages mkl, mkl-include, and intel-openmp from Anaconda.
Scal - New physical scalable benchmark, namely ScalSALE, is based on the well-known SALE scheme. ScalSALE's main goal is to provide a gold-standard benchmark application that can incorporate multi-physical schemes while maintaining scalable and efficient execution times. High-performance Spiking Neural Networks Library Written From Scratch with C++ and Python Interfaces.
Spyker - High-performance Spiking Neural Networks Library Written From Scratch with C++ and Python Interfaces.
SYCLomatic - The SycloMatic project helps developers migrate code to the SYCL heterogeneous programming model. Daily builds are available, but not rigorously tested for production quality control.
SYnergy - Energy Measurement and Frequency Scaling for SYCL applications.
SYCLops - A SYCL-specific LLVM-to-MLIR converter.
syclreduce - This is a tiny package implementing what is a giant unmet need in SYCL2020 - proper reductions. Want to sum a vector coming from every thread in a kernel launch? Want to accumulate a couple different kinds of diagnostic output from a kernel? Too bad. SYCL doesn't have full documentation on how span<> works, and you'll easily get lost writing your own undefined type reducer.
TAU Performance System - The TAU Performance System® supports profiling and tracing of programs written using the Intel OneAPI. Intel OneAPI provides two interfaces for programming - OpenCL and DPC++/SYCL for CPUs and GPUs. TAU supports both - the OpenCL profiling interface and Intel Level Zero API to observe performance.
TornadoVM - TornadoVM is an open-source software technology that automatically accelerates Java programs on multi-core CPUs GPUs and FPGAs.
toyBrot - toyBrot is a raymarching fractal generator that is used both as a simple benchmarking tool and a study tool for parallelisation. The code is is implemented with over 10 different technologies including Intel TBB� ISPC and SYCL (with support for oneAPI)
ZFP - zfp is a compressed format for representing multidimensional floating-point and integer arrays. zfp provides compressed-array classes that support high throughput read and write random access to individual array elements. zfp also supports serial and parallel compression of whole arrays for applications that read and write large data sets to and from disk.

Tutorials

50YearsOfRayTracing - This GitHub project is focused on ray tracing and covers several techniques and models developed from 1968 to 1997, with a focus on physically based rendering.
data-parallel-CPP - The Data Parallel C Book Source Samples repository contains code that accompanies the Data Parallel C: Mastering DPC for Programming of Heterogeneous Systems using C++ and SYCL book.
efficient-dl-systems - This repository contains materials for the Efficient Deep Learning Systems course taught at the HSE University and Yandex School of Data Analysis.
Jurassic - Hunting Dinosaur bones using AI
syclacademy - SYCL Academy, a set of learning materials for SYCL heterogeneous programming

Related Communities

Explore the curated collection of top AI projects leveraging OpenVINO across diverse domains on Awesome OpenVINO.

breyerml / awesome-oneapi Goto Github PK

awesome-oneapi's Introduction