Code Monkey home page Code Monkey logo

cpuid2cpuflags's Introduction

cpuid2cpuflags -- CPU_FLAGS_* generator
(c) 2017-2024 Michał Górny
SPDX-License-Identifier: GPL-2.0-or-later


Usage
~~~~~

The program attempts to obtain the identification and capabilities
of the currently used CPU, and print the matching set of CPU_FLAGS_*
flags for Gentoo. To use it, just run it:

  $ cpuid2cpuflags
  CPU_FLAGS_X86: 3dnow 3dnowext mmx mmxext sse sse2 sse3

There are no command-line options. Please note that the program
identifies the apparent CPU capabilities using available CPU calls
or system interfaces, *not* the capabilities indicated by compiler
flags.

The flag definitions match the flags described in Gentoo profiles/desc
at the time of program release. If additional flags are introduced
in the future, they will be added in a future program release.

The output format is compatible both with Portage (package.use)
and Paludis (use.conf/options.conf). If you find it useful to
generate/update it automatically, you can use a dedicated file:

  $ mkdir /etc/portage/package.use   # if not used yet
  $ echo "*/* $(cpuid2cpuflags)" > /etc/portage/package.use/00cpuflags


Building
~~~~~~~~

These are the steps necessary to build the ./cpuid2cpuflags program:

  $ autoreconf -vi
  $ ./configure
  $ make


Implementation details
~~~~~~~~~~~~~~~~~~~~~~

X86 (incl. x86-64)
------------------

On x86 platforms, cpuid2cpuflags issues the CPUID instruction to obtain
processor capabilities. This should work reliably across different
systems and kernels, unless the system somehow blocks this instruction.
If this is the case, please report a bug.


ARM and AArch64
---------------

On ARM platforms, the userspace processes are not allowed to obtain
processor information directly. Instead, the program is relying
on kernel identification of the CPU provided via the system interfaces.
Currently, only Linux is supported.

On Linux, two interfaces are used: uname() to identify the CPU family,
and getauxval(AT_HWCAP*...) to obtain detailed feature flags.

The textual value obtained from uname (armv* or aarch64) is used
to enable appropriate ARM version flags and some feature flags. It is
also used to determine whether the kernel is 64- or 32-bit since that
affects the interpretation of AT_HWCAP* flags.

Afterwards, the remaining feature flags are enabled based on either flag
bits provided by AT_HWCAP*, or implicitly based on the subarchitecture
(i.e. currently a number of features is always set on AArch64).

It should be noted that the program strongly depends on correct
identification of the CPU in the kernel. If you find the results
incorrect, please report a bug but I can't promise I'll be able to find
a good workaround.

cpuid2cpuflags's People

Contributors

angryloki avatar danfe avatar floppym avatar gyakovlev avatar klondi avatar mgorny avatar nuno-silva avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cpuid2cpuflags's Issues

Output is not perfectly ready for Portage

I don't know about Paludis, but portage config files use key=value format for configuration on make.conf and category/package flags for per package configuration files, but the tool is currently printing in the var:values format.

Easy enough to edit by hand, but easy enough to fix too, either on the README or the program itself. Maybe adding a flag to make it more script friendly, with which it would only print the flags.

Missing flag for AMD ZEN 4?

I have a recent AMD ZEN 4 (AMD Ryzen 7 PRO 7840U) and it seems that f.ex. avx512_vnni flag is not registered. I have not cross referenced the list below, so there might be more missing.

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         48 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  16
  On-line CPU(s) list:   0-15
Vendor ID:               AuthenticAMD
  Model name:            AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics
    CPU family:          25
    Model:               116
    Thread(s) per core:  2
    Core(s) per socket:  8
    Socket(s):           1
    Stepping:            1
    CPU(s) scaling MHz:  50%
    CPU max MHz:         6076,0000
    CPU min MHz:         400,0000
    BogoMIPS:            6590,60
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht sys
                         call nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl nonstop_tsc cpuid extd_a
                         picid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave a
                         vx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs sk
                         init wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd m
                         ba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a 
                         avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt x
                         savec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx512_bf16 clzero irp
                         erf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid dec
                         odeassists pausefilter pfthreshold v_vmsave_vmload vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospk
                         e avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid overflow_recov succ
                         or smca flush_l1d
Virtualization features: 
  Virtualization:        AMD-V
Caches (sum of all):     
  L1d:                   256 KiB (8 instances)
  L1i:                   256 KiB (8 instances)
  L2:                    8 MiB (8 instances)
  L3:                    16 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-15
Vulnerabilities:         
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec rstack overflow:  Vulnerable: Safe RET, no microcode
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Enhanced / Automatic IBRS, IBPB conditional, STIBP always-on, RSB filling, PBRSB-eIBRS Not a
                         ffected
  Srbds:                 Not affected
  Tsx async abort:       Not affected

ppc support

Hi,

I checked this issue out and apparently the way grepwood implemented it is the easiest.

The only thing analogous to x86's cpuid instruction on PowerPC is copying the value of the Processor Version Register. This special register is read-only and it is restricted to supervisors only. Some kernels like Linux and FreeBSD do the smart thing and trap its value in memory and expose it for userspace applications to have a look at. That makes reading /proc/self/auxv fair enough a method. The PVR is a 32bit register and its first half contains the version, the other revision. More info: http://www.cebix.net/downloads/bebox/pem32b.pdf pages 78-79

Example: the PVR on Cell Broadband Engine is 0x 0070 0501.
From arch/powerpc/kernel/cputable.c:

        {       /* Cell Broadband Engine */
                .pvr_mask               = 0xffff0000,
                .pvr_value              = 0x00700000,
                .cpu_name               = "Cell Broadband Engine",
                .cpu_features           = CPU_FTRS_CELL,
                .cpu_user_features      = COMMON_USER_PPC64 |
                        PPC_FEATURE_CELL | PPC_FEATURE_HAS_ALTIVEC_COMP |
                        PPC_FEATURE_SMT,
                .mmu_features           = MMU_FTRS_CELL,
                .icache_bsize           = 128,
                .dcache_bsize           = 128,
                .num_pmcs               = 4,
                .pmc_type               = PPC_PMC_IBM,
                .oprofile_cpu_type      = "ppc64/cell-be",
                .oprofile_type          = PPC_OPROFILE_CELL,
                .platform               = "ppc-cell-be",

We can see that version clearly matches 0x0070. Checking for PPC_FEATURE_HAS_ALTIVEC is legit, because of $(grep PPC_FEATURE_HAS_ALTIVEC arch/powerpc) in the kernel source tree. These 2 macros are always equal.

Add support for FreeBSD

I've played with it on ARM today and it seems to work. Two changes were required:

  • Replace getauxval(3), a non-standard glibc extension, with our native elf_aux_info(3) call
  • Return features supported by the specific machine processor architecture rather than the hardware platform, because the latter is too vague on FreeBSD: it always returns just arm without any version (generation)

The actual patch is here.

cmake conversion.. via autocmake.py

CMakeLists.txt

#https://github.com/fritzone/autocmake

https://cmake.org/cmake/help/v3.7/module/CPackDeb.html or CPackrpm
toys like binary-gentoo on pypi for say a cheap web binhost or renting gravaton2 arm64 docker to run gentoo builds might be of some small use to where one is rammed with rpm/or deb vps servers as a defacto...
or github actions etc... build gentoo toys >>> github/lfs .... ( @spreequalle/gentoo-binhost )
(resolve march native is python based .. binary-gentoo has want of cpuid2cpuflags as well )

hopefully cmakes multi-arch abilities helps add mor arches... whom knows
its a start anyways.

output format does not match make.conf syntax

# cpuid2cpuflags
CPU_FLAGS_X86: aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt rdrand sse sse2 sse3 sse4_1 sse4_2 ssse3

# cpuid2cpuflags >> /etc/portage/make.conf

requires editing make.conf... IMHO it would be better to have sth like

# cpuid2cpuflags
CPU_FLAGS_X86="aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt rdrand sse sse2 sse3 sse4_1 sse4_2 ssse3"

Add a tool to convert CFLAGS into CPU_FLAGS

It has been suggested that we could have a tool that would take an -march= option and output the corresponding CPU_FLAGS_*. I'm thinking the cleanest approach would be to actually to defer to GCC to expand CFLAGS into a set of -m options, and map them onto CPU_FLAGS_*. However, we first need to check if all of the flags have corresponding -m options.

enable simple copy and paste

I suggest to adjust the format a little bit to enable simple copy and paste

Output now:
CPU_FLAGS_X86: aes avx f16c mmx mmxext pclmul popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3

What I suggest:
CPU_FLAGS_X86="aes avx f16c mmx mmxext pclmul popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3"

cpuid2cpuflags omits ssse3 for skylake intel cpu

On my vps cpuid2cpuflags does not list ssse3 even though it should support it according to wikipedia.
lscpu output:

CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
Address sizes:       40 bits physical, 48 bits virtual
CPU(s):              1
On-line CPU(s) list: 0
Thread(s) per core:  1
Core(s) per socket:  1
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel Xeon Processor (Skylake, IBRS)
Stepping:            4
CPU MHz:             2100.000
BogoMIPS:            4200.00
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            4096K
L3 cache:            16384K
NUMA node0 CPU(s):   0
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ssbd ibrs ibpb fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 arat
`

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.