jeffhammond / stream Goto Github PK
View Code? Open in Web Editor NEWSTREAM benchmark
Home Page: http://www.cs.virginia.edu/stream/ref.html
License: Other
STREAM benchmark
Home Page: http://www.cs.virginia.edu/stream/ref.html
License: Other
=============================================== STREAM is the de facto industry standard benchmark for measuring sustained memory bandwidth. Documentation for STREAM is on the web at: http://www.cs.virginia.edu/stream/ref.html =============================================== NEWS =============================================== UPDATE: October 28 2014: "stream_mpi.c" released in the Versions directory. Based on Version 5.10 of stream.c, stream_mpi.c brings the following new features: * MPI implementation that *distributes* the arrays across all MPI ranks. (The older Fortran version of STREAM in MPI *replicates* the arrays across all MPI ranks.) * Data is allocated using "posix_memalign" rather than using static arrays. Different compiler flags may be needed for both portability and optimization. See the READ.ME file in the Versions directory for more details. * Error checking and timing done by all ranks and gathered by rank 0 for processing and output. * Timing code uses barriers to ensure correct operation even when multiple MPI ranks run on shared memory systems. NOTE: MPI is not a preferred implementation for STREAM, which is intended to measure memory bandwidth in shared-memory systems. In stream_mpi, the MPI calls are only used to properly synchronize the timers (using MPI_Barrier) and to gather timing and error data, so the performance should scale linearly with the size of the cluster. But it may be useful, and was an interesting exercise to develop and debug. =============================================== UPDATE: January 17 2013: Version 5.10 of stream.c is finally available! There are no changes to what is being measured, but a number of long-awaited improvements have been made: * Updated validation code does not suffer from accumulated roundoff error for large arrays. * Defining the preprocessor variable "VERBOSE" when compiling will (1) cause the code to print the measured average relative absolute error (rather than simply printing "Solution Validates", and (2) print the first 10 array entries with relative error exceeding the error tolerance. * Array index variables have been upgraded from "int" to "ssize_t" to allow arrays with more than 2 billion elements on 64-bit systems. * Substantial improvements to the comments in the source on how to configure/compile/run the benchmark. * The proprocessor variable controlling the array size has been changed from "N" to "STREAM_ARRAY_SIZE". * A new preprocessor variable "STREAM_TYPE" can be used to override the data type from the default "double" to "float". This mechanism could also be used to change to non-floating-point types, but several "printf" statements would need to have their formats changed to accomodate the modified data type. * Some small changes in output, including printing array sizes is GiB as well as MiB. * Change to the default output format to print fewer decimals for the bandwidth and more decimals for the min/max/avg execution times. =============================================== UPDATE: February 19 2009: The most recent "official" versions have been renamed "stream.f" and "stream.c" -- all other versions have been moved to the "Versions" subdirectory and should be considered obsolete. The "official" timer (was "second_wall.c") has been renamed "mysecond.c". This is embedded in the C version ("stream.c"), but still needs to be externally linked to the FORTRAN version ("stream.f"). The new version defines entry points both with and without trailing underscores, so it *should* link automagically with any Fortran compiler. =============================================== STREAM is a project of "Dr. Bandwidth": John D. McCalpin, Ph.D. [email protected] =============================================== The STREAM web and ftp sites are currently hosted at the Department of Computer Science at the University of Virginia under the generous sponsorship of Professor Bill Wulf and Professor Alan Batson. ===============================================
I noticed that commit 5d77d6d explicitly specifies GCC 4.9 in the makefile. I'm sure you have a good reason, but I don't see it. Could you explain your choice? Is there a reason to install GCC 4.9 instead of using whatever version was included in my operating system?
Can you provide a quick guide on how to install on macOS with Apple Silicon?
$ brew install gcc cmake
$ git clone https://github.com/jeffhammond/STREAM/ && cd STREAM
$ make
gcc -O2 -fopenmp -c -o mysecond.o mysecond.c
clang: error: unsupported option '-fopenmp'
make: *** [mysecond.o] Error 1
When n
is huge, we need to not compute the total bytes in INTEGER
precision, when it's the default of 32 bits. The fix is trivial:
- WRITE (*,FMT=9050) label(j),n*bytes(j)*nbpw/mintime(j)/1.0D6,
+ WRITE (*,FMT=9050) label(j),n/1.0D6*bytes(j)*nbpw/mintime(j),
[root@openeuler STREAM-master]# gcc -fopenmp -DSTREAM_ARRAY_SIZE=800000000 -DNTIMES=20 -mcmodel=large stream.c -o stream_c.exe
/usr/local/gcc-10.3.0/lib/gcc/aarch64-linux/10.3.0/libgcc.a(lse-init.o): in function init_have_lse_atomics': /home/perfagent-plus/install_gcc/gcc-10.3.0/aarch64-linux/libgcc/../.././libgcc/config/aarch64/lse-init.c:45:(.text.startup+0x14): relocation truncated to fit: R_AARCH64_ADR_PREL_PG_HI21 against
.bss'
/usr/local/gcc-10.3.0/lib/gcc/aarch64-linux/10.3.0/libgcc.a(ldadd_4_1.o): in function __aarch64_ldadd4_relax': /home/perfagent-plus/install_gcc/gcc-10.3.0/aarch64-linux/libgcc/../.././libgcc/config/aarch64/lse.S:266:(.text+0x4): relocation truncated to fit: R_AARCH64_ADR_PREL_PG_HI21 against symbol
__aarch64_have_lse_atomics' defined in .bss section in /usr/local/gcc-10.3.0/lib/gcc/aarch64-linux/10.3.0/libgcc.a(lse-init.o)
If I reduce the arrary size less then 100000000, it can compile successful,but the results can't be trusted.
How to solve this problem?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.