Code Monkey home page Code Monkey logo

Comments (5)

mdberryh avatar mdberryh commented on August 24, 2024

I have the AMD radeon 6600 pro, and I got a lot worse results. I was wondering if it was from using ROCM, but I'm not. I also see the benchmarks you're talking about also are using ROCM and PlaidML. When I was trying to get ROCM working, it sounded like it didn't support consumer GPUs, but I am now reading there is some support...I dunno.

@-desktop:~$ plaidbench --examples 2048 --batch-size 16 keras --no-fp16 --no-train mobilenet
Running 2048 examples with mobilenet, batch size 16, on backend plaid
INFO:plaidml:Opening device "opencl_amd_gfx1032.0"
Compiling network... Warming up... Running...
Example finished, elapsed: 8.574s (compile), 10.313s (execution)

-----------------------------------------------------------------------------------------
Network Name         Inference Latency         Time / FPS          
-----------------------------------------------------------------------------------------
mobilenet            5.04 ms                   3.96 ms / 252.74 fps
Correctness: untested. Could not find golden data to compare against.
@desktop:~$ plaidbench --examples 2048 --batch-size 16 keras --no-fp16 --no-train resnet50
Running 2048 examples with resnet50, batch size 16, on backend plaid
INFO:plaidml:Opening device "opencl_amd_gfx1032.0"
Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels.h5
102858752/102853048 [==============================] - 59s 1us/step
Compiling network... Warming up... Running...
Example finished, elapsed: 8.874s (compile), 43.838s (execution)

-----------------------------------------------------------------------------------------
Network Name         Inference Latency         Time / FPS          
-----------------------------------------------------------------------------------------
resnet50             21.41 ms                  19.53 ms / 51.21 fps
Correctness: untested. Could not find golden data to compare against.

from plaidml.

mdberryh avatar mdberryh commented on August 24, 2024

I've actually checked another site and notice they didn't mention batch sizes, but with batch size of 1 the performance increased a lot, so they might have left that to the defaults https://openbenchmarking.org/test/pts/plaidml&eval=31492f06de09eca1672491c6b1484ffae4f2df19#metrics

@-desktop:~$ plaidbench  keras --no-fp16 --no-train mobilenet
Running 1024 examples with mobilenet, batch size 1, on backend plaid
INFO:plaidml:Opening device "opencl_amd_gfx1032.0"
Compiling network... Warming up... Running...
Example finished, elapsed: 9.348s (compile), 17.516s (execution)

-----------------------------------------------------------------------------------------
Network Name         Inference Latency         Time / FPS          
-----------------------------------------------------------------------------------------
mobilenet            17.11 ms                  2.81 ms / 356.46 fps
Correctness: PASS, max_error: 7.314303729799576e-06, max_abs_error: 6.407499313354492e-07, fail_ratio: 0.0

from plaidml.

tedliosu avatar tedliosu commented on August 24, 2024

I have the AMD radeon 6600 pro, and I got a lot worse results. I was wondering if it was from using ROCM, but I'm not. I also see the benchmarks you're talking about also are using ROCM and PlaidML. When I was trying to get ROCM working, it sounded like it didn't support consumer GPUs, but I am now reading there is some support...I dunno.

Here's the thing @mdberryh - since I (think) around mid-December of last year (2021), the non-ROCm pro drivers (which have been posted every quarter or so to AMD's support website) have replaced/been completely merged with the rocm-dkms drivers from AMD's ROCm stack. So unless if my observations and conclusions here are wrong, it seems that it's not technically possible anymore to install a set of Radeon pro drivers onto a Linux distro without also pulling at least one or two packages from AMD's ROCm repositories, and hopefully what I said makes sense.

Also before the apparent merge I mentioned above, I had been using a Vega 56 and running ROCm and other pro-drivers just fine on that graphics card, as ROCm has been good for at least quite a while now in supporting at least some consumer GPUs (see this documentation for more details). But what you might been referring to is how AMD ROCm doesn't officially support the iGPUs within their APUs, and how excruciatingly slow AMD has been in getting support implemented for Navi 1X/Navi 2X chips in ROCm (in fact, for some reason I still don't see Navi 1X support officially listed within the documentation I've linked above).

from plaidml.

tedliosu avatar tedliosu commented on August 24, 2024

I've actually checked another site and notice they didn't mention batch sizes, but with batch size of 1 the performance increased a lot, so they might have left that to the defaults https://openbenchmarking.org/test/pts/plaidml&eval=31492f06de09eca1672491c6b1484ffae4f2df19#metrics

Actually @mdberryh if you go to that same website you linked and click on "View Source" for any one of the test definitions listed (for example this latest definition) you'll see that Phoronix's benchmarks all do in fact use a batch size of 16 (scroll down and look at the line "--examples 2048 --batch-size..." under "test-definition.xml" to see what I mean). So yea I'm pretty sure that unless if you're talking about a different benchmarking results site other than Phoronix's openbenchmarking.org, or unless if the vast majority of the people running the plaidml portion of the Phoronix Test Suite manually edited their test definition files (which I highly doubt), my original complaint about the performance issues that we're facing when batch size gets set to 16 on these plaidml benchmarks is still valid.

Also, have you tried running batch size = 1 for all of the other neural network benchmarks in plaidml (e.g. plaidbench keras --no-fp16 --no-train resnet50)? Because after doing that on my machine I noticed that mobilenet was the only neural network where a decrease in batch size resulted in an increase in performance. So yea you might wanna double check to make sure that decreasing the batch sizes also increases performance for all of the other neural networks on your machine to make sure that there isn't something wonky going on within the underlying software stack of plaidml, which I highly suspect there is at least from the benchmarking results I got.

from plaidml.

tedliosu avatar tedliosu commented on August 24, 2024

Since I broke the system containing my RX 6800 while attempting to upgrade its system memory, and no longer have the time nor energy to maintain my own desktop system, I just sold my RX 6800 (my only AMD GPU). Therefore, since I will not be able to repro any potential fix of this issue anymore, I am closing this issue for the time being. Will be more than willing to reopen this if anyone else runs into the same issue as me.

from plaidml.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.