Code Monkey home page Code Monkey logo

occa2's Introduction

  • ๐Ÿ‘‹ Hi, Iโ€™m Tim Warburton (AKA @tcew)
  • ๐Ÿ‘€ I develop and implement GPU accelerated, high-order finite element based PDE solvers
  • ๐ŸŒฑ Iโ€™m always learning more numerical analysis, PDEs, algorithms, linear algebra, preconditioners, ...

occa2's People

Contributors

dmed256 avatar lcw avatar reidatcheson avatar tcew avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

occa2's Issues

Reentrant OCCA

How does OCCA do in the presence of multiple host threads?

One obvious thing I spotted is the stateful kernel launch interface. (setWorkDimSize or so)

Can't get addVectors example to parse/compile addVectors.ofl

I am looking at the addVectors example in fortran. I've changed the occaBuildKernel line to

  addVectors = occaBuildKernelFromSource(device, "addVectors.ofl", "addVectors")

to try to engage the OFL kernel. The kernel compilation fails at run-time. When I inspect the code OCCA is trying to compile, I see that the contents of addVectors.ofl have been included verbatim:

...
#endif
#undef OCCA_USING_OPENMP
#define OCCA_USING_OPENMP 1
#undef OCCA_USING_CPU
#define OCCA_USING_CPU 1
kernel subroutine addVectors(entries, a, b, ab)
  implicit none

  integer(4), intent(in)  :: entries
  real(4)   , intent(in)  :: a(:), b(:)
  real(4)   , intent(out) :: ab(:)

  integer(4), shared    :: sharedVar(16,30)
  integer(4), exclusive :: exclusiveVar

  integer :: group, item, N

  do group = 1, entries, 16, outer0
     do item = 1, 16, inner0
        N = (item + (16 * (group - 1)))

        if (N < entries) then
           ab(i) = a(i) + b(i)
     end do
  end do

end subroutine addVectors
^@

FYI

maxhutch@edoras:~/src/OCCA2/examples/addVectors$ git log |head -3
commit 16ef4d4d8dd26c62d8a952faa9e98a5549133665
Author: Tim Warburton <[email protected]>
Date:   Mon Dec 8 15:26:35 2014 -0600

Memory leak in the C interface

In the C interface occaKernelRun there is a memory leak when "casted" occa types are used (such as occaInt).

For instance the following call

occaKernelRun(some_kernel, occaInt(a), d_ptr);

will have a leak because 'occaInt' has a new without an associated delete.

From other places in the library, such as occaKernelInfoAddDefine the delete is issued in the call, so I assume that it make sense for the delete to be issued at the end of occaKernelRun (as opposed to a separate all by the user).

Cannot build on Windows

Hi,
I tried building OCCA on Windows using MINGW, but it fails as it cannot find sys/sysctl.h .
I had to hack around a bit to run it on windows, by changing HOME/scripts/makefile at around line 110 to put bash instead of /bin/bash . However, I got stuck at the error mentioned as sys/sysctl.h is not available in Windows apparently. Is there any alternative ?

No locking on cache directory

Subject says it. Imagine two codes running on a node and trampling on each other's cached binaries. (Or is that what the "salt" parameter is about?)

fortran occaKernelInfoAddDefine with string argument

I am in need of an occaKernelInfoAddDefine that is able to accept a string (character(len=*)) as its third argument, i.e for something like occaKernelInfoAddDefine(info,"datafloat","double"). The current fortran interface can only accept a "character" as its third argument, hence the above definition will result in "datafloat" being defined as just "d".
Thanks

Don't ignore floopy failures

Here's a log of running gNUMA with a (f)loopy kernel:

Found cached binary of [occa/clusterVerticalStrongVolumeKernel.occa] in [/exthome/andreas/._occa/f51421d852343e72]
floopy --lang=floopy --target=cl:0,0  --occa-defines=/exthome/andreas/._occa/loopy1_7d27af49ef7c8cc7.defs  --occa-add-dummy-arg occa/verticalStrongVolumeKernel.floopy /exthome/andreas/._occa/loopy2_7d27af49ef7c8cc7.ocl
sh: 1: floopy: not found
Found cached binary of [/exthome/andreas/._occa/loopy2_7d27af49ef7c8cc7.ocl] in [/exthome/andreas/._occa/9dc467c8d2d13cc6]
...

I'd argue it should stop when it fails to execute floopy.

floopy is now called loopy

The floopy transform executable is now just called loopy, not floopy. It also no longer takes a --target flag.

Fortran occaKernelInfoAddDefine for double precision

If I define a double precision kernel constant defined as :

gamma = 1.40023693379791_r8
call occaKernelInfoAddDefine(info, "p_Gamma", gamma)

Inside an occa kernel the constant is truncated to
gamma = 1.400240000000

I see that both versions with val = real(4) and real(8) are defined in occaF.f90 so there may be an issue in how the constant is stored for the later.

Memory leak in occaOpenCL.cpp

I think that free template should have a delete command for the data
delete (OpenCLKernelData_t*) this->data;

I am also getting some memory leaks in occaParser.cpp (from the new on line 3522) and occaParserNodes.cpp (from the new on line 216), but I am not sure where in my code these functions are getting called so I haven't been able to track down the leak fully.

Bug with C-style comments in OKL files.

I will give screenshots of the affected lines.

With these lines code compiles and my tests all pass.
screen shot 2015-01-31 at 2 47 30 pm

However if I add a C-style comment like so:

screen shot 2015-01-31 at 2 49 20 pm

I get the following errors

screen shot 2015-01-31 at 2 49 50 pm

Looking at the relevant cached OCCA file I find:

screen shot 2015-01-31 at 2 50 53 pm

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.