zchee / cuda-sample Goto Github PK
View Code? Open in Web Editor NEWCUDA official sample codes
CUDA official sample codes
when NLM_WINDOW_RADIUS is increased, the idx
counter which indices the fWeights array,
goes out of array range. see comments with //******
__shared__ float fWeights[BLOCKDIM_X * BLOCKDIM_Y]; //****** default fWeights size is 64
const int ix = blockDim.x * blockIdx.x + threadIdx.x;
const int iy = blockDim.y * blockIdx.y + threadIdx.y;
//Add half of a texel to always address exact texel centers
const float x = (float)ix + 0.5f;
const float y = (float)iy + 0.5f;
const float cx = blockDim.x * blockIdx.x + NLM_WINDOW_RADIUS + 0.5f;
const float cy = blockDim.x * blockIdx.y + NLM_WINDOW_RADIUS + 0.5f;
if (ix < imageW && iy < imageH)
{
//Find color distance from current texel to the center of NLM window
float weight = 0;
for (float n = -NLM_BLOCK_RADIUS; n <= NLM_BLOCK_RADIUS; n++)
for (float m = -NLM_BLOCK_RADIUS; m <= NLM_BLOCK_RADIUS; m++)
weight += vecLen(
tex2D(texImage, cx + m, cy + n),
tex2D(texImage, x + m, y + n)
);
//Geometric distance from current texel to the center of NLM window
float dist =
(threadIdx.x - NLM_WINDOW_RADIUS) * (threadIdx.x - NLM_WINDOW_RADIUS) +
(threadIdx.y - NLM_WINDOW_RADIUS) * (threadIdx.y - NLM_WINDOW_RADIUS);
//Derive final weight from color and geometric distance
weight = __expf(-(weight * Noise + dist * INV_NLM_WINDOW_AREA));
//Write the result to shared memory
fWeights[threadIdx.y * BLOCKDIM_X + threadIdx.x] = weight;
//Wait until all the weights are ready
__syncthreads();
//Normalized counter for the NLM weight threshold
float fCount = 0;
//Total sum of pixel weights
float sumWeights = 0;
//Result accumulator
float3 clr = {0, 0, 0};
int idx = 0;
//Cycle through NLM window, surrounding (x, y) texel
for (float i = -NLM_WINDOW_RADIUS; i <= NLM_WINDOW_RADIUS + 1; i++)
for (float j = -NLM_WINDOW_RADIUS; j <= NLM_WINDOW_RADIUS + 1; j++)
{
//Load precomputed weight
float weightIJ = fWeights[idx++]; //****** in this line , we go out of array
//******if NLM_WINDOW_RADIUS is larger than 3. just increasing the fWeights array, does not solve ///****** the
// ******problem
//Accumulate (x + j, y + i) texel color with computed weight
float4 clrIJ = tex2D(texImage, x + j, y + i);
clr.x += clrIJ.x * weightIJ;
clr.y += clrIJ.y * weightIJ;
clr.z += clrIJ.z * weightIJ;
//Sum of weights for color normalization to [0..1] range
sumWeights += weightIJ;
//Update weight counter, if NLM weight for current window texel
//exceeds the weight threshold
fCount += (weightIJ > NLM_WEIGHT_THRESHOLD) ? INV_NLM_WINDOW_AREA : 0;
}
Hello,
I installed CUDA 8 with Visual Studio 2015 Community version (on windows 10). I compile the deviceQuery.cpp file, it executed without error but "deviceQuery.exe" isn't created. I also tried with Visual Studio 2013 Community version, too. But no luck.
I will appreciate, if anyone could help me.
Hosna
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.