Code Monkey home page Code Monkey logo

Comments (12)

joseph-montanez avatar joseph-montanez commented on May 26, 2024

Actually the same issue on MacOS with the Metal backend enable, after 4k draw calls it will crash so definitely some limitation hit. However this is ONLY in debug mode, release mode there is not a problem:

_platform_memmove 0x00000001978966e8
bx::memCopy(void *, const void *, unsigned long) bx.cpp:52
bgfx::mtl::RendererContextMtl::setShaderUniform(unsigned char, unsigned int, const void *, unsigned int) renderer_mtl.mm:1547
bgfx::mtl::RendererContextMtl::setShaderUniform4x4f(unsigned char, unsigned int, const void *, unsigned int) renderer_mtl.mm:1557
bgfx::ViewState::setPredefined<…>(bgfx::mtl::RendererContextMtl *, unsigned short, const bgfx::mtl::PipelineStateMtl &, const bgfx::Frame *, const bgfx::RenderDraw &) renderer.h:194
bgfx::mtl::RendererContextMtl::submit(bgfx::Frame *, bgfx::ClearQuad &, bgfx::TextVideoMemBlitter &) renderer_mtl.mm:4728
bgfx::Context::renderFrame(int) bgfx.cpp:2470
bgfx::renderFrame(int) bgfx.cpp:1491
bgfx::Context::renderThread(bx::Thread *, void *) bgfx_p.h:3150
bx::Thread::entry() thread.cpp:328
bx::ThreadInternal::threadFunc(void *) thread.cpp:95
_pthread_start 0x0000000197867fa8

------------ BGFX Stats ------------
CPU Frame Time: 9193
CPU Begin Time: 1692529311994101
CPU End Time: 1692529312003243
CPU Timer Frequency: 1000000
GPU Begin Time: 1692529311976854
GPU End Time: 1692529311977518
GPU Timer Frequency: 1000000
Wait Render: 2096
Wait Submit: 22
Draw Calls: 4720
Compute Calls: 0
Blit Calls: 0
Max GPU Latency: 0
GPU Frame Number: 0
Texture Memory Used: 53248
Render Target Memory Used: 0
Transient VB Used: 0
GPU Memory Max: -9223372036854775807
GPU Memory Used: -9223372036854775807
Width: 450
Height: 800
Text Width: 100
Text Height: 28
Number of view stats: 0
Number of encoders used during frame: 1
Primitives Rendered [0]: 9440
Primitives Rendered [1]: 0
Primitives Rendered [2]: 0
Primitives Rendered [3]: 0
Primitives Rendered [4]: 0
------------ End of BGFX Stats ------------

from bgfx.

bkaradzic avatar bkaradzic commented on May 26, 2024

Make debug build and see debug output.

from bgfx.

magester1 avatar magester1 commented on May 26, 2024

I already have a debug build, that's how I was able to do the analysis in the issue description.
Do you mean to share the log for the debug build? If so then here it is: log.txt

I'm actually getting a slight different behaviour now, it crashes pretty much immediately. Not sure why, I haven't really used Vulkan ever since I created the ticket originally. But the out of bounds access error, stacktrace and everything is the same, so it's still the same issue.

from bgfx.

bkaradzic avatar bkaradzic commented on May 26, 2024

Update your drivers.

from bgfx.

joseph-montanez avatar joseph-montanez commented on May 26, 2024

I've isolated my problem and fixed it:

Change [src/renderer_mtl.mm:1556]:(

void setShaderUniform(uint8_t _flags, uint32_t _loc, const void* _val, uint32_t _numRegs)
)

		void setShaderUniform(uint8_t _flags, uint32_t _loc, const void* _val, uint32_t _numRegs)
		{
			uint32_t offset = 0 != (_flags&kUniformFragmentBit)
				? m_uniformBufferFragmentOffset
				: m_uniformBufferVertexOffset
				;
			uint8_t* dst = (uint8_t*)m_uniformBuffer.contents();
			bx::memCopy(&dst[offset + _loc], _val, _numRegs*16);
		}

To check for the UNIFORM_BUFFER_SIZE before copying the memory.

		void setShaderUniform(uint8_t _flags, uint32_t _loc, const void* _val, uint32_t _numRegs)
		{
			uint32_t offset = 0 != (_flags&kUniformFragmentBit)
				? m_uniformBufferFragmentOffset
				: m_uniformBufferVertexOffset
				;
			uint8_t* dst = (uint8_t*)m_uniformBuffer.contents();
			if (offset + _loc > UNIFORM_BUFFER_SIZE) {
				return;
			}
			bx::memCopy(&dst[offset + _loc], _val, _numRegs*16);
		}

I can also just increase the buffer instead from src/renderer_mtl.mm:19
#define UNIFORM_BUFFER_SIZE (8*1024*1024)
To:
#define UNIFORM_BUFFER_SIZE (24*1024*1024)

from bgfx.

magester1 avatar magester1 commented on May 26, 2024

@joseph-montanez I'm not entirely sure we are seeing the same issue. I'm not testing this with my code, this is happening with example 17.
The source line where the crash happens is in the issue description, where it's trying to write onto the vk scratch memory more than is available, these values are regardless of release/debug as well. Plus there's the incorrect assert in ScratchBufferVK::write that only checks the start of the address and not the address + length of copy, although this would still result in a crash via the assert anyway so it doesn't really matter.

@bkaradzic If you mean my nvidia drivers then they are up to date. Is there any other Vulkan specific driver that I should have and I'm not aware of?

from bgfx.

joseph-montanez avatar joseph-montanez commented on May 26, 2024

@magester1 That limit 3971 is EXACTLY the number of quads I could draw on screen, if I went to 3972 nothing else would render and going beyond 4000+ would crash it. Which means somewhere there is a limit causing that. We are both hitting the same exact limit before crashing. Since I am using Metal, my fix will do nothing to help you but should help narrow the problem area around data thats trying to be passed to the shader. For me it was the unified memory. The VK implementation doesn't have this and there are several places that could tell you exactly whats wrong but you need to debug the application to get the stack trace with lines. The stack trace you originally provided doesn't have line number so you most likely do not have BGFX compiled/linked with the debug version to get the lines associated information to further track down the issue.

from bgfx.

magester1 avatar magester1 commented on May 26, 2024

But that's what I mean, this is happening because of vk's scratch memory, which I believe has nothing to do with Metal (please correct me if that's wrong). The number being the same seems like a happy coincidence to me, or maybe because bgfx is using this magic "128" for both of them?

Oh I feel like an idiot, I forgot to add the lines numbers to the stack trace!! Thank you for pointing that out. Just to clarify, I do have this running in debug mode, and I know exactly which lines are causing the issue (linked in the original description). But I don't know enough about Vulkan to understand the design decision behind the size of the scratch memory, that's why I created this ticket here.

Here's the trace with the line numbers, sorry about that I didn't realize they were missing:

example-17-drawstress.exe!bx::memCopy(void * _dst, const void * _src, unsigned __int64 _numBytes) (...\bgfx\bx\src\bx.cpp:44)
example-17-drawstress.exe!bgfx::vk::ScratchBufferVK::write(const void * _data, unsigned int _size) (...\bgfx\bgfx\src\renderer_vk.cpp:4644)
example-17-drawstress.exe!bgfx::vk::RendererContextVK::submit(bgfx::Frame * _render, bgfx::ClearQuad & _clearQuad, bgfx::TextVideoMemBlitter & _textVideoMemBlitter) (...\bgfx\bgfx\src\renderer_vk.cpp:8680)
example-17-drawstress.exe!bgfx::Context::renderFrame(int _msecs) (...\bgfx\bgfx\src\bgfx.cpp:2455)
example-17-drawstress.exe!bgfx::renderFrame(int _msecs) (...\bgfx\bgfx\src\bgfx.cpp:1489)
example-17-drawstress.exe!entry::Context::run(int _argc, const char * const * _argv) (...\bgfx\bgfx\examples\common\entry\entry_windows.cpp:521)
example-17-drawstress.exe!main(int _argc, const char * const * _argv) (...\bgfx\bgfx\examples\common\entry\entry_windows.cpp:1185)
example-17-drawstress.exe!invoke_main()
example-17-drawstress.exe!__scrt_common_main_seh()
example-17-drawstress.exe!__scrt_common_main()
example-17-drawstress.exe!mainCRTStartup(void * __formal)

from bgfx.

joseph-montanez avatar joseph-montanez commented on May 26, 2024

So here is the issue:

		uint8_t m_fsScratch[64<<10];
		uint8_t m_vsScratch[64<<10];

Take anything that increments in 16 and you get 3971 limit. BTW its also used for...

		void setShaderUniform(uint8_t _flags, uint32_t _regIndex, const void* _val, uint32_t _numRegs)
		{
			if (_flags & kUniformFragmentBit)
			{
				bx::memCopy(&m_fsScratch[_regIndex], _val, _numRegs*16);
			}
			else
			{
				bx::memCopy(&m_vsScratch[_regIndex], _val, _numRegs*16);
			}
		}

Why the limit... no idea. In my case macOS running on Arm64 doesn't have vram since its all shared memory. I am not sure why this needs to be limited to 64KB for Vulkan.

from bgfx.

magester1 avatar magester1 commented on May 26, 2024

In my case the main culprit was the m_scratchBuffer scratch buffer which is created here.
Although what you highlighted looks like an issue as well, and a bit odd that it's not using the BGFX_CONFIG_MAX_DRAW_CALLS macro instead of being hardcoded. I'm not sure what the relationship between the m_scratchBuffer and m_vs/fsScratch buffers is.

But yeah, like you I don't know why this limits exists or how it was determined. Specially considering that what goes here depends on the shader size (is it size in number of uniforms?), since with the original example shader it works fine up to the max draw calls.

from bgfx.

bkaradzic avatar bkaradzic commented on May 26, 2024

64k / 16 is 4096. If you're running out of fs/vsScratch that means you're setting over 4k uniforms.

from bgfx.

magester1 avatar magester1 commented on May 26, 2024

I don't think example-17 is setting any uniforms besides the default ones (you know view transformations, etc), so I don't think that's the issue.

from bgfx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.