Code Monkey home page Code Monkey logo

gpuinstance's Introduction

GPUInstance

Instancing & Animation library for Unity3D.

Alt text

This library can be used to quickly and efficiently render thousands to hundreds of thousands of complex models in Unity3D. At a high level, this library uses compute shaders to implement an entity hierarchy system akin to the GameObject-Transform hierarchy Unity3D uses.

Features

A scene with many cube instances flying out in all directions. Alt text

  • Fully Dynamic Skinned Mesh Rendering- *Multi skinned mesh supported
  • GPU based LOD, Culling, & skeleton LOD
  • Attaching/Detaching GPU instances to eachother to form complex entity hierarchies with instances composed of different mesh and materials
  • Pathing System for moving entities around on the GPU
  • Retrieving position & rotation of instances & bone instances without needed GPU readbacks
  • Slow down, speed up, and pause time for any instance
  • Billboarding, 2D texture animations supported as well.

Paths

  • An Example of the instancing pathing system. The cubes below are instanced and following paths that are running in a compute shader.
  • The paths being drawn below are just using Debug.DrawLine to show what the paths look like. The spheres are the actual path points. The paths are being automatically blended by the library to be smooth.
  • This isn't a pathfinding solution* The paths (ie like 10 points) are sent to the GPU and the instance will follow it for some period of time. This is useful because the instance wont have to be processed every frame by the CPU. And it won't have to send new matrices to the graphics card every frame.

Alt text

Billboards

  • This library is a bit overkill for billboards. But it can do them if you want. Additionally, this library can do 2D sprite sheet animations for you instances (just use a quad mesh).
  • Below is a scene with a bunch of exploding tile sprites Alt text

Performant Instancing

  • Below is a scene with a half million dynamic asteroids. Alt text

Instance Transform Readbacks

  • This library uses a discretized tick system to time all of the instance movement and animations.
  • Because of this, all of the paths, bones, positions, rotations, and scales of any instance can be lazily calculated at any time without doing GPU readbacks.
  • Below is a scene of a skinned mesh where all the bone transforms (white cubes) are being retrieved without requesting data from the GPU. Alt text

Can Slow down, Speed up, & pause instance times.

  • Just seemed useful for pausing the game. Alt text

Guide

This guide will be very basic- you will be expected to look at the demo scenes & demo models to learn how things work. You are expected to already know how to rig your models, create LODS (if you are using them), setup animations, etc... Additionally, this library requires that you know how to code (in c#).

Preparing a Skinned Mesh

  • To instance a Skinned Mesh, you need to use the window under Window->EditorGPUInstanceAnimComposer. Just drag the model prefab onto this window & press the compose button. The window may do nothing and print a warning. Adjust your model accordingly.
  • The Window->EditorGPUInstanceAnimComposer window will create alot of Assets that the instancing library needs. You will be directed to select a directory for these assets.
  • The final output will be labeled with prefab_gpu.
  • See the example scenes in the Assets/Models folder for examples of how a SkinnedMesh should look before using this editor. The only thing I really did to these models was create and add an animation controller after importing.
  • Additionally, you can add an optional 'GPUSkeletonLODComponent' to setup skeleton LOD.
    • Skeleton LOD will stop computing animations for bones after they are a certain distance from the camera.
    • Drag bone GameObjects onto the desired maximum detail LOD level you want them to animate at.

Instancing Stuff

  • I highly recommend that you look through the demo scene scripts. crowddemo is a good one to look at. It will show you how to spawn many skinned mesh and give them simple paths to follow.
  • The codebase is highly commented. If you don't understand how something works- then just open it up. Chances are there will be an explanation waiting for you.
  • I've added some additional explanation on the crowd demo scene script below.
// Initialize character mesh list
int hierarchy_depth, skeleton_bone_count; // This function will initialize the GPUAnimationControllers and combine any that have duplicate skeletons (to save space)
var controllers = GPUSkinnedMeshComponent.PrepareControllers(characters, out hierarchy_depth, out skeleton_bone_count);

// Initialize GPU Instancer
this.m = new MeshInstancer(); // The MeshInstancer is the main object you will be using to create/modify/destroy instances
this.m.Initialize(max_parent_depth: hierarchy_depth + 2, num_skeleton_bones: skeleton_bone_count, pathCount: 2);
this.p = new PathArrayHelper(this.m); // PathArrayHelper can be used to manage & spawn paths for instances to follow

// Add all animations to GPU buffer
this.m.SetAllAnimations(controllers);

// Add all character mesh types to GPU Instancer- this must be done for each different skinned mesh prefab you have
foreach (var character in this.characters)
    this.m.AddGPUSkinnedMeshType(character);

// Everything is initialized and ready to go.. So go ahead and create instances
for (int i = 0; i < N; i++)
    for (int j = 0; j < N; j++)
    {
        var mesh = characters[Random.Range(0, characters.Count)]; // pick a random character from our character list
        var anim = mesh.anim.namedAnimations["walk"]; // pick the walk animation from the character
        instances[i, j] = new SkinnedMesh(mesh, this.m); // The SkinnedMesh struct is used to specify GPU Skinned mesh instances- It will create and manage instances for every skinend mesh & skeleton bone.
        instances[i, j].mesh.position = new Vector3(i, 0, j); // set whatever position you want for the instance
        instances[i, j].SetRadius(1.75f); // The radius is used for culling & LOD. This library uses radius aware LOD & culling. Objects with larger radius will change LOD and be culled at greater distances from the camera.
        instances[i, j].Initialize(); // Each instance must be initialized before it can be rendered. Really this just allocates some IDs for the instance.

        instances[i, j].SetAnimation(anim, speed: 1.4f, start_time: Random.Range(0.0f, 1.0f)); // set the walk animation from before

        var path = GetNewPath(); // create new path
        instances[i, j].mesh.SetPath(path, this.m); // You have to make the instance aware of any paths it should be following.
        paths[i, j] = path;

        instances[i, j].UpdateAll(); // Finnally, invoke Update(). This function will append the instance you created above to a buffer which will be sent to the GPU.
    }


// Get New Path Function. This will create a simple 2-point path.
private Path GetNewPath()
{
    // Get 2 random points which will make a path
    var p1 = RandomPointOnFloor();
    var p2 = RandomPointOnFloor();
    while ((p1 - p2).magnitude < 10) // ensure the path is atleast 10 meters long
        p2 = RandomPointOnFloor();

    // The Path Struct will specify various parameters about how you want an instance to behave whilst following a path. See Pathing.cs for more details.
    Path p = new Path(path_length: 2, this.m, loop: true, path_time: (p2 - p1).magnitude, yaw_only: true, avg_path: false, smoothing: false);
    
    // Initialize path- this allocates path arrays & reserves a path gpu id
    this.p.InitializePath(ref p);
    
    // Copy path into buffers
    var start_index = this.p.StartIndexOfPath(p); // what happening here is we're just copying the 2 random points into an array
    this.p.path[start_index] = p1;
    this.p.path[start_index + 1] = p2;
    this.p.AutoCalcPathUpAndT(p); // Auto calculate the 'up' and 'T' values for the path.
    
    // Each path you create requires that you specify an 'up direction' for each point on the path
    // This is necessary for knowing how to orient the instance whilst it follows the path
    
    // Additionally you need to specify a 'T' value. This 'T' value can be thought of as an interpolation parameter along the path.
    // Setting path[4]=0.5 will mean that the instance will be half way done traversing the path at the 4th point on the path
    // The 'T' value is used to specify how fast/slow the instance will traverse each segment in the path.
    
    // send path to GPU
    this.p.UpdatePath(ref p); // Finally, this function will append the path you created to a buffer which will send it to the GPU!

    return p;
}

Some performance considerations

  • Try and reduce the number of multi-skinned meshes you have. Each additional mesh causes an additional DrawInstancedIndirect call- which results in more draw calls.
  • You can change the animation blend quality of your models at each LOD by just changing it on the SkinnedMeshRenderer of your model prefab. I recommend 1-2 bones for lower LODS- it is pointless to have more.
  • You can use different materials at different LODS. (And should). Eg, using fancy shader for LOD0 and basic diffuse for LOD4. Again, just specify this on your Skinned Mesh.
  • You can toggle shadows on off for different LODS- again on your skinned mesh.
  • Don't spawn more than ~50000 of the same (mesh,material) type. Instead break it up into batches by instantiating a new identical material.
    • You will have much higher FPS instancing 20 objects with instantiated materials than all one million as the same type. This has to due with contention on the GPU.
  • If you aren't using LODS for you skinned mesh then use them.
    • On a GTX 10606GB- All of the demos (using 10-15000 skinned mesh) will run at above 150FPS.
    • Without LOD- Maybe 20-30FPS. There is simply too many animated vertices.
  • This library has very little CPU overhead. You will only really get CPU overhead from populating the buffers which send data to the GPU.
  • Changing the depth of entities with many children can be expensive. If you need to reparent entities with many children, try keeping them at the same hierarchy depth before and after reparenting.
  • You can create/modify/and destroy instances on different threads than the Unity Main update thread.
  • That being said, thread safety for this library is implented via simple mutual exclusion locks.

Other Notes

  • Some of the animations look Jank ASF because I am not an artist- I used Mixamo rigger with all my LODS at once which results in Jank
  • Root animations not supported.
  • Very simple animation state- No animation blending is implemented.
  • If you want though, you can enable/disable animation for select bones and manually pose them yourself. Eg- Have a ragdoll control it or something.
  • Tile textures are supported. You can specify the tiling & offset for the instance to use.
  • There is also an optional per-instance color that you can set.
  • If you want to modify the compute shader and add your own stuff- you can overwrite some of the fields in the property struct safely.
    • You can overwrite the offset/tiling if you dont need them. You can overwrite the color if you dont need it. You can overwrite the pathInstanceTicks if not using a path. You can overwrite the instanceTicks if not using an animation. The pad2 field is completely unused- you can use it for whatever without any worries.
  • What version of Unity is supported? Unity 2023.2 is what this project was most recently built with- but it should work for most versions. See branches for versions with explicit support.

gpuinstance's People

Contributors

mkrebser avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gpuinstance's Issues

How do textured instances work? Is there a better way to reparent instances in batches?

Hello again, I've been hard at work optimizing my voxel engine and it seems that the new thing causing major lag spikes is re-parenting voxels. When I disconnect a section of voxels from an object, I need them to do just that, disconnect, so I re-parent them to a new instance with the positioning of a rigibody, and they fall away. The issue is obviously that, the larger the group, the slower this re-parenting is. That's why I was wondering if it would be possible to group the voxels into large cuboids, and just change the texture of the instance based on the color of the voxels along its edge. This would mean, however, that I would need to use textures created at runtime and use a different one for each instance, as well as change them as I go. So, my question is, would these different textures differentiate the instances on the GPU and defeat the purpose of using this library, and if so, is there some way to re-texture them without doing that? Finally, if none of that's possible, how should I re-parent these instances more efficiently?

Thanks for your continued help,
-Max

exampleInstanceComb

This Project is the Bomb! Thank you!

This isn't really an issue, but I wanted to leave you a positive review/comment. Not sure where else to leave this.

Compliment!: Comparing to the other competing solutions (where they bake vertex positions from each animation into a texture), this one appears to be superior, in that you bake the Bone Keyframes to the GPU and do animation normally using the bones. That's a LOT less data to be processed and sent to the GPU, and seems to me as a superior solution.

Question: How does your solution stack up against the "GPU Instancer" on the Asset Store (and it's add-on "Crowd Animations"?

BUG: I did find one bug -- the "Camera Culling" was only showing a "sliver" of content, instead of true "near to far distance of frustum" (was instead like distance 10 to 12, not 0.1 to 100 as I'd expect). So I nulled this out so that I could see more content.... and with 10,000 non-culled walking animated characters, I'm still running at 30 FPS.

Packaging iOS to draw real-time point cloud will report an error,How to solve this problem?

Xcode Packaging iOS to draw real-time point cloud will report an error:
ArgumentException: The t parameter is a generic type.
Parameter name: t
at UnityEngine.Rendering.CommandBuffer.SetBufferData (UnityEngine.ComputeBuffer buffer, System.Array data, System.Int32 managedBufferStartIndex, System.Int32 graphicsBufferStartIndex, System.Int32 count) [0x00000] in <00000000000000000000000000000000>:0
at GPUInstance.InstanceMeshDeltaBuffer`1+InstanceMeshIndirectIDBuffer[T].UpdateComputeBuffer (UnityEngine.ComputeBuffer buffer, UnityEngine.Rendering.CommandBuffer cmd) [0x00000] in <00000000000000000000000000000000>:0
at GPUInstance.instancemesh.computeshader_UpdateDataTask (System.Boolean& force_dispatch) [0x00000] in <00000000000000000000000000000000>:0
at GPUInstance.instancemesh.Update (System.Single dt) [0x00000] in <00000000000000000000000000000000>:0
at GPUInstance.MeshInstancer.Update (System.Single deltaTime) [0x00000] in <00000000000000000000000000000000>:0
at PointCloudSubscriber.Update () [0x00000] in <00000000000000000000000000000000>:0

Camera Controls Wonky (always rotating with mouse)

I created a "MainCamera" prefab, with the following adjusted settings on "CameraFreeFly" script:

Move Speed: 4
Sensitivity 4
Rotate When Down: Mouse1 (same as for Unity editor)
Sprint Multiplier: 7

The biggest need here is for "Mouse1" rotation (right mouse button) -- so that the camera isn't constantly rotating.
And the slower default speed with "7" Sprint, allows for fine-tuned motion, and fast motion (both needed for exploring the demo)

Then use this MainCamera Prefab in ALL demo scenes, and so if a user wants to adjust the Camera settings, they need only change the values on the prefab.

So this issue is just a suggestion to update to using a Camera prefab, with better settings (like those above).

Help with positioning

Hi,
I want to use it as a method of introducing animations in dots as the official animation package is still not ready.
so i want to place the instances myself in code.
but setting position gives an error in as that is only available for billboards.
is there a way to set position of instances in code.

Appending Position is 1 frame late?

Hello again! I've noticed that when an instance's position is appended, it only visually applies a frame late, at least when compared with a transform. Is there a way to circumvent this? I typically wouldn't mind an issue this minor, but I'm attempting to use this instance as a player view model, making the delay quite noticeable to players when moving.

Thanks in advance,
Max

Android Doesn't Work At All

Tested With

  • GPU Adreno 509
  • OPEN GLES 3
  • ES 3.1, 3.2
  • VULKAN

Due to some API changes on the Unity side, I changed all errors after that windows build, and (editor play mode) worked as expected I don't have any experience on shader(and also types and requirements)

On Android, there is two error

  • Nothing rendering on the phone
  • The unity debugger throws these
  1. GLSL link error: Error: BufferBlock BufferBlock location or component exceeds max allowed. (default and skinned mesh)
  2. Autoconnected Player Can't add component because class 'BoxCollider' doesn't exist!
  3. Autoconnected Player ArgumentException: The t parameter is a generic type.
    Parameter name: t

Is this project suitable for mobile? If so, what are the things I need to do to solve the current bugs ?

"Tried to parent a child to an unitialized parent" error on a parent that's already been initialized multiple frames ago.

I know for a fact that the instance was initialized because I literally see it, and when trying to initialize it again, I get the "already initialized" error. I also know the parentID is 1 on every instance parented at startup, and that I'm inputting 1 to the new instances, so it's not that. Is there something I'm doing wrong? How do I parent new instances to an old instance? Sorry if this is another case of me missing it in the code guide or something, but I'm pretty lost.

Thank you once again,
Max

Changing render distance of instances

I scoured the codebase for info on this but the closest thing I could find was UniformCullingDistance and FrustumCullingType both of which seem to have no effect on the instance's render distance. Perhaps I'm assigning it incorrectly or something, but I can't figure it out so my question is, how do I change the culling distance of the instances the right way?

Not an issue - Amazing Project! Would love to have a URP version, even payed!

Hello,
this is not an issue. I just try to reach out to you this way. :)
First of all, I want to say this project is amazing! Compared to other gpu instancing solutions this one is by far the best!
I would love to have URP support. Is there any way you could help me in this regards? I tried to convert the shaders myself but without any luck. I would definitely be willing to pay for a URP version.

Edit2: Getting instance's position in shadowmap for uniform shading shader

Hello! I've been working on a voxel engine in Unity using your instancing library for the rendering of the voxels, and after stress testing, concluded that I desperately needed LOD models for distant voxels in the case of larger-scale maps. After setting up the LOD settings I got much better performance, but because of the shape of my LOD models, had very obvious shading differences in the distant lower detail models than I did with the full-resolution voxels. This led me to try to make a shader that would either make the whole voxel model shaded or unshaded (uniform shading). I'm very new to shaders but was able to throw together an unlit one after some trial and error, only to realize it wasn't what I was looking for. I began trying to use a typical v2f shader with the instancing to have more control over the shadows and shading but quickly found that the voxels would now no longer instance correctly, instead instancing as white with their position all as 0,0,0. This happened a few times previously, during the creation of my unlit shader when I was missing any of the instancing-specific keywords or something, but in this case, I can't use the typical instancing keywords because I'm no longer using the standard surface shader. So I guess my question is;

(TLDR;) What parts are required to get the shader working right on instancing or do you have a workaround for uniform shading? Thank you so much for reading my ramblings in advance, and I hope this isn't too much to ask!

-Max

My current shader attempt:

        Shader "Instanced/instancemesh_distantVoxel"
	{
		Properties
		{
			_MainTex("Albedo (RGB)", 2D) = "white" {}
			//_LightingAdjustment("LightingAdjustment", Range(0, 3)) = 1.5
		}
		
		SubShader
		{
			Tags{ "RenderType" = "Opaque" }
			LOD 200

			CGPROGRAM
			// Upgrade NOTE: excluded shader from OpenGL ES 2.0 because it uses non-square matrices
			#pragma exclude_renderers gles
			// Physically based Standard lighting model
			#pragma surface surf Standard noforwardadd vertex:vert finalcolor:colorOverride
			#pragma instancing_options procedural:setup
			#pragma target 5.0
			#pragma multi_compile_instancing
			#include "AutoLight.cginc"
			#include "Lighting.cginc"

			sampler2D _MainTex;
			half4 colorSample;
			int sampledColor = 0;
			float shadowAtten = 0;
			//float _LightingAdjustment;

			#include "gpuinstance_includes.cginc"

			struct Input
			{
				float2 uv_MainTex;
				fixed4 color;
			};

			void setup()
			{
				do_instance_setup();
			}

			void vert(inout appdata_full v, out Input o)
			{
				UNITY_INITIALIZE_OUTPUT(Input, o);
				if (sampledColor < 1)
				{
					shadowAtten = SHADOW_ATTENUATION(v);
				}
				int id = get_instance_id();
				o.color = get_instance_color(id);
			}

			void surf(Input IN, inout SurfaceOutputStandard o)
			{
				if (sampledColor < 1)
				{
					fixed4 c = tex2D(_MainTex, IN.uv_MainTex) * IN.color;
					o.Albedo = c.rgb;
					o.Alpha = c.a;
					//o.Normal = 1;
					o.Occlusion = 0;
					o.Metallic = 0;
					o.Smoothness = 0;
					//o.Emission = c.rgb / (fixed4(1, 0.9568627f, 0.8392157f, 1) * 0.8f);
					colorSample = half4(c.r, c.g, c.b, 1);
					sampledColor++;
				}
				o.Albedo = colorSample;
				o.Alpha = 1;
				//o.Normal = 1;
				o.Occlusion = 0;
				o.Metallic = 0;
				o.Smoothness = 0;
			}

			void colorOverride(Input IN, SurfaceOutputStandard o, inout fixed4 color)
			{
				color = colorSample / (1 + shadowAtten);
			}

			ENDCG
		}
		FallBack "Diffuse"
	}

Diffirent Shader

Hi, the demos work very well. I had no problems with spawning to the points I wanted. At this point I want to run these codes for different shader types. For example the URP Lit shader. Is there an extra function for shader graph?

Enabling/Disabling Animations For Certain Bones

Hi There

Really Powerful Tool You Have Created Here I'm Very Happy With How Stable It Is

My Question Is How Can I Enable/Disable Some Bones Per Animation To Achieve Some Sort Of Animation Blending

Kind Regards & Thanks For Taking Time To Answer Me

High GPU Usage for Culling?

I'm running the "instanceMeshAnimationDemo" with all of the block-people (225 x 225 = 50,625 instances!).

I set "Application.targetFrameRate = 60", so that it doesn't keep pegging my GPU Usage to 100% (and causing fan to turn on to cool it).... So for this demo, running at 60 FPS, it uses 41% of my GPU looking at the whole horde of them (all in frustum).

If I turn away, so that NONE are showing, the GPU is still 15%, even when Camera is still.

I'm not for sure, but this seems to indicate that maybe the "culling" logic could be optimized considerably.

One suggestion comes to mind:

  1. Only do "culling" type logic periodically (2 Hz), or for significant camera Rotation.
  2. Widen Frustum by 5 degrees (each direction) - and then you can trigger culling to re-run if Camera moves by more than 5 degrees Left/Right.

So the culling would run at "{A} Hz" guaranteed (to account for animated movement coming in/out of focus), OR if the camera is turned Left/Right by "{B} Degrees". These should be user configurable.

I tried simply not calling "SelectRenderInstances" as much -- but apparently, in your pipeline, this is needed EVERY FRAME, else nothing shows up (thus my lame/easy attempt to optimize failed).

Changing the Layer of Instances?

Hey, it's me again! I was wondering whether or not it would be possible to change the layer of each instance, as I want my camera to cull a specific instance and ignore the rest. I was messing with it and found that the instances use the default layer, so could change this on an instance basis?

Thanks as always,
-Max

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.