What does "Pass platform caps keyword defines to compute shaders" mean?

August 21, 2017, 5:22 pm

≪ Previous: How to disable Warnings in Compute Shaders ?

In the [2017.1.0 release notes][1] it says "Shaders: Pass platform caps keyword defines to compute shaders." What does that mean? [1]: https://unity3d.com/unity/what's-new/unity-2017.1.0

↧

Compute Shader calculation time

September 1, 2017, 12:12 pm

≫ Next: How would you index a texture in a compute shader to matched UV coordinates

≪ Previous: What does "Pass platform caps keyword defines to compute shaders" mean?

Hi, how can I know for sure that from the moment I call: shader.Dispatch(kernelHandle, x, y, z); the next line the compute shader is done with the calculations? I mean, lets say I want to calculate some data and send a compute buffer, how can I know for sure that right after I called Dispatch, the data from the compute buffer is ready so I can get it again? I've seen a lot of examples and all of them shows something like: shader.SetBuffer(kernelHandle, "TestBuffer", buffer); shader.Dispatch(kernelHandle, 8, 8, 1); Test[] testBack = new Test[2]; buffer.GetData(testBack); I mean how the application knows that after dispatch you can already get the data? Or the calculation is so fast that the calculation time is smaller than going from one line of code to another?

↧

How would you index a texture in a compute shader to matched UV coordinates

September 6, 2017, 9:49 am

≫ Next: How would I use compute shader with physics?

≪ Previous: Compute Shader calculation time

If you have a compute shader with 16 by 16 by 1 threads, run 32 by 32 by 1 times, for a texture size 512 * 512, how would you index that in a way that would match the UV coordinates of a texture. This would be the indexed texel of the texture: e.g. float2 texelID = dispatchThreadID.xy uv coordinates would come from a vertex buffer created for a compute shader: float2 uv = inputBuffer[index].UV Put the texelID into the UV coordinate space (scale at least) texelID /= 512 Question is: is there any coordinate inversions I need to do to ensure I am mapping the texel into the UV coordinates correctly?

↧

How would I use compute shader with physics?

September 11, 2017, 3:05 pm

≫ Next: How do I use compute shaders with particles?

≪ Previous: How would you index a texture in a compute shader to matched UV coordinates

Hello, I've been trying to make a simulation game using particles but my main problem is how to make them collide with each other. About everywhere I've looked had something talking about compute shaders and using them to do this. I have no knowledge of compute shaders at all and never even used a shader. If someone could please link me to a tutorial going over how to use compute shaders and not just what they can do like I saw with other ones. Thank you!

↧

How do I use compute shaders with particles?

September 20, 2017, 3:37 pm

≫ Next: Accessing and writing depth on a render texture in compute shader

≪ Previous: How would I use compute shader with physics?

Hello, I'm trying to make a 2D mobile simulation game that is entirely based on particles. It's meant to have different elements you can spawn that act in different ways, at the moment I made water but I need to make the particles collide with other particles in the system and other particles systems. I know the basics of compute shaders but I don't know how to use physics with them. I've seen this in one game called Jelly In The Sky which used compute shaders and particles for self collision. So what I'm asking is how would I make a compute shaders to calculate the collisions for the particles and make them collide? Thanks.

↧

Accessing and writing depth on a render texture in compute shader

September 27, 2017, 8:14 am

≫ Next: ComputeBuffer low-level access in Direct3D 11 in Unity 5.5.0B3

≪ Previous: How do I use compute shaders with particles?

Hey, so this might seem to be a stupid question, but so far after a few days of trying to find this I have found only people who don't know how to do this or people who assume you know how to do this, but no actual explanation. I am writing a compute shader which I am giving a rendertexture. So far changing the color of the rendertexture works fine. I declare it as a RWTexture2D and set the color for each pixel as a float4. My problem is that I also need to write to the depth buffer of my render texture too. How do I do this? Is there a way to read and write to the depth part of the rendertexture I am giving to my compute shader?

↧

ComputeBuffer low-level access in Direct3D 11 in Unity 5.5.0B3

September 14, 2016, 5:44 pm

≫ Next: Geometry Shader is drawing triangles in wrong vertex positions

≪ Previous: Accessing and writing depth on a render texture in compute shader

Hi all, I just discovered **ComputeBuffer.GetNativeBufferPtr** in Unity 5.5.0B3. Very very awesome!! I'm trying to use the ID3D11Buffer with CUDA interop but it keeps crashing. When looking at the buffer description, Direct3D reveals this no matter what size or type I specify: - ByteWidth: 256 - Usage: 256 - BindFlags: 1 - CPUAccessFlags: 1 - MiscFlags: 2 - StructureByteStride: 1 I'm wondering whether this feature is ready to use yet or I'm doing something wrong. Thank you so much! David

↧

Geometry Shader is drawing triangles in wrong vertex positions

September 14, 2016, 11:29 pm

≫ Next: UVs are not mapped properly in my unity surface shader

≪ Previous: ComputeBuffer low-level access in Direct3D 11 in Unity 5.5.0B3

Hi, After learning about shaders in past couple of weeks, I wanted to try something on my own. So I tried to draw a face for a cube using Compute Shader and Geometry Shader. Compute shader is generating 16 points ( 2x2 x 2x2) which are equally spaced by 1 unit and bottom left vertex is at origin. Below is the result without Geometry shader. ![alt text][1] When I use Geometry Shader for generating triangles on each point. It is changing the vertices in x and y axis a little bit as you can see below. There is a gap between rows of triangles. ![alt text][2] Here is my compute and geometry shader code. Shader "Custom/CubeShader" { SubShader { Pass { CGPROGRAM #pragma target 5.0 #pragma vertex vert #pragma geometry GS_Main #pragma fragment frag #include "UnityCG.cginc" StructuredBuffer square_points; struct ps_input { float4 pos : POSITION; }; struct gs_input { float4 pos : POSITION; }; ps_input vert(uint id : SV_VertexID) { ps_input o; float3 worldPos = square_points[id]; o.pos = mul(UNITY_MATRIX_MVP, float4(worldPos,1.0f)); return o; } [maxvertexcount(3)] void GS_Main(point gs_input p[1], inout TriangleStream triStream) { float4 v[3]; v[0] = float4(p[0].pos.x, p[0].pos.y, p[0].pos.z, 1); v[1] = float4(p[0].pos.x + 1.0f, p[0].pos.y + 0.0f, p[0].pos.z + 0.0f, 1); v[2] = float4(p[0].pos.x + 1.0f, p[0].pos.y + 1.0f, p[0].pos.z + 0.0f, 1); float4x4 vp = mul(UNITY_MATRIX_VP, _World2Object); gs_input pIn; pIn.pos = mul(vp, v[2]); triStream.Append(pIn); pIn.pos = mul(vp, v[1]); triStream.Append(pIn); pIn.pos = mul(vp, v[0]); triStream.Append(pIn); } float4 frag(ps_input i) : COLOR { return float4(1,0,0,1); } ENDCG } } Fallback Off } #pragma kernel CSMain #define thread_group_size_x 2 #define thread_group_size_y 2 #define thread_group_size_z 1 #define group_size_x 2 #define group_size_y 2 #define group_size_z 1 struct PositionStruct { float3 pos; }; RWStructuredBuffer output; // I know this function is unnecessary. But only for now :) I would like to add more things here. float3 GetPosition(float3 p, int idx) { return p; } [numthreads(thread_group_size_x, thread_group_size_y, thread_group_size_z)] void CSMain(uint3 id : SV_DispatchThreadID) { int idx = id.x + (id.y * thread_group_size_x * group_size_x) + (id.z * thread_group_size_x * group_size_y * group_size_z); float3 pos = float3(id.x, id.y, id.z); pos = GetPosition(pos, idx); output[idx].pos = pos; } Please help me debug it. Thanks in advance :) [1]: /storage/temp/78258-pnts-1.png [2]: /storage/temp/78259-trngs-1.png

↧

UVs are not mapped properly in my unity surface shader

September 28, 2016, 1:43 pm

≫ Next: Indexing an array thousands of times per frame

≪ Previous: Geometry Shader is drawing triangles in wrong vertex positions

Hi, I am trying to use compute shaders with surface shaders with dx11 tesselation shaders ( Using unity's distance base tesselation example from unity guide ). I copied the generated code from surface shaders and used it as a seperate shader. I am generating vertices for a plane in compute shader. The triangles are being generated properly but I seem to have to a problem with mapping the UVs. With using any texture all I get the is black plane. Here is my surface shader's vertex shader: InternalTessInterp_appdata tessvert_surf(appdata v, uint id:SV_VertexID) { InternalTessInterp_appdata o; o.vertex = float4(vertexBuffer[id],1); o.tangent = v.tangent; o.normal = v.normal; o.texcoord = float2(o.vertex.x/16,o.vertex.y/16); return o; } Here is my Compute shader (the value of thread group size x, group size y and z is 4) void GenerateMesh(uint3 id : SV_DispatchThreadID) { int idx = id.x + (id.y * thread_group_size_x * group_size_x) + (id.z * thread_group_size_x * group_size_y * group_size_z); float3 pos = float3(id.x, id.y, id.z); output[6 * idx + 0].pos = pos + float3(1, 1, 0); output[6 * idx + 1].pos = pos + float3(1, 0, 0); output[6 * idx + 2].pos = pos; output[6 * idx + 3].pos = pos; output[6 * idx + 4].pos = pos + float3(0, 1, 0); output[6 * idx + 5].pos = pos + float3(1, 1, 0); } Here is my CSharp code void DrawMesh() { planetComputeShader.Dispatch(kernel, 4, 4, 1); planetTesselationMaterial.SetPass(0); planetTesselationMaterial.SetBuffer("vertexBuffer",vertexBuffer); Graphics.DrawProcedural(MeshTopology.Triangles,6 * 289); } My final aim to create a procedural planet generated on GPU. I know I am far way from doing this but I am taking small steps :-) Please help me debug this. Thanks in advance.

↧

Indexing an array thousands of times per frame

October 2, 2016, 11:00 pm

≫ Next: confusing compute shader results

≪ Previous: UVs are not mapped properly in my unity surface shader

I have an array that is 60 by 60 by 60 filled with Vector3 wind data. What I am trying to do is visualize how the wind is flowing with arrows that move around the scene. Right now there is too much processing going on and the game freezes. Currenty I am working with a smaller array that way the game can load. I am wondering what actions I can I take to optimize this process. I was told to try and use a compute shader because every object is doing the exact same task, but the documentation on compute shaders seems pretty slim. I also don't know, can the GPU access an array of data to get wind information? Please share any tips that might help and just note how much difference you think it would make. I don't have much experience with Unity or shaders at all, but I have Google! Thank you! What the game does: First 216,000 (60*60*60) objects are created and there positions are set to the index values of our wind array. ex: (1,1,1) (1,1,2) ... (60,60,60) Every frame their positions are transformed by taking there current position + the value from the wind data array. Lets say a partial is at (5.3,2.0,6.9) the new position will be (5.3,2.0,6.9) + windData[5,2,6] The rotation of the objects are also set to the windData to make sure the arrow is pointing the correct way. If an object is out of the 60 by 60 by 60 cube then its position is set to its starting location

↧

confusing compute shader results

October 13, 2016, 3:54 am

≫ Next: Strange behaviour on compute shader

≪ Previous: Indexing an array thousands of times per frame

I have a compute shader that generates only positions for my geom shader to build cubes. The compute shader has 4 by 4 by 4 threadGroups and a group size of 1 by 1 by 1. So there are 4*4*4 positions 'calculated', but even if i set the position for all the buffer elements to (1, 0, 0) i see more than one cube (thats been rendered on each position) on the screen, means that there are strangely come in different positions?! If i hardcode a fix position in the geom (or vert) shader the problem is gone so it has to be in the buffer/compute shader. I simplified all of the code but can't find the problem causing this odd behaviour. Anybody got a same behaviour or can tell what's happening?

↧

Strange behaviour on compute shader

October 13, 2016, 3:55 am

≫ Next: RWTexture2D in Compute Shader on Android?

≪ Previous: confusing compute shader results

I've written a small compute shader that spit out positions for my geom shader. The threadGroup size is 4by4by4 and the group size just 1. So I get 4*4*4=64 positions where my geom shader have to generate a simple cube. But I get some strange behaviours... Even if I set all the buffer elements (my positions) to (1.0, 0.0, 0.0) I see more than one cube on the screen :( if I set the position directly in the geom(vert) shader to this value I have only my single cube, so it has to be something in the compute shader right?! Is there someone having a similar problem or someone that can give me a hint of what causes this strange behaviour?

↧

RWTexture2D in Compute Shader on Android?

December 8, 2016, 4:31 am

≫ Next: Computeshader array output

≪ Previous: Strange behaviour on compute shader

**I am trying to use [this asset][1] to test out a compute shader on android**, that allows texture modification. I'm new to shaders, I could have misunderstood or not realizing some kind of limitation on mobile. I had assumed since the editor (while under android platform) worked that it would just work on android. I have tested this in the editor, on android platform, with opengl 3.1 as min graphics API, disabled the graphics emulation, and the asset works as expected. Player settings like this: ![alt text][2] When testing it on android, the asset caused a force close on Unity 5.4, and on 5.5, it would complain about "*Kernel at index 0 is invalid*" when I ran the code the same as in the editor. If I modify the shader code by commenting out a section where it assigns to a RWTexture2D, then the error goes away, but obviously nothing is drawn to the screen. This is a snippet of code to describe the problem area (cannot share complete code, its asset from asset store): #pragma kernel pixelCalc RWTexture2D textureOut; [numthreads(32,32,1)] void pixelCalc (uint3 id : SV_DispatchThreadID){ // some stuff happens here... while (itn < 255 && d < 4){ // calculations and stuff... itn++; } if (itn == 256) // if this line and 3 below it are commented out, runs on android! textureOut[id.xy] = float4(0, 0, 0, 1); else textureOut[id.xy] = colors[itn]; } **Hope somebody can confirm that RWTexture2D on a compute shader should work or not on android?** Has anybody had this error and found a way to make it work on android? [1]: https://www.assetstore.unity3d.com/en/#!/content/72649 [2]: /storage/temp/83658-b8b256d68784a6cdb751453836cd2dee.png

↧

Computeshader array output

January 15, 2017, 11:39 pm

≫ Next: Use of compute shader in coroutine freeze the rendering threads

≪ Previous: RWTexture2D in Compute Shader on Android?

Hi Everyone, is it possible to output for example 8 vertices in output buffer using array in C#, but not using unsafe or fixed struct? for example, look at this code: struct outData { public float sdf; float v0, v1, v2, v3, v4, v5, v6; float v7, v8, v9, v10, v11, v12, v13; float v14, v15, v16, v17, v18, v19, v20; float v21, v22, v23; }; this is not nice code, is it only solution? Thanks, bye

↧

Use of compute shader in coroutine freeze the rendering threads

January 26, 2017, 3:48 pm

≫ Next: Use compute shader to go RGBA to RGB

≪ Previous: Computeshader array output

I'm currently working on a learning AI based on the Double DQN algorithm. As the code is huge and I couldn't find which part of the code is bugged, I'll first described the system. I'm working on a smaller example project to share, but I prefer present my problem first. I developed a compute shader to manipulate a neural network on the GPU. The learning algorithm run on the CPU side, in a coroutine. The main loop is used to managed the interface with some button and slider for visual feedback. Everything is fine when I try the framework on small test project. My problem appears when I work on my real project. The coroutine seems to run perfectly, I can follow the full loop on the debugger. But, despite the yield method, the interface loop never resume, and the editor freeze. So, waiting for an example project, is there a known reason for the main loop to not resume after a "yield return null;" statement in a coroutine. [ps: I'm not fluent in english, sorry for the possible speaking error.]

↧

Use compute shader to go RGBA to RGB

January 27, 2017, 2:18 pm

≫ Next: Compute Shader compile error from Texture3D.Sample()

≪ Previous: Use of compute shader in coroutine freeze the rendering threads

Is it possible to write a compute shader that can convert an RGBA image to a RGB image? I am new to HLSL and I cannot find a type of byte or uchar to make a structured buffer like


RWStructuredBuffer< uchar > WouldntThisBeNice;

Many functions in OpenCV work on RGB images but not RGBA. I am already successful in flipping my RGBA images up/down before sending them to OpenCV and figured it would nice to avoid an OpenCV call to convert the image on the CPU.

↧

Compute Shader compile error from Texture3D.Sample()

February 10, 2017, 11:31 am

≫ Next: Compute shader FindKernel() fails with unity exception

≪ Previous: Use compute shader to go RGBA to RGB

I'm trying to sample a 3D noise texture that I generate elsewhere and pass to my ComputeShader. However, when compiling the following, all I get is "Shader error in 'DensityGenerator.compute': cannot map expression to cs_5_0 instruction set at DensityGenerator.compute(14) (on d3d11)". I really haven't got a clue where to start with this. It [looks right][1] and the only way to get it to compile is to comment out line 14 (noiseVol.Sample(), specifically). What obvious bug have I missed? #pragma kernel Density Texture3D noiseVol; SamplerState samplerNoiseVol; RWStructuredBuffer voxels; [numthreads(32,32,1)] void Density (uint3 threadId : SV_DispatchThreadID, uint3 groupId : SV_GroupID) { int size = 32; int3 voxPos = threadId; // Just rename this for the sake of clarity float density = 0;//-voxPos.y; density += noiseVol.Sample(samplerNoiseVol, voxPos, 0).z; voxels[voxPos.x + voxPos.y*size + voxPos.z*size*size] = density; } [1]: https://msdn.microsoft.com/en-us/library/windows/desktop/bb509695(v=vs.85).aspx

↧

Compute shader FindKernel() fails with unity exception

March 10, 2017, 7:55 am

≫ Next: How (Or Why) to use Consume Buffers safely on arbitrary data lengths

≪ Previous: Compute Shader compile error from Texture3D.Sample()

So the shader code is here: http://pastebin.com/Tbgu7UvH And the script is here: http://pastebin.com/r19BV1Rh If I try to find the only kernel that I have, I get this: Kernel 'CSMain' not found UnityEngine.ComputeShader:FindKernel(String) Controller:Start() (at Assets/Controller.cs:10) and UnityException: FindKernel failed Controller.Start () (at Assets/Controller.cs:10) If I try to Dispatch directly by index such as this: shader.Dispatch(0, 1, 1, 1); Then I receive this error: Kernel index (0) out of range UnityEngine.ComputeShader:Dispatch(Int32, Int32, Int32, Int32)

↧

How (Or Why) to use Consume Buffers safely on arbitrary data lengths

March 12, 2017, 8:19 am

≫ Next: Calling Graphics.DrawProcedural multiple times for chunked procedural terrain

≪ Previous: Compute shader FindKernel() fails with unity exception

Hey guys, I wanted to use compute shaders to write object position into a texture, then manipulate the texture and sometimes read a value back from the texture. To avoid too much stalling, I wanted to write positions into the texture every 100ms (for example). So every update write position into a list, then send that list as a consume buffer on that 100ms tick. But I do not know how many samples I will be sending per time, which is why I thought consume buffers were the solution, but now I learned they are rather "dumb". But if I consume more elements than it contains, I start getting into trouble. So how do I do this? Do I have to have a 1x1x1 sized thread group and just dispatch only enough groups for the list length? Or is there a smarter way? Or maybe I should use a different approach altogether?

↧

Calling Graphics.DrawProcedural multiple times for chunked procedural terrain

May 8, 2017, 5:56 am

≫ Next: Compute shader on UWP

≪ Previous: How (Or Why) to use Consume Buffers safely on arbitrary data lengths

In my project, I'm creating chunks of 3D procedural terrain (voxels) using a series of compute shaders and then passing the vertex data of each chunk from a ComputeBuffer to the Graphics.DrawProcedural method to be rendered in a surface shader. void OnPostRender() { foreach(GPUMeshData gMeshData in gpuMeshData) { m_drawBuffer.SetBuffer("_Buffer", gMeshData._Vert); m_drawBuffer.SetBuffer("_ColorBuffer", gMeshData._Color); m_drawBuffer.SetPass(0); Graphics.DrawProcedural(MeshTopology.Triangles, SIZE); } } ... ... public struct GPUMeshData { public int meshNum; public ComputeBuffer _Vert; public ComputeBuffer _Color; public GPUMeshData(int meshNumber, ComputeBuffer vert, ComputeBuffer color) { meshNum = meshNumber; _Vert = vert; _Color = color; } } It works OK, but the problem is that it appears the buffer data seems to get jumbled up intermittently. Newer buffer vertex data is somehow getting merged with older vertex data that was in previous frames but should no longer be present in my GPUMeshData list. As a result, old meshes at different LODs are overlapping my newly rendered chunks. It starts to get ugly quick. ![alt text][1] Through debugging, I know for certain that I'm not making calls to re-render the "bad" chunks/buffers after I remove them, yet somehow the old data gets mixed into one of my new buffers. When I remove objects from the GPUMeshData list, I am doing a Release() on the ComputeBuffers as well: public void removeChunkGPU(int meshNum) { if(gpuMeshData.Exists(x => x.meshNum == meshNum)) { GPUMeshData gMeshData = gpuMeshData.Find(x => x.meshNum == meshNum); if(gMeshData._Vert != null) gMeshData._Vert.Release(); if(gMeshData._Color != null) gMeshData._Color.Release(); gpuMeshData.RemoveAll(x => x.meshNum == meshNum); } } I'm just trying to find out... am I doing a big "no-no" here by making multiple DrawProcedural calls for different buffers per frame? I can't understand how the older data is getting "stuck" in the Graphics pipeline. I also found a very similar question asked here: https://forum.unity3d.com/threads/compute-shaders-and-drawprocedural-drawproceduralindirect.413196/ In my case though, I only need to render a maximum of ~350 chunks in the worst case. But as that poster mentioned, merging all chunks into a single buffer just seems counter-intuitive to me. Any thoughts are appreciated! **EDIT: ** So I discovered something that seems to fix the issue, but I'm not sure why exactly. Essentially if I pre-initialize all the values in my mesh ComputeBuffers using a **SetData()** call before generating data in them, the problem no longer occurs. public void generateChunkGPU(OctreeNode node) { ... ... gpuMeshData.Add(new GPUMeshData(meshNum, new ComputeBuffer(SIZE, sizeof(float)*7), new ComputeBuffer(SIZE, sizeof(float)*4))); GPUMeshData gMeshData = gpuMeshData[gpuMeshData.Count-1]; // Initialize all verts to -1 float[] val = new float[SIZE*7]; for(int k = 0; k < SIZE*7; k++) val[k] = -1.0f; gMeshData._Vert.SetData(val); ... perlinNoise.SetFloat("_ChunkWidth", chunkWidth); ... perlinNoise.SetBuffer(0, "_Result", noiseBuffer); perlinNoise.Dispatch(0, 4, 4, 4); marchingCubes.SetFloat("_ChunkWidth", chunkWidth); ... marchingCubes.SetBuffer(0, "_Voxels", noiseBuffer); marchingCubes.SetBuffer(0, "_Buffer", gMeshData._Vert); marchingCubes.Dispatch(0, 4, 4, 4); } ![alt text][2] Obviously though **SetData()** is very expensive and stalls on the CPU, so I'd like to avoid using it. But this seems to suggest that whenever I create a new ComputeBuffer, there is some "left over" data sitting in memory where it's allocating to create the new buffer. I also tried writing another ComputeShader to just "clear" the buffer in my remove function: public void removeChunkGPU(int meshNum) { if(gpuMeshData.Exists(x => x.meshNum == meshNum)) { GPUMeshData gMeshData = gpuMeshData.Find(x => x.meshNum == meshNum); // Clear buffer before releasing initializeMeshBuffer.SetInt("SIZE", SIZE); initializeMeshBuffer.SetBuffer(0, "_Buffer", gMeshData._Vert); initializeMeshBuffer.Dispatch(0, 0, 0, 1); if(gMeshData._Vert != null) gMeshData._Vert.Release(); if(gMeshData._Color != null) gMeshData._Color.Release(); gpuMeshData.RemoveAll(x => x.meshNum == meshNum); } } But that didn't seem to help at all.. Here's the shader code for it anyway: #pragma kernel CSMain struct Vert { float4 position; float3 normal; }; int SIZE; RWStructuredBuffer _Buffer; [numthreads(1,1,1)] void CSMain (uint3 id : SV_DispatchThreadID) { Vert vert; vert.position = float4(-1.0, -1.0, -1.0, -1.0); vert.normal = float3(-1.0, -1.0, -1.0); for(int i = 0; i < SIZE; i++) _Buffer[i] = vert; } Anybody have any thoughts on how I can avoid using **SetData()** and truly get a clear ComputeBuffer each time I create a new one? [1]: /storage/temp/93386-overlapping-meshes-issue.jpg [2]: /storage/temp/93958-overlapping-meshes-no-issue.jpg

↧