App Hub
Sort Discussions: Previous Discussion Next Discussion
Page 1 of 1 (15 posts)

DynamicBuffers vs DrawUserIndexedPrimitives

Last post 11/6/2010 4:29 PM by Nuclex Games. 14 replies.
  • 11/5/2010 8:01 PM

    DynamicBuffers vs DrawUserIndexedPrimitives

    Mr Hargreaves and SpriteBatch seem to agree that DynamicVertexBuffer is the right way to go, especially now Xbox supports .Discard and renaming

    However as I'm reading around the forums and interwebz I keep finding comments like 'if you are building and throwing away a whole vertex buffer especially on the xbox' then you should be using DrawUserIndexedPrimitives. Though it would seem like its certainly not the right thing to do on windows.

    Is this old advice from before Dynamic buffers were fully supported on the 360?
  • 11/5/2010 8:09 PM In reply to

    Re: DynamicBuffers vs DrawUserIndexedPrimitives

    To answer my own question a little from jwatte:

    Based on that if the data is created and used only once there is little to choose between a dynamic buffer and DrawUser... 

    In my case I'm doing particles so the buffer is large and my index buffer doesn't change between calls so there is an advantage to using DynamicVertexBuffers as it wont have to copy the index buffers multiple times and takes the pressure off the command buffer.

    Sound right?
  • 11/5/2010 8:43 PM In reply to

    Re: DynamicBuffers vs DrawUserIndexedPrimitives

    The ZMan:
    Is this old advice from before Dynamic buffers were fully supported on the 360?


    I suspect so.


    The ZMan:

    In my case I'm doing particles so the buffer is large and my index buffer doesn't change between calls so there is an advantage to using DynamicVertexBuffers as it wont have to copy the index buffers multiple times and takes the pressure off the command buffer.

    Sound right?


    Right.

    Does all your vertex data change every frame? It can often be useful to separate vertices into multiple streams, with changing values in one buffer and static portions in another.
  • 11/5/2010 9:09 PM In reply to

    Re: DynamicBuffers vs DrawUserIndexedPrimitives

    Its a CPU updated particle so position and color change every frame - texture coordinates do not. In fact they are, like the index buffer effectively constant. Would it be worth splitting them just for that?

    Right now it uses BasicEffect because we want a shader free solution for phone - but eventually there will be a custom shader version that does some of the stuff on the GPU. Like your GPU particle sample that will probably drop the texture coordinates in favor of a corner indicator - for that reason it might not be worth the hassle.
  • 11/5/2010 10:12 PM In reply to

    Re: DynamicBuffers vs DrawUserIndexedPrimitives

    You've probably seen it already, but just in case: http://www.nuclex.org/blog/gamedev/30-efficiently-rendering-dynamic-vertices

    Some time ago, I ran benchmarks with DrawUserPrimitives() versus DynamicVertexBuffer both on my Xbox and Windows (with a GeForce 8800). With good batching, the Xbox could actually achieve up to 1/4th of the performance of the GeForce card. That was with XNA 3.1, so I didn't do any DynamicVertexBuffer tests on the Xbox.

    Maybe I should go hunting for the source code of that benchmark and redo it with XNA 4.0, this time including DynamicVertexBuffers on the Xbox :)

    It's sad that Windows Phone doesn't support custom vertex shaders, otherwise the particle billboards could be oriented towards the camera purely on the GPU.
  • 11/5/2010 10:38 PM In reply to

    Re: DynamicBuffers vs DrawUserIndexedPrimitives

    The ZMan:
    texture coordinates do not. In fact they are, like the index buffer effectively constant. Would it be worth splitting them just for that?


    Probably. Although if you compress the texture coordinate data (Byte4 should be fine for this, right?) that might make them small enough that the slight extra cache complexity of fetching from two streams outweighs the bandwidth saving of not having to copy so much data.

    If you really care about perf, my guess is two streams will be fastest, but you'd have to try both ways to find out for sure.
  • 11/5/2010 11:27 PM In reply to

    Re: DynamicBuffers vs DrawUserIndexedPrimitives

    Compressing the data means custom shader to decompress though which wont work on Phone.

    I'll give it a shot and report back. Phone supports multiple vertex streams right?
  • 11/6/2010 12:03 AM In reply to

    Re: DynamicBuffers vs DrawUserIndexedPrimitives

    Cygon:
    You've probably seen it already, but just in case: http://www.nuclex.org/blog/gamedev/30-efficiently-rendering-dynamic-vertices

    Some time ago, I ran benchmarks with DrawUserPrimitives() versus DynamicVertexBuffer both on my Xbox and Windows (with a GeForce 8800). With good batching, the Xbox could actually achieve up to 1/4th of the performance of the GeForce card. That was with XNA 3.1, so I didn't do any DynamicVertexBuffer tests on the Xbox.

    Maybe I should go hunting for the source code of that benchmark and redo it with XNA 4.0, this time including DynamicVertexBuffers on the Xbox :)


    I did see it but had forgotten - thanks. If nothing else its convinced me I still have a problem somewhere. I improved the DynamicBuffer version with some unsafe and ref functions by about 10% but I'm still only getting 6000 quads (in 4 batches) at 70 fps on windows (GTX 260) and 2500 quads at 30 fps on 360. 

    Was your benchmark rebuilding the whole buffer every single frame? I have particle update and billboarding math in there but even commenting both of those out I don't see the numbers you do. So I'd be interested in seeing the benchmark code.
  • 11/6/2010 12:16 AM In reply to

    Re: DynamicBuffers vs DrawUserIndexedPrimitives

    As long as you're just compressing by changing vertex format you don't need a custom shader.  Just store the data in Color or NormalizedShort2 format as opposed to Vector2, instant 2x space and bandwidth saving...
  • 11/6/2010 1:03 AM In reply to

    Re: DynamicBuffers vs DrawUserIndexedPrimitives

    This is one confusing thread!   However, I'm not a DirectX expert and I might end up with a nasty surprise once I get my phone but here is what I did:

    1) I created two sets of Vertices and on Index list as I don't change that.
    2) In the update I update the Vertices for the next frame
    3) I start sending down the next frame to the GPU with SetData()
    4) In the draw I render then current frame using DrawIndexPrimatives()
    5) I switch witch frame is current and which is next.   (I'm double buffering the models)

    This has been allowing me to update all the vertices if I choose between frames.   I realize I have a problem if I go over 65000 verticies, but I probably have other problems long before that on the phone.   This was simple to implement and as soon as I did, my frame rate on the emulator when way up.   Hopefully it will work on the phone two. 

    I was thinking the advantage of this approach is I could have multiple sets of vertices that I want to work with.   After this thread I'm just confused.
  • 11/6/2010 2:28 AM In reply to

    Re: DynamicBuffers vs DrawUserIndexedPrimitives

    It seems the I've just implemented the DrawUserIndexedPrimatives() ... except that I can pre-buffer mine.   The stuff isn't easy!
  • 11/6/2010 3:22 AM In reply to

    Re: DynamicBuffers vs DrawUserIndexedPrimitives

    If you use DrawUserIndexedPrimitives there should be no need to double buffer. When you call Draw a copy of the data is taken so you are safe to update.
  • 11/6/2010 3:28 AM In reply to

    Re: DynamicBuffers vs DrawUserIndexedPrimitives

    Shawn Hargreaves:
    As long as you're just compressing by changing vertex format you don't need a custom shader.  Just store the data in Color or NormalizedShort2 format as opposed to Vector2, instant 2x space and bandwidth saving...


    Fascinating... so as long as I tag it as TEXCOORD01 the built in shaders are smart enough to work out how to unpack the data?
  • 11/6/2010 4:17 AM In reply to

    Re: DynamicBuffers vs DrawUserIndexedPrimitives

    The ZMan:
    Fascinating... so as long as I tag it as TEXCOORD01 the built in shaders are smart enough to work out how to unpack the data?


    This happens in the GPU memory fetch hardware before your vertex shader code begins execution.
  • 11/6/2010 4:29 PM In reply to

    Re: DynamicBuffers vs DrawUserIndexedPrimitives

    The ZMan:
    Cygon:
    You've probably seen it already, but just in case: http://www.nuclex.org/blog/gamedev/30-efficiently-rendering-dynamic-vertices
    [...]

    I did see it but had forgotten - thanks. If nothing else its convinced me I still have a problem somewhere. I improved the DynamicBuffer version with some unsafe and ref functions by about 10% but I'm still only getting 6000 quads (in 4 batches) at 70 fps on windows (GTX 260) and 2500 quads at 30 fps on 360. 

    Was your benchmark rebuilding the whole buffer every single frame? I have particle update and billboarding math in there but even commenting both of those out I don't see the numbers you do. So I'd be interested in seeing the benchmark code.


    Yes, I'm rebuilding the whole buffer each frame. It goes like this:
    for each quad {
      construct quad's vertices in local array;
      copy array into batching buffer;
      if batching buffer >= 8192 vertices {
        draw using DrawIndexedUserPrimitives() or DynamicVertexBuffer+DynamicIndexBuffer
      }
    }

    I haven't kept my original benchmark code (which didn't use indexed vertices), so I recreated the benchmark using my current library routines. Performance is a bit lower, probably because I now fill both an index buffer and a vertex buffer.

    Here are my current results: http://www.nuclex.org/blog/gamedev/112-dynamicvertexbuffer-versus-drawuserprimitives-round-2

    The article also includes the source code to my benchmark application this time. All the interesting stuff is in Nuclex.Graphics\Source\Batching. If you want to switch between DrawIndexedUserPrimitives() and DynamicVertexBuffers, change the PrimitiveBatch.GetDefaultBatcher() method so it either returns a DynamicBufferBatchDrawer or a UserPrimitiveBatchDrawer all the time.
Page 1 of 1 (15 posts) Previous Discussion Next Discussion