Winning my own little battle against Cuda

I’ve won a nice little battle in my personal struggle with Cuda recently by hacking in dynamic texture memory updates. Unfortunately for me and my little Cuda graphics prototypes, Cuda was designed for general (non graphics ) computation. Texture memory access was put in, but as something of an afterthought. Textures in Cuda are read…

A little idea about compressing Virtual Textures

I’ve spent a good deal of time working on virtual textures, but took the approach of procedural generation, using the quadtree management system to get a large (10-30x) speedup through frame coherence vs having to generate the entire surface every frame, which would be very expensive. However, I’ve also always been interested in compressing and…

Rasterization vs Tracing and the theoretical worst case scene

Rasterizer engines don’t have to worry about the thread-pixel scheduling problem as its handled behind the scenes by the fixed function rasterizer hardware. With rasterization, GPU threads are mapped to the object data first (vertex vectors), and then scanned into pixel vector work queues, whose many to one mapping to output pixels is synchronized by…

Understanding the Effeciency of Ray Traversal on GPUs

I just found this nice little paper by Timo Alia linked on Atom, Timothy Farrar’s blog, who incidentally found and plugged my blog recently (how nice). They have a great analysis of several variations of traversal methods using a standard BVH/triangle ray intersector code, along with simulator results for some potential new instructions that could…

Voxel Cone Tracing

I’m willing to bet at this point that the ideal rendering architecture for the next hardware generation is going to be some variation of voxel cone tracing. Oh, there are many very valid competing architectures, and with the perfomance we are looking at for the next hardware cycle all kinds of techniques will look great,…

Deferred Rendering w/ MSAA – MSAA Z Prepass Idea

Deferred rendering presents something of a challenge to combine with MSAA on current console hardware because of the memory/bandwidth overhead of storing multiple render targets with multiple samples.  The extra memory hurts both PS3 and 360 equally, but the bandwidth effect differs on the two platforms.  On PS3, there is a straightfoward additional bandwidth cost…