Sunday, April 8, 2012

Independent Study: Week 5

Almost half-way! I'm making steady progress daily. Today I'm tackling photon mapping (hopefully updating this by the end of the day with an image!) but I thought I should record a major decision that I made today.

Defining The Photon Map

To define the photon map in the C++ header, I have to have static variables for all the dimensions of it. I've decided to forego acceleration structures for the time being in favor of a more simplistic multi-dimensional array sorted based on object type and various indices. The problem with that is that I have to define the limits of these dimensions statically. Fast-forward a ton of non-important stuff and the end result is that having two structures for Spheres and Planes is kind of annoying in this process. Not to mention unnecessary. Look at these two structure definitions:

Sphere:

typedef struct _SpherePrimitive {
cl_float4 center;
cl_float radius;
cl_uint materialIndex;
} SpherePrimitive;

Plane:

typedef struct _PlanePrimitive {
cl_float4 normal;
cl_float originOffset;
cl_uint materialIndex;
} PlanePrimitive;

I don't see a single reason these two technically have to be separate. Which led me to this hybrid:

Object:

typedef struct _ObjectPrimitive {
cl_float4 positionOrNormal;
cl_float radiusOrOffset;
cl_uint materialIndex;
} ObjectPrimitive;

I haven't integrated this yet, but when I do I'll update with the results. I predict that nothing much will change performance-wise but defining my maps and various future structures will be that much easier.

Update: 4/8/2012
I switched everything over to generic objects and saw ~5% decrease in performance overall. I don't understand why exactly, but for the time being it's worth it for how much cleaner everything is.

I've also got most of the photon mapping done. Unfortunately the part I don't have completed is the storage format itself. I'm trying to find a middle-ground between ease-of-implementation and effectiveness for my initial implementation so that I'm not coding for hours on end just to find out that I've made a fundamental mistake and have to start over. I tried a 2-dimensional array based on object index, but without a lot of work it's just not going to pan out (initial tests not only didn't work, but took about 30 seconds per frame!!!). I think I might implement a 1-dimensional array instead and use a radix sort to keep everything in order by object index. Something like this:

struct Photon photons[SCENE_MAX_PHOTONS]; //holds all photons
uint* objectPhotonIndices; //holds the start index into "photons" for the given object

this approach will also make progressively adding photons to the map much easier than using a 2D array would have, as long as the total count remains below SCENE_MAX_PHOTONS.

Update: 4/9/2012
Well, decent progress this afternoon. I made the switch I mentioned earlier and it's looking promising!



The cool thing about this is that the two spheres in the middle are white.

I'm casting 10,000 photons into the scene (at the moment) uniformly from the point light located between the spheres and the blue wall. I don't have a sorting algorithm in place, so I'm only actually using the 500th through the 550th photons from the list arbitrarily, which is why this looks so bad relatively :).

Still, with only 50 photons from the list you can see some indirect light on the (100% red) ceiling, and same with the aqua floor. The spheres show the most though by far. I plan to implement a sorting algorithm for the photons next so that I can see what all 10K look like in real-time!

I'm a little worried about the performance. 2 Days ago I was getting 200 FPS at 1280x720 resolution. After implementing the photon mapping for 50 photons that has dropped to ~25 FPS at 640x480. I hope to come back and find something to optimize later on!

Note: As far as optimization goes, I don't know how successful I'll be. One of the most unfortunate things about my video card is that -- being a laptop card -- it lacks much dedicated memory. Unlike the smaller structures, my photon map was far too large to pass to my OpenCL kernels in __constant memory, and so I had to use the larger (and MUCH slower) __global access qualifier instead. I don't know what I can do about this...

I'll be updating soon when I get a sorting algorithm implemented. I've got my eye on a particular radix implementation that made it's debut in the Nvidia SDK a few months back.

Update: 4/9/2012

Debug Screenshot!



Update: 4/10/2012

How it looks when it's functioning correctly (so slow!!!)


Update: 4/12/2012


I modified the scene to be similar to the traditional Cornell Box. While playing with some debug settings and only rendering the indirect lighting I discovered how artistic these images can look. Here's one that looks almost hand-painted:




Update: 4/14/2012

I implemented really basic sorting and now I've doubled my speed!




Update: 4/14/2012

Added a random function for the photon emitting, changed storage to allow for all bounces of photons to be stored, and finally was able to replace all the hard-coded hacks with REAL, physically-based lighting. This is all true inverse-square law attenuation both from light to scene and scene to camera:



No comments:

Post a Comment