Chris's Awesome CS184 AS6

This is my implementation of AS6 for CS184, a raytracer with reflections and acceleration structures. It also features improved soft shadows over my version of AS5; previously only point lights created soft shadows (due to time constraints in both rendering and programming); now, all light sources except ambient light has soft shadows.

The above image is scene2.scd, rendered at 4 RPP and 25 shadow rays per spot in shadow. It is a slight but definite improvement over the reference rendering of the same image. Most noticeably, because of the soft shadows, the aliasing effect is significantly less.

The above image is thousandspheres.scd, rendered the same way as the above image. It is not much of an improvement due to the very few shadows in the image.
It has been tested on:
- Windows 7 (64-bit--this is the only one guaranteed to work!)
- Mac OS X (Intel, 32-bit)
Timing information
In previous tests, I determined that multithreading my application (completed for AS5) results in about a 2-3 times speed up of just the rendering when 4 threads are used. Part of the reason for this is that the different parts of the image are not as complicated as other parts; for example, for thousand spheres, about half the image is empty space so half the image completes rendering very quickly and half of the image takes a very long time to render. This can be ameliorated with careful use of OpenMP but I didn't do that. The following times are all measured with multithreading enabled.
- 4 RPP, no fancy shadows, no fast intersect when using HAABB tree
- Thousand spheres: 71s --> 3.8s at depth 4, 33s --> 1.7 s at depth 16
- Helix: 14s --> 10s at depth 4, 27s --> 40s at depth 16 (gets worse)
- 4 RPP, fancy shadows, fast intersect
- Thousand spheres: 647s --> 27s
- Helix: 143s --> 190s (gets worse)
As you can see, the fancy shadows for some reason caused the speed of the rendering of the helix image to increase. The reason for this is not known. One guess could be that the bounding boxes commonly encountered for at least one (but less than 4) threads are so inefficient that the multiprocessing speedup isn't as good as it was for without using the tree. Memory usage in all situations remained low, as expected. I thought perhaps the fast intersect code could have been the problem, but it actually saves 10 seconds when the HAABB tree is used. In particular, due to the nature of the image, most of the spheres will have to be checked by any one thread's shadow rays anyway, so it is difficult to properly construct an efficient HAABB that beats just checking all the spheres.
In contrast, for thousand spheres, the HAABB tree sped up rendering quite a bit. The non-tree version suffers from the fact that each thread continuously checks all the spheres for an intersection, so that the thread rendering the bottom 1/4th of the image when using 4 threads will take very long and dominate the rendering time.
Errata
You must specify (ao 0) (i.e., ambient occlusion off) to the camera in order for most of the images to work. Ambient occlusion was added last-minute and a problem with an uninitialized value for ambient occlusion rays can cause images to have ambient occlusion on (and at very high numbers) when it should default to off.
Extra credit
TOC
- Improved soft shadows
- Improved multiprocessing
- Improved World::fastIntersect
- Triangle mesh support
- Smooth shading
- Depth of field
- Optics
- Ambient occlusion
Improved soft shadows
In my AS5 submission, my shadows were only softened for point lights, by mistake. Now they are softened for point and directional lights. Because they interfere with triangle mesh support, they are disabled by default and must be reenabled by #define-ing FANCY_SHADOWS in main.h.

Improved multiprocessing
In my AS5 submission, things were processed in multiple threads but there was never any evidence of progress. Now the current image is displayed to the user. You can see the supersampling in action because already-drawn portions of the image will often be "updated" again in subsequent lines. Additionally, by judicious calls to Viewport::resetLimits, the program will automatically repurpose threads that have completed their part of the image.

Above you can see evidence of both multithreading and the ability to display intermediate stages.

It is not necessary to have an image with approximately equal complexity in order to take advantage of all your processors at once; the above image shows that threads which have finished their part of the image get repurposed.
Improved World::fastIntersect
World::fastIntersect now uses the same speedups as World::intersect does, with the HAABB.
Triangle mesh support

Supports arbitrary triangle meshes that can be exported as .obj files from something like Blender. OBJ file available here.
Smooth shading

Linear interpolation of the surface normal using the vertex normals gives a much smoother image than before. It's still not perfect though. In particular, it seems to be rather incompatible with the smooth shadows implementation. It also works best with true triangle meshes (as opposed to quad meshes in which the quads have simply been split in two). Fortunately smooth shading can be specified in the scd file with the command "smooth". It only works on the current instance! So you have to apply it on the leaf node when you instantiate a copy of the mesh. You can't apply it to a group and have it apply to everything in that group.
Depth of field

In the above image, a camera with aperture 0.1 and focal plane at z=-52 is used to render this image with the Utah teapot in its original aspect ratio. It was rendered at 32 RPP and took perhaps 15-30 minutes to render at 256 RPP.
Optics
Refraction is supported, so arbitrary, non-intersecting meshes or spheres can be used as lenses to focus light. The three images below are based on this scene file, modified as specified. The camera used in rendering these images has an aperture of about 0.1 units (the exact units are of undefined type).

In the above image, the three teapots are at z = -18 and the camera's focal plane is at z = -18. This took about 30 seconds to render at 32 RPP.

In the above image, the teapots are blurred because an incorrect focal plane is specified to the renderer, z = -10.5 instead of z = -18. This also took about 30 seconds to render at 32 RPP.

In the above image, a corrective optical element (BCX lens with n = 1.5, r1 = 7, r2 = 7, d ~= 2 --> f = 7.2) is placed at z = -5 to correct for the incorrect focal plane specified to the camera. Aberrations are present due to the smoothing used to smooth both the teapots as well as the lens. Also note how the teapot is magnified as well. This took just over 4 minutes to render at 32 RPP. Chromatic aberration and things that we could get by photon mapping (like caustics) are not supported. The lens OBJ file can be found here. It was made in Blender by creating two ico spheres (subdivision 5) of radius 7 and placing them 13 units apart. After intersecting the two spheres, the result was intersected with a cylinder of length 2 and radius 1 to obtain a lens. Higher subdivision number seemed to just crash Blender instead of producing a better lens. Horrific aberrations resulted with UV spheres (rings and divisions = 96) that could not be corrected with smooth shading.

This image is the same as the previous one but rendered at 512 RPP and without the white background.

In the image above, smooth shading has been turned off and the camera focused on the plane in which the lens resides, enabling us to see the true multifaceted nature of the lens. Rendered at 4 RPP in 40s.
In this image, the lens remains in place while the camera is focused onto the background, with smooth shading disabled. Rendering time same as above.

In this image, a glass contains 2 cubes and is surrounded by 2 more cubes. All of them are represented by triangle meshes, without smoothing. The shadows caused by transparent objects are much lighter than they were in previous images. Rendered at max recursion depth 16 and 4 RPP in 5700s. Scene file available here. Cube 1 (lower cube in glass) here, cube 2 (upper cube in glass) here, other cubes are transformed versions of this cube, and the glass is available here. The cubes have 49152 triangles each and the cup has 1452 triangles.
As a final note, the Schlick approximation is used to determine refrlection magnitudes. Additionally, Kt is used to determine how bright/dark the shadows are. The translucent objects can also be of arbitrary color; 1 1 1 is totally clear and 0 0 0 is totally opaque.
Ambient occlusion

The above image has no ambient occlusion.

The above image has ambient occlusion with 32 shadow rays per regular light ray. Both dragons are equally terrifying but one clearly looks better than the other. Ambient occlusion provides subtle depth cues to the viewer by selectively blocking ambient light as though it could be shadowed as well. Of course the effect is more pronounced in scenes with a lot of ambient light, such as not this one, which has very little. The non occluded image (first) took about 20 seconds to render; the occluded image took about 140 seconds. OBJ file here and SCD file here. Specify (ao 32) to the camera to get the occluded image seen here.