Trillion Frames-per-Second Camera Explained

MIT has been recently in the news because Ramesh Raskar and his team have demonstrated a camera which can visualize the motion light through a scene. The New York Times has a writeup here. More details can be found on the group project page.

One thing that commentators have been confused about is what it means to show light in flight through a media. In fact the group’s video makes it more confusing to the layman when Raskar says “We have built a virtual slow-motion camera where we can see photons or light particles moving through space.” This gives that impression that we can see single photons and we can somehow see them moving along a path. Neither of these things are happening.

Instead, what they have is a laser that can produce extremely short pulses of light (containing many photons) and a synchronized camera that can open its electronic shutter to capture frames with picosecond durations. The effect is similar to using a stroboscope. The laser illuminates the objects of interest with a short pulse of light which repeats every 13 nanoseconds. This pulse of light travels through the scene, bouncing around and scattering. Many of these photons find their way into the camera. However the camera is set to ignore anything that happens outside a time window. By adjusting the relative timing of the laser pulse and the moment of camera capture, the scene can be visualized after varying delays following the start of the laser pulse.

Laser Scattering
The diagram should make things a little clearer. The laser pulse starts from a source on the left at time zero. The light travels to the right and scatters from the liquid in the bottle or from surfaces at various times and by various paths, and some photons end up at the camera while some continue on to other areas. Light traveling by various paths but with the same delay will arrive simultaneously at the camera, e.g., 1a and 1b. Therefore the camera sees an image formed from light that has a narrowly defined path length (corresponding to a time T1) from the source. This light will only be imaged at the sensor when the laser to camera synchronization delay is set to T1. When they have captured a complete image frame of light coming from many different directions for this delay T1, they can go on to adjust the delay to T2 which may correspond to the light paths shown as 2a and 2b. Delays for T3 and T4 would capture light coming from progressively longer paths, 3 and 4.

This process is in practice achieved by automatically capturing 480 frames in quick succession at 1.71 picosecond intervals (corresponding to a 0.5 millimeter incremental motion of the light). This movie can be played back at any speed. Due to the fact that they use a 1D imager, they must assemble horizontal lines of the image from different runs to make the final movie. Additionally, it is necessary to capture the same movie millions of times to average together enough data statistically to get a clean-enough signal.

This overall technique holds promise for various future applications: Super-sensors could see around corners, or determine accurate 3D scene structure by using time-of-flight information. At the moment there is a move to use active sensing to determine the 3D structure of a scene (e.g, using Microsoft Kinect). Using detailed time of flight information and second order reflections, it may be possible to enhance this and overcome some of the occlusion problems involved when trying to reconstruct 3D structure from range data taken from multiple cameras. Additionally, sensors based on the MIT research could also rapidly measure surface reflectances functions (BRDFs) and sub-surface scattering, giving more realism to computer graphics created from the imaged scenes or materials. Detailed imaging of sub-surface scattering from the human body could also help to diagnose disease. Watch this space.