Microsoft just recently presented the paper “High quality streamable free-viewpoint video” at SIGGRAPH. In this presentation, they are capturing live 3D views of actors on a stage using multiple cameras and using computer vision to construct detailed texture mapped mesh models which are then compressed for live viewing. On the viewer you have the freedom to move around the model in 3D.
I contributed to this project for a year or so when I was employed at Microsoft, working on 3D reconstruction from multiple infra-red camera views, so it was nice to get an acknowledgment. Some of this work was inspired by our earlier work at Microsoft Research which I co-presented at SIGGRAPH in 2004.
It’s very nice to see how far they have progressed with this project and to see the possible links that it can have with the Hololens virtual reality system.
When training neural networks it is a good idea to have a training set which has examples that are randomly ordered. We want to ensure that any sequence of training set examples, long or short, has statistics that are representative of the whole. During training we will be adjusting weights, often by using stochastic gradient descent, and so we ideally would like the source statistics to remain stationary.
During on-line training, such as with a robot, or when people learn, adjacent training examples are highly correlated. Visual scenes have temporal coherence and people spend a long time at specific tasks, such as playing a card game, where their visual input, over perhaps hours, is not representative of the general statistics of natural scenes. During on-line training we would expect that a neural net weights would become artificially biased by having highly correlated consecutive training examples so that the network would not be as effective at tasks requiring balanced knowledge of the whole training set.
If you are training neural networks or experimenting with natural image statistics, or even just making art, then you may want a database of natural images.
I generated an image patch database that contains 500,000 28×28 or 64×64 sized monochrome patches that were randomly sampled from 5000 representative natural images, including a mix of landscape, city, and indoor photos. I am offering them here for download from Dropbox. There are two files:
image_patches_28x28_500k_nofaces.dat (334MB compressed)
image_patches_64x64_500k_nofaces.dat (1.66GB compressed)
The first file contains 28×28 pixel patches and the second one contains 64×64 patches. The patches were sampled from a corpus of personal photographs at many different locations and uniformly in log scale. A concerted effort was made to avoid images with faces, so that these could be used as a non-face class for face detector training. However there are occasional faces that have slipped through but the frequency is less than one in one thousand. Continue reading