A photo is worth a thousand words, but what if the image could also represent thousands of other images?
New software developed by University of California, Berkeley computer scientists seeks to tame the vast amount of visual data in the world by generating a single photo that can represent massive clusters of images.
This tool can give users the photographic gist of brides and grooms at their weddings. It works by generating an image that literally averages the key features of the other photos.
Users can also give extra weight to specific features to create subcategories and quickly sort the image results.
The research, led by Alexei Efros, associate professor of electrical engineering and computer sciences, noted that since photography was invented, there have been an estimated 3.5 trillion photos taken, including 10 per cent within the past year.
"Visual data is among the biggest of Big Data," said Efros.
"We have this enormous collection of images on the Web, but much of it remains unseen by humans because it is so vast.
People have called it the dark matter of the Internet. We wanted to figure out a way to quickly visualise this data by systematically 'averaging' the images," said Efros.
Efros worked with Jun-Yan Zhu, UC Berkeley computer science graduate student and the paper's lead author, and Yong Jae Lee, former UC Berkeley postdoctoral researcher, to develop the system, which they have dubbed AverageExplorer.
The researchers provided examples of potential applications of this system, such as in online shopping, where a consumer may want to quickly home in on two-inch wedge heels in the perfect shade of red.
Lee, now an assistant professor in computer sciences at UC Davis, said the system could also be used to help improve the ability of computer vision systems to distinguish key features in an image, such as the tires on a car or the eyes on a face.
When users mark those features on an average image, the entire collection of images is automatically annotated as well.
"In computer vision, annotations are used to train a system to detect objects, so you might mark the eyes, nose and mouth to teach the computer what a human face looks like," said Lee.