The past three years we’ve been involved in the Tripod project. The goal of Tripod is to create new, easy, ways to find visual media. One of the approaches is to use the geographic metadata stored with pictures nowadays to figure out what is on that picture. Once you know what is on a picture, you could search for instance for ‘Stephanskirche’ and get pictures of that particular church, even without users manually adding tags or captions.
The available metadata of photos is increasing with the number of sensors available in cameras. Already all digital cameras record the focal length, aperture and such. Using information from GPS, compass and accelerometer sensors you can also include the location, direction and roll, pitch, yaw of the photo. With this information you can recreate the photo in a 3D environment. If you have geodata available in that 3D environment, you can return information of the visible features.
Take for instance this picture:
We know the location and the direction of the picture. With a simple GIS operation you can find out that the picture is taken in Bamberg, Germany. However you don’t yet know what’s in the picture. For a start the actual picture is shot at the ‘Obere Seelgasse’, which is two streets away from the church. Also the area is both urban and hilly. This means that you could either see a house in that particular street, or a hill nearby, both blocking the view of the church.
So you need to do a 3D analysis of the image. We have a detailed 3D model of Bamberg and a digital terrain model to be able to calculate which features are in the view. To do this we use the Web Perspective View Service (WPVS) from deegree. This is a proposed OGC standard which allows us to both render a perspective view given location, orientation and field of view and to query geo-databases using locations on the photo.
Using the WPVS we get this image:
For automatic identification, it is easier if each building has one color, this way we can calculate the area occupied by a specific building. So we had to uglyfy the image by removing anti-aliasing and shading. We wrote a service called the Feature Identification Service (FIS) which, given a georeferenced photo, will determine the most prominent features and return a list of visible features. In this case it would tell us that it contains the Stephanskirche in Bamberg.