Rotation is the enemy

Last week I have published a simple tool that calculates (among a few other things) motion blur resulting from camera movement relative to the photographed object. Looking at the results of those calculations, one could say that motion blur is a very minor issue in UAV photography: at a platform speed of 30 km/h and a shutter speed of 1/1000 s, motion blur is as low as 0.8 cm. Flying a Canon G12 at the wide angle limit (28 mm) and 200 m above ground, this amounts to only 0.25 image pixels. From the calculation results of UAVphoto, motion blur does not appear to be a relevant issue. The need to take images at short intervals to achive sufficient overlap appears to be much more important when using a UAV. But why do I even get blurred images when using a kite that is almost immobile relative to the ground?

The point is that motion blur due to translation (i.e. linear movement of the camera relative to the object) is only one reason for blurred images. Another (and much more relevant) reason is rotation of the camera. Unfortunately, this is also much harder to measure and to control. To show how important rotation is for image blur, I have added the calculation of rotation blur to the new version of UAVphoto. Two types of rotation have to be distinguished: rotation about the lens axis and rotation about the focal point but perpendicular to the lens axis. I am not using the terms pitch, roll and yaw here because the relation of platform pitch, roll and yaw to rotation about different camera axes depends on how the camera is mnounted to the platform.

Rotation about the lens axis results in rotation blur that is zero at the image centre and reaches a maximum at the image corners. Rotation about an axis orthogonal to the lens axis results in rotation blur that is at first sight indistinguishable from motion blur due to high speed translation movement. Of course, all types of blur combine to the total image blur. Rotation blur about the lens axis is independend of focal length. Orthogonal rotation blur, on the other hand, increases with increasing focal length. In both cases an increase in shutter speed will result in a proportional decrease in image blur.

Most UAV rotation movements are due to short-term deflections by wind gusts or steering. Wind gusts are also the main source of rotation movements of kite-carried cameras. Let’s say we’re using a Canon G12 at the wide angle limit (28 mm). The maximum rotation rate which will not result in image blur (using a 0.5 pixel threshold) is 12.4 °/s (or 29 s for a full circle) for rotation about the lens axis and 8.1 °/s (or 44 s for a full cirlce) for rotation orthogonal to the lens axis. At a focal length of 140 mm, the maximum rotation rate orthogonal to the lens axis is only 1.9 (or 189 s for a full circle). If all this sounds very slow to you, you’ve got the point: even slow rotation of the camera during image capture is a serious issue for UAV photography, in most cases much more important than flying speed.

UAVphoto – a simple calculation tool for aerial photography

I have to admit that I am sometimes a bit lazy. Rather than solving a problem once and working with the solution, in some cases I keep twiddling with the same problem again and again. Calculating things like viewing angles, ground resolution, motion blur or image overlap for aerial photography is a case in point. There must be a dozen or so spreadsheet files on my various computers which I used to do such calculations. I kept re-inventing the wheel again and again for myself and when others asked me for help.


Now I finally got around writing a small piece of software for this specific task. It is a simple tool that allows to calculate parameters like ground pixel size, motion blur and sequential image overlap from UAV flight parameters (velocity and altitude) and camera parameters (focal length, shutter time, image interval etc.). Calculation assumes a vertical camera view for simplicity. Image y dimensions are those in flight direction, image x dimensions are those perpendicular to flight direction. Default camera values are for Canon G12 at the wide angle limit. Five to six seconds is the approximate minimum image interval using a CHDK interval script. In continous shooting mode, a minimum interval of approximately one second can be achieved.

Now that I created this tool, why not share it? UAVphoto is published under the GNU General Public License and can be downloaded from Sourceforge.

3D acquisition techniques: Structure from Motion

Structure from Motion (SfM), also known as Multi-View Stereo (MVS), is a technique for creating three-dimensional digital models from a set of photographs. Fundamentally, it is a stereo vision approach, that is, it uses the parallax (the difference in the apparent relative position of an object if seen from different directions) to derive 3D information (distance/depth) from 2D images. (Look at something that has foreground and background, e.g. a plant on your window sill and the house on the other side of the street, alternate between your right and left eye – the shifts you see is the parallax.)

What you need for SfM is several (2, 3, … thousands) photographs capturing an object from many different directions. Then, there are by now several software products that can perform SfM 3D reconstruction. VisualSfM (open source) and Agisoft Photoscan ($179 standard, $3499 professional) appear to be among the more popular solutions. There are several more, and there are quality, speed and usability differences, but essentially they all do the same.

A series of photographs of a rock in the desert.

A series of photographs of a rock in the desert.

The software will first analyse the photographs to find many (usually several thousand) characteristic feature points within each image, using SIFT (Scale-Invariant Feature Transform) or similar algorithms. The feature points are defined by analysing the surrounding pixels which will help to recognise similar feature points in other images. Algorithms like SIFT (rather than just trying to directly match small regions among images, e.g. by correlation) ensure that this works irrespective of scale or rotation.

The next step is to match these feature points among images to find corresponding feature points. Some feature points will not be found in any other image – so what, there are many more. Some others will be mismatched, so outliers will be recognised and excluded. The next step is the actual heart of the SfM approach: bundle adjustment. The theory for this has been around for decades, but only with powerful computer hardware was it possible to actually do this (who wants to do billions of computations on a piece of paper…). What happens during bundle adjustment is that the software tries to find appropriate camera calibration parameters and the relative positions of cameras and the feature points on the object. It is quite a challenging optimisation problem, but after all its not magic but just very sophisticated number crunching.

The algorithm starts by using two images. It assumes the known camera focal length (from the JPG’s EXIF data) or (if no data are available) a standard lens. It then uses the relative (2D) positions of the feature points in the two images to derive a rough estimate of the 3D coordinates of these points relative to the camera positions, then iteratively refines both the camera parameters and the 3D point coordinates until either a certain number of iterations or a certain threshold in the variance of the point positions has been reached. Then, the next image is added, and the software tries to fit the matched feature points of that image into the existing 3D model, iteratively refines the parameters and so on. After all images have been added, an additional bundle adjustment is run to refine the entire model. And then, voilá, there’s your sparse 3D point cloud and your modelled camera calibrations and positions.

Sparse point cloud with modelled camera positions.

Sparse point cloud with modelled camera positions.

Now, there are three or four things still worth considering. One is that you may observe slight differences in modelled camera calibration parameters or 3D data. That’s a relatively common issue when dealing with iteratively optimised data. What comes out of the process is not the “absolute truth” but something that is usually quite close to it.

The second point is that the 3D model somehow floats in space – it is not referenced to any coordinate system. Depending on the software, referencing can be done directly by inserting control points with known positions or by using known camera positions as control points. Alternatively, referencing can be done in external software (for example by applying a coordinate transform to the point cloud using Java Graticule 3D. For some purposes, the non-referenced model is good enough, for others you might at least want to scale it, for yet others a complete rotate-scale-translate transformation to reference the model to a coordinate system is necessary.

The third point is that you may want a denser point cloud. To get this, software like VisualSfM or Agisoft Photoscan uses the camera positions, orientation and calibration together with the sparse point cloud and the actual images to compute depth (distance) maps: for every pixel in the image (but usually resampled to a lower resolution), the distance between camera and object is computed. From these distance maps, a dense point cloud is then created.

Finally, what if you need a meshed model? It is either created directly after dense point cloud generation (Agisoft Photoscan), or meshing can be done in external software.

test descr

Meshed model of the rock in the desert.