Images of the Russian Empire

Introduction

Sergey Prokudin-Gorsky was a Russian chemist and photographer who pioneered color photography in the early 20th century. He took a series of photographs of the Russian Empire, which were later compiled into a collection by the Library of Congress. The collection contains images of the Russian Empire in the early 20th century, including landscapes, people, and architecture. These photographs are taken using a special camera that captures three black-and-white images of the same scene, each with a different color filter. The idea was that these images could be later combined to create a color photograph, which was not possible at the time. This project aims to combine these three black-and-white images and recreate the intended original color photograph.

Approach

The fundamental approach used in this project is to first take the three black-and-white photos representing red, green, and blue, and to align them so that the combined photo is correctly colored. This is done by setting the blue image as the reference image, and then aligning the red and green images to the blue image. To align the images, I shift the source photo -15 to 15 pixels horizontally and vertically, and then find the offset that best aligns with the reference image. In this project I used the sum of squared differences (L2) and normalized cross-correlation (NCC) as the similarity measures to find the best alignment. More specifically, L2 minimizes the L2 norm (sum of squared differences) between the two images, and NCC maximizes the normalized cross-correlation between the two images. Once the images are aligned, I combine them to create the final color image. Some of the results are shown below.

Examples

Cathedral (Unaligned)

Cathedral (Unaligned)

Cathedral (L2 Aligned)

Cathedral (L2 Aligned)

Cathedral (NCC Aligned)

Cathedral (NCC Aligned)

Single-scale alignment applied to cathedral image.
Tobolsk (Unaligned)

Tobolsk (Unaligned)

Tobolsk (L2 Aligned)

Tobolsk (L2 Aligned)

Tobolsk (NCC Aligned)

Tobolsk (NCC Aligned)

Single-scale alignment applied to tobolsk image.
Monastery (Unaligned)

Monastery (Unaligned)

Monastery (L2 Aligned)

Monastery (L2 Aligned)

Monastery (NCC Aligned)

Monastery (NCC Aligned)

Single-scale alignment applied to monastery image.

Improving features through edge detection

As the monastery image above shows, L2 and NCC alignment methods can sometimes fail to align the images correctly. This can happen when the images have low contrast or when the images have similar or noisy features. To address this issue, I used Canny edge detection algorithm before alignment to make the features more prominent. By having more prominent features and suppressing noise, the alignment methods can find better alignments. Canny edge detection works by identifying the edges in an image by finding the gradient magnitude and direction of the image. The edges are then thinned to a single pixel width, noises are suppressed, and then the edges are connected to find the final edges. Applying Canny edge detection with threshold values of 20 and 120 yielded in a much clearer image across most photos, as shown below.

Monastery (Unaligned)

Monastery (Unaligned)

Monastery (L2 Aligned)

Monastery (L2 Aligned)

Monastery (Canny)

Monastery (L2 Aligned, Canny)

Monastery image alignment with Canny edge detection.
There is a noticeable improvement with Canny edge detector.
Cathedral (Unaligned)

Cathedral (Unaligned)

Cathedral (L2 Aligned)

Cathedral (L2 Aligned)

Cathedral (Canny)

Cathedral (L2 Aligned, Canny)

Cathedral image alignment with Canny edge detection.

Multi-scale Approach

Since many of the images in the collection are large, shifting the pixels by a fixed amount has its limitations. For example, if the image is 3000 pixels wide, then shifting by 15 pixels each side only allows 1% of error, whereas in previous `cathedral.jpg` image, the shift can cover errors up to 8%. Increasing the shift range can help, but it significantly increases the computation time, especially for large images. Thus a pyramid approach is used to align the images at multiple scales. With this approach, the original image is downsampled to a smaller size, and the computed alignment is used by the next scale to further refine the alignment. My pyramid approach uses 2x downscaling, with lowest scale used when the image is smaller than 256x256. This guarantees that the alignment algorithm can align the images by 10% error margin regardless of the size of the image. The results are shown below.
Image pyramid (Source)
Emir (Unaligned)

Emir (Unaligned)

Emir (L2 Aligned)

Emir (L2 Aligned)

Emir (Canny)

Emir (L2 Aligned, Canny)

Emir image alignment with pyramid method and Canny edge detection.
A large improvement can be observed from using edge detector.
Church (Unaligned)

Church (Unaligned)

Church (L2 Aligned)

Church (L2 Aligned)

Church (Canny)

Church (L2 Aligned, Canny)

Church image alignment with pyramid method and Canny edge detection.
Self Portrait (Unaligned)

Self Portrait (Unaligned)

Self Portrait (L2 Aligned)

Self Portrait (L2 Aligned)

Self Portrait (Canny)

Self Portrait (L2 Aligned, Canny)

Self portrait image alignment with pyramid method and Canny edge detection.

Histogram Equalization

The images in the collection are often underexposed or overexposed, which can lead to poor alignment results. To address this issue, I used histogram equalization to improve the contrast of the images. Histogram equalization is a technique that spreads out the intensity values of an image, so that the full range of intensities is used. This can help to improve the alignment results for images that are underexposed or overexposed, while also making the final image to have better contrast. In this implementation two types of histogram equalization are used: global and adaptive. Global histogram equalization spreads out the intensity values of the entire image, while adaptive histogram equalization uses regions of the image to equalize the histogram. Adaptive histogram equalization can be more effective for images with varying lighting conditions, as it can equalize the histogram of each region separately. We notice this in most cases, where global histogram equalization tends to oversaturate the image. In my implementation, I used tile grid size of 8x8, and with contrast limit of 2.

Lugano (Canny)

Lugano (Aligned)

Lugano (Full)

Lugano (Global)

Lugano (Adaptive)

Lugano (Adaptive)

Onion church image with different histogram equalization methods applied.
Adaptive equalization maintains the colors of the original image while making it slightly more vibrant.
Church (Canny)

Onion Church (Aligned)

Church (Full)

Church (Global)

Church (Adaptive)

Church (Adaptive)

Church image with different histogram equalization methods applied.
The sky and the church building are overexposed with global histogram equalization.
Monastery (Canny)

Monastery (Aligned)

Monastery (Full)

Monastery (Global)

Monastery (Adaptive)

Monastery (Adaptive)

Monastery image with different histogram equalization methods applied.
The colors of the building and trees are clearer with adaptive histogram equalization.

Full Results

Select an image from the dropdown to view outputs.

Offset of best result : Red (7, -1), Green (1, -1)
Unaligned

Unaligned

L2 Aligned

L2 Aligned

NCC Aligned

NCC Aligned

Canny

L2 Aligned with Canny

Global

With Global Histogram Equalization

Adaptive

With Adaptive Histogram Equalization

Sources