Image-Based 3D Reconstruction in Rerun

How to easily visualize COLMAP’s reconstructing 3D structure in Rerun

4 min readMay 3, 2024

This tutorial is a guide focused on visualisation and provides complete code for visualising an image-based 3D reconstruction using COLMAP with the open-source visualisation tool Rerun.

A short video clip has been processed offline using the COLMAP pipeline. The processed data was then visualized using Rerun, which allowed for the visualisation of individual camera frames, estimation of camera poses, and creation of point clouds over time. By using COLMAP in combination with Rerun, a highly detailed reconstruction of the scene depicted in the video was generated.

Therefore, you’ll learn:

How to read the sparse COLMAP reconstruction
How to visualize camera frames
How to visualize the estimation of camera poses
How to visualize point clouds

Complete Code

rerun/examples/python/structure_from_motion at docs-latest · rerun-io/rerun

Visualize streams of multimodal data. Fast, easy to use, and simple to integrate. Built in Rust using egui. …

github.com

COLMAP

COLMAP is a general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline. It offers a wide range of features for the reconstruction of ordered and unordered image collections.

If you’re curious about the topic of image-based 3D reconstruction, I highly suggest the tutorial provided by COLMAP’s team.

For reading the sparse COLMAP reconstruction you will need the read_model function, which is provided by ETH Zurich and UNC Chapel Hill. You can find here. [1]

from read_write_model import Camera, read_model

def read_and_log_sparse_reconstruction(dataset_path: Path, filter_output: bool, resize: tuple[int, int] | None) -> None:
    print("Reading sparse COLMAP reconstruction")
    cameras, images, points3D = read_model(dataset_path / "sparse", ext=".bin")

Logging and Visualising with Rerun

Rerun is a visualisation tool, that consists of an SDK and a viewer for logging, visualising and interacting with multimodal data streams. The SDK provides a simple interface to log timestamped multimodal data, which can then be visualized and interacted with in the Rerun viewer.

Key advantages of Rerun:

It’s free and open-source
Supported by an active community
Usable from C++, Python, and Rust
Developer-friendly interface

Timelines

All data logged using Rerun in the following sections is connected to a specific frame. Rerun assigns a frame id to each piece of logged data, and these frame ids are associated with a timeline.

rr.set_time_sequence("frame", frame_idx)

Logged images at lefts and Logged Images with Points at Right | Image by Author

Images with 2D Points

The images are logged through the Image to the camera/image entity.

rr.log("camera/image", rr.Image(rgb).compress(jpeg_quality=75))

The 2D image points that are used to triangulate the 3D points can be visualized by logging as Points2D to the camera/image/keypoints entity. Note that these keypoints are a child of thecamera/image entity, since the points should show in the image plane.

rr.log("camera/image/keypoints", rr.Points2D(visible_xys, colors=[34, 138, 167]))

Pinhole Camera with 3D Points

To visualize the images in 3D, the pinhole projection has to be logged and the camera pose.

The Pinhole camera is logged to the camera/image entity and defines the intrinsics of the camera. This defines how to go from the 3D camera frame to the 2D image plane. The extrinsics are logged as an Transform3D to the camera entity.

rr.log("camera", rr.Transform3D(translation=image.tvec, rotation=rr.Quaternion(xyzw=quat_xyzw), from_parent=True))
rr.log(
    "camera/image",
    rr.Pinhole(
        resolution=[camera.width, camera.height],
        focal_length=camera.params[:2],
        principal_point=camera.params[2:],
    ),
)

The coloured 3D points were added to the visualisation by logging the Points3D archetype to the points entity.

rr.log("points", rr.Points3D(points, colors=point_colors), rr.AnyValues(error=point_errors))

Reprojection error

For each image, a Scalar archetype containing the average reprojection error of the keypoints is logged to the plot/avg_reproj_err entity.

rr.log("plot/avg_reproj_err", rr.Scalar(np.mean(point_errors)))

Beyond Structure from Motion

If you found this article useful and insightful, there’s more!

Human Pose Tracking with MediaPipe in 2D and 3D: Rerun Showcase

How to easily visualise MediaPipe’s human pose tracking with Rerun

towardsdatascience.com

Real-Time Hand Tracking and Gesture Recognition with MediaPipe: Rerun Showcase

How to visualise MediaPipe’s Hand Tracking and Gesture Recognition with Rerun