Firstly, thanks a ton for making this library. It is extremely helpfu

[Help] Getting the depth of the image plane about camtools HOT 4 OPEN

cs-mshah commented on May 18, 2024

[Help] Getting the depth of the image plane

from camtools.

Comments (4)

yxlao commented on May 18, 2024 1

Is there a way to know the size of the pixel in mm or the for a SIMPLE_PINHOLE or PINHOLE camera used by colmap?

As far as I know, COLMAP's reconstruction of points and cameras is not physically scaled. That is, the scale is relative (or arbitrary) as we don't know the physical scale of COLMAP's reconstruction.

Extrinsic properties: You have to manually to obtain a physical scale or provide physical-scale camera poses to COLMAP for it to reconstruct physical-scale points.
Intrinsic properties: The same applies to your question about "pixels scale in mm". You either have to know the physical specifications of your camera in advance, or use one of the camera calibration techniques by capturing a known pattern in physical space.

from camtools.

yxlao commented on May 18, 2024

I think what you mean by the "depth of the image plane" is the distance from the camera center to the image plane. This distance is referred to as the focal length, and there are two types of focal length representations: focal length in pixels and physical focal length in metric space.

TLDR: Typically, the focal length is expressed in pixels in computer vision, as specified in the intrinsic camera matrix $K$. If you want to compute the physical focal length, you'll need additional information including the sensor size (in metric unit) and the resolution of the camera.

Let's break it down. Assuming you have camera intrinsic $K$ matrix:

$$ K=\left[\begin{array}{ccc} f_{x} & 0 & c_{x} \\ 0 & f_{y} & c_{y} \\ 0 & 0 & 1 \end{array}\right] $$

Focal Length in Pixels: $f_x$ and $f_y$ in the intrinsic camera matrix $K$ are the focal lengths in pixels. These are unitless values that does not have any physical scale. You can imagine that scaling the focal length and the sensor size by the same factor will not change the image projection relationship at all. This is the most common representation in computer vision as we don't care about the physical size of the sensor, nor the physical focal length.
Physical Focal Length: The physical focal length is the focal length of the lens in metric units (e.g., millimeters). If you want to convert the physical focal length to focal length in pixels, you'll need to know the sensor size in metric units and the resolution of the camera. To compute the metric focal length from the pixel focal length, you would use the following formulas:

$$ f_{metric_x} = \frac{f_x}{resolution_x} \times sensor_x $$

$$ f_{metric_y} = \frac{f_y}{resolution_y} \times sensor_y $$

Where:

$f_x$ and $f_y$ are the given focal lengths in pixels along the x and y axes, respectively.
$resolution_x$ and $resolution_y$ are the resolution of the camera sensor in pixels along the width (x-axis) and height (y-axis).
$sensor_x$ and $sensor_y$ are the physical sizes of the sensor along the width and height in metric units (typically millimeters).
$f_{metric_x}$ and $f_{metric_y}$ are the calculated physical focal lengths in metric units along the x and y dimensions, respectively.
Also, you may assume that $f_x = f_y$ for typical cameras (uniform square pixels, symmetric lens).

from camtools.

yxlao commented on May 18, 2024

If you want to project depth images to 3D as point clouds, you may use the functions in ct.project. Typically you'll need the intrinsic and extrinsic camera parameters to project a depth image to 3D. Also, pay attention to the depth image format, as it could be in different units or different scales.

import open3d as o3d
import camtools as ct
import json
import numpy as np

from pathlib import Path


def main():
    # Get paths.
    redwood = o3d.data.SampleRedwoodRGBDImages()
    im_color_path = Path(redwood.color_paths[0])
    im_depth_path = Path(redwood.depth_paths[0])
    camera_intrinsic_path = Path(redwood.camera_intrinsic_path)

    # Load K (intrinsic).
    with open(camera_intrinsic_path, "r") as f:
        camera_intrinsic = json.load(f)
    K = np.array(camera_intrinsic["intrinsic_matrix"]).reshape(3, 3).T

    # Load T (extrinsic), assume identity.
    T = np.eye(4)

    # Load images and depths.
    im_color = ct.io.imread(im_color_path)
    im_depth = ct.io.imread_depth(im_depth_path, depth_scale=1000.0)

    # Create point cloud.
    points, colors = ct.project.im_depth_im_color_to_points_colors(
        im_depth=im_depth, im_color=im_color, K=K, T=T
    )

    # Visualize.
    pcd = o3d.geometry.PointCloud()
    pcd.points = o3d.utility.Vector3dVector(points)
    pcd.colors = o3d.utility.Vector3dVector(colors)
    o3d.visualization.draw_geometries([pcd])


if __name__ == "__main__":
    main()

This shall give you:

from camtools.

cs-mshah commented on May 18, 2024

Thanks. The explanation was really helpful. But I wanted to know the actual focal length in mm since I want to back-project my points to the depth of the image plane itself. Is there a way to know the size of the pixel in mm or the $sensor_x$, $sensor_y$ for a SIMPLE_PINHOLE or PINHOLE camera used by colmap? Or should I just assume the standard: 1px = 0.264mm

from camtools.

[Help] Getting the depth of the image plane about camtools HOT 4 OPEN

Comments (4)

Related Issues (6)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent