Code Monkey home page Code Monkey logo

Comments (4)

yxlao avatar yxlao commented on May 18, 2024 1

Is there a way to know the size of the pixel in mm or the for a SIMPLE_PINHOLE or PINHOLE camera used by colmap?

As far as I know, COLMAP's reconstruction of points and cameras is not physically scaled. That is, the scale is relative (or arbitrary) as we don't know the physical scale of COLMAP's reconstruction.

  • Extrinsic properties: You have to manually to obtain a physical scale or provide physical-scale camera poses to COLMAP for it to reconstruct physical-scale points.
  • Intrinsic properties: The same applies to your question about "pixels scale in mm". You either have to know the physical specifications of your camera in advance, or use one of the camera calibration techniques by capturing a known pattern in physical space.

from camtools.

yxlao avatar yxlao commented on May 18, 2024

I think what you mean by the "depth of the image plane" is the distance from the camera center to the image plane. This distance is referred to as the focal length, and there are two types of focal length representations: focal length in pixels and physical focal length in metric space.

TLDR: Typically, the focal length is expressed in pixels in computer vision, as specified in the intrinsic camera matrix $K$. If you want to compute the physical focal length, you'll need additional information including the sensor size (in metric unit) and the resolution of the camera.

Let's break it down. Assuming you have camera intrinsic $K$ matrix:

$$ K=\left[\begin{array}{ccc} f_{x} & 0 & c_{x} \\ 0 & f_{y} & c_{y} \\ 0 & 0 & 1 \end{array}\right] $$

  • Focal Length in Pixels: $f_x$ and $f_y$ in the intrinsic camera matrix $K$ are the focal lengths in pixels. These are unitless values that does not have any physical scale. You can imagine that scaling the focal length and the sensor size by the same factor will not change the image projection relationship at all. This is the most common representation in computer vision as we don't care about the physical size of the sensor, nor the physical focal length.
  • Physical Focal Length: The physical focal length is the focal length of the lens in metric units (e.g., millimeters). If you want to convert the physical focal length to focal length in pixels, you'll need to know the sensor size in metric units and the resolution of the camera. To compute the metric focal length from the pixel focal length, you would use the following formulas:

$$ f_{metric_x} = \frac{f_x}{resolution_x} \times sensor_x $$

$$ f_{metric_y} = \frac{f_y}{resolution_y} \times sensor_y $$

Where:

  • $f_x$ and $f_y$ are the given focal lengths in pixels along the x and y axes, respectively.
  • $resolution_x$ and $resolution_y$ are the resolution of the camera sensor in pixels along the width (x-axis) and height (y-axis).
  • $sensor_x$ and $sensor_y$ are the physical sizes of the sensor along the width and height in metric units (typically millimeters).
  • $f_{metric_x}$ and $f_{metric_y}$ are the calculated physical focal lengths in metric units along the x and y dimensions, respectively.
  • Also, you may assume that $f_x = f_y$ for typical cameras (uniform square pixels, symmetric lens).

from camtools.

yxlao avatar yxlao commented on May 18, 2024

If you want to project depth images to 3D as point clouds, you may use the functions in ct.project. Typically you'll need the intrinsic and extrinsic camera parameters to project a depth image to 3D. Also, pay attention to the depth image format, as it could be in different units or different scales.

import open3d as o3d
import camtools as ct
import json
import numpy as np

from pathlib import Path


def main():
    # Get paths.
    redwood = o3d.data.SampleRedwoodRGBDImages()
    im_color_path = Path(redwood.color_paths[0])
    im_depth_path = Path(redwood.depth_paths[0])
    camera_intrinsic_path = Path(redwood.camera_intrinsic_path)

    # Load K (intrinsic).
    with open(camera_intrinsic_path, "r") as f:
        camera_intrinsic = json.load(f)
    K = np.array(camera_intrinsic["intrinsic_matrix"]).reshape(3, 3).T

    # Load T (extrinsic), assume identity.
    T = np.eye(4)

    # Load images and depths.
    im_color = ct.io.imread(im_color_path)
    im_depth = ct.io.imread_depth(im_depth_path, depth_scale=1000.0)

    # Create point cloud.
    points, colors = ct.project.im_depth_im_color_to_points_colors(
        im_depth=im_depth, im_color=im_color, K=K, T=T
    )

    # Visualize.
    pcd = o3d.geometry.PointCloud()
    pcd.points = o3d.utility.Vector3dVector(points)
    pcd.colors = o3d.utility.Vector3dVector(colors)
    o3d.visualization.draw_geometries([pcd])


if __name__ == "__main__":
    main()

This shall give you:
Screenshot from 2024-04-07 15-40-49

from camtools.

cs-mshah avatar cs-mshah commented on May 18, 2024

Thanks. The explanation was really helpful. But I wanted to know the actual focal length in mm since I want to back-project my points to the depth of the image plane itself. Is there a way to know the size of the pixel in mm or the $sensor_x$, $sensor_y$ for a SIMPLE_PINHOLE or PINHOLE camera used by colmap? Or should I just assume the standard: 1px = 0.264mm

from camtools.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.