Code Monkey home page Code Monkey logo

boundarydepthextraction's Introduction

Depth Estimation and Boundary Extraction

This project implements a multi-step pipeline for extracting boundary depth from images. The process involves estimating a depth map from a monocular image, back-projecting this image into a 3D triangular mesh, and then extracting planar surfaces to approximate boundary conditions. This approach is inspired by the methodology described in the paper "LOOSECONTROL: Lifting ControlNet for Generalized Depth Conditioning."

Installation

Before running the script, ensure that you have python env in the same directory and it's activated:

pip install -r requirements.txt

How It Works:

  1. Depth Map Estimation: The first step is to estimate the depth map of the given image using a monocular depth estimator. In the provided code, this is accomplished using the DepthEstimator class, which utilizes the DPTForDepthEstimation model from the transformers library. The input image is processed and passed through the model to obtain a depth map.

  2. 3D Triangular Mesh Back-Projection: Once the depth map is obtained, the next step is to back-project the image into a 3D triangular mesh within the world space. This involves converting the depth map into a set of 3D points that represent the scene geometry. In the provided code, this step is performed by the createObj method within the BoundaryDepthExtractor class, which generates a 3D object file (model.obj) from the depth map.

  3. Vertical Plane Extraction: For efficiency during training, the code focuses only on vertical planes. This reduces the 3D boundary extraction problem to a simpler 2D problem. The verticalPlaneExtraction method in the BoundaryDepthExtractor class is responsible for this step, although the actual implementation is not provided in the code snippet.

  4. Orthographic Projection: The 3D mesh of the scene is projected onto a horizontal plane using orthographic projection. This projection facilitates the precise delineation of the 2D boundary that encapsulates the scene. The orthogonalProjection method in the BoundaryDepthExtractor class performs this step by projecting the 3D points onto a vertical plane.

  5. 2D Boundary Delineation: After projection, the next step is to delineate the 2D boundary that encapsulates the scene. This is achieved by determining the convex hull of the projected points, which represents the outer boundary of the scene. The boundaryDelineation method in the BoundaryDepthExtractor class performs this step.

  6. Polygon Approximation: The 2D boundary is then approximated with a polygon to simplify the representation. This approximation is done using the Douglas-Peucker algorithm, which reduces the number of points in the boundary while maintaining its overall shape. The polygonApproximation method in the BoundaryDepthExtractor class performs this step.

Input (1242 x 822):

emptyRoom

Polygon Approximation:

Screenshot 2024-04-05 at 2 14 29 AM

Rerendered Image (1242 x 822)

Screenshot 2024-04-15 at 1 12 38 AM

Camera Parameters and Vertices(Saved as a JSON file):

{
    "camera": {
        "extrinsic": [
            [
                1.0,
                0.0,
                0.0,
                0.03777360018751525
            ],
            [
                -0.0,
                -1.0,
                -0.0,
                0.0074896122114004515
            ],
            [
                -0.0,
                -0.0,
                -1.0,
                0.9023693128733252
            ],
            [
                0.0,
                0.0,
                0.0,
                1.0
            ]
        ],
        "intrinsic": {
            "width": 1242,
            "height": 822,
            "fx": 711.8728819108087,
            "fy": 711.8728819108087,
            "cx": 620.5,
            "cy": 410.5,
            "intrinsic_matrix": [
                [
                    711.8728819108087,
                    0.0,
                    620.5
                ],
                [
                    0.0,
                    711.8728819108087,
                    410.5
                ],
                [
                    0.0,
                    0.0,
                    1.0
                ]
            ]
        }
    },
    "vertices": [
        [
            [
                0.461700439453125,
                -0.45669275522232056
            ]
        ],
        [
            [
                -0.5396929979324341,
                -0.4580613672733307
            ]
        ],
        [
            [
                -0.5212900042533875,
                0.35700321197509766
            ]
        ],
        [
            [
                0.43428346514701843,
                0.36788055300712585
            ]
        ]
    ]
}

boundarydepthextraction's People

Contributors

ritessshhh avatar

Watchers

Fan-Yun Sun avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.