Code Monkey home page Code Monkey logo

floornet's Introduction

FloorNet: A Unified Framework for Floorplan Reconstruction from 3D Scans

By Chen Liu*, Jiaye Wu*, and Yasutaka Furukawa (* indicates equal contribution)

Introduction

This paper proposes FloorNet, a novel neural network, to turn RGBD videos of indoor spaces into vector-graphics floorplans. FloorNet consists of three branches, PointNet branch, Floorplan branch, and Image branch. For more details, please refer to our ECCV 2018 paper or visit our project website. This is a follow-up work of our floorplan transformation project which you can find here.

Updates

[12/22/2018] We now provide a free IP solver (not relying on Gurobi) at IP.py. The functionality of IP.py should be similar to QP.py which uses Gurobi to solve the IP problem. You might want to consider the free solver if you don't have a Gurobi license.

Dependencies

Python 2.7, TensorFlow (>= 1.3), numpy, opencv 3, CUDA (>= 8.0), Gurobi (free only for academic usages).

Data

Dataset used in the paper

We collect 155 scans of residential units and annotated corresponding floorplan information. Among 155 scans, 135 are used for training and 20 are for testing. We convert data to tfrecords files which can be downloaded here (or here if you cannot access the previous one). Please put the downloaded files under folder data/.

Here are the links to the raw point clouds, annotations, and their associations. Please refer to RecordWriterTango.py to see how to convert the raw data and annotations to tfrecords files.

Using custom data

To generate training/testing data from other data source, the data should be converted to tfrecords as what we did in RecordWriterTango.py (an example of our raw data before processed by RecordWriterTango.py is provided here). Please refer to this guide for how to generate and read tfrecords.

Basically, every data sample(tf.train.Example) should at least contain the following components:

  1. Inputs:

    • a point cloud (randomly sampled 50,000 points)
    • a mapping from point cloud's 3D space to 2D space of the 256x256 top-view density image.
      • It contains 50,000 indices, one for each point.
      • For point (x, y, z), index = round((y - min(Y) + padding) / (maxRange + 2 * padding) * 256) * 256 + round((x - min(X) + padding) / (maxRange + 2 * padding) * 256).
        • maxRange = max(max(X) - min(X), max(Y) - min(Y))
        • padding could be any small value, say 0.05 maxRange
    • optional: image features of the RGB video stream, if the image branch is enabled
  2. Labels:

    • Corners and their corresponding types
    • Total number of corners
    • A ground-truth icon segmentation map
    • A ground-truth room segmentation map

Again, please refer to RecordWriterTango.py for exact details.

NEW: We added a template file, RecordWriterCustom.py for using custom data.

Annotator

For reference, a similar (but not the same) annotator written in Python is here. You need to make some changes to annotate your own data.

Training

To train the network from scratch, please run:

python train.py --restore=0

Evaluation

To evaluate the performance of our trained model, please run:

python train.py --task=evaluate --separateIconLoss

Generate 3D models

We can popup the reconstructed floorplan to generate 3D models. Please refer to our previous project, FloorplanTransformation, for more details.

Contact

If you have any questions, please contact me at [email protected].

floornet's People

Contributors

art-programmer avatar chenliu-wustl avatar koykl avatar woodfrog avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

floornet's Issues

Image Data From tfrecord File

Hi,

Is it possible to extract raw perspective views (original RGBD images) from tfrecords file? From what I understand, only the features extracted in the Image branch of FloorNet are available. Am I correct?

"metadata.t7" not find

@KoykL @art-programmer
根据RecordWriterTango.py我想将我的数据写入到tfrecords中,并进行评估,但是我不知道metadata.t7(writeExample)是哪里来的,是其机器学习到中间结果吗?是用来将点云转化为俯视图到吗?

According to RecordWriterTango.py I want to write my data into tfrecords and evaluate it, but I don't know where the metadata.t7(in writeExample) comes from, is its machine learning intermediate results?Is it used to turn a point cloud into a top view?

Question about checkpoint

Did you pre-train the network and supply the pre-trained model in the floder named 'checkpoint'?

I tried to use your checkpoint to test your data, but I got very bad prediction result which is as same as the input point cloud.(python train.py --task=test --startIteration=10000)

I tried to train by myself too. However, I found that there are lots of data which is not match between point cloud and labels in the training data. (I got your tfrecord data from Mega, “Tango_train.tfrecords”)Did I train the network in a wrong way?
data_not_match_1
![data_not_match_2

Error in RecordReader.py during training

Hi,
I followed the readme and tried running train.py with those two files in the data folder. I am getting this bug.
raceback (most recent call last):
File "train.py", line 1522, in
train(args)
File "train.py", line 759, in train
dataset_train = getDatasetTrain(filenames_train, options.augmentation, '4' in options.branches, options.batchSize)
File "/home/FloorNet/RecordReader.py", line 202, in getDatasetTrain
return tf.data.TFRecordDataset(filenames[0]).repeat().map(functools.partial(parse_fn, augmentation=augmentation, readImageFeatures=readImageFeatures), num_parallel_calls=NUM_THREADS).batch(batchSize).prefetch(1)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 840, in map
return ParallelMapDataset(self, map_func, num_parallel_calls)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 1857, in init
super(ParallelMapDataset, self).init(input_dataset, map_func)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 1826, in init
self._map_func.add_to_graph(ops.get_default_graph())
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/function.py", line 488, in add_to_graph
self._create_definition_if_needed()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/function.py", line 321, in _create_definition_if_needed
self._create_definition_if_needed_impl()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/function.py", line 338, in _create_definition_if_needed_impl
outputs = self._func(*inputs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 1791, in tf_map_func
ret = map_func(nested_args)
File "/home/FloorNet/RecordReader.py", line 75, in parse_fn
point_indices, corners, heatmaps = tf.cond(tf.logical_or(tf.equal(flags[0], 0), tf.equal(flags[0], 4)), lambda: augmentWarping(point_indices, corners, heatmaps, gridStride=32, randomScale=2), lambda: (point_indices, corners, heatmaps))
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 432, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2047, in cond
orig_res_t, res_t = context_t.BuildCondBranch(true_fn)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 1897, in BuildCondBranch
original_result = fn()
File "/home/FloorNet/RecordReader.py", line 75, in
point_indices, corners, heatmaps = tf.cond(tf.logical_or(tf.equal(flags[0], 0), tf.equal(flags[0], 4)), lambda: augmentWarping(point_indices, corners, heatmaps, gridStride=32, randomScale=2), lambda: (point_indices, corners, heatmaps))
File "/home/FloorNet/augmentation_tf.py", line 93, in augmentWarping
xsTarget, ysTarget = warpIndices(pointcloudIndices % width, pointcloudIndices / width, gridStride, gridWidth, gridHeight, width, height, gridXsTarget, gridYsTarget)
File "/home/FloorNet/augmentation_tf.py", line 20, in warpIndices
topLeftXsTarget = tf.gather_nd(gridXsTarget, topLeft)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 2840, in gather_nd
"GatherNd", params=params, indices=indices, name=name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 609, in _apply_op_helper
param_name=input_name)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 60, in _SatisfiesTypeConstraint
", ".join(dtypes.as_dtype(x).name for x in allowed_list)))
TypeError: Value passed to parameter 'indices' has DataType float64 not in list of allowed values: int32, int64

Can you please point me in the right direction

Disable Gurobi

有什么简单方法不使用Gurobi吗?
因为我已经毕业了,无法申请到免费的授权,Gurobi的的授权对我来说实在有些贵~
如果不使用Gurobi对结果影响大吗,还是只影响运算时间?

Question about metadatata

Where can we find the definitions of 'topDownTransformation', 'topDownViewAngle', 'videoOrientation' included in "metadata.t7", so that we could attempt to run FloorNet on data that we acquire on our own?

  1. Do we correctly understand from issue #4 that 'topDownViewAngle' is a rotation angle about the Z' axis?
  2. Do we correctly infer that 'videoOrientation' might take valuies 1 or 2 regarding if the video is in, respectively, landscape or portrait orientation?
  3. What about the topDownTransformation? Where is the information that it contains?

evaluate.py not run

我首先执行:
python train.py --batchSize=3
之后执行:
python evaluate.py
然而脚本并不能直接执行,所以我修改了执行的指令
python train.py --task=evaluate --separateIconLoss
不知道这样做对吗?

Alternatives to Gurobi

Hello!
It seems that Gurobi is being used which is not completely free. Any alternatives to evaluate the model without the use of Gurobi? It would have been better if this dependency is mentioned on the ReadMe file as well.

Question about 'reconstructFloorplan'.

The results of the model training output already have the rendering of the icon and the room. The information such as the corner has been identified. Why do you need reconstructFloorplan, is the logic and model here still relevant?

模型训练产出的结果就已经有了icon和房间的渲染图,墙角等信息已经被识别出来了,为什么还需要reconstructFloorplan,这里边的逻辑和模型还有关系吗?

Usage of image branch and possible scaling bug

I am interested in investigating results with and without the image branch of the network. As far as I am concerned you do not provide checkpoint which utilizes such a branch, so I trained the network enabling this branch, on my own.

First of all, I want to know if I am correct that corner_acc.npy, topdown_acc.npy which you provide within your example are the weights which correspond to pretraining of DRN and HG networks, thus we utilize them as-is for both training and validation/testing purposes.

Next I want to point out a possible bug related to image features which is the following. I realized that within RecordWriterTango.py, RGB values are scaled twice.

One time during loading:

color = color.astype(np.float32) / 255

Second time during processing:

points[:, 3:] = points[:, 3:] / 255 - 0.5

This results to RGB values within a range
(-0.5, -0.49607843)

It can be reproduced with the following snippet (similar for training records)

record_filepath ='Tango_val.tfrecords'
dataset = getDatasetVal([record_filepath], '', True, 1)
iterator = dataset.make_one_shot_iterator()
input_dict, gt_dict = iterator.get_next()

pt = input_dict['points'][0]
points = tf.Session().run([pt])[0]
RGB = points[:, 3:6]
print(np.amin(RGB), np.amax(RGB))

Could such issue affect how the values are interpreted by the network? Is this related to already pretrained weights?

Thanks in advance

How can I got a 3D model of the indoor space by a tango phone?

你们是如何通过tango手机获取室内三维建模的obj文件的?

在你们的代码中处理的是室内空间的三维模型,请问我如何才能获取我们的房间的三维模型呢?tango手机里带有扫描三维模型,并生成Obj文件的应用吗?同时metadata.t7也是tango手机生成的吗?

Your codes handle a 3D model of the interior space. How can I get a 3D model of our room? Does the tango phone have an application for scanning 3D models and generating Obj files? At the same time, metadata.t7 is also generated by the tango phone?

Questions about floorplan.txt

How is the floorplan.txt file obtained?

  1. If it is manual labeling, is there a labeling tool? I did not see the generated code for floorplan.txt in the FloorplanAnnotator.
  2. Which image is floorplan.txt based on? Is it based on the density map image of the 3D model?

floorplan.txt文件是怎么获取的呢?

  1. 如果是人工标注的话,是否有标注工具呢?我在FloorplanAnnotator中并没有看到floorplan.txt的生成代码。
  2. floorplan.txt是基于哪张图做的标注呢?是基于3D模型的密度图来做的标记吗?

FloorPlan Image

怎么才能看到你们的网站上视频里那样,最后生成的平面图呢?

我运行evaluate后test目录下会生成一些图片和文件
result_door.png result_icon.png result_line.png
doors_out.txt icons_out.txt points_out.txt
还是我需要读取这些生成的文件后,自己生成平面图?

Question about metadata

Where can we find the definitions of 'topDownTransformation', 'topDownViewAngle', 'videoOrientation' included in "metadata.t7", so that we could attempt to run FloorNet on data that we acquire on our own?

  1. Do we correctly understand from issue #4 that 'topDownViewAngle' is a rotation angle about the Z' axis?
  2. Do we correctly infer that 'videoOrientation' might take valuies 1 or 2 regarding if the video is in, respectively, landscape or portrait orientation?
  3. What about the topDownTransformation? Where is the information that it contains?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.