In a nutshell: we propose a method for point cloud densification (from camera, IMU, range sensor) that can generalize well across different sensor platforms. The figure in this link illustrates our improvement over existing works: https://github.com/alexklwong/calibrated-backprojection-network/blob/master/figures/overview_teaser.gif
The slightly longer version: previous methods, when trained on one sensor platform, have problem generalizing to different ones when deployed to the wild. This is because they are overfitted to the sensors used to collect the training set. Our method takes image, sparse point cloud and camera calibration as input, which allows us to use a different calibration at test time. This significantly improves generalization to novel scenes captured by sensors different than those used during training. Amongst our innovations is a "calibrated backprojection layer" that imposes strong inductive bias on the network (as opposed trying to learn everything from the data). This design allows our method to achieve the state of the art on both indoor and outdoor scenarios while using a smaller model size and boasting a faster inference time.
For those interested, here are the links to
paper: https://arxiv.org/pdf/2108.10531.pdf
code (pytorch): https://github.com/alexklwong/calibrated-backprojection-network