Supplymentary Materials and Codes

A Framework for Depth Video Reconstruction from a Subset of Samples and Its Applications

The growth of depth sensing systems in this decade have facilitated a variety of applications in computer vision. Depending on the systematic configurations, both direct and indirect sensing techniques encounter image processing issues, such as hole filling, depth map super resolution. In this paper, a framework for depth video reconstruction from a subset of samples is proposed. By redefining classical dense depth estimation into two individual problems, sensing and synthesis, we propose a motion compensation assisted sampling (MCAS) scheme and a spatio-temporal depth reconstruction (STDR) algorithm for reconstructing depth video sequences from a subset of samples. Using the 3-dimensional extensible dictionary, 3D-DWT, and applying alternating direction method of multiplier technique, the proposed STDR algorithm possesses scalability for temporal volume and efficiency for processing large scale depth data. Exploiting the temporal information and corresponding RGB images, the proposed MCAS scheme achieves an efficient 1-Stage sampling. Experimental results show that the proposed depth reconstruction framework outperforms the existing methods and is competitive compared to our previous work[1], which requires a pilot signal in the 2-Stage sampling scheme. Finally, to estimate missing reliable depth samples from varying input sources, we present an inference approach using geometrical and color similarities. Applications for depth video super resolution from uniform-grid subsampled data and dense disparity video estimation from a subset of reliable samples are presented.

Reproducible Matlab Code
  • Matlab Code (*.rar format) [download] (~464 MB)
    • "*.mat" Data for Matlab Code (*.rar format) [download] (~ 39.8 MB).
    • Unzip the file and replace the directory ".\Matlab_Code\results_matfiles\"

Experiments on Parameter Selection (Section 3-C)

  • 2D plots MAE curves with sweeping parameters (lambda, beta)
5th frame, Tanks72-73rd frames, Tanks10-12th frames, Books22-25 frames, Books88-92nd frames, Temples
(T=1)(T=2) (T=3)(T=4)(T=5)

Synthetic Data Comparisons (Table III)

Table III: Comparisons for the video reconstruction algorithm. All methods are evaluated over the whole depth video.
(Additional)Table III: Comparisons for the video reconstruction algorithm. All methods are evaluated over the whole depth video.

Real Data Comparisons (Stereo Images)

  • Comparisons for depth video reconstruction algorithms.
  • Experimental results in videos: realdata.avi (~27.8MB)

Real Data Comparisons (Kinectdata)

  • Color Image (380x540), Depth Image (190x270)
  • Kinect dataset: Kinectdata (~77MB)
    • Users may directly run "Figure11_Video_Synthesize_kinectdata.m" in the provided code to see the video sequence
RGB ImageInput Depth (190x270)Proposed Method (190x270)Guided Filter [4]Hawe [5]Triangular Interp.

Depth Video Reconstruction from Uniform-Grid Subsampled (LR) Depth Video

Comparisons:
Table in Figure 12: Performance comparisons for depth video reconstruction from uniform-grid subsampled depth data.
Ground TruthSampling MapReconstructed DepthFerstl et al [2]Bicubic
(71th frame)(MCAS) (MCAS + STDR)
Ground TruthSampling MapReconstructed DepthFerstl et al [2]Bicubic
(30th frame)(MCAS) (MCAS + STDR)
Ground TruthSampling MapReconstructed DepthFerstl et al [2]Bicubic
(21th frame)(MCAS) (MCAS + STDR)
Visual Comparisons: Temporal Consistency
Ground TruthSampling MapReconstructed DepthFerstl et al [2]Bicubic
(60th-63rd frames)(MCAS) (MCAS + STDR)

Disparity Video Estimation

Comparisons
D_w3x3D_w5x5D_w7x7D_w9x9 Reliable SetSampling MapProposedGround Truth
D_w3x3D_w5x5D_w7x7D_w9x9 Reliable SetSampling MapProposedGround Truth
D_w3x3D_w5x5D_w7x7D_w9x9 Reliable SetSampling MapProposedGround Truth

Temporal Consistency Comparing to Motion Compensation

Comparisons
Motion Compensation [3]ProposedGround Truth
(22th-24th frames)
Motion Compensation [3]ProposedGround Truth
(14th-16th frames)
Motion Compensation [3]ProposedGround Truth
(25th-27th frames)


[1] L.-K, Liu, S.H. Chan, and T.Q. Nguyen, "Depth reconstruction from sparse samples: Representation, algorithm, and sampling," IEEE Trans. on Image Process., vol. 24, no. 6, pp. 1983-1996, Jun. 2015.

[2] D. Ferstl, C. Reinbacher, R. Ranftl, M. Ruether, and H. Bischof "Image guided depth upsampling using anisotropic total generalized variation," in Proceed. IEEE Int. Conf. on Computer Vision, Dec. 2013, pp. 993-1000.

[3] S.H. Chan, D.T. Vo, and T.Q. Nguyen"Subpixel motion estimation without interpolation," in IEEE Int. Conf. on Acoustics Speech and Signal Process., Mar. 2010, pp. 722-725, Dec. 2013, pp. 993-1000.

[4] K. He, J. Sun, and X. Tang,"Guided image filtering," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 35, no. 6, pp. 13971409, Jun. 2013.

[5] S. Hawe, M. Kleinsteuber, and K. Diepold, "Dense disparity maps from sparse disparity measurements," in Proceed. IEEE Int. Conf. Computer Vision (ICCV'11), Nov. 2011, pp. 2126-2133.