Text this: Detail-preserving depth estimation from a single image based on modified fully convolutional residual network and gradient network