Scene reconstruction from multiple viewpoints are not always possible and rather it represents a small minority of the potential applications, from robotic manipulators to drones, autonomous vehicles etc. To overcome those limitations, we propose a fully convolutional 3D neural network capable of reconstructing a full scene from a single depth image by creating a 3D representation of it and automatically filling holes and inserting hidden elements. Our algorithm was evaluated on a real word dataset of tabletop scenes acquired using a Kinect and processed using KinectFusion software in order to obtain ground truth for network training and evaluation. Extensive measurements show that our deep neural network architecture outperforms the previous state of the art in terms of both precision and recall for the scene reconstruction task.