In frequency division duplex massive MIMO systems, one critical challenge is that the mobiles need to feed back a large downlink channel matrix to the base station, creating large signaling overhead. Estimating a large downlink channel matrix at the mobile may also be costly in terms of power and memory consumption. Prior work addresses these issues using appropriate angle parameterization and compressed sensing techniques, but this approach involves solving a challenging, and sometimes extremely large, sparse inverse problem-which is difficult to solve to global optimality, and often leads to unaffordable memory and computational costs. In this work, we propose an alternative framework that explores the fact that double directional channels for mmWave massive MIMO usually have low rank. The base station estimates the downlink channel via recovering a low-rank matrix, utilizing samples of the channel matrix compressed and fed back from the mobiles. This way, the mobile users can avoid performing resource-consuming tasks. In addition, the number of feedback measurements can be much smaller than the size of the channel matrix without losing channel recovery guarantees. Further, the low-rank estimation problem at the base station has a manageable size that scales gracefully with the channel size. Based on the new model, we propose two methods for channel estimation, which are based on iterative optimization and deep learning, respectively. Compared with the state-of-the-art, the optimization method obtains 10x improvement and the deep learning approach achieves up to 1000x improvement in computational complexity, while achieving high estimation quality in very low sample region.