This paper presents a computational framework for learning autonomous guidance behavior in unknown or partially known environments. The learning framework uses a receding horizon trajectory optimization associated with a spatial value function. The function describes optimal (for example, minimum time) guidance behavior represented as cost and velocity at any point in geographical space to reach a specified goal state. For guidance in unknown environments, a local spatial value function based on the current vehicle state is updated online using environment data from onboard exteroceptive sensors. The proposed learning framework has the advantage, in that it learns information directly relevant to the optimal guidance and control behavior, enabling optimal trajectory planning in unknown or partially known environments. The learning framework is evaluated by measuring performance over successive runs in three-dimensional indoor flight simulations. The test vehicle in the simulations is a Blade-Cx2 coaxial miniature helicopter. The environment is a priori unknown to the learning system. The paper investigates changes in performance, dynamic behavior, spatial value function, and control behavior in the body frame as a result of learning over successive runs.