Effective fault feature extraction is the key of fault diagnosis. In previous works, it is shown that some embedding methods and unsupervised deep learning methods have the ability to extract fault features from raw signals directly, such as PCA and deep autoencoder. Particularly, deep autoencoder has been shown in relevant research that it can effectively extract the hidden 'trend' associated with machinery health states which can be used directly for online anomaly detection and prediction. However, in practical online fault diagnosis, the discrimination between successive signals is small due to the slow degradation progress and the external noise. Therefore, it is important to optimize the feature extraction process to achieve better online fault tracking. In this paper, a regularized deep clustering algorithm is proposed to guide the optimization process of feature extraction which combines embedding method and semi-guided learning. A regularization term for the cluster center points is proposed to make the feature optimization converge in a monotonic linear trend. In order to verify the effectiveness of the method, an accelerated gearbox run-to-failure experiment is carried out. The result shows that the feature optimization method can optimize the fault features on the basis of the deep autoencoder algorithm in two aspects: A better distinction of the fault features in short term and a more consistent trend of the gear wear in the long term.