Spike sorting has long been used to obtain activities of single neurons from multi-unit recordings by extracting spikes from continuous data and assigning them to putative neurons. A large body of spike sorting algorithms have been developed that typically project spikes into a low-dimensional feature space and cluster them through iterative computations. However, there is no reached consensus on the optimal feature space or the best way of segmenting spikes into clusters, which often leads to the requirement of human intervention. It is hence desirable to effectively and efficiently utilize human knowledge in spike sorting while keeping a minimum level of manual intervention. Furthermore, the iterative computations that are commonly involved during clustering are inherently slow and hinder real-time processing of large-scale recordings. In this paper, we propose a novel few-shot spike sorting paradigm that employs a deep adversarial representation neural network to learn from a handful of annotated spikes and robustly classify unseen spikes sharing similar properties to the labeled ones. Once trained, the deep neural network can implement a parametric function that encodes analytically the categorical distribution of spike clusters, which can be significantly accelerated by GPUs and support processing hundreds of thousands of recording channels in real time. The paradigm also includes a clustering routine termed DidacticSortto aid users for labeling spikes that will be used to train the deep neural network. We have validated the performance of the proposed paradigm with both synthetic and in vitro datasets.