The locus at which a vector harboring a product transgene integrates into the genome can have a profound effect on the transgene’s transcript level and the stability of the resulting cell line. In order to identify integration site(s) of a transfected vector from next generation genome sequencing data, the SAM filtering pipeline (SFP) was created. It is best suited for targeted sequence data, such as that from sequence capture of probed vector regions. However, it will also work for whole genome sequencing data, though the memory requirements are large (the more reads in your data set, the larger the memory requirements). A bwa-mem mapped .sam file is required as input to the pipeline.