Experiences of landing machine learning onto market-scale mobile malware detection

Liangyi Gong, Zhenhua Li, Feng Qian, Zifan Zhang, Qi Alfred Chen, Zhiyun Qian, Hao Lin, Yunhao Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

App markets, being crucial and critical for today's mobile ecosystem, have also become a natural malware delivery channel since they actually "lend credibility" to malicious apps. In the past decade, machine learning (ML) techniques have been explored for automated, robust malware detection. Unfortunately, to date, we have yet to see an ML-based malware detection solution deployed at market scales. To better understand the real-world challenges, we conduct a collaborative study with a major Android app market (T-Market) offering us large-scale ground-Truth data. Our study shows that the key to successfully developing such systems is manifold, including feature selection/engineering, app analysis speed, developer engagement, and model evolution. Failure in any of the above aspects would lead to the "wooden barrel effect" of the entire system. We discuss our careful design choices as well as our first-hand deployment experiences in building such an ML-powered malware detection system. We implement our design and examine its effectiveness in the T-Market for over one year, using a single commodity server to vet ∼ 10K apps every day. The evaluation results show that this design achieves an overall precision of 98% and recall of 96% with an average per-App scan time of 1.3 minutes.

Original languageEnglish (US)
Title of host publicationProceedings of the 15th European Conference on Computer Systems, EuroSys 2020
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9781450368827
DOIs
StatePublished - Apr 15 2020
Externally publishedYes
Event15th European Conference on Computer Systems, EuroSys 2020 - Heraklion, Greece
Duration: Apr 27 2020Apr 30 2020

Publication series

NameProceedings of the 15th European Conference on Computer Systems, EuroSys 2020

Conference

Conference15th European Conference on Computer Systems, EuroSys 2020
CountryGreece
CityHeraklion
Period4/27/204/30/20

Fingerprint Dive into the research topics of 'Experiences of landing machine learning onto market-scale mobile malware detection'. Together they form a unique fingerprint.

  • Cite this

    Gong, L., Li, Z., Qian, F., Zhang, Z., Chen, Q. A., Qian, Z., Lin, H., & Liu, Y. (2020). Experiences of landing machine learning onto market-scale mobile malware detection. In Proceedings of the 15th European Conference on Computer Systems, EuroSys 2020 (Proceedings of the 15th European Conference on Computer Systems, EuroSys 2020). Association for Computing Machinery, Inc. https://doi.org/10.1145/3342195.3387530