Experiences of landing machine learning onto market-scale mobile malware detection

Liangyi Gong, Zhenhua Li, Feng Qian, Zifan Zhang, Qi Alfred Chen, Zhiyun Qian, Hao Lin, Yunhao Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations


App markets, being crucial and critical for today's mobile ecosystem, have also become a natural malware delivery channel since they actually "lend credibility" to malicious apps. In the past decade, machine learning (ML) techniques have been explored for automated, robust malware detection. Unfortunately, to date, we have yet to see an ML-based malware detection solution deployed at market scales. To better understand the real-world challenges, we conduct a collaborative study with a major Android app market (T-Market) offering us large-scale ground-Truth data. Our study shows that the key to successfully developing such systems is manifold, including feature selection/engineering, app analysis speed, developer engagement, and model evolution. Failure in any of the above aspects would lead to the "wooden barrel effect" of the entire system. We discuss our careful design choices as well as our first-hand deployment experiences in building such an ML-powered malware detection system. We implement our design and examine its effectiveness in the T-Market for over one year, using a single commodity server to vet ∼ 10K apps every day. The evaluation results show that this design achieves an overall precision of 98% and recall of 96% with an average per-App scan time of 1.3 minutes.

Original languageEnglish (US)
Title of host publicationProceedings of the 15th European Conference on Computer Systems, EuroSys 2020
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9781450368827
StatePublished - Apr 15 2020
Event15th European Conference on Computer Systems, EuroSys 2020 - Heraklion, Greece
Duration: Apr 27 2020Apr 30 2020

Publication series

NameProceedings of the 15th European Conference on Computer Systems, EuroSys 2020


Conference15th European Conference on Computer Systems, EuroSys 2020

Bibliographical note

Funding Information:
We sincerely thank our shepherd Prof. Jon Crowcroft and the anonymous reviewers for their valuable feedback. We also appreciate Weizhi Li, Yang Li, Zipeng Wu, and Hai Long for their contributions to the data collection and system deployment of APICHECKER. This work is supported in part by the National Key R&D Program of China under grant 2018YFB1004700, the National Natural Science Foundation of China (NSFC) under grants 61822205, 61902211, 61632020 and 61632013, and the Beijing National Research Center for Information Science and Technology (BNRist).

Publisher Copyright:
© 2020 ACM.


Dive into the research topics of 'Experiences of landing machine learning onto market-scale mobile malware detection'. Together they form a unique fingerprint.

Cite this