Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/75972
Title: Android app representation for machine learning based malware detection
Authors: Zheng, Dunyuan
Keywords: DRNTU::Engineering::Electrical and electronic engineering
Issue Date: 2018
Abstract: According to statcounter, the most popular mobile operating system in the world is Android from Google with a market share of 75.66%. Since it is possible to install applications from erstwhile and not just only from the official application market ‘Google play’, malicious applications pose an invisible threat to the security of the android phones. Traditional Android malware detection approach collects suspicious samples and analyzes each sample comparing it with the existing database. The disadvantages of low accuracy and low efficiency associated with this method made researchers and anti-virus companies to look for new techniques for better resolutions. Nowadays, machine learning and deep learning techniques are prominently being used for malware detection. The project uses a unified framework for learning representation based malware classification. Firstly, MKLDroid is used to extract several graphs including the Control-Flow-Graphs from applications. There are five views that are integrated in this framework. In this project, only three of them are employed for analysis. Weisfeiler-Lehman graph (WL graph) kernel is applied to map the original graphs to a sequence of graphs (vectors). Graph2vec, a new method like doc2vec, is utilized to learn the embedding of the graphs in an unsupervised manner. Instead of applying kernel function, the project simply concatenates the view files. Traditional classification technique such as Support Vector Machine is then applied on the embedding to evaluate its performance. In the experimental studies, when the same dimension of 64 is used for both the embedding approach and WL graph view approach, the classifier based on the embedding can achieve 2% to 5% higher accuracy than that of usingWL graphs. In addition, classifying the embedding is faster than classifying the WL graph views.
URI: http://hdl.handle.net/10356/75972
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:EEE Theses

Files in This Item:
File Description SizeFormat 
ZhengDunYuan_2018.pdf
  Restricted Access
1.35 MBAdobe PDFView/Open

Page view(s) 20

63
checked on Oct 20, 2020

Download(s) 20

8
checked on Oct 20, 2020

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.