Deep Learning Malware Classification

Back

Project Description:

The objective of this project is to develop a Deep Learning model that can classify malware and predict the threat group it belongs to. The model will be trained on greyscale images of malware binaries that have been converted to images and resized using padding methods to ensure a black background.

The first step of this project will be to collect a dataset of malware binaries from different threat groups. This dataset will be preprocessed by converting the binaries to greyscale images and resizing them to a uniform resolution. To ensure a consistent background, padding methods will be applied to some of the images.

Next, a Deep Learning model will be developed using convolutional neural networks (CNNs) to classify the malware images. The model will be trained on the preprocessed dataset using both the padded and unpadded images to determine which method is better for classification.

This work includes:

Method: Convert binaries to greyscale images and resize resolution
Challenge: Poor image quality, tested padding for black background
Training of a CNN model with the extended data
Approach: Train Deep Learning model with padded and unpadded data
Goal: Predict malware threat groups with accuracy using CNN

My Contributions:

Collect Data
Annotate & Label data
Extract & Combine data with colleagues
Split data into train & test dataset with [8:2]
Train the model using extracted and pre-processed data

Github Repositories:

github.com/goheesheng/FYPJ_AI_MALWARE

Created Using:

CV2
Cuda
Numpy
Python | Tensorflow

Dr Brandon Ooi, Lecturer

Testimonial
Presentation Slides