Back

Deep Learning Malware Classification

Project Description:

The objective of this project is to develop a Deep Learning model that can classify malware and predict the threat group it belongs to. The model will be trained on greyscale images of malware binaries that have been converted to images and resized using padding methods to ensure a black background.

The first step of this project will be to collect a dataset of malware binaries from different threat groups. This dataset will be preprocessed by converting the binaries to greyscale images and resizing them to a uniform resolution. To ensure a consistent background, padding methods will be applied to some of the images.

Next, a Deep Learning model will be developed using convolutional neural networks (CNNs) to classify the malware images. The model will be trained on the preprocessed dataset using both the padded and unpadded images to determine which method is better for classification.

This work includes:
  • Method: Convert binaries to greyscale images and resize resolution
  • Challenge: Poor image quality, tested padding for black background
  • Training of a CNN model with the extended data
  • Approach: Train Deep Learning model with padded and unpadded data
  • Goal: Predict malware threat groups with accuracy using CNN

My Contributions:

  • Collect Data
  • Annotate & Label data
  • Extract & Combine data with colleagues
  • Split data into train & test dataset with [8:2]
  • Train the model using extracted and pre-processed data
Github Repositories:
Created Using:
  • CV2
  • Cuda
  • Numpy
  • Python | Tensorflow
Dr Brandon Ooi, Lecturer
Testimonial
Presentation Slides
Goh Ee Sheng © 2023