← All talks

An Open Source Malware Classifier and Dataset

BSidesSF · 201828:13835 viewsPublished 2018-04Watch on YouTube ↗
Speakers
Tags
CategoryResearch
StyleTalk
About this talk
Phil Roth - An Open Source Malware Classifier and Dataset Research in machine learning for static malware detection has been stymied because of stale, biased, and otherwise limited public datasets. In this talk, I will introduce an open source dataset of labels for a diverse and representative set of Windows PE files. The dataset also includes feature vectors for machine learning model building, a high-performing pre-trained model for research, and source code to reproducibly generate the features and model. I’ll also detail the reasoning behind the features and labels and demonstrate how the machine learning model performs on samples in the wild.