
hi everyone i was a bit confused am i live or not um i'm a bit nervous but more excited because this is going to be my first talk ever um just a brief introduction i'm a new graduate i did my master's in information system security from concordia university of edmonton so while doing my research i did some of the experiments and the whole presentation today is going about the experiments which are performed while doing my research so now i'm going to share my presentation with you
so today's presentation is all about ransomware analysis based on machine learning and clear virtual machine
and today's agenda is about firstly briefly we will introduce what is ransomware what is machine learning and how ransomware attack works and the type of ransomware families we studied while doing our final research followed by methodology and outline of our experiments performed and the most important things the data set which we collected to perform our experiment followed by discussion part and some of the interesting results which we achieved while doing those experiments and then how we use the flare virtual machine to perform the experiments and then what attacks were performed to achieve the results so the first question is what is ransomware ransomware is a malware that prevents or limits users from accessing their system
it's like we are using our own system but all of the sudden we are not able to use it probably it prevents our access or limits our excess or it can be locked and if it encrypts all the file types it is known as crypto ransomware in that case it force users pay ransom to decrypt the key the recent study reports that ransomware attacks grew by 118 percent new families of ransomware are detected and threat actors use innovative techniques there is an almost constant increase in business detections of ransomware rising a shocking of 365 percent from last year to present year question is which sectors are attacked by ransomware while i cannot think about even a
single sector which is not attacked by ransomware most of them are education government healthcare hr departments mobile devices and macs how these sectors are actually attacked by any attachment which includes phishing email or spams or social engineering sometimes drive by downloads or sometimes the advertisement which are shown on our screen and we just click it for those are malware advertisements now what is machine learning machine learning is a method of data analysis this is a branch of artificial intelligence and in this systems learn from data it helps to identify patterns make decisions with minimal human intervention there are different methods for the machine learning which includes supervised learning which includes trained using label data the one we used
in our research other are unsupervised semi-supervised and reinforcement learning now what is classification classification includes the predicting the class of a given data points it is considered an instance of supervised learning that is learning where a training set of correctly identified observation is available sometimes classes are known as targets labels or categories that belongs to the category of supervised learning the targets also provided with the input data
the first thing is interaction firstly it has been delivered to the system via email attachment phishing email interactive application or other method the ransomware installs itself on the endpoint and any network devices it can access all of a sudden we are working on our system and we download any attachment or an efficient email hence our computer got interrupted once the infection is in our computer what's the next point pencilware contacts with a command and control server which is operated by the cyber criminals behind the attack to generate the cryptographic keys to be used on the local system the next part is encryption the ransomware starts encrypting any file that can be present on our local machine
or even on the network once the file is called encrypted it start doing extortion with the encryption work done the ransomware displays instruction for expansion and ransom payment threatening destruction of data if payment is not made by using our own system and we are not able to use our system just because now it has been infected by a malware known as ransomware and it's asking us to unlock the computer to unlock the computer organizations either have to pay the ransom or hope for the cyber criminals to decrypt the effective files or sometimes they can attempt recovery by removing infected files and systems from the network in restoring data from theme backups there are four families which we studied
in our research those are wannacry pattaya serbia all of these ransomware families comes under the category of crypto ransomware families it is a very harmful program that encrypt files stored on a computer or mobile device in order to exhort money then encryption scrambles the contents of a file so that it is unreadable to restore it if it for normal use or a decryption key is needed to unscramble the file there was a methodology which we used to present our experiments it includes the collection of data ransomware samples were collected using virus total and hybrid analyser to perform the experiments there were two approaches which we used the first one was malware samples which are not classified
from any antivirus vendors and we use classifiers for classification and the second one was ransomware samples that are classified from antivirus providers to their respective families wireless total and hybrid analysis used to label our ransomware families while doing our experiments now there were some steps followed to perform experiments first step was collected data set to collect that data set we use virus total and hybrid analyser to collect all the data set once the data set was collected we're using behavior analysis by using grouping and filtering by doing a grouping and filtering from our data set and then we do hashing by using sha 256 or sha256 and then we save the data in excel sheet in the format of a rff
or csv because that was compatible to the tool we used the tool which we used was the veka tool that was configured to do the pre-processing of data the best part about the baker was to do the malware classification in data mining of data there are so many classifiers in a baker which we can use but some of them what we um used were smo naive bayes j48 random forest and the last one multi-layer perceptron specially classifiers are used to concrete implementation to imp to implement the algorithm to get the results about the ransomware behavior analysis our data set included one activity dharma ransomware serbia and serbia ransomware and patera ransomware in a wannacry there were 15 samples in
therma there were 11 samples and in serber we collected 11 samples again and patera we collected seven samples so in all total we got 55 features and 44 instances that it includes how many files got deleted how many files are modified how many files are um added or what were the infections it doesn't only infection impact the files but our registry things library files everything got affected as well now after doing after using the whole data set with 55 feature and 44 instances in different classifiers we come to the end to some discussions the first discussion part says the weka machine learning tool has an option to allocate a percentage split of a data set
for training and test purposes like we have a data set we can divide the data set into two parts the first one is for training the other part is for testing then evaluated the performance using machine learning matrices that includes true positive rate false positive rate precision recall f measure and roc area the data set created by extracting information from their attributes now the attributes includes how many files were opened how many files were modified for example once a computer is infected there were some changes that includes how many files were dropped or deleted there were some changes in the district just digest reactions those were deleted or opened or modified it also includes some of the
changes in synchronization mechanisms and signals which include mutex is created and mutex is deleted that includes some of the network behavior changes which includes http ugp or tcp and there were some changes in the library as well which includes kernel user32 etc after doing the whole experiments we come to the results like there were some classifiers which were way too good to understand the data and give us the better accuracy one of them was j48 classification which got end result of 97.73 percent accuracy rate we use the same data we split the same data for the testing and training but j48 got really good accuracy as compared to the other classifiers random forest was able to understand our
data as well because it gives us 95.45 percentage accuracy which was good but j48 was better than the other ones same smo resulted in 86.64 and base network gives us 72.73 of accuracy this is the same thing which we already um discussed in our results part but this is how our weka 2 shows us the results like how we are getting j48 is got getting 97.73 and how the other classifiers understand the behavior analysis of transit after performing and after working with the baker tool we come to know with a question like we have different families for example we studied about wannacry serbia we have one family and it has different members do different members work same as the
same family to know about that thing we use clear virtual machine we rely on a customized virtual machine to perform malware analysis the virtual machine is a windows installation with numerous tweaks and tools to help our analysis the best thing is that it is freely available and it is an open source windows page security which is designed for reverse engineers malware analysis incident responders and penetration testers we run ransomware samples on a same virtual machine containing same number of files folders documents etc as with same machine as it helps us to know the behavior of different numbers of same ransomware variants in this technique or while doing these experiments the first important thing was to isolate a virtual
machine from the host machine so to achieve that we created one shared folder and after copying files from the host device to the virtual device every time we remove the excess of the shared folders so that our host machine should not get affected by the malicious files this is an interesting thing which we actually did for example the first screenshot shows the wannacry attack that's the sample one when we actually um run the malicious pile of wannacry attack we come to know like actually 24 files were added 71 files were deleted and there were some changes in the registry entries as well some of them got updated some of them got added the best thing was we were using same
data every time to know the behavior of our ransomware family more into details so the wannacry sample 2 gives us the completely different behavior because it added files in thousands like three thousand three hundred and ten that was an interesting part even our registry entries got changed in thousand values and moment of uh executable file was um downloaded on my system it gives me the option oops your files have been encrypted and it gives you the options how you can recover your files to recover your files you need to pay to pay it will give you a proper address where you can send your money after doing wannacry ransomware samples running on our system we run some of the samples of pattaya
sample when we do with that it gives us the warning do not turn off your pc if you abort this process you could destroy all of your data please ensure that your power cable is plugged in at the end if you will click again after showing warning it gives you the other page of showing like you need to pay the ransom to get your own data back this was a very interesting patera sample uh which we ran on our system because once i ran this sample on my system it starts showing me like you are a victim of the pattaya ransomware because the hard dicks of your computer have been encrypted with the military grade
and the one i um ran that sample on my computer it asked me to press any key the moment i pressed any key it start beeping my computer so loud like i was scared for a minute like what's going on with my computer because beeping was way too louder than after volume of my computer so in that case it crashed my all virtual machine so to do the other experiments i downloaded the virtual machine again it not only affected my virtual machine it affected my host machine as well even though those were completely isolated from each other so to get rid of this problem i actually um put my system for two hours on um
charging and after that i was able to turn on my um host computer so this was very interesting uh sample which i ran on my computer so after that dharma and somewhere in case of therma ransomware the sample starts with map drives which followed by the roots of operating system drive and encrypting files via implement implementation of aes algorithm after encrypting all the files the malware popped up ran some node your all files have been encrypted and it gives you how to obtain bitcoins how to decrypt your files so it gives you the whole options coming towards the server ransomware the most interesting thing about server ransomware is it doesn't show you like your computer is getting infected
because it runs silently in the background during the encryption phase and not provide any indication of infection to us there are some ransomwares the problem probably when we just do the click part of the executable file it shows you the right away your computer has been encrypted but the thing in server and server is like probably for so many hours or probably in minutes it's just run silently in the background and during the encryption phase it just indicate you once everything is done for its um their end this was the other sample of the server and somewhere which we um run on our sample again it shows us like your documents photos databases other important files have
been encrypted again you need to pay the ransom to get your own data back so at the end like after doing the research about the behavior analysis and to study about the different families and the different members of the same family we come to know how to defeat ren dansomware though i didn't do my research on how to defeat ransomware but still there are some recommendations which we can share and if i will get a chance to do it in the future definitely i would like to know like how to defeat ransomware so the first part is isolate the infection which prevent the infection from spreading by separating the infected computers even in this the best part is that to
isolate the backup of our um computer from the host machine and the second one is identify the infection from messages evidence on the computer identification tools because when the ransomware happens on a computer it tells you like this is the one apply ransomware on your computer and your files have been encrypted so it gives you the evidence in the messages as well so we can determine which malware stream we are dealing with after that if it's possible we can report to the authorities to support and coordinate measures and we can determine some options which include there are a number of ways to deal with infection and determine which approach is the best one and we need to
restore and refresh and we need to use safe backups and program and software sources to restore a computer or outfit a new platform and we need to prevent operands we need to make sure like what was the vulnerability in our system like our computer got exploited our computer got compromised or affected by the ransomware what was the problem behind that and we need to plan to prevent the re reoccurrence so that it should not happen again with our system so yeah that was all about my uh presentation today there is my email id if you have any question you can ask me and you can email me and definitely i will try my best to get
back to you and it will be so great if you will ask more questions and i can work on it i i can give back to you i'm so thankful like that that was my first talk and i'm getting so good comments and thank you to all for listening to my talk and i'm very thankful to you all