Digging The Attack Surface Of Microsoft Rich Text Format Files - An OLE Perspective

Name: Digging The Attack Surface Of Microsoft Rich Text Format Files - An OLE Perspective
Uploaded: 2021-05-07
Duration: 45 min 9 s
Description: View slide decks and full list of talks available at: https://www.bsidesdub.ie/past/2021.php

BSides Dublin · 202145:0988 viewsPublished 2021-05Watch on YouTube ↗

Speakers

Chintan Shah

Tags

CategoryTechnical

StyleTalk

About this talk

View slide decks and full list of talks available at: https://www.bsidesdub.ie/past/2021.php

Show transcript [en]

so good morning good afternoon good evening uh depending on the time zone wherever you are um uh it has been it's a great opportunity to come and speak over here in this event so the topic for uh discussion today is going to be digging the attack surface of microsoft rich text format so primarily we are going to discuss this um from the ole standpoint right so before we move ahead a bit of an introduction um so my name is chintan shah i've been working with mcafee for quite some time i've been into the network security industry for about 15 plus years now uh i've i've kind of spoken at many other security conferences uh recently i spoke at

mr and uh before that i've been speaking at other international conferences as well my primary focus uh as a part of my my job in mcafee is basically analyzing malware's you know apd's exploits reversing uh primarily fuzzing and vulnerability research um i i hold multiple patents on exploit detection techniques and one of my core job responsibility is to translate the research into the product uh new new detection method product improvements and and providing improving the products with with new uh new detection techniques and methods that's what i i basically do uh so let's see what is in the agenda for uh today's session so we are basically going to touch upon the microsoft which text format

uh we are going to touch upon the overview as well as um what has been in the past with with msrtf file format we'll discuss a little bit in detail about uh object linking and invading and how this has been exploited by by attackers we'll touch upon uh oily attack surface as well and how this has been abused abused by attackers to target an office platform we'll we will discuss in a little bit more detail about msrtf file structure parsing and how to go about inspecting the rtf file format or the network on the network and how how can we build a automated inspection mechanism to classify msrtf uh files but we will talk about the engine uh

components architecture inspection flow and then we will will end this talk with example engine output and and what we've seen as initial results when we ran this tool over many of the exploits that we have tested right so over the past few years we have had many many exploits that target office platform and uh out of these these attacks lot of a lot of these attacks have been over uh all the exploited in the rtf file format by through the rtf file format so rtf has been one of the more adapted file format recently over the past few years and um many many many out of this many attacks out of this um high impact attacks have been

exploited via rtf so it's very popular among the attackers and there are few reasons for it [Music] during the last year may 2020 uh us government basically shared a top list of vulnerabilities exported since 2016 uh so they have shared the the popular um popular uh uh smarts which have been used in targeted attacks and many of these exploits uh have been all of these in fact all of these exploits have been on the office on the office platform and one of the top uh top observation that came out of of all of these exploits is all of these exploits were primarily targeting microsoft's oily so early has been one of the another feature which has been targeted

massively by by attacker because of the attack surface that it exposes right it exposes virtually a huge attack surface for the attacker and without writing any form of complex shell code attacker can easily deliver the exploits targeting this feature they can build the exploit they can craft the exploit and and deliver it to the victim and just super easy to exploit this kind of vulnerabilities which which uh uh office only feature exposes so that's that's one of the prime reason why why microsoft only has been a top attack vector over the past few years which also is a conclusion out of out of what has been shared by by us domain so the the question is why why

rtf what what are the reasons behind the rtf rtf file format being so popular among the attackers so one of one of the one of the reason is that it is very very very very versatile format it is very attacker friendly another major reason is that rtf is able to embed many object types rtf can host many object types like images video docs activex controls can host like other other objects fonts so in fact this feature is is a part of entire office application suit now rtf having the feature of wealthy doesn't mean that the other file formats of the office application doesn't have uh uh cannot be exploited they have been exploited as well

but there are there are a few reasons why rtf is uh rtf kind of stands apart from other other file file format in terms of uh attackers uh popularity among the attacker and why it has been used so massively uh one of the reason is that it serves as a career for other other file format experts you can easily embed many file formats in the rtf um in the rdf file format and deliver the exploits to the to the victim other reason is that uh rtf file format is because of its structural complexity and and and parsing um complexity it is kind of very uh limited the visibility of of rtf file format in terms of structure passing is very

limited uh among the perimeter solution so for instance i ips doesn't have a full parsing or full visibility into the structure of the rtf and and that's the reason attacker uh primarily choose to use it more than any other file format and um attacker can craft the graph default file format rtr file format in such a way that it can it can evade a lot of a lot of static detections so if you if you see the the distinction between the rtf file format and other file formats which which of this application has for example a doc docx or other xml xml file format versus versus the compound binary file format as well as the rtf

one of the prime difference here is that um rtf file format can be obfuscated heavily to evade the detection right and which which which uh it is kind of very unlikely as as uh with the other file format it can you can nest the control words rtf is primarily made up of control boards as we'll see in the next few slides how the structure looks like but because of the heavy heavy nesting of the control boards and introducing obfuscations it is it is very easy for the attackers to bypass any static detection mechanism and hide malaysia resources in there so um these are some of the primary reasons why rtf has been popular among the attackers

and and and because of the as i said because of the uh limited structure awareness among the perimeter security devices uh and because of the uh nested structure uh ips and other perimeter security solutions don't choose to inspect rtf as as deeply as it should have been so these are some of the reasons why rtf has been so popular uh and and no do file format for the attackers now what are the primary attack vectors in in rtf so there are a couple of them uh predominantly uh in fact there are many attack vectors but uh primarily there are a couple of attack factors which i would like to kind of highlight here one is parsing

engine floss so as i said rtf is complex structure with with nested control words as you can see on the on the right side there are there are many many control words uh in rtf about one 1800 plus control words in rtf and and many of them are are largely unexplored so uh you know any any of these control words would have could have a parsing vulnerability right and the other other thing is uh all these control words are able to consume streams of data now what these mean mean for the attacker is that attacker can craft the rtf exploit in such a way that they can they can get the rtf msrtf parser ignore

the control what at the same time they can embed the malicious resources like shellcode or decoy document or or any other no executables within the controller stream data now while parsing this um attackers can choose to embed this malicious resources um to to hide them as well as attackers can basically uh you know craft the exploit to x to craft the rtf to exploit control about parsing vulnerabilities as well so we have we've had many such control word passing vulnerabilities in the past uh other primary vector uh attack vector is object linking and embedding so this this as i as we discussed a few slides back uh this has been a massive attack vector it has been exploited heavily in the

past by attackers and most of them have been a logic flaws where uh where attackers were able to exploit one of the other parsing vulnerability or logic flaws in the the ole control so what ole allows attackers to do is it can allow attacker to link to the external object so that that's one of the one of the serious um serious uh you know uh drawback of of overly if if you're able to find an only control which allows you to link your rtf file to the external external world or external file anywhere outside and virtually you can download any any file invoke a respective resource handler and then you can basically execute an arbitrary code

so that's what we have been we have been observing uh you know the recent trend has been so far um over the past few years uh so object embedding allows you to basically allows attackers to basically uh exploit memory corruption vulnerability or there have been many cases in the past where it it it aids the further exploitation in terms of crafting crafting the process heave memory or no um aiding the further exploitation process so attacker could basically embed a file of any other file within the rtf where it can help to craft the memory of the process and then aid further exploitation so we have had many such cases in the past so how does it look so this is the first

look at the rtf basically if you if you see this is just less than one percent of the document if you even if you write a couple of words um a couple of words of text the document says document of document is kind of so big that you know it's what i've been what you've been seeing here is less than even just as just a single uh one percent of the document so imagine the the volume of data to process if if uh ips and other perimeter security devices has to inspect this file format so because the performance issues and everything uh other other issues uh the perimeter of security devices hasn't been too focusing on

a deep inspection of rtf files and and and i think that's one of the uh that's one of the uh uh one of the uh fact that attackers have been taking advantage of so here are a couple of examples where uh where tigers have been abusing the control words so if you see on the right side both both both of these control words for for instance level text control word as well as uh data store control word both of these control words have been abused to either embed a shell code or embed an executable within that and if you see uh the bottom on the control board the rtf car prefragments have been used to embed a shell code as well

so some of these many of these control words in fact have been used have just put a few examples here but this is what attackers can do with the rtf control words they can hide many hide malicious resources within the control world stream of data as well as they can obfuscate the control words uh stream to break image or parsers right and it can even bypass many of the static av detection which are kind of based on signatures so attacker can basically craft the rtf file in in in a way that can break the detections based on detections which are kind of based on static signatures so these are primarily uh no [Music] examples where the control words have

been abused so let's take a look at how object linking and embedding have been abused in the past what is it how ollie is initialized and what are the attack surfaces so object linking embedding is basically uh the infrastructure which is built by microsoft in for uh having interoperability with the third party vendors it is entirely built on com microsoft's com infrastructure which allows you to create an object uh which is uh you know allows you to either embed an object which is created in other applications to be embedded into a container application so we call uh here in our case rtf is basically a container application where we create an object uh in other applications for instance we

create an object it could be a it could be a pdf file or it could be an activex control or it could be uh oxml file right and it allows you to embed or link these objects within the rtf so what what it results into is they think basically it will provide a richer user experience as as you can see here on the right side um activex scroll bars are you you can you can embed activex controls you can embed other documents into the rtf right you can link your objects uh link your rtf to the external objects as well via linking capability so all these functionalities are are kind of provided by microsoft's valley instructor and what it results

into is it consequently it results into attack increased attack surface because there are there are kind of many com objects in the windows even if you install software uh the vendor will register a com uh interface com objects into your system and if that comma come up there comes out to be a vulnerable object and essentially your entire office security is compromised so let's take a look at it how how we can do embedding and linking so if you look at the figure of figure one you can just insert an object and you have an option to insert basically a lot of variety of documents into into rtf you can basically embed uh many many form of documents and what

it will result into is an object control word with a nested control what called obj emb so obj emv essentially means that there is a embedded object now if you if you come across an object control word in in rtf as as you can as you can see uh at the bottom uh when you embed a document pdf document right it basically turns out to be object control for followed by obj emp control and then when you link an object uh link your rtf to any external object right it could be any url then it will come out to be an object controller for followed by obj auto link or it could be an obj link as well

link or auto link it could be both of them and if you uh choose to embed an activex control then this is exactly what you'll get as as you've been seeing here um you will get an object control word with obj ocx and an object class will determine what kind of object it is if it is a pdf object you'll you'll see object class as acrobat dot dc you know depending on your on your version if uh if it's a if it's a linked object then you'll say you'll basically see obj auto link so you'll kind of see something like this over here and these are the important control words when we are talking about

inspection of rtf over the uh you know kind of building automated inspection tool to classify rtf files so here you can see some of the some of the important uh ole control words so obj emb indicates uh indicates the uh embedded object as you can see on the left side uh object control word followed by obj emd control word and obj class uh kind of determines what kind of object what kind of object is it right if it is a pdf it could be acrobat if it is a doc what document it would be a word.document if it is oxml then it could be something else if it's uh if it's packaged then it could be a oh classes

packages it's exactly what you can see on the left side of this the screenshot so other important object is that obj ocx which represents the the uh which indicates that there's a activex control number inside rdf file uh if it's a link object as i as i said uh previously it should it will be object followed by obj auto link control world it could be a which obj link as well so these are some of the important uh control words related related to ovali which which should be uh taken care of when we are building an automated classification tool to classify rtf files because we will need to extract data data from all these files and then we will need to

inspect it further so if you if you look at the screen charts closely there is a there is a control word called obj data which is a data that is to be rendered by the uh the only control and that rendering is primarily based on cls id which is determined by the ole32.dll which which is kind of kind of handles the uh infrastructure in the office so some of the important control words and let's quickly take a look at how this object initialization and loading happens when when we embed any object into rdf file so as i said uh vali32.dll is in proc server for instance instantiating early object right it basically handles the uh common instantiation and uh you know

loading the com dll into the memory uh object class and object data is basically the program id mapping to early control as as you can see in the previous slide if you see object class is what indicates the program id right uh so that string is primarily known as program id based on which only 32.dll will will call a function called cls id from program id as i've highlighted in the on the right side and once it gets the class id unique class id it will map that class id into the registry and then it will pick up the dll from there basically it will load the dll into memory and and then it will follow the

process which i have highlighted at the bottom so it will basically go and create instantiate a comma object using co-create instance and then it will basically hand over the obj data for rendering so this is how the early object initialization takes place into into the office suit right now this opens the wide attack surface as well as as far as attackers are concerned this is opens a huge attack surface for for the attackers because if you see uh if you think about it any any dll id any dll which which is uh mapped to the cls id could be loaded in into the office file so what attackers can basically supply here uh is this the class id as well as the

data which which has to be rendered by by the com object so let's little by talk about uh object linking so as i indicated earlier uh if you just go to the object and if you click on create from file and then if you provide a file name in the file name you just need to provide a url basically and then you just have to uh check link to the file uh so you are you are primarily linking an object and what it results into is uh as i said previously it it it will look something like this so how the data is stored when when up when a link to embed object is is whenever an object is linked or

embedded into rtf so this is how it looks like so you will have an object controlled which which primarily means that there is an object either linked or embedded which which is determined by the next control word which follows the object control board right so uh there's an object control board and since we linked it uh to the external file we will see obj auto link there if you see the specs it is basically a type of a auto link only auto link object which will uh which which links your your rtf file to the external world right external uh resource and then when you open open the document it will basically download the file and then it will invoke the uh

respective handlers for example if if you if you have came across if you have uh seen the turn 170199 which was kind of famous uh in famous vulnerability which was exploited since uh uh until until kind of uh 2019 um you know it uh linked that rtf file to the external hdf file and then it used to basically download the sdf file and then invoke a hta handler and then it basically ended up uh executing arbitrary code right so that that's that's how the object linking uh opens the attack surface now if you see object class is word.document.8 which basically indicates that there is a there is a only two compound document file format which is highlighted in in

blue here so the one which you have one which you see in red highlight is the nested control word which is visible when you link your rtf file to the external resource the one which you highlight one which you see highlighted in green is basically the the early 1.0 native data size in native data it is basically uh only 1.0 native stream and the entire object data is is primarily stored in ole 1.0 native stream right and and the the the last four bytes of the native stream is the only 1.0 native data size so which indicates the size of the ole control that you have that the rtf carries and what follows there is what follows next

is the ole compound compound binary file format so this compound binary file format is basically the the the data which has the oli is the data which has to be rendered by the application uh intended by the respective early control and and this is uh the the normal uh doc file format if you see the signature is the oc fili so which which actually means that it is a it is a compound binary file format which we used to see uh you know before office 2007 and office round seven and before so it's primarily the entire uh ole data is embedded into inside the ole 2.0 format let's see how how object embedding takes place so it's

basically very similar to the linked object the only difference difference here is the control board will change uh instead of object instead of obj link it will have obj emb right so here what i've shown shown here is the oxml file embedded within the in the rtf file now oxml file here was uh was uh basically an exploit which was delivered via by rtf right so if you see uh oxml file has been embedded inside the only two compound document now if you see the program id string here is word.document.file which basically indicates that there is a there is a oxml document which is embedded inside the rtf file format now uh uh now now along with the rtf along with

the oxml uh there could be many other control but many other file formats which could be embedded inside rtf for example flash files pdf documents activex controls images and and stuff like that so what how this helps attacker is this allows attractors to uh to use rtf as a career for other file formats which is essentially a you know a huge attack surface but since rtf has a limited visibility over the network and with limited visibility within the ips or within the perimeter security devices this basically hides your actual exploit within the rtf and then it allows you to deliver it to the victims so if you extract the uh oxml file and and inspect this file you can see

there's a shell code which is embedded inside there was a there was a malicious activities control and if you inspect this activex control there was a shell code which was embedded inside so essentially uh attackers were attackers were able to deliver the oxml exploit via the rtf uh file format so that's what uh embedding will will basically allow you to do so talking about the ole packages this is one another another attack surface which which has been exploited widely by by by the malicious actors so what it allows you to do is it will allow you to embed malicious payloads now now if you see packages are are uh basically uh if you just uh if you click on the

object and if you go to the go to the packages you can embed basically any any file format right now there is no particular file format which is associated with only packages now it could be either vbscript it could be javascript or it could be executable it could be an xml file so there is no particular data format associated with with the with the packages so attacker can basically come here and embed any any file format which which it kind of wants to and the and the respective handler will be invoked by the ole infrastructure know based on the file type so what essentially it essentially happens here is a packager.dll uh is responsible for loading the

package data into from from the document and then processing the package data now there was there was a serious issue which was reported uh uh in this early packages few years back in fact it was reported by one of our researchers that you know a package is basically uh dropping the files into the whatever files that you are you you embed in the embed via only packages it used to drop into the temp directory and and this behavior of early package has been abused heavily by the attackers over the past few years so uh if you see here uh malicious rtf drops embedded value package into the temp directory and this is exact behavior which uh which is out of the

packager.dll so package.dll basically processes that package data and it will drop the uh the embedded file or embedded payload into the time directory now this behavior can also be abused by attacker in in many ways because there are many applications right we we don't we never know how an application uses uh the temp director it could it would it would basically pick up the drop file from the from the dam directory and then it would execute so this behavior was was and was used by attacker to target a lot of victims so this is something which we need to take care about when we are building an automated tool for rtf classification as well so

we need to inspect the oil packages uh see if if there is any package data there and then we need to extract and and then extract the file and then probably we can inspect it through some other engines or sandbox so summarizing what what has been the early attack surface primarily attackers have been abusing so one is cls cls id based loading off dll so as i indicated there are there are many early object in windows and any any logic flaw in any of the old object could lead to the entire office compromise so what attackers can supply here is they can they can supply the series id right they can supply the relevant data

to be processed by that by that dll and this dll could would be either used to build build an entire exploit uh it could help attackers to bypass windows mitigations or it can help attackers to do you know get a memory corruption exploit memory corruption flaws so this has been one of the one of the primary attack surface wellie packages has been used by attackers to drop the payload on the disk because of functionality that it provides and since no no file format is associated with packages it can basically embed any any form of comma files and other attack vector which has been um in the oly objects is is the logic flaws primarily so

there could be a parsing logic flaws in in the attack in the oly objects for instance uh that could be a only object which can process your xml data right so if if there is any parsing flaw in the in the networly object basically it could lead to an office compromise so that's one of the one of the fact that attackers have been taking advantage of and many many early objects have been have been allowing attackers it can allow attackers to link your rtf to the external fight external resource and it can invoke the respective handlers to to execute that that external resource so if it is an htf file it can invoke mshta.exe to execute the file execute

the hdfi so that can lead to arbitrary code execution so let's uh kind of touch upon how we can build rtf file file structure and and how we can build will be parsing and inspection so primarily what we need here is a robust rdf document parser uh which can parse the destination control words and extraction of and it can it can extract data streams out of them and this it is critical for the parser to handle the stream offer stations because attackers can take advantage of the of the control world offer stations to bypass bypass you know evade the static detections av detections or or perimeter detections if if it is available in any other any other

parameter solutions so we need a robust rtf document parser and since all the ole objects are embedded into into the ole 2 compound binary file format compound document format we need a compound document format parser as well so we need a only two parser so primary two three parsers we need other is very packet structure parser for extracting the embedded payload and sending it somewhere else for analysis could be into the sandbox or could be any any other av detection solution so we need a three three parsers one is rtf document parser we need a only 2.0 compound document parser and we need our early packet structure parser so this three parser could basically allow us to extract the data

from the rtf file and and inspect relevant sections of the data and other other modules other inspection models can be integrated if if needed so for for instance if there is an oxml oxml file embedded we could have uh oxml analysis analyzer which can inspect the oxml5 we could have pdf analyzer plugged in into the into this rtf uh inspect rtf uh inspection uh module which can you know primarily inspect pdf document or similar for the flash file document so any other file format uh any other file format analyzers can be cut then plugged in into this and and which can help us identify extract and send it to the relevant relevant uh file format analyzers so

uh what is our detection focus here reduction focus is to is to detect weaponized exploits so we want to kind of identify exploitation methods um instead of uh just a vulnerabilities for instance if rtf links to external resource sdf file a htf file or a js file or any executable file then we should basically it should be an eye catcher basically it should be it should be looped with little bit of suspiciousness so we want to detect the exploitation methods we want to identify uh we want to extract all the data we want to even uh inspect non-very controlled words as well and there are there are few non-very controlled words for for instance data store or theme data

or any other other controllers which has been abused by the attackers so you want to inspect them as well and there are many other other rtf control words which can carry streams of data so any of these control words can be abused so you want to extract the data and inspect them as well we want to extract data streams to only control words and you want to inspect them as well so all the valley control words are primarily obj emp obj ocx obj link auto link and obj html uh stuff like that so all these data all these control words will carry obj data we have obj data and all the all the all these data streams needs to be

extracted and inspected further right and one of the other primary uh thing is you want to also look at the rtf overlay data so whatever is basically data which is appended at the end of the rtf file and many times it has been used to hide hide shell codes or hide any other malicious resources which can help attackers to basically execute the attack so in terms of only 2.0 we want to inspect the content stream because that's where the embedded file will be if you if you if attacker basically uh embeds oxml file or pdf file you will find the pdf file in the in the content stream of 482.0 document so we want to inspect

we want to extract od 2.0 data stream um and we want to extract all the object storage and object streams and we want to inspect that further to see if there is any malicious content there so this is the high level block diagram so we have an rtf structure parser we have a only structure parser we have compound binary file format parser we have office oxml parser we can have flash file detection pdf file detection we have can we can have a sandboxing solution which could be plugged in into this this module so this will kind of and this is kind of scalable you could plug in other modules as well which as per the inspection requirements and

this will help us to kind of parse the rtf and then use any other modules which which which can inspect other file formats to see whether the rtf file is is comes out to be malicious or not so this is the uh flow diagram a little bit more detail so we want to inspect the ole objects so we want to look at all the early objects with early mb obj emb auto link ocx uh first control you want to look at object class uh which is a package and out of that we want to look at the many the many other object class uh arguments as well we want to look at uh what are document.812 uh

rich acrox document.c which means basically pdf document so based on the obj class we could forward we could extract the relevant file and then we can forward it to respective modules so since the data is is embedded into early but not native stream we want we have to extract the native string and based on the size depending on the size of the uh or into compound document we have to extract that and inspect all the all the streams for for malicious data as well as inspect the content stream for uh embedded file and then we can forward the file to respective analyzers so only package packages we we can extract early packages based on the object class

if we find the program id as programmatic string as object obj class is equal to package we could basically extract that package data and forward it to uh package parser for extracting any embedded effect scripts on any embedded executable so at a high level this is how the the rtf inspection can take place and of course uh overlay data as well well data and the uh non-overly controlled words as well we we need to inspect them because attackers can abuse this to hide malicious resources i said earlier so let's take a look at engine output and some of the initial results so as as you can see here uh there is an oxml object which is

embedded inside the rtf which is uh oximeter itself is a malicious file and the engine is able to extract the oxml file out of the out of the early to compound document and then it can apply the binary stream heuristics inspection heuristics and it turned out that the file the the oxml file was uh which was embedded into the rtf document has was kind of malicious right so you can see all the malicious heuristics that triggered triggered here uh eventually leading to classifying classification of rtf file as malicious and it was able to extract the oxml file and and was able to reinspect the file for uh for further malicious content so this was one of the one of the

examples uh recently last year there was one uh one operation one one targeted attack which targeted aerospace and defense industry so what we saw here is uh obj class was one dot document.12 which essentially means it was an oxml document so we went inside this wax document and then we inspected it further and the attack came out to be a template injection so this basically linked uh this document this embedded oximeter document was linked to other other oxml document to the uh to the cnc server external external source and then it was downloaded and executed so that that the external one was was a real exploit but when we inspected this this document it came out to be a template

injection technique which was used by attackers so this is this kind of techniques are are very important to inspect when we are classifying the rtf files so and and since this allows as i said to embed any kind of any kind of file format we we have to be kind of extract these files and inspect it further so coming to the initial testing results uh we've kind of tested this engine from uh over variety of exploits starting from 2012 to around 20 and we've kind of tested 15 000 exploits uh around 2000 exploits and we've uh with with over uh close to 50 heuristics 50 rtf heuristics inspection heuristics implemented we came to across about 94.3

of detection rate so we had about you know close to 50 heuristics of the arduino sticks implemented the rtf inspection engine which includes rtf rtf heuristics as well as early compound binary file format heuristics as well as oxml heuristics as well and false positive we've kind of had very less false positives in with this engine 0.60 percent of false positives which is very encouraging so with that i close this discussion and i'm open for any questions if you have thank you so much chintan for that great deep dive into into a file format that probably doesn't get as much attention as the normal um you know docx files and the likes you know and really interesting stuff and i guess

what one question i have um you know for someone maybe getting started in this space and maybe not at the level of implementing some of the automated parsing and categorization what's sort of tools would you recommend to someone analyzing rtf files and and embedded ollies in them i know jda stevens has a few a few popular ones there is there anything you'd recommend for people starting out in this space uh yes sure uh so there are there are kind of uh rtf uh parsers available uh so instead of kind of reinventing the entire entire wheel coding the rtf parsers there are open source rtf parsers available kind of those who are interested to get started in this space can

can uh visit um decal edge rtf parser rtf it is basically entire suit of overly parsing framework but what we have kind of done here is we have we had to modify the rtf parser to be able to suit our requirements so based on our needs and requirements we have to have a modifications to this parcel because uh there could be many other corner cases which could not be which may not be covered by the open source process so we we have to take take care of that as well based on the exploits available so one is decollage rdf parser that there is rdf dump as well which is which is another open source tool

available to to get started in this space and there are many other uh only two parsers as well so entire an entire decoration office parsing suit can be used to implement this this uh engine yeah thanks for that and i noticed your your research on this subject is available on the mcafee website i've provided a link to your blog article give people an opportunity through afterwards as well okay thank you so much tinton for that thank you and again virtual round of applause from everyone

Digging The Attack Surface Of Microsoft Rich Text Format Files - An OLE Perspective

Related talks