
besides tlv crew knows please raise up the ceiling for a real shaft and the wrong Carmi yeah first time speakers we love our first time speakers we love our first time speakers and they are on brand oh yeah look at that look at how sharp they are so a real shark but you should be called the real shelf yes yeah but you're being sharp so that's your new name they're gonna walk us through wild wild world of Google workspaces and Google cloud and into security investigations now I don't think there is a single soul in this room that does not use Google cloud or Google workspaces heck even my baby daughter who was just on stage just got a Gmail
account so we all need to learn from these gentlemen are you gentlemen from these guys we'll see if they're gentlemen soon we'll see if you investigate and tell we'll see what they have to share about Google workspaces and their investigations and forensics take it away guys thank you thank you one more photo okay shlomi you have to tell me to bring them in before I tell them take it away he said let me try Okay all right okay okay okay now let's give them one more Applause hey don't be stingy this is the last couple of thoughts of the day you can do a big nice Applause for these guys okay thank you so hello everyone well Welcome
to our lecture uh Google works with workspace forensic insights from Real World hunt and instant response uh we are the one car me my name is Ariel and we are both senior Cloud researchers at mitiga by the way if you are home Bakers or prefer animals over humans we are looking for more researcher so feel free to reach out and let's deep dive in what we are going to have today so our agenda uh we are going to talk about Google workspace and specifically about Google workspace logs we're going to talk about how to retrieve those logs what challenges are in those logs and what Improvement improvements we in mitiga apply to those logs to make them more
readable we're also going to talk about data acceleration uh formed specifically from Google Drive including a real case that we found and at the end we are going to talk about a particular visibility Gap that we find in Google Drive so let's start first of all what is Google workspace so I'm sure most of you know but Google workspace is a collection of a cloud-based tools that help organization and individuals to uh collaborate easier uh it includes Gmail Google Drive Google keep calendar and more you have to know that this is a very very popular platform in the world there are more than six million paying businesses that are using Google workspace so this make it a huge opportunity for
Bad actors to perform attacks against it and specifically around data acceleration let's talk a little bit about the logs themselves so the logs are divided by the service for example when you enable the logs you will have a file for each type for each service of uh Google workspace for example admin logs Drive Google Drive logs Gmail logs and more they are collected in near real near Real Time with some exception but most of the services collected in near real time and the typical retention period is six months today we are going to focus specifically on Google Drive uh those logs includes uh among other actions that are taken by the users uh like sharing files
accessing files downloading files creating new files all of them in Google Drive so let's see an example of how the logs looks like so this is an example of a log okay I'm missing one picture but let's let's do it this is an example of a log as we can see this is the uh event the event it includes the name of the call and the parameters on the picture that is missing here uh you should you could see that there are also other forensic forensically important uh Fields like the IP address and the you the user that actually made the call also you can see here that the events is a list and it's comprised of several
events that are all tied together so the logs the structure of the logs it's not that easy to read and there are some challenges that makes us the investigator makes the life of us the investigators uh challenging let's see why first of all user agent field is missing this is a typical log of Google Drive we can see a lot some type of information that are included like the IP address the user but there is no user agent the user agent in forensic investigation is a key field that help us detecting abnormal activities and Google Drive you won't be able to find it in addition IP addresses inconsistencies uh you can find iPad rested inconsistencies in the log
wait okay another picture is missing but you can find IP addresses inconsistent in the logs for example you uh you could see a few download events all of them coming from the same user in the very same second but from three different IP addresses or some a few different IP addresses and I will I will explain later why this could happen another example or another challenge is that in file related log entries such as download in this case uh there is no path so you can see information about the document that was accessed such as the document ID the document type the document title but you won't be able to know the path of the of
the file and when you as an as an investigator coming to investigate such case of exfiltration it's very crucial to know where the files were taken from so let's see what we did in mitiga in order to make those files or those those logs much easier to to read and to investigate uh wait sorry another image is missing okay I'm going to talk without pictures that don't because they are missing I'm not sure why but just believe me that's the that's the case so let's go back to to events and I'll show you back in the beginning okay so this is a typical event of Google Drive as you can see it's comprised of several
elements each one of them is an uh event and this is a typical log record uh each event is comprised of uh the name of the event and the parameters that are tied to it this is not that easy to to ingest and to to investigate because this is a list of events one of those events in each log record one of the events will be marked as the primary event and here it's set to true and this primary event is actually the event that the user took the user took and all the rest of the events are events that were generated because of this action so what we what we do in mitiga is a
splitting those events and this is another slide but that's a screenshot from our tool of choice which is pi spark and this is after we process the event so each sub event we split it into a new dedicated row as you can see here there are two rows and these were used to be part of the same log try to think about the case when you are trying to look for all the documents that were ever shared in your uh not sure that were ever got public in your environment with our uh with our technique it's much easier because you just need to look for the event name and you will be able to see all the
documents that were shared in the original row log you need to iterate through all the events and through all the list of though in those events in in order to be able to uh conclude that
okay let's go back again because I need an example
so back here this is an event and those are the parameters so each event is comprised of a list of dictionaries that represent the parameters of the call for example in this case we can see uh two examples but there are more parameters of this called upload one of them is primary event the second one is billable there are two types of keys inside of each dictionary one of them is name which is the name of the parameter and there are four other types related keys that represent the type of the parameter for example in this case we can see primary event is Boolean billable is also Boolean but in the picture that is not here you could see other example uh
but okay so in some cases you will see that the parameter name and only one of the values of the type related fields will be populated depends on the type of the parameter so in this case these are booleans but it could be value with string it could be multi-value or int so what we do in mitigan in order to make it easier and this picture I do have
is we are flattening all the dictionaries and we omit the type related fields so in this uh with this technique we actually lose the type of the parameter but we still have what actually matters which is the parameter name and the parameter value and that that makes the investigation much faster much easier and we are able to access the parameters or ask questions about those parameters in a faster manner and lastly we contextualize the log records with the originating app name so again no picture but this is an originating app name which could be in some cases you could find it in some calls and this is depends whether a user took the action of an application so if
an application took the action uh you will see the original app ID this id id represent the application and we enrich the data with application name so the investigator is able to ask questions about the application that took Direction whether it be first party application or third third party application and so we add in this in this example I think this is Google Drive in the picture that is not here it was flag and one important thing to note is that when an application takes an action on behalf of a user the IP address in some cases would be the IP address of the hosting provider of the application and this is why in some cases you will see
inconsistent inconsistencies in the IP address so you can see one entry coming from one user with one IP and then a later a second later another entry from the same user from a different IP address and again this could be misleading and indirectly or investigator while performing an investigation now we are going to talk about a specific data acceleration exfiltration case and shelf overdue thanks Devon now we are going to focus on that expectation from Google Drive let's start from the basics just a minute cool let's start from the basics there are six event names that may be related to that exfiltration from Google Drive the most obvious is download of course third actor also can
few files can send them in email as attachment you can't print them you'll note that they can print them to PDF so they don't need to physically print them in order to accept the do you mind of data for example they can preview them and they listen intuitive they can copy them to more convenient location for example to a public folder um just a minute when you suspect the user you can check which exfiltration related events they perform and also when you perform threatened you can search for anomalies in these events appearances in order to generate leads to investigate in this example each line is the user you can see how many expectation related events each user performed over time
the green user for example the green line performed on February 27th approximately to any 20 20K of acceleration related events and it can be it can be really interesting to investigate it in threatens besides that actor also can share files to a public or to an external user um we are going to talk just about shell files from share drive not about private Drive when action file or a folder under the general access section you can you can choose the group anyone with the link your organization or restricted restricted means just users that you explicitly mentioned here get permissions to this object and also you can choose here the access scope viewer commenter or editor
in our example we are going to see we change the the group under General access from our organization to anyone with a link this click actually generated four events two change document visibility events and two to change document access code events in this table you also see three parameters extracted from the parameters column Target domain old value and new value first look it may be confusing but when you are looking back you can see here actually a pattern in the first couple of events the start state people within domain with link AKA our organization and can view access scope changed to the clean State private and none and after that in the next couple of events the clean State changed to the
end State people will link and can view you'll note that even though we didn't change they can view access scope it still goes through the clean state now let's talk about Sheriff file over folder with concrete principle in the upper section so when you share a file to concrete principle it's straightforward there is one event that's called changes to access and the act of this event is the user that actually performed that but to additional folder with concrete principle something interesting happened in the in the logo for the for the main folder there is one event that's called changing the access and the actor is a is the user that actually perform that but after that
for each file and folder recursively under the main folder there is a special event that's called change user access hierarchy reconciled and the actor of these events is system you'll note that all of these events are primary events and not part of a chain like Dawn described now story time sometimes the the simple checks are the most important I want to tell you about one of our threatens with one of our customers in this third hand we started from the basics and we searched for anomalies in the six events appearances I just described and we saw something really weird we saw an external user from gmail.com domain performed at the same timestamp approximately 15 15K of download events
it's a lot and it's really interesting to investigate it so the first question question we asked ourselves is what are the five pests but as though interested there are no file pests in the log in the logs so the straightforward solution is using API but there are two problems with that Opera sorry there are two problems with that first to perform API calls you need proper permissions and when you are an external investigator to the organization you don't always have the proper permissions and second API calls give you the current state of the organization and when you investigate you want the historical state of the organization so we try to think what what we can do based on the log
records only we saw that for each file of folder creation there is a great event but also there is add folder event under this event this event parameters there are the document ID and title and also there are the destination folder again title based on these events we build this table in this table you can see the document IDs and the destination folder details title and ID this table based on the enter folder events now if you think about that if all the destination folder ID is destination folders We have also the autofolder events we can search these IDs in the document ID column and recursively build the paths so that's what we actually did
this table is from our lab don't worry here you can see event names and the timestamp you can see the document title from the parameters column and here you can see the calculated document path we built exclusively technique um you'll note of course that the paths might be partial in this technique depending on the log time frame if you don't have the create events and the folder events you can build the entire path so back to the story we understood these files were sensitive long story short a former employee downloaded them after an admin gave him permissions and the scissor took it from there finally I want to share with you a visibility Gap we found in Google
workspace a few weeks ago when we investigate we expect to see consistency in the logs what do we mean we already know that there is a download event so every time a user download a file we see a download event in the logs right so it's not as simple as that let's talk a little bit about licenses in Google workspace each user has the cloud identity free license that enables basic features and also the admin can purchase paid licenses to enable more features in this example in the organization there is paid license that's called Google workspace Enterprise Plus but this license is it assigned to this user so in our research we found that if a user doesn't have any
paid license there are no log records on their private drive at all when I say Private Drive I'm into their private Drive in the organization Google workspace so they they can copy files download files View files great files and so on without any log record it's crazy to think about that based on this finding we try to think how retractor can accelerate not just the private drive but the share Drive of the organization with minimum log records now I want to show you a use case how protector can do something like that exfiltrate the share drive with minimum log records with our finding in this use case the compromise user is an admin user because an admin user has the
permissions to revoke in the sign license so Theta token evoke the paid license to avoid of a Vlogs about the pivot Drive of the compromised user copy all the shell drive files to the private Drive download all the files from the private drive and finally reassign the paid license to be discreet as possible now let's see together in the logs about the revoking wrestling the life the like the license there are relevant events user license required assignment under the admin audit log for the copy the files actually it's interesting usually for copy file in Google Drive there are two events oscope event and copy event these events are almost the same so it's not interesting interesting to monitor
both but in our case there are no copy events because there are no log records about the private Drive yes the source copy events is about the original files and the copy events is about the destination files so because there are no logo codes about the private drive there are no copy events and there are just telescope events finally for the download of the files there are no log records based on This research we understood that in our threatens we should also search for license revoke and reassign in the same in the in the show time and also search search for Source copy events without related copy events thank you so let's see what we had today a few
take takeaways first of all collecting Google workspace logs is not enough some pre-processing is required to prepare the logs for investigation we saw the events we saw the parameters the structure of the log is not the trivial some data is missing and you need to uh to know this in advance and also to be able to preprocesses pre-process it before second early introduction to deluxos is required by your security team as I said some missing some data might be missing such as the user agent and the visibility could be limited in some cases so you need you need to be aware of the limitation of the lock Source in order to be able to investigate it
better and lastly the visibility Gap that was shared by Ariel just a second ago might require monitoring events that are not the tribal for example the revoke and reassignment of a license to a user to be able to understand that data was exiltrated thank you for coming this was our lecture uh if you would like to know more the the QR codes for our blog and advisory and that's it thank you thank you thank you and thank you Ariel thank you the Ron now we do apologize for the technical difficult and I believe these kind gentlemen will enjoy a beverage on our expense at the