
fun um
hi everyone my name is chandrani I'm from Adobe security team so I am a adversary Intel engineer at Adobe so this presentation is about sorry
are you able to hear me now okay okay yeah so this presentation is about um the career transition Journey that I went through at Adobe like different security roles I had and what all I learned what's been the transition Journey like so the agenda is simple I joined Adobe as an um application security researcher and then I moved on last year as a data driven adversary Intel engineer so what has been the Pain and Gain what all I have learned in both these two roles and what are the challenges that I have faced uh before going into um okay it's okay it's not even
standing on the shoulder of giants so basically um whatever I'm going to present whatever I'm going to talk today is based on the work and the knowledge that I have acquired from my past experience and also based on many scholar work in this domain however the opinions are of my own and they are based on my perception of security so I joined Adobe as an application security researcher um prior to this I was at Oracle primarily as a security developer so at Adobe I was responsible for primarily doing threat modeling so basically we would be assigned a product line for like for the Adobe products and we would uh whenever any fee the product is going
through any feature change any new uh any new modifications we would look at the architecture diagram and we would perform stride for that particular service and then we would proactively try to find out like what are the security controls that we can place as part of it um as a part of my work I we would also do some light pen testing like for example if something looks alarming to us we would ask for a Sandbox environment and then we would do some Hands-On testing and um uh also as part of that role it comes like at Adobe there are various different security teams who does various various different types of testing uh like for example pen testing
there is bug Bounty Hunter so we are responsible for looking at all those results and prioritize them and telling the product teams okay this is the one that you should look at today than tomorrow so it's basically giving a holistic view to the product teams a holistic security guidance and apart from this is it's not there in the slide so oops it's not in presentation
apart from this what we are also encouraged to write some automation scripts that can find security issues at scale however the story behind the curtain is um whatever we do it's more like a governance ribbon that means like we are we were not looking at the data per se like where the adversary interest is going where um what are the volume it is they are trying to uh exploit but it is more like one size fits all kind of an attitude at the time the process was at Adobe it was hard to scale in the sense like uh the product teams they would submit a request it would be triage to one of our queues and then uh based on the priority
we we would pick it up um and uh we have tried to employ the um certain mechanisms to so that the process becomes seamless but then it was there was a lot of uh you know manual work involved in the process and it was not completely automated at that time and there is a little room for automation because always there is a huge pile of this review request would come in and so we will have really less time to go put in some extra time and do some automation um from our on our own so while all this was going on like for about three years um about last year there was a major orc
change going on at Adobe so this was uh like about uh where in like since March 2022 so in inside the security a lot of revamp was happening and we were all hearing that a new team is going to form that's called adversary Intel team so the team had a lot of cool promises that we will look at uh you know we will do data-driven cyber security we will look at adversary interest signals um things like that and I spoke to the manager a couple of times and I understood that whatever the team is going to work on probably I have no knowledge about it I have not I do not have any past experience but I still
join the team with a lot of Courage but the transition path has not been uh very smooth I still remember the first Workshop we had as a team like it was initially like a just a three-member team most of us from appsic background in Adobe and we were just discussing like how a different mindset we need to have because from appsec we uh look at it like the security best practices what's industry uh best practice things like that but then here we are going to look at data and then we are going to trying to empower the product teams based on the data Trends so it's going to be a completely different mindset there needed a steep learning curve uh
the sooner I attended the workshop soon I I was having more meetings I understood that where I'm going to more and it needed a very steep learning curve and then this feeling like will I be able to do it am I making the right choice oh my God so things like that stepping out of my comfort zone was not easy because um when I understood like what all new skills I have to learn the list was long I mean here everything is probably not listed because it will it will go on and on I had to understand how to write programmatically data schema what are the industry Norms what are the best practices what are the most common
attributes I need to have I needed to learn the methods of storing and processing data the efficient method what's the data platform being used I had taken few machine learning course in my under in my grad days but then it needed a major uh revamp I needed a major brush up same for querying skills and visualization I had no experience at all so yeah things didn't happen overnight um so um we have this we have Azure subscription so databricks was an obvious choice we started learning anything and everything that we could um on data breaks like uh LinkedIn learning Coursera learnings wherever we took paid learning so how the data breaks underlying architecture works that distributed spark data structure
works when you uh submit a job how it goes to the various nodes and because when you face an error you need to understand those underlying architecture so that you can deal with it then we launched by spark I would I wouldn't go through all of this but yeah machine learning I took a course again that is offered by Stanford by Andrew ing data storytelling also we took a paid course so for each of this and it's not like okay we are giving three months time complete all and come back it's like a continuous process in our team like learn and deliver learn and deliver so enough of learning now show me some work Adobe is not just paying me to just
to learn I have to deliver something so the pro first project that we did was um as I was saying that there are multiple security teams at Adobe and um like they are capturing various test results right there is pen testing team there is dynamic testing um someone is running wire shirt tools so there are multiple attributes that they are capturing in multiple ways so we wanted a One-Stop solution so that we can see the trends of data how it is going where the adversary where we should focus more where we should tell the product teams okay this is more important than this area so we came up with this problem statement that uniformly captures
security data and have a single source of Truth for all security issues at Adobe so for this we built an automated pipeline that works as a single source of Truth the first stage was the data ingestion for data injection this sat with the six different teams testing teams we understood what all attributes they're capturing understood like uh what all we need to keep the common one and then in the data processing layer in Azure data breaks uh we are doing more data refining so in databricks there is a concept of bronze silver and gold so bronze is like it's like the raw data the input that you are getting from all the like the raw input from the teams
then you do some um some filtering some deduplication of the data some processing you add timestamp things like that that you call the silver data and then finally you add maybe some machine learning uh algorithm on top of it maybe some aggregation logic and that you write to the databricks Delta table so that becomes our final aggregation and um we connected to uh the power bi desktop client to put a nice graphical structure out of it so coming to the implementation part for the schema definition I used python python pyrantic Library so basically that lets you create a self-validating schema so for example if you have if someone has given a schema in Json or CSV if if an attribute is not
matching whatever is there in the schema it would complain it will before you are submitting it would complain and say that correct it so in the next process all these test results uh so then the teams were automatically sending it to uh to different storage accounts that we had set up and these storage accounts again we had uh securely connected um two Mount locations to data breaks and in data bricks we had Auto loaders running so Auto loaders are nothing but you can say like a program that would programmatically pick up your um your files from the mount location that you have set and it will process that file uh however you have mentioned uh add some more details to it like uh
file time stamp or yearly um yearly data and things like that and it also has some like like a lot of advantages like it has this Fail-Safe mechanism for example like if you're expecting a column that but that is not there it would add a extra column or as a comment as an extra column and it would go on to the next column so it doesn't error out immediately and then finally we write everything to the Delta table and then we connect it to the visualization layer which is the power bi so this was the report we generated this was very well appreciated and we also shared this with our customer the next project that we did was
currently we are at Adobe were also working on a concept called as like adversary Persona like uh there are different attackers and they employ different uh attack techniques tools things like that and um so for example for there is a researcher Persona right who are employing certain different tactics then phishing Persona is there bug Bounty Hunter is there so we wanted to see like and what is at our discretion is a long text description uh where that is that comes with the attack tactics and the techniques like we have the SEC uh maybe from the SSC team we are getting this description or it's if it's a bug Bounty Hunter we are getting the uh jira ticket
description like what all um payloads they used what all steps they used what we wanted to find out is uh from this text we wanted to see if there is any similarity among all these personas so uh we wanted to develop a text based clustering algorithm that captures the similarity of this attack methods employed by different persona um so here also uh I have like placed a text that is not related to any of the things that we have used like basically we uh run the pre-processing step so we do the limitization we do the stop words removal and then we get a process text finally we do a next step is we do the
vectorization method so here I pick the what to wake vectorization method that applies the C bow that's called like continuous bag of words so basically you give a word and back based on the vectors that comes up it gives you the nearest neighbor for that word and on top of that we apply the k-means clustering algorithm to get some segmentations like of the adversity personas so this was the second project like this was also very well appreciated project that we did the third project was uh we picked um a single Persona a specific Persona and we wanted to see like uh what are the most common ttps and vulnerabilities they are using like single Persona
depending on the system depending on the product they're attacking they can employ different various different tactics right and so we wanted to see um which one which vulnerability they are using the most and again at our discretion was a long security incident that we are getting from various vendors maybe or from the SSC description that we have the footprints um so yeah this was our problem statement here we did through the machine learning route but we went the advanced data processing route and the advanced query language route after uh getting the data in the data frame we did some deduplication logic and initial data processing but then if you see the third uh the fourth
row over here on the right hand side it has everything clubbed up like miter data current identifiers are also coming together then ransomware the adversary names are also coming together so second ah the next step that we did was the data normalization so wherein we would normalize the data so that each of those become one single column here like in this one is only the TTP is one column then I have CVS so these are again like Json Aries then we would take this Json area and then we would use the pipe Pi spark SQL functions like we have used explode to uh you know make it into different rows so that we can write the
aggregate function on top of it and we can see like which one was the top most one and do aggregation on that so story behind the curtain is a it has not been very smooth right like uh there has been days where I have faced databricks errors has no I have no clue uh how to deal with it uh searched in the internet or maybe reached out to the MS uh some ticket raised service ticket with them for machine learning there is always a plethora of information available but choosing the one that that will work for you and obtaining accuracy has always been a challenge like um this is still we are learning and developing and
um the team that we are on like it is very much like uh startup setup like we have to continuously learn and deliver like it is at a very fast pace um the key takeaways is I think it takes courage uh to come out of your comfort zone you know and then do something new but I feel it is worth taking the risk if you are putting your soul to it if you really want something um to do it like put your uh dedication to it and you will see the light end of the tunnel different perspective of security is equally important like initially I was doing as an application security researcher threat modeling architecture
review so that is also a very important security exercise in the products security life cycle as well as the data driven security piece that we are working on now looking at the adversary interest signal that gives a whole new perspective like where adversaries are going after and it's not always that you have to change jobs you know to go to a whole new exciting role you can be in the same company and uh work at different roles and these are the some of the certifications and trainings that I have completed um and yeah that's all any questions
thank you
are you asking about the transition Journey or are you asking about the projects that we have done what did you find yeah that's a very good question like for example um let's take the example of Photoshop right so for Photoshop when we um apply these techniques we saw the adversary interest signals there is a lot of domain squatter problems there is a lot of phishing problems so we are telling uh we are telling the teams okay go focus more on those areas right so we are able to I think because security is vast and often when we tell the product things okay you have to do this you have to do this so that that
also increases their work so but then when we are giving them a very focused Direction I think that is helping them as well because we are telling them with Jita so that is giving them some proof as well okay uh we have this data that uh uh this much vulnerity has been exploited for this product so you should definitely go look at it so I think that is helping just answer your question
yeah yeah so be before I was like more to um like how we should be doing it but now we are questioning about what we should go after right so that question we are trying to solve first then uh how we should be doing and uh sorry for not being on the presentation mode I couldn't really fix it this was my first presentation so yeah thank you so much [Applause]