← All talks

LinkedIn OPSEC, Targeting Analysis and Countermeasures

BSides NoVa · 202128:50212 viewsPublished 2021-06Watch on YouTube ↗
Speakers
Tags
About this talk
Examines LinkedIn as a primary reconnaissance and targeting platform used by adversaries ranging from nation-states to financially motivated actors. Presents a data-driven methodology for identifying high-value targets through public profile enumeration and enrichment, then proposes defensive countermeasures including persona development, deception infrastructure, and platform-level security responsibilities.
Show original YouTube description
Presented at BSidesNoVA 2021 on June 5th, 2021 This talk will review the continuing need for employee and corporate Operations Security (OPSEC) in social media platforms with a focus on LinkedIn. We will review the historical parallels between the pre-information age controls around “unguarded talk” and the threats, attack surface and responsibilities in the age of social media. We will walk through a simple methodology for creating a data driven targeting model to identify and enrich potential targets on LinkedIn and the countermeasures that users or defenders can employ to deny, disrupt or deceive an adversary. The primary focus of my research has been and will continue to be LinkedIn. Some see this platform as a professional networking site. I see this social media platform as an index of unlimited potential for targeting our interests. If we examine this threat, we can build the countermeasures that are needed to protect our assets, networks and most importantly our people. This talk will examine one targeting use case for the purpose of generating interesting data. We will then pivot the talk into using the same targeting data to generate countermeasures.
Show transcript [en]

welcome to b-sides and good morning today we're going to talk about linkedin opsec targeting analysis and countermeasures it's a subject that i've researched for many years now and i've had the opportunity to speak about different elements of it to both closed groups open groups and many different people many researchers and i have shared you know our challenges around social media together but as to my background you can find it you can figure it out um the most important thing for me to get through uh in the beginning of course is my employer statement and my views research and the data expressed in this deck are my own they do not reflect the views of my

current or previous employers so when you look at the current state of social media it's important to look back at the past um the campaigns that that were put out there in the propaganda campaigns in some cases but um humans were a means to an end of an adversary this was always the concern during the second world war and going into the cold war i don't think things have changed that much and maybe we're due for a campaign of this uh this level and maybe recent events will will push us there you know i've seen i've seen things that that make me think we need to go in this direction again and you know maybe uh a sign for this

could be the the historical parallel today is maybe it's loose social media profiles sync corporations sync governments sync industry that's the way i think about it and uh hopefully i'll shift some of your views on it so the threat that we're discussing today is targeting against linkedin every adversary be it a nation-state or a for-profit a financially motivated adversary they all use linkedin some do it better than others some are willing to do more with it than others and to use to tell a good story with data you use a threat actor and i've done this many times looking at uh north korea ep or lazarus groups in specifically and you know i found it interesting that

they were targeting security researchers that were reaching or researching them and targeting hackers so that was that was interesting and secondly they were targeting covet 19 research and that'll play into uh the data that drives the story for today so in 20 and 2018 the senate had a hearing there have been multiple follow-ups to those hearings the focus of those hearings was of course on facebook and twitter um twitter is a giant echo chamber it's great but we know what it is we know what damage it can potentially do i think it's not too much facebook um it's targeted advertising that can potentially influence someone but the platform that has the data about what you do and what you have access to

is linkedin so my question has always been was the focus in the right place and maybe that focus has been adjusted you know and from the outside looking in like most of us so i'm hoping it will be adjusted so i like to always include some definitions because many of us at b-sides are new osent is is what i do it's what many of us do you're working to the left before the adversary executes an operation that's where we want to spend our energy it's passive adversary reconnaissance through publicly available means to allow for collection analysis and targeting and opsec is is your duty to understand what data is available to an adversary how you protect that data the

responsibilities are protecting it that responsibility doesn't just fall on an individual it falls onto a platform it falls onto a corporation if you exhibit that you have access to my data i believe are the responsibilities of linkedin of twitter of facebook corporate responsibility if i work for a corporation and you are exhibiting that you have access to my crown jewels i want to have data that that lets me know that you're likely to be targeted and as an individual an individual you don't want to be the one to be targeted you don't want to be the reason that your company is breached so targeted user analytics was a thing that i put together a project that

i put together a white paper and an attack model and a data model the combined red and blue team capabilities to identify and reduce this attack surface to our users arm is the attract retain and monetize model you're the product and that's something you need to constantly be aware of i think in our community we understand this but in order to monetize and attract more people to the platform your data needs to be sold it needs to be made available that's that's the key to growing the model and why i don't care for it and then uh hibps we all know about we all know what that is so my hypothesis is that our adversaries are

using linkedin to harvest and target users with access to targeted data or systems and data-driven models can be built to simulate adversary behavior using openly available gathered data from social media platforms targets can be externally enriched using ocean data breach data and through other methods and internal enrichment there's internal data sets that are only available to us as defenders and they're extremely valuable when correlated against the targeting data models it gives you a complete picture and that's the the basis of the countermeasures so the questions that i asked to arrive at at these conclusions with targeted user analytics were simply who has access i did this you know for many years not thinking about what i was doing but i

looked outside to correlate against inside and i've sort of formalized the process what is the probability that someone has access what can go wrong from the adversarial perspective and what story does the available acquisition of enrichment and finally what counter measures can we introduce this includes opsec for hardening our users training them on what they're exposing hardening our environment and deceiving an adversary um uh incident scenario where it starts i but you don't really need to look at the right if you get to the far right and recovery you're at the game over point if you work to the left and miter has a pre-attack model and i've talked with mitre about some of my work before

as well and uh the reconnaissance phase that's where you want to focus your energy that's where you want to generate data because those are the people and that if people are the means to the end that will be targeted first i want to increase the cost and complexity to performing reconnaissance against an organization and the adversary will move on to the next person if you're if you're the bear in the middle of the herd or you're at the front of the pack on the left side of uh targeting engagement the adversary will move on to the next person and that's some somewhere that you want to be so i simplified this process i may get the opportunity to show you

the drawn out full data model but i may not in the the short period of time that we have so again it's target acquisition and acquisition is defined by keywords keyword matching a harvest is uh matching those keywords to to people matching them to humans and an acquisition score is the number of keyword matches you you get about a a potential target an enrichment is uh and all this you can do yourself you don't need anything complex it's a human with a web browser access to a search engine excel that can get you started but enrichment um is oscent of course the breach data footprint and i consider that to be meta breach data because we're not going to take in

breach data um and that's where have i been phone comes in and the internal enrichment is phishing metrics um if someone knows about a person inside your network and they know they have access why not take the easy approach and fish target spearfish them more um so you can you can correlate that type of data for example and and other behaviors that you're looking for against these potentially high risk users inside your enterprise again our counter measures are hardening our users personas and creating human sensors so as a view of your organization this is a high-level view this would be a typical organization a high risk user would be someone with an extensive set of keywords you know if you have 20

keywords maybe you have a 17 keyword match and they're publicly available on the open internet a highly attractive defensive persona if you were to create that persona it would have approximately those same number of keywords don't be more than the average uh to a high number of keywords or you'll look too good to be true you won't pass the smell test a low-risk user that's your general users in your environment if a defender if if an adversary persona and the final block makes a connection to a low risk user they become high risk simply because they've given their profile data that persona and you always want to match keywords against against targets that's the key so for

responsible research and preconditions i've looked into this extensively and before i talked at defton some time ago i pointed out a particular case that i thought was interesting because it started in california but is it legal to use a search engine to perform unauthenticated search against linkedin to pull user-generated data that's browsing the internet using a search engine and the answer is yes and you can see the linkedin verse hiq ruling that has gone all the way to the supreme court there are some more decisions related to that that need to be made but that is where we are today can we use public data sources for enrichment answer is yes can researchers aggregate and pull in breach data to build a

complete targeting model for the targets identified in the model the answer to that is no we can't do that but we can use have i been phoned to generate a one or a zero based on interesting data fields that are a match we know that the bad guys leverage breach data sets and aggregate against users to target them today this has been a growing problem it's only gotten bigger the various collections and and things that are aggregated and resold so you want to remove the value of that that data set to an adversary using any means that you have at your disposal and can we create defensive personas and social media platforms the answer is sometimes i

believe so linkedin adversary platform targeting methods these are the three primary methods that the adversaries utilize unauthenticated again that's what my research follows google foo or google dorky web scraping with python authenticated users um are in the network their personas or they're typically personas or they could be recruiters so they could be a persona recruiter we don't have metrics on how many uh personas or persona recruiters have ever been created in this platform so the data products are the third party aggregator products like hiq generates and uh data sift the the product that linkedin has has someone sell datasift also sells data products for facebook or twitter um so if you didn't think of linkedin as a social media platform

um their whole model is based on this and they use the same mechanisms so you should maybe change the way you think the linkedin potential use cases i've looked at you know anything and i think it's only the limit of your imagination that's that's all you can do and you can find anything but today i focused in on biomedical and pharmaceutical and for biopharmaceutical uh you know the interesting time that we're in made the uh the vaccine research data very interesting and who has access to vaccine research data is a question what could go wrong if people with access to vaccine research data are collected from linkedin and the basic methods to enumerate targets from linkedin data and then

modeling the attack surface and the counter measures that can be employed so this is a basic boolean string or a search engine string that could be you could use advanced search or you could just add this string and add keywords if you want to try this at home or you know adjust your own and preferably enter to defend your company and or your interests the seed file is a set of keywords that indicate that someone may have or likely has access to the target i don't have any maps or models to show today but what you can expect and this could be any vertical but for any vertical this would apply keyword matching to identify high value

targets seed files are grown based on common skills amongst harvested targets if three or more targets if you have a set of individuals that look like a good target map to your profile if three or more have a common skill then add it to the seed file that's how you grow keyword file you find common skills amongst targets and you add it to the seed file the geo location where the people that have access to the targeted data are located if you're looking for a job in a particular area you provided geodata to linkedin platform or you may leak that in your profile i'm looking for a job in this area i did this with this person here

and that that is another factor that you need to consider identified targets reveal peers in industry from outside it's an unauthenticated it's more difficult and but you can find collaboration uh by way of of uh of language that's that's distributed publicly uh or chart mapping of relationships between targets targets have titles that are assigned a value you need a manager an employee a vp whatever it may be and phishing is always based on this hierarchy somebody that is that is very senior in an organization organization asking someone that is very junior to take an action and we can use this data set to build out the phishing methodology that will be utilized and identifying these targets and these

targets once identified will be targeted repeatedly because if you have a specialized job then a single model can be used for years and years your career is not likely to change if you are a singer a senior research scientist researching covid vaccine um sars or any of the variants you'll be doing that for the next 20 years and that's true of any specialized job so uh in terms of presenting the data in a way that is safe this was the best that that i could do with these type of discussions closed discussions with with companies are always very different and with individuals are very different too but a selection of four and an approximate harvest targeting

around the top hundred results or more the average seed match was relatively low um we had five on a four on average for b seven and uh for c and four for d um the geodata at least someone in that subset always leaked geodata that is always going to be the case and that can give you a location if somebody's if the target is a physical location unique names unique names are highly problematic and i'll get more into that because simply because of the alignment of breach data to a unique name versus your name is john smith it's something very common so target enrichment for adversaries again what ocean data is available about the harvested targets

and the simple google search can generate more data once you have a name and a unique name will generate more data than than they um or i i'm sorry unique they will generate more data of course and what breach data is available about the harvested targets so if you have you you need unique name you can run a guessing program to firstname.lastname yahoo.hotmail.gmail etc and you can finger on the the potential footprint and what breach data is associated with the individual again you could use building a data model for this a one or a zero because you're just trying to determine if you think someone will be targeted and then you map that to your attack

surface to those vectors so again target richmond for adversaries uh there's one item i want you to pay close attention to just because we see so much of it today and that is for some swapping if it is easy to correlate the phone number and then then guess the phone of the individual that may be targeted and that is the factor that allows someone to get into your corporate environment or other environment then you want to take measures and make sure that that individual has uh hardened their access to their their telecom provider but there are lots of elements and this is a quick visualization i'm sorry i'm going to share the slides afterwards they came out kind of sp

small when i couldn't do the presentation role but this maps out that org structure what it would look like for any particular organization once the data is collected you have a senior scientist in this case vp spearfish's senior scientist asked them to share it externally maybe that's not an unusual request in an environment that needs to solve a large problem and needs to collaborate heavily the adversary is already inside your network they know who to enumerate they know who has access to the data is another one and threat number two threat number three they're also inside your network and uh they're again going to compromise vulnerability and infrastructure and uh target the individuals that have

access and attempt to enumerate them and uh you know you can talk you can target people high you can target them low but i think the fishing piece is is the most important piece of vaccine research is expensive why would an adversary pay for it when they could potentially steal it and that points back to why north korea took them the approach that they did they're they're exquisitely good at using the linkedin platform so that should have been a surprise to no one so i'd like to jump into our counter measures so this is what we can do about this problem and that includes opsec training if you train users on the impact of what they

post and the value of breach data and how it can be used against them it will make them stronger and you will make all of us stronger and you will make your organization stronger your security awareness training should show actual metrics about your company and what that attack surface looks like make it real make it actionable and if you show that to a leader in your company they will want to invest in this type of a program or this type of an adjustment if you already don't have something like this create your own targeted user analytics learn the keywords build and maintain a seed file search your user base continuously for high-risk data matches and share knowledge partner with your

employees create that partnership your employees don't want to be the one that that causes a breach they don't want to be the one that is compromised it's not a good position for them it's not a good position for us human sensors are another counter measure you can utilize existing high risk users as sensors change their email change their user id move them over if they're if i have a user that is also a habitual clicker they've consistently failed phishing tests and they have poor opsec and you will find this correlation if you look at the data it's there because people that exhibit poor behaviors do them everywhere you can turn a weakness into a strength

by adjusting this and and collecting data against them for enhanced monitoring building internal deception based on these users and moving them over to a different user id a different group if you're able to do that it would be a fun activity and i find too often deception isn't based on reality and the bad guys move past it so build it that way utilize former employees as sensors someone switched a job they left a stale linkedin profile they have scale artifacts that indicate that they work for your organization but they don't and you've defanged their access you can use them as a canary in the coal mine take advantage of their lack of updating and again base deception infrastructure

upon them and volunteer and employee as a sensor is another one because we all know those infosec folks that are introverts they have no public profile maybe they should have one that reflects a different skill set than an infosec and align them to a crown jewel and use them as a sensor they'll have the minds for it and they'll be hardened to look at look at the data that may come in both outside and inside so the next counter measure is persona development and this is where i i was really focused on opsec for for years and i found that it's it's just like with with with uh simulating fishing you can't solve it and it's very difficult to solve it

because people need to find jobs and you need to expose keywords to find jobs sometimes that's just a fact but personas can help with this so personas must fit the target profile use the same keywords to establish attractive targets create relationships with other potential targets or other controlled personas either in your control you can also create relationships with adversary personas if you leave one in place i won't have time to discuss that at that area it's a big area and there's a lot of pros and cons um for success methods build a deception profile over time if somebody shows up as a senior virologist or senior uh very senior position and they haven't been in the industry they haven't been

in your organization for a long time they're not they're too new and it won't pass the smell test start them off grow them over years create multi-platform personas for target validation i want an adversary to believe that that it is real outside of linkedin build internal corporate deception that aligns to these external artifacts again don't waste your time with the deception program that isn't built from the outside in linkedin issues there are a few that i will talk about but again this could be talked about for quite a while user responsibility versus linkedin responsibility in regard to the arm model i i think that a lot of the responsibility falls on the platform and i'll leave it i'll leave it at that

users can do so much but the platform needs to help them when there are individuals that are endangering us the it is a platform responsibility and i believe regulation will one day and i'd like to see it happen sooner rather than after something but it's gone on for a long time but i believe that that will come and 2019 was the first time linkedin announced the number of personas they removed and there were 21 million of them and prior years my theory is that they were not announced because prior to the microsoft and linkedin we should be talking about microsoft because it is microsoft those metrics were not made available because the valuation of any social

media platform is based on the number of users and a high number of users meant a higher valuation for the um the acquisition so my questions for linkedin have shell companies bought data products from linkedin how many persona recruiters have been detected since today or prior to 2019. um does linkedin plan to inform users if they were contacted or exposed to a known persona because i've talked with groups that have run into this scenario uh it was them contacting and getting that response and having to work hard to get a response but seeing that proactive data would be extremely helpful and i always want to understand the capital investment that has been made into security as a percentage of

profit i want to see those metrics and i think we all would and particularly when we're protecting against state threats so my proposal is that linkedin should protect u.s interests by offering platform level protections first to special users users that align with uh critical infrastructure for example i think this is a responsibility that they must undertake i think microsoft should be pressured by regulators to force this again and i think linkedin needs to force a security review facebook did it and facebook's data was not nearly as important for all users because people have had their settings set up for such a long time and they do not realize that you can use a search engine to glean uh

clean this data some of this profile data and it's such a large number of people that don't understand it that why not have a security review the answer is because it would it would lower the value of the platform in the arm model but i think it should happen because if facebook did it why not linkedin and linkedin should consider selling or providing free deception products um why not introduce something that's a potential profit center into a platform that can also benefit all of us and benefit our corporations i would sign up immediately for something like this if they were to provide it i would love to help them make it to be honest but i'd like to see them do it

and linkedin should offer a pay for product of course we always say this when we talk about these platforms to offer an encryption scheme that only shares our data with our trusted networks have an opt-in model i would pay for this there's a large number of us that would be willing to pay for us don't monetize me let me pay what is needed and you know i mean it may not value it may not matter what the cost is i think we'll pay and uh you know i think that that's that's the gist of it so in terms of opsec what not to post keywords that indicate you have access to sensitive business data system

systems or processes keywords that indicate you have access to classified data facilities or programs if you work in the government you can count on the training in this area that's one thing that that is done well pictures with metadata that could reveal or connect you to a co-worker or you or a co-worker to a sensitive location maybe you want a co-worker if you're both in the photo and you know we can get into what platforms strip out exif data linkedin does facebook does twitter does but but there are other places to find uh find photos that don't and keywords with specific tools you use to protect your enterprise it's part of an operation an adversary

will want to know obviously what technologies you're utilizing and there's all kinds of ways that the corporations leak this data make sure you don't do it as an individual and what to be aware of your social media footprint the breach data set that can be used against you those two married together equal a high risk scenario for many of our interests we need to remove the value of breach data and we need people corporations these platforms and [Music] and all of us to work together against the social media footprint issues and so what to do assume breach assume exposure and don't be the reason that your company is breached no one wants to be that reason

so the summary and actions to round it all out is that linkedin data is actively used by our adversaries to target our interests targeting models built on this data can be reused outside the platform forever for as long as your career lasts if you're specialized blue team should use linkedin data to model high-risk users red teams should use linkedin data to simulate adversary targeting and personas of all types are prevalent in the platform and you should consider building them as sensors opsec is a user platform and corporate shared responsibility we need to push the bar on this we're going to all be in trouble and that is it