
foreign
good morning everyone so the session is about getting smarter about data privacy and we are going to talk through like what exactly is data privacy how we can achieve that we're also going to dive deeper into the concepts of creepiness which is related to your big data analytics like how the third parties your data aggregators they collect the information your personal information and they use it so uh before we dive deep into the agenda a brief introduction about myself [Music] I'm based in Toronto Canada I worked there as a consultant for a financial institution um I have over eight years of experience and governance risk compliance and I've also done incident response software engineering help desk so pretty much
have like Hoops of different experiences um I've also published to research people on doxing so this was for the conference on forensics and privacy in 2016 in Montreal I think this deck includes the link for it somewhere if not then probably just uh kind of reach reach out to me like I'm more than happy to give you the link I'm also a board member for the ISE Square Toronto chapter I represent that we have like monthly meetings and I'm also the events coordinator for the vices we do have uh our wise as president Moana sitting right there so yeah say say hi to us whenever you get a chance um okay for the agenda I'm going to have walk
you through like what's the distinction there's a thin line which is democratic democrating between your data privacy as well as your data security so we're going to walk through that and then I'm going to Define what exactly is creepiness like how exactly you define that and what are the best mitigating measures you take to avoid it um we're also going to look through some of these statistics to see where do consumers draw the line and it's an interesting um study that has been pulled up from a research which was conducted by KPMG I'm going to walk through the pillars of uncreepy so more essentially the crowns of like best practices and then the steps to mitigate creepiness and in the
end we'll be spending a few five to ten minutes of the best practices to achieve your data security I'll be concluding this session and leaving the last 10 minutes open for the audience to answer any follow-up questions so uh between data privacy and data security so it's essential to understand the difference between the two when I talk about data privacy it means how exactly your data is collected how exactly your data is used and how exactly your data is shared with third party or data aggregators your data Brokers so essentially the governance of connecting using and sharing the data that is your data privacy when we talk about data security your data security essentially is how that
data is protected is that data encrypted at rest do we have ACLS in place on who exactly have access to that data what are the security controls in place or if your let's say website processes highly confidential pii information do we have appropriate authentication mechanisms in place so all these things are related to your data security so your data privacy is essentially is proper handling of data and includes the consent notice and Regulatory obligations so something more on the crowns of your data privacy laws your data security is protecting data from your internal and external attackers which is your data security control so I've taken an example here and I'll just read through it so
consider a scenario where like you have gone to create lens to secure your pii information like your date of birth your name phone number Etc now the data is encrypted access is restricted and multiple overlapping and monitoring systems are in place right so this is essentially your data security controls however if that pii data was collected without proper consent so this is your violation of your data privacy so there is a thin line to marketing between the two but essentially your data protection is necessary to achieve your data privacy so taking a look at some of the threats to your data security so we briefly talked about this um your unauthorized access and usage like who all have access to the data and
especially if it deals with your financial your pii data is that something that that is being logged do you have enough sufficient data Elements which are actually tracking that information capturing that information and storing them in the form of logs then liability due to regulatory non-compliance like have the consumers of the users consented to the use of that personal information the theft or accidental loss of media do you have enough safeguards in place that if let's say your Remote device or your laptop it gets stolen or it gets lost do you have enough safeguards to do that remote why like that ability to remotely wipe that particular device improper treatment or sanitization of the end of use what are the controls in
place for the secure destruction of the information what what are the technologies that you're relying on to make sure that the data sensitive data cannot be recovered at whatever cost your data leakage and creatures again there are heavy penalties associated with this there was a study which was conducted by IBM which said that this takes approximately 287 but roughly around an ear to contain a data breach and the cost of containing a data breach for your hybrid Cloud environments is approximately 3.1 million so this was one of the studies which was conducted by IBM so you definitely want to make sure you have safeguards in place for your data leakage and creatures corruption modification and destruction
of data okay this is important so you have your logging and monitoring mechanisms in place to ensure that there is no threat because there is no um unauthorized modification that has been done to your data now why is data privacy so important so first things first it's your personal identifiable information right so you have the right to know how this information is being used where it is used and with what purpose it is being collected right so the penalties for violating the rules they are severe right especially when we talk about like gdpr and when we talk about the different privacy laws so those penalties are quite uh quite serious so we saw that recently a Cesar was sent to
imprisonment just because of hiding uh data breach and most of the business has built its business Foundation through its customer data so your the trust of your customers it relies on essentially that they know that you are the company is protecting the pii data right so that's the essential foundation so you want to make sure that you adhere to it and you have proper safeguards in place to protect your customer data then as we talked about it privacy it's the right of every person you know to be free from Uninvited surveillance and we are going to look into that surveillance in the coming slides and then finally save the individual and companies from the theft of data that
can cause enormous monetary losses right because having a data breach or getting your company's name published in the news that's no good right it garnishes the brand image the customers walk away you lose the business you lose time and money and then there is damage to the reputation as well so after we have defined data privacy data security and then the common threads let's look at what is exactly your creepiness right so I've taken a Snapshot from Twitter here so I'll just read through it so creepiness essentially let's let's take a look at this snapshot and then I'll dive into what exactly is creepiness so again this is like some arbitrary XYZ so last night
I had a conversation with my partner about Palm treating and this morning I get an ad for a palm reading app on Instagram pretty sure that it's happened with most of us let's say let's say I'm going to a website making a search for um maybe flights to Vegas or hotels in Vegas and the next thing I get is I tend to get a lot of recommendations either in the form of like Gmail or the Google Chrome or the ads right so everyone laughs at me when I print this up but I'm so totally convinced Facebook is listening to our conversations to Target apps so the sense of creepiness is that it's someone or literally like these apps like
someone or something there is tracking our activities that they are so they get to know about our personal habits about her personal preferences and when they come around making those recommendations for some of us that could be literally as if someone has been stalking us it could lead to a fact of like being someone is you know literally standing next to our shoulders and listening all our conversations so that creates a sense of like being creepy so this is what essentially creepiness is now how many of us in this room have experienced that right yeah I'm pretty sure all of us and we are going to look into the driving Technologies like what exactly leads to
this creepiness but the concept is named after that feeling that someone is literally watching over our shoulder when the online ads if we are connected to an email listed or even a conversation we just had so I would like to bring this like I think um yeah last to last week a kind of like met a few Folks at the conference and the next thing I know that was my Facebook friend request page was popping up with a friend suggestion for them so this was essentially like a pretty good example of okay there is someone something someone who's tracking it right the creepiness comes from the impression that the marketer knows so when we talk
when we say marketer it's essentially your data analytics tool which knows more about you than you want or expect them to know and furthermore they know how to follow you around in other words they are spying on you and talking and stalking you so what are exactly like these Technologies so there is this social listening right now most of the companies they have the right to let's say intercept or listen to the emails like the conversations between the employees right like how exactly it's happening they do track it so that's one part of it your data driven marketing which means the way your data is being collected the the way your data is being shared with a
third-party aggregator so that is something about your data driven marketing personalized analytics not this forms a big part of how exactly these personal preferences are being made this analytics is gaining momentum like day over every single day you have 10 samples from a personal like from an individual and you can clearly guess like using those analytics too you can make out what that person's choice would be what that person's preference would be and then you kind of like bring it back to that person so this is what like where your personalized analytics comes into picture then your ambient social Labs which is your Facebook Twitter Instagram LinkedIn maybe Tick Tock as well so you'd like
name it and all these they have these association with a third-party aggregators and they kind of like kind of tend to pass the information around and how these third-party aggregators use it this is not something which is controlled by the application with which you have consented to use your personal identifiable information so taking a look at what exactly are the factors that Define creepiness there's like three of them which is control distance and granularity so the control is in one simple term if I was to Define like in one simple sentence control is has have I consented to that data being collected or how much of my personal identifiable information is being collected so this is something that I
control right like for instance if I'm signing up for an account it makes sense to provide your name date of birth your last name but it doesn't make sense to provide your social information number or any other personally identifiable information right so control essentially is it's if the consumer or the user has consented to collect to connect that personally identifiable information uh your distance which means that whenever your your application collects that particular data how far is that data getting passed on to the third party uh your Brokers to the third party aggregators how exactly your data analytics tool is being implemented on that right so that depth or that distance defines a lot of it now
if that data is being passed over to let's just say thousands of data aggregators that all of them they're going to have their own data analytics tool so you are going to receive a lot of personalization based upon so you clearly want to make sure that there is less distance which means that the application you are passing the data to keeps it to themselves instead of passing the data you know to other places because that increases the creepiness Factor the granularity again this is more or less closely related to control granularity in one simple terms is what is the details or the data set that is being collected like is it your name Social Insurance Number your agent or
your date of birth your contact information anything related to that is your granularity so let me bring up this um interesting statistics and facts again this was a study which was conducted by kbmg so it this is just to give you an idea of where exactly we stand in the present world I believe that there was like the studies based off on interviewing 1500 individuals uh like working professionals and 55 percent of people said that they had decided against buying something online due to privacy concerns right and it's quite normal it happens with me too um then a little over two-thirds of people are not comfortable with smartphone and tablet apps using their personal data right and it's essentially
there where we have where we disable the cookies we check the cookies or clear the cookies fifty percent of people would accept free or cheaper products in exchange for Less privacy yeah it works yes then 50 respondents already delete their internet browser cookies or manage their social media privacy settings which is good like but the number baffles me it's just 50 of it and then one third use the Incognito or do not track modes when browsing their web pretty much a safe practice but it's just one third of people 75 percent of respondents said that they were uneasy for the online shopping data being sold to third parties right so this is something to like these numbers
and facts are something to think about then if we talk about the pillars of I'm creepy so there's like three different broad uh pillars on which we can Define how exactly we can be uncreepy okay the name it's kind of like self-explanatory but we'll go through like all three of these so transparency is one of the things how exactly your data is being collected what exactly is being put to use right like a clear explanation of what data you collect and how you use it the Privacy experiences that are human understandable and then responsiveness when the customers request those information right you have the right to know how that information is being used collected and shared
meaningful choice so options for data collection that includes Progressive capture right data minimization without breaking the experience so which means that you are limiting the data collection only to those data sets that are that is actually needed and allowing opt-out versus nuclear opt-out which means that you provide the individual the consent to actually opt out if you're not if the consumer has not consented to the Privacy terms and conditions then fair value exchange so this is related to recognize that value has different meanings right like for some of the consumers what may stand as like if they receive a lot of uh personal a lot of suggestions lot of recommendations that might not be creepy
for them but for some other individuals if they receive a lot of like those suggestions they might find like someone is literally stalking them or they might find it creepy that ensure the trade-off of the data to value is balanced and fair and then pass the reasonable expectation test with each new use of data right so something to think about when we talk when we talk about the pillars of I'm creepy so when we take a look at again this is not a very exhaustive we are going to like in the next few slides we are going to take a look at the at some of the measures to reduce the creepiness but again this is not an exhaustiveness so
one of the important things to understand is that one size does not fit all like an algorithm which a big company uses to have your data analytics and apply those personal uh preferences like suggest the personal recommendations that might be physiological you move it like you just replicate it and apply it to a small startup that algorithm might might be you know more looked as something which is creepy something which is invading the privacy of individuals is something with which individuals get a sense that yes someone is literally stopping them so it's important to note that one science does not fit all now analytics team may have to take the lead like if I was to redo these slides
I would get rid of the word me like analytic teams is a big driving Factor when it comes to taking the lead so the leadership needs to extend beyond simply developing the best possible analytic models they need to keep in human emotions and considerations into the mind official leaders should avoid practice and applications that expose the company to customer societal legal and Financial Risk okay an analytics leader may decide to implement a model with slightly weaker predictive Powers than one that is stronger but whose workings will not easily explained to customers and Regulators so you really have to draw that balance on what analytics is doing and you have to draw that thin line where it's not actually invading into
the privacy of the dimensionals um applied business sense when using personal data and alcoholic firms so you definitely want to have you know a person like a sense of what exactly is this Analytics tool or the analytics doing so even though a particular use of personal data anal corrections may be feasible it's not necessarily wise so it's important to ask this question how is the customer likely to respond to this recommendation at or automated decisions right so that should be something that needs to be kept in your in your in the mind when you are building those data analytics and qualiforms and solutions now for example it's possible for a gaming like a gambling company to send
offers to people with known gambling addiction problems right but let's just say if someone who's not addicted to crambling and you are sending them suggestions so that's kind of like does not uh make sense so you have to really be aware of what exactly uh you have to really ask this question like how exactly are the customers likely to respond to this recommendation then expand governance for the use of personal and personal data acquired films and applications so essentially what this means is how your data is collected where is it stored who will have access like logging in place how's that data is being shared with a third-party aggregators so most companies that do not have good
processes for identifying discussing and deciding on ethical issues right so this is something which is clearly lacking then better governance is needed so governance committee members should have a mix of your business legal ethical statistical modeling and system engineering backgrounds you definitely want to have that element of governance when you are connecting the data and when you are sharing it then be transparent about the use of personal data and algorithms the transparent visibility is one of the important things which the customers or the consumers they have the right to know about how the data is being collected so people should be aware of what personal data is collected analyzed use shared and the extent to which
fertilization is automated the public media and emerging laws and regulations are also calling for credit transparency so we'll be taking a look at some of the privacy laws in the next few slides but essentially you want to be transparent about how this particular data is being collected right and if it is just enough or just necessary data elements that are being collected then be alert and seek technological solutions so when we say about like technological solutions essentially this interprets what this refers to your data security controls in place and let's for example um like if an XYZ application or a website has collected personal identifiable information and there is a need to pass that pii data to a
third-party aggregator or a data broker then how that data is being being stored like how that particular data is being secured on that with that particular third-party aggregator so this is something that you need to be thoughtful about when you're looking at your technological solutions and when you're talking about your data security controls so many companies have personal data stored in a variety of databases used in multiple applications and employ it for different purposes so it's important to be able to show consent to collect the data know who has access to it identify how it's being transformed and track what data is used describe the analysis performed on it and then explain its use that algorithms
as well as decisions then act on current and future laws and regulations so it's important because if you don't abide by your data previously laws and regulations that there is a heavy penalty associated with that now the gdpr it went into effect May 2018 and in fact it all companies doing business with the European citizens so the regulations require opt-in authorization to collect any personal data which means that giving right to the consumer if they don't want their personal data to be collected now the request must be specific which means that it should not be ambiguous that they should clearly list or what the data is being collected and how it's going to be used so that's where the
little thing like the terms and agreements or the Privacy terms and agreements which we usually skip that comes into picture the collection and use of personal data must be specific well understood and business purpose again it's something which is needed for business it should not be something which is just collected out of randomly and out of the queue then have people and processors that can explain analytical output so that's quite important to understand because we rely so heavily on like automation the data analytics tool to all those analysis for us it's important for the humans to intervene and then study that analysis to make sure that we are not invading that line between privacy as
well as your creepiness Factor so companies need to recognize the analytics interpreter or translator role so people with the ability and responsibility to explain analytical models and When selecting from among modeling Alternatives keep explainability as a consideration like you should always be able to explain whatever the Analytics tool has in further drawn with there should be a element of human intervention studying those Analytics and then the next important one is be careful when automating decisions of the heart now data analytics yes it can be applied when we are talking about let's say we want to receive personal preferences or recommendations or suggestions yeah that's fine but can it be can those data analytics to be
applied to everywhere like all the institutions across all the Technologies the answer is you know like for example if we are talking about something related to your medical or something related to your like hard making decisions of whether a person it's a hard a pacemaker or not right we cannot rely on this obviously we can help the data analytics can help but end of the day the call has to be made like the year of need decision has to be made by the human so that's why it's important to have that human Intervention when we say that someone needs to study those data analytics uh results and the outcomes so analytics managers and professionals they should start by thinking carefully
before automating decisions of the heart the potential negative consequences obviously that could pose a threat to the lives that could be really extension and if the organization chooses to proceed be specially careful in building new models is something that needs to be given like a careful consideration and thought about so so these are some of like the things that needs to be kept in mind when we are talking about okay the creepiness like the mitigating factors for your creepiness now when we talk about your data security Technologies like how exactly what are the Technologies in place to secure your data the first one is your data loss prevention so DLP essentially means that how exactly your data is being
passed on like externally like do you have those controls if let's just say today I tend to send large Pi information to my personal email or to my personal Outlook from my work laptop is there sufficient safeguards or controls in place to prevent that or to block that action from happening so your TLP essentially comprises your discomfort classification which means Discovery means where exactly your data is like where exactly your data you're highly confidential or your pii data resides what's the classification of their data like is it restricted is it confidential is it like internal is it external and then obviously monitoring and enforcement because whatever tool or technology that you are using for your DLP like as a DLP
solution it's important to monitor those even so that action can be taken um you know immediately if there is let's just say if there is a data breach or a data leak so some of the things that your TLP policy that needs to address are like what kind of data is permitted to be stored in the cloud right where can the data be stored like with jurisdiction so something to think about sorry how should it be stored like an encryption and storage access considerations what kind of data access is permitted like devices and what networks which applications which tunnel so all these things like the technical details of it this is something to think about then
under what conditions is data allowed to leave the cloud because most of these companies like a third-party aggregator so your third-party Brokers they rely heavily on storing your data in the cloud premises so you definitely want to make sure that the cloud be like a public Cloud private cloud records official controls in place to protect that data from from you know from a data leak or a data breach some other data security Technologies is your encryptions definitely you want to make sure that your data is encrypted interest your data is encrypted in transit if you have an application which is like your apis making calls to the database that yes the authentication to that API
is controlled it's through the proper authentication mechanisms your key management is important as well like the data encryption Keys who has them and then how often these keys are rotated what are the security controls in place to protect those keys masking and obfuscation is another thing which means that if we are using your pii data and like your lower environments which you shouldn't but yeah there has to be masking as well as scrambling of that data in place then anonymization tokenization these are some of the other things which means that you cannot identify like which individual like which set of pii data sets belong to which individuals because it has been anonymized right and then
you kind of like assign a token so tokenization is more commonly used in like your medical world where you kind of like assign a unique ID instead of displaying of the patient and tires uh pii which is personally identified with information foreign so um some other data security best practices to look at is hardening of the devices making sure that the patches are up to date then all guest accounts are removed all unused Sports they are like closed right yeah you get rid of the default settings um no default passwords remain then obviously if there is like strong password policies in effect I would add to this that there is multi-factor authentication too um any admin accounts they are
significantly secured as well as log all unnecessary Services they are disabled the physical access is severely limited and controlled and then systems are patched to maintained and updated according to the vendor guidance sorry I have been transitioning like in the last three weeks across different weathers so yeah throat is one of the things that it has hit hard um talking about data privacy in Cloud so privileges privacy rights or obligations they are related to collection as we talked about your collection use disclosure storage and destruction of personal data so some of the things that you actually need to think about when we talk about data privacy is what information in the cloud is regulated under your data protection
loss then who is responsible for
where is this personal data processed obviously your privacy preachers and then who is responsible for protecting the Privacy so some of the some of the things that you need to think about when you're actually planning planning for data privacy in Cloud
okay so some of the factors you have to think about when you're talking about your data privacy is segregations of roles and appointments so this is quite important like if there is you know admin there is a super user or thing like segregation of Duties with training and instruction so security awareness and training is quite like underrated but it definitely forms a big part of how you implement those security controls and how you secure your data authentication techniques and procedures you definitely want to rely on strong authentication mechanisms in place right and if there is your website processes highly confidential or PIR data you want to make sure that there is enough logging in place your login is capturing
enough data set or sufficient data set elements authorization techniques and procedures like what kind of access is granted like what level of access is granted you want to make sure like that control on the time validity of assigned a thought authorization profiles like if there is after a time limit you kind of like that expires that access expires so you want to make sure of that particular thing then personal data preach management plans techniques and procedures obviously your company policies as well as standards and then log activities according to the criticality of personal data or the purpose of processing your data retention control according to the purpose of processing like how exactly the data is fitting like once it
once your data retention period is over how do you exactly scrub that data how do you get rid of that data um like is it through your uh like deleting erasing should so all those things you definitely want to make sure that you're not able to recover that data so something to think about and then um so yeah this slide it depicts some of the privacy laws uh which came into effects so there was this uh in 1974 you had the U.S U.S Privacy Act then HIPAA which is your health insurance portability and accountability act that and it protects your health information that came into effect in 1996. um in 1999 you had the glpa which is
your Grand Beach finally act it protects your financial non-public personal information then children's online privacy protection act your co-part came into effect in 2000 which Pro I think which protects the children's data Under 12 years I think it has been revised to 13 years um the Privacy Rule that came into effect in 2010 your socks was in 2002 with uh protect the public from fraudulent practices by the corporations um the fisma which is federal information security management act that chemic effect in 2002 as well it ordered the agencies to protect the highly confidential and confidential data which is in financial data as well uh isos 27 000 pounds that's in 2013. it functions as a framework for an
information security management system and then you have your gdpr which is in 2018 and aims to protect your European citizens personal data so it's important to be aware like in which jurisdiction like what kind of your data Privacy Law is applicable especially if you have your data storage or if you are relying on your third-party Brokers which are into like different jurisdictions so definitely something to think about because violation of the privacy laws as I stated before yes it leads to uh heavy penalties so to conclude this session um yeah we know that the current use of personal data and algorithms is creating both business opportunities and challenges like what may be creepy for one individual may not be treated for
another individual the data analytics is booming like day by day there is like these data analytics tools more work is being done to develop these models um there is large amount of data that's being collected so it's important to know like for what purpose it's collected in order to address both the opportunities as well as challenges then it is clear that the public media and state and federal governments are going to press for greater transparency and accountability and insists that personal data and algorithms be used in Fair ethical and legal ways so definitely transparency having clear visibility on what exactly or how exactly the data is being collected as another thing to be kept in mind
and then analytics managers and professionals and companies already heavily using your personal data algorithms they are aware of the growing pressures and demands so however the need for algorithm algorithmic transparency is going to expand and will be felt by all companies that apply analytics to personal data so essentially when one sentence if I was to summarize this is um there should be human Intervention when exactly designing your analytics Solutions and your analytic models like what is the outcome of that particular data analytics and is this outcome justifiable like let's just say if we were to recommend that outcome to an individual would that be considered as an innovation of privacy or would that be considered as the normal behavior
like as an creepy so something to think about and that call essentially needs your human intervention or your human Force to actually apply the dreams and think about it because it's easier if we rely on automation for your data analytics to pass that line which you know kind of like demarcates between your privacy as well as something which is creepy so I think here with that I leave the last five so seven minutes of this session open for the audience to answer any questions uh I can be reached on if you have any other comments or questions after that if I can be reached on my email or that's the link to my LinkedIn profile so I'm open to questions
yes babe that there is an accounting model to actually um determine the reputational cost of a breach is is that something that that you're aware of foreign
right the reputational cost of the page so yeah thank you for that um yes I have read about so when I was building this deck yes I have read that there are studies underway there is research on the way to get to have deeper time into the analytics more solutions which have actually present a fact let's just say if there is a data breach like what would be the cost to the reputation what would the cause to the image what would be the cost to the business right there are studies underway again how that particular research applies to a small startup a medium startup or a big company like Facebook Google that kind of like
differentiates like one small so this this your this again um gives gets us back that one Theory does not fit all like one size does not fit all but yeah you're right when we say that there are studies and the way there is research on the way which is uh I'm not familiar of any particular like any analytic solution in general but but yes while building out this deck I did come across it get out there yes there are studies underway
um any further questions
okay okay I think yeah with that oh yeah I would like to hand it over to the next speaker and thank you so much for attending this session thank you [Applause]