Protecting Locations with Differential Privacy under Temporal Correlations_第1页
Protecting Locations with Differential Privacy under Temporal Correlations_第2页
Protecting Locations with Differential Privacy under Temporal Correlations_第3页
Protecting Locations with Differential Privacy under Temporal Correlations_第4页
Protecting Locations with Differential Privacy under Temporal Correlations_第5页
已阅读5页,还剩9页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Protecting Locations with Differential Privacy underTemporal CorrelationsYonghui XiaoDept. of Math and Computer ScienceEmory ULi XiongDept. of Math and Computer ScienceEmory UABSTRACTConcerns on location privacy frequently arise with the rapiddevelopment of GPS enabled devices and location-based ap-plications. While spatial transformation techniques such aslocation perturbation or generalization have been studiedextensively, most techniques rely on syntactic privacy mod-els without rigorous privacy guarantee. Many of them onlyconsider static scenarios or perturb the location at singletimestamps without considering temporal correlations of amoving users locations, and hence are vulnerable to vari-ous inference attacks. While di erential privacy has beenaccepted as a standard for privacy protection, applying dif-ferential privacy in location based applications presents newchallenges, as the protection needs to be enforced on the yfor a single user and needs to incorporate temporal correla-tions between a users locations.In this paper, we propose a systematic solution to preservelocation privacy with rigorous privacy guarantee. First, wepropose a new de nition, -location set based di erentialprivacy, to account for the temporal correlations in locationdata. Second, we show that the well known 1-norm sensitiv-ity fails to capture the geometric sensitivity in multidimen-sional space and propose a new notion, sensitivity hull, basedon which the error of di erential privacy is bounded. Third,to obtain the optimal utility we present a planar isotropicmechanism (PIM) for location perturbation, which is therst mechanism achieving the lower bound of di erential pri-vacy. Experiments on real-world datasets also demonstratethat PIM signi cantly outperforms baseline approaches indata utility.Categories and Subject DescriptorsC.2.0 Computer-Communication Networks: General|Security and protection; K.4.1 Computers and Society:Public Policy Issues|PrivacyPermission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full cita-tion on the first page. Copyrights for components of this work owned by others thanACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-publish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from .Copyright 20XX ACM X-XXXXX-XX-X/XX/XX .$15.00.KeywordsLocation privacy; Location-based services; Di erential pri-vacy; Sensitivity hull; Planar isotropic mechanism1. INTRODUCTIONTechnology and usage advances in smartphones with lo-calization capabilities have provided tremendous opportu-nities for location based applications. Location-based ser-vices (LBS) 20, 8 range from searching points of inter-est to location-based games and location-based commerce.Location-based social networks allow users to share locationswith friends, to nd friends, and to provide recommenda-tions about points of interest based on their locations.One major concern of location based applications is lo-cation privacy 3. To use these applications, users have toprovide their locations to the respective service providers orother third parties. This location disclosure raises importantprivacy concerns since digital traces of users whereaboutscan expose them to attacks ranging from unwanted locationbased spams/scams to blackmail or even physical danger.Gaps in Existing Works and New Challenges. Manylocation privacy protection mechanisms have been proposedduring the last decade 23, 15 in the setting of LBS or con-tinual location sharing. In such setting, a user sends herlocation to untrusted service providers or other parties inorder to obtain some services (e.g. to nd the nearest restau-rant). One solution is Private Information Retrieval (PIR)technique, based on cryptography instead of revealing indi-vidual locations (e.g. 32). However, such technique tendsto be computationally expensive and not practical in ad-dition to requiring di erent query plans to be designed fordi erent query types.Most solutions proposed in the literature are based on lo-cation obfuscation which transforms the exact location of auser to an area (location generalization) or a perturbed lo-cation (location perturbation) (e.g. 14, 1). Unfortunately,most spatial transformation techniques proposed so far relyon syntactic privacy models such as k-anonymity, or ad-hocuncertainty models, and do not provide rigorous privacy.Many of them only consider static scenarios or perturb thelocation at single timestamps without considering the tem-poral correlations of a moving users locations, and henceare vulnerable to various inference attacks. Consider thefollowing examples.I Suppose a user moved from school to the cafeteria (where?is) in Figure 1 (left). Three perturbed locations werereleased by selecting a point probabilistically in each of(East)(North)star123(East)(North)starFigure 1: Examples of privacy breach caused by temporal corre-lations of user locationsthe three circles (by some spatial cloaking methods).Even though the individual locations were seeminglyprotected at each timestamp, considering them togetherwith road constraints or the users moving pattern willenable an adversary to accurately gure out the user isin the cafeteria, resulting in privacy breach.II Suppose a users location ? is protected in a circleas shown in Figure 1 (right). If by estimation based onprevious locations the user can only be in the ve placesat current timestamp as shown in the gure, then theobfuscated location actually exposes the true location.Thus technically, the radius of the circle (in locationobfuscation) should be subject to temporal correlations.While such temporal correlations can be commonly modeledby Markov chain 37, 17, 25, and few works have consideredsuch Markov models 37, 17, it remains a challenge to pro-vide rigorous privacy protection under temporal correlationsfor continual location sharing.Di erential privacy 9 has been accepted as a standard forprivacy preservation. It was originally proposed to protectaggregated statistics of a dataset by bounding the knowledgegain of an adversary whether a user opts in or out of adataset. Applying di erential privacy for location protectionis still at an early stage. In particular, several works (e.g.6, 33, 12) have applied di erential privacy on location ortrajectory data but in a data publishing or data aggregationsetting. In this setting, a trusted data publisher with accessto a set of location snapshots or user trajectories publishesan aggregate or synthetic view of the original data whileguaranteeing user-level di erential privacy, i.e. protectingthe presence of a users location or entire trajectory in theaggregated data.There are several challenges in applying di erential pri-vacy in the new setting of continual location sharing. First,standard di erential privacy only protects user-level privacy(whether a user opts in or out of a dataset); while in oursetting, the protection needs to be enforced on the y for asingle user. Second, as shown in Figure 1, temporal correla-tions based on road networks or the users moving patternsexist and the privacy guarantee needs to account for suchcorrelations. Finally, there is no e ective location releasemechanism with di erential privacy under such model.Contributions. In this paper, we propose a systematic so-lution to preserve location privacy with di erential privacyguarantee. As shown in Figure 2, we consider a moving userwith sensitive location stream who needs to share her lo-cations to an untrusted location-based application host orother parties. A users true locations are only known by theuser. Thesanitizedlocations released by the privacy mech-timet 1t 2t 3t 4Location BasedApplicationsPrivacyMechanismsPerturbed locationPerturbed locationTrue locationTrue locationTrue locationTrue locationTrue locationuntrustedusert 5Figure 2: Problem settinganisms are observable to the service providers, as well asadversaries. To enable private location sharing, we address(and take advantage of) the temporal correlations, which cannot be concealed from adversaries and hence are assumed tobe public. Our contributions are summarized as follows.First, we propose -location set based di erential pri-vacy to protect the true location at every timestamp. Theneighboring databases in standard di erential privacy areany two databases under one operation: adding or removinga record (or a user). However, this is not applicable in avariety of settings 21, 5, which leads to new and extendednotions such as -neighborhood 13 or event-level 11 di er-ential privacy. In our problem, location changes between twoconsecutive timestamps are determined by temporal correla-tions modeled through a Markov chain 37, 17. Accordinglywe propose a -location setto include all probable locations(where the user might appear). Intuitively, to protect thetrue location, we only need to hide it in the -location setin which any pairs of locations are not distinguishable.Second, we show that the well known 1-norm sensitivityin standard di erential privacy fails to capture the geomet-ric sensitivity in multidimensional space. Thus we propose anew notion, sensitivity hull, to capture the geometric mean-ing of sensitivity. We also prove that the lower bound oferror is determined by the sensitivity hull.Third, we present an e cient location perturbation mech-anism, called planar isotropic mechanism (PIM), to achieve-location set based di erential privacy.I To our knowledge, PIM is the rst optimal mechanismthat can achieve the lower bound of di erential pri-vacy1. The novelty is that in two-dimensional space wee ciently transform the sensitivity hull to its isotropicposition such that the optimality is guaranteed.II We also implement PIM on real-world datasets, show-ing that it preserves location utility for location basedqueries and signi cantly outperforms the baseline Laplacemechanism (LM).2. PRELIMINARIESWe denote scalar variables by normal letters, vectors bybold lowercase letters, and matrices by bold capital letters.We use jj jjp to denote the p norm, xi to denote the ithelement of x, E() to denote the expectation, xT to denote the1The state-of-art di erentially private mechanisms 18, 4for linear queries can be O(log(d) approximately optimalwhere d is the number of dimensions.si a cell in a partitioned map, i = 1;2; ;mu, x location in state and map coordinatesu ; x true location of the userz the released location in map coordinatep t prior probability (vector) at timestamp tp+t posterior probability (vector) at timestamp tX -location setK sensitivity hullTable 1: Denotation1 2 3 4 56 8 9 1011 12 13 14 1516 17 18 19 2021 22 231 2 3 4 5123457Figure 3: Two coordinate systemstranspose of vector x. Table 1 summarizes some importantsymbols for convenience.2.1 Two Coordinate SystemsWe use two coordinate systems, state coordinate and mapcoordinate, to represent a location for the Markov model andmap model respectively. Denote S the domain of space. Ifwe partition S into the nest granularity, denoted by cell,thenS =fs1;s2; ;smgwhere each si is a unit vector withthe ith element being 1 and other m 1 elements being 0.Each cell can represent a state (location) of a user. On theother hand, If we view the space as a map with longitudeand latitude, then a 2 1 vector can be used to represent ausers location x with two components x1 and x2. Figure3 shows an example using these two coordinate systems. Ifa user is in s7, the state coordinate and map coordinate areshown as follows. Note that the two coordinate systems canbe transformed to each other. We skip how to transformthem and treat u and x interchangeable.u = s7 = 0 0 0 0 0 0 1 0 0 x = 2;4T with x1 = 2 and x2 = 4As time evolves, the trace of a user can be represented bya series of locations, x1;x2; ;xt in map coordinate oru1;u2; ;ut in state coordinate.2.2 Mobility and Inference ModelOur approach uses Markov chain 37, 17, 25 to modelthe temporal correlations between users locations. Otherconstraints, such as road network, can also be captured byit. However, we note that Markov model, as well as anymobility models, may have limits in terms of predicability38. And we will discuss our solution to address these limitslater.In our problem setting, a users true locations are unob-servable, i.e. only known by the user. The sanitized loca-tions released by the perturbation mechanism are observableto the service provider, as well as adversaries. Thus from anadversarial point of view, this process is a Hidden MarkovModel (HMM).At timestamp t, we use a vector pt to denote the probabil-ity distribution of a users location (in each cell). Formally,pti = Pr(u t = si) = Pr(x t = the coordinate of si)where pti is the ith element in pt and si2S. In the exam-ple of Figure 3, if the user is located in cells fs2;s3;s7;s8gwith a uniform distribution, the probability vector can beexpressed as follows.p = 0 0:25 0:25 0 0 0 0:25 0:25 0 0 Transition Probability. We use a matrix M to denote theprobabilities that a user moves from one location to another.Let mij be the element in M at ith row and jth column.Then mij represents the probability that a user moves fromcell i to cell j. Given probability vector pt 1, the probabil-ity at timestamp t becomes pt = pt 1M. We assume thetransition matrix M is given in our framework.Emission Probability. If given a true location u t, a mech-anism releases a perturbed location zt, then the probabilityPr(ztju t = si) is called emission probability in HMM.This probability is determined by the release mechanism andshould be transparent to adversaries.Inference and Evolution. At timestamp t, we use p tand p+t to denote the prior and posterior probabilities ofa users location before and after observing the released ztrespectively. The prior probability can be derived by theposterior probability at previous timestamp t 1 and theMarkov transition matrix as p t = p+t 1M. Given zt, theposterior probability can be computed using Bayesian infer-ence as follows. For each cell si:p+t i = Pr(u t = sijzt) = Pr(ztjut = si)pt iPjPr(ztju t = sj)p t j (1)The inference at each timestamp can be e ciently com-puted by forward-backward algorithm in HMM, which willbe incorporated in our framework.2.3 Differential Privacy and Laplace Mecha-nismDefinition 2.1 (Differential Privacy). A mechanismA satis es -di erential privacy if for any output z and anyneighboring databases x1 and x2 where x2 can be obtainedfrom x1 by either adding or removing one record2, the fol-lowing holdsPr(A(x1) = z)Pr(A(x2) = z) eLaplace mechanism 10 is commonly used in the literatureto achieve di erential privacy. It is built on the 1-normsensitivity, de ned as follows.2This is the de nition of unbounded di erential privacy 21.Bounded neighboring databases can be obtained by chang-ing the value of exactly one record.Definition 2.2 (1-norm Sensitivity). For any queryf(x): x!Rd, 1-norm sensitivity is the maximum 1 normof f(x1) f(x2) where x1 and x2 are any two instances inneighboring databases.Sf = maxx1;x22 neighboring databasesjjf(x1) f(x2)jj1where jj jj1 denotes 1 norm.A query can be answered by f(x) + Lap(Sf= ) to achieve-di erential privacy, where Lap() 2 Rd are i.i.d. randomnoises drawn from Laplace distribution.2.4 Utility MetricsTo measure the utility of the perturbed locations, we fol-low the analysis of metrics in 37 and adopt the expecteddistance (called correctness in 37) between the true loca-tion x and the released location z as our utility metric.Error =qEjjz x jj22 (2)In addition, we also study the utility of released locationsin the context of location based queries such as nding near-est k Points of Interest (POI). We will use precision andrecall as our utility metrics in this context which we willexplain later in the experiment section.2.5 Convex HullOur proposed sensitivity hull is based on the well studiednotion of convex hull in computational geometry. We brie yprovide the de nition here.Definition 2.3 (Convex Hull). Given a set of pointsX = fx1;x2; ;xng, the convex hull of X is the smallestconvex set that contains X.Note that a convex hull in two-dimensional space is a poly-gon (also called convex polygon or bounding polygon).Because it is well-studied and implementations are also avail-able 31, we skip the details and only useConv(X) to denotethe function of nding the convex hull of X.3. PRIVACY DEFINITIONTo apply di erential privacy in the new setting of contin-ual location sharing, we conduct a rigorous privacy analysisand propose -location set based di erential privacy in thissection.3.1 -Location SetThe nature of di erential privacy is tohidea true databasein neighboring databases when releasing a noisy answerfrom the database. In standard di erential privacy, neigh-boring databases are obtained by either adding or removinga record (or a user) in a database. However, this is notapplicable in our problem. Thus we propose a new notion,-location set, to hide the true location at every timestamp.Motivations. We rst discuss the intuitions that motivatesour de nition.First, because the Markov model is assumed to be pub-lic, adversaries can make inference using previously releasedlocations. Thus we, as data custodians in a privacy mecha-nism, can also track the temporal inference at every times-tamp. At any timestamp, say t, a prior probability of theusers current location can be derived, denoted by p t asfollows.p t i = Pr(u t = sijzt 1; ;z1)Similar to hiding a database in its neighboring databases,we can hide the users tru

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论