upvote
There is no such thing as anonymized location data when you have the location of something where and when they sleep and work.

It's a rhetorical fiction the ad industry tells itself.

reply
Right, there's probably no other phone in the world that typically stops for hours within 1000 feet of my bed and typically stops on Monday-Friday within 1000 feet of my work-desk.
reply
Now think what Lavrenti Beria and an LLM could have done with that.
reply
Somebody once said that if Stalin had access to television, he would never have to kill 20+ million ppl. What would he do with all that data? No idea.
reply
Only thing better to rule with is a network connected telescreen that monitors and issues orders to the proles.
reply
Pretty sure it would be hard to enslave these people through television
reply
Would it be? I'd argue the current US administration is entirely propped up by television. Hell, the president seems to "rule" based on what Fox News said last night.
reply
A slightly different and no more charitable perspective is that the people pulling the president's strings are the same people pulling Fox News's strings.
reply
I think this begs the question of what anonymous data means. Sure my visit to HN is "anonymous" in that it doesn't say "abustamam visited this site" but piece together all the other visits that have my "anonymous ID" then eventually it paints a pretty nice picture of who I am.
reply
Does it map to a single, identifiable person or something close enough that the distinction is meaningless?

Then it's not anonymous.

Simple as that.

reply
And with LLM’s now it’s easier than ever to piece the parts together. Companies were doing it before we even knew what LLM’s were capable of.

Edit: It's a rhetorical fiction the ad industry tells us.

reply
deleted
reply
deleted
reply
deleted
reply
We should have learned this lesson 20 years ago when researchers were able to deanonymize a lot of the Netflix Prize dataset, which contained nothing except movie ratings and their associated dates.

https://arxiv.org/abs/cs/0610105

If movie ratings are vulnerable to pattern-matching from noisy external sources, then it should be obvious that location data is enormously more vulnerable.

reply
> In contrast to previous attacks on micro-data privacy [22], our de-anonymization algorithm does not assume that the attributes are divided a priori into quasi-identifiers and sensitive attributes. Examples include anonymized transaction records (if the adversary knows a few of the individual's purchases, can he learn all of her purchases?), recommendation and rating services (if the adversary knows a few movies that the individual watched, can he learn all movies she watched?), Web browsing and search histories (12], and so on. In such datasets, it is impossible to tell in advance which attributes might be available to the adversary;

Is Location data highly dimensional though?

reply
exactly. calling it 'anonymized' is pure security theater once you have enough data points to map out someones daily routine.

waiting for legislation or eulas to fix this is a lost cause since adtech always finds a loophole. the fix has to be architectural. moving toward stateless proxies that strip device identifiers at the edge before they even hit upstream servers. if the payload never touches a persistent db there is literally nothing to de-anonymize. stateless infra is the only sane way forward

reply
To be honest, I feel like this is where iOS and Android are failing us. Why is every app allowed to embed a bunch of trackers? Only blocking cross-app tracking on user request as iOS does is not enough (and data of different apps/websites can be correlated externally).
reply
im not sure about allowed. perhaps required may be closer.

why would someone include tech that makes people think twice about using the app, unless it is required if you want to "sell" in a particular venue.

if your developing geolocation based apps, location tracking is a core function.

a calender, absolutely does not require location tracking beyond what side of the prime meridian are you on.

reply
> if your developing geolocation based apps, location tracking is a core function.

But the subsequent sale of that data is not—is the discussion here.

reply
and the reason why that data is available for sale, starts with forced collection of data, if you want to participate in an app store as a developer.

you cant sell what you dont have unless you lie lower than a rug.

fix the data collection problem and a second order effect of no data for sale emerges.

reply
Are you suggesting Android/iOS app developers are forced into data collection somehow? If so, how? I'm genuinely curious.
reply
> why would someone include tech that makes people think twice about using the app, unless it is required if you want to "sell" in a particular venue.

Because the overwhelming majority of people don't think twice about this tech.

I do, and that's why I use a lot of web tools or old-fashioned phone calls, but most people think metadata=unimportant and assume that the purpose of the app is what it does for them rather than to gather their personal information for sale.

reply
Because we don’t enforce antitrust law in this country and the people that make those decisions profit from the ads.
reply
> To be honest, I feel like this is where iOS and Android are failing us. Why is every app allowed to embed a bunch of trackers? Only blocking cross-app tracking on user request as iOS does is not enough (and data of different apps/websites can be correlated externally).

Even if Google and Apple both want to commit to fighting this, it becomes a game of whack-a-mole, because there are all sorts of different ways to track users that the platforms can't control.

As an easy example: every time you share an Instagram post/video/reel, they generate a unique link that is tracked back to you so they can track your social graph by seeing which users end up viewing that link. (TikTok does the same thing, although they at least make it more obvious by showing that in the UI with "____ shared this video with you").

reply
How is this legal under the GPDR? There is clear examples in the citizenlab document of a user been tracked inside of the EU from outside.

Is there not also a requirement for clean consent? Ie a weather app can’t track your precise location?

reply
Companies exist that de-anonymize other data brokers data. Lets the other data brokers claim they have anonymized data while end end users get everything.
reply
you could probably run a anonymization company at the same time you run a de-anonymization company
reply
Best of both worlds - legal and profitable \s
reply
> enough samples that you can apply statistics to find precise locations, in many cases you can de-anonymize the IDs

I think a lot of people don't realize the power of a big enough sample size. With enough samples even something pretty innocent looking like your daily step counter could make you identifiable.

As far as I know we don't have large enough databases to make this happen in practice, but I don't think this is impossible in the future.

reply
How large are you estimating is "large enough"?
reply
In what sense can the latitude and longitude of my house be called anonymous data?
reply
Ultimately, a map is anonymous data containing lat/lon of everyone's house

Alone, these points are not deanonymizing, it's when there's other data associated.

reply
Location and identity are inextricably linked. You can't destroy identity without also destroying location and location is critical for myriad purposes.

The analytic reconstruction of identity from location is far more sophisticated than the scenarios people imagine. You don't need to know where they live to figure out who they are. Every human leaves a fingerprint in space-time.

reply
> and location is critical for myriad purposes.

It's not though.

Critical for myriad elective purposes? Sure.

reply
Only if you consider the entire concept of logistics in civilization as "elective".
reply
Seems hyperbolic we had logistics that functioned extremely well before we had customer location data for sale on 3rd party sites.
reply
If you re-read the comment they didn't say that selling it was intrinsic.
reply
The article is about privacy tracking spyware cookies. I think making statements in that context about how modern logistics don't work with out location data implies you mean location data from those sources. I mean i suppose it doesn't have to but than it just feels off topic no?
reply
I don't follow what you mean by 'logistics in civilization' as that's pretty vague and amorphous.

Could you be more specific with maybe a single example of where my physical geographic location is electronically critical for a purpose that isn't elective/optional/avoidable?

(And I'm not just trying to be obtuse. I think you're touching on at least part of the 'heart' of both this conversation and that of digital ID verification.)

reply
How does tracking the movements of individual humans aid shipping and logistics, other than providing traffic data to freight companies? How did we manage to have global supply chains prior to GPS being invented?

Edit: I assume I am missing a crucial part of logistics that you’re familiar with, genuinely curious.

reply
deleted
reply
From what I've seen none of this is that complex, one could simply 'draw a circle around your house' and get all the "anonymized" device pings and just trace those.
reply
Yep. With side channel/one order of thinking above the laws, its trivial to get around said laws. Need better laws.
reply
> A lot of geolocation data on the market is anonymized

A lot isn't good enough.

reply