Donate for the Cryptome archive of files from June 1996 to the present


17 July 2015

1.7 billion "anonymous" comments from 5% of the internet


A sends:

1.7 billion "anonymous" comments from 5% of the internet

All in one searchable database:

https://archive.org/details/2015_reddit_comments_corpus_sqlite

Cryptome: What will this be used for?

A:

It depends on who the user is. Law enforcement and private investigators will use the information to try to:

1. Identify individuals based on behavioral analysis of comments, etc.

2. De-anonymize individuals and leverage this information on other platforms, i.e. checking identical/similar usernames and using the behavioral analysis to predict other (online or offline) hangouts and activities in order to build a more complete picture.

Sociologists and psychologists will use it to build behavioral models for individuals acting as individuals and for ad-hoc groups of individuals without any external organization, goal, etc.

Members of the public and historians will use it to look at and for public figures and to better understand them. More importantly, the public should use this database as a wake-up call that the driving force behind Big Data isn't Big Brother - it's the masses. Between this and the Dark Net Market archives and some other releases in the last few weeks, it's becoming more apparent that the "right to be forgotten" may be recognized by some governments but private individuals and researchers, not just megacorps, remain major obstacles to it.

This is, simply put, the biggest example of open source SIGINT to date. The fact that it was done legally and openly, and not as the result of a hack or data leak, may make it seem less newsworthy - but if anything, it makes even more alarming to privacy advocates. It's not a one-off either, it's just one of the biggest signposts we've seen so far.