The Library of Congress
recently announced an ambitious archiving project. Every posting on twitter will be collected and archived forever. I still haven't decided whether this is a good thing or not.
The Benefits
There are some seriously positive benefits for archiving this information. For example, people use twitter to post their immediate thoughts and activities. This isn't just a snapshot of American life, this is a snapshot of the world! If we ask what life was like 100, 200, or even 500 years ago, we need to estimate based on the available materials. Few people wrote down their daily actions and thoughts. Writing was a time consuming hobby. And it was expensive -- paper wasn't free.
With Twitter, people constantly post the most menial things. But those little details are the most enlightening. In 500 years, the question won't be "what was life like back in 2010?" There are detailed histories from regular people all over the world. They talk about daily life, common activities, desires, and even opinions about political and environmental topics.
Twitter also permits researchers to follow trends in vocabulary, observe the impact of technologies, and even watch how news and events propagate through society. Twitter is an information goldmine.
The Downside
While there is a definite research benefit, it comes at a price of privacy. We have become a world that never forgets. Frankly, not everything done in public needs to be recorded forever. For example, a quiet conversation between friends -- even if made in public -- may still have an expectation of privacy. And just because I step outside my house does not mean someone should record my every step. Although the Internet operates with a very public profile, not every public comment needs to be recorded as if the paparazzi found everyone important. Many people post things to Twitter that they would never have written if they thought it would be archived for posterity.
Today, there are people who find that items posted to public forums have very serious consequences. Like the woman who was
denied a teaching certificate because of a MySpace picture. Students have been
expelled over Facebook comments, and two people
were arrested for directing protesters via Twitter.
Imagine what people will find if they look more closely at archived tweets... Old-school politicians only needed to worry about items recorded in the media. But we now have an entire generation that is living online. That innocent tweet you made when you were 12 could be the reason you lose your run for governor when you turned 34. And that snide remark posted when you were 14 could be why you were passed on a promotion when you turned 28, were laid off when you turned 29, and
can't find a new job.
Roach Motel
The Library of Congress is supposed to have a copy of every book that has been publish in the United States. Their
American Memory Project includes detailed interviews of everyday people; recording current thoughts, fears, and daily life for posterity. And the
National Digital Preservation Project has been archiving the Web since December of 2000.
However, there is a big difference between these projects and Twitter. When a book is published, there is an expectation that people will read it. There is an expectation that it will be archived. When a web page is published, it is published for the world; most web masters hope that Google will index it. Cached or archived web pages are the norm. In fact, the Internet Archive's
Wayback Machine has been mirroring sites since 1996. And while the American Memory Project records detailed snapshots of real people, the people are interviewed for the purpose of being archived.
But what about Twitter? These people weren't interviewed. And while it is a public forum, there hasn't been an expectation that someone would archive and datamine all tweets forever. Moreover, Twitter is used by people world-wide, and not just in the United States. Did international people have any expectation for privacy -- even on public tweets?
Where does it end?
Besides posting text, many people tweet pictures. Will those also be archived?
I've been periodically downloading pictures from
Twitpic. I've probably archived close to 100,000 pictures. I'm doing it because I want unmodified samples from digital cameras (great for
photo ballistics). I've also noticed certain trends. For example, I know what type of pictures to expect depending on the time of day. I can actually associate some camera models with certain regional and social economic classes, I know who is the target customer for an iPhone, and I know the most common thing people photograph: their food. It is really weird... it doesn't have to be a great presentation at an expensive restaurant. It is usually just "
here's what I'm about to eat for lunch".
Besides food, people also post semi- and fully-exposed self photographs. While it is stupid to take those kinds of pictures, and really stupid to post them in a public forum, should their stupidity be archived forever? (Not in my archives. I deleted all amateur porn that I download from Twitpic. I just need camera samples and I prefer ones that are workplace safe. Besides, most of these people are really not as attractive as they think they are.)