On Pseudonymity, Privacy and Responsibility on Google+

[This was originally posted on Google+ (https://plus.google.com/117903011098040166012/posts/asuDWWmaFcq) where it went viral for a while. It’s still my most popular post. Since then of course Google finally gave up on their “real names” policy. Turns out it didn’t actually improve the quality of discussion at all–and it hurt people. Facebook, OTOH, still deletes accounts using pseudonyms, and it continues to be a tool of attackers to shut down victims.]


Google has said that they plan to “address” the issue of pseudonymity in the near future. I hope that these thoughts and experiences may help inform that decision.

Protections for anonymous speech are vital to democratic discourse. Allowing dissenters to shield their identities frees them to express critical, minority views . . . Anonymity is a shield from the tyranny of the majority. . . . It thus exemplifies the purpose behind the Bill of Rights, and of the First Amendment in particular: to protect unpopular individuals from retaliation . . . at the hand of an intolerant society.
———— 1995 Supreme Court ruling in McIntyre v. Ohio Elections Commission

This whole persona/pseudonym argument may seem like a tempest in a teapot, but the fact is, the forum for public discourse is no longer the town hall, or newspaper, or fliers on the street. It is here on the Internet, and it is happening in communities like this, hosted by private sector companies. Freedom of speech is not guaranteed in these places. As +Lawrence Lessig once said,“the code is the law.” The code that Google applies, the rules they set up now in the software, are going to influence our right to speak out now and in the future. It is imperative that we impress upon Google the importance of providing users with the same rights (and responsibilities) as exist in the society that nurtured Google and brought about its success.

I’m going to try to summarize the discussion as I’ve seen it over the past few weeks. Since this is a long post (tl;dr), here’s a description of what’s coming so if you want, you can skip to the section that you’re interested in.

First I’m going to address some red herrings; arguments that actually have no bearing on pseudonyms. I will explain why I think we should be having this discussion about a company’s product. I’ll explain, through painful personal disclosure, the experience of close friends, and other examples, why someone might want to use a pseudonym. Then I will address the arguments I have heard against pseudonyms (and some of them are quite valid), and what some alternatives might be.

I apologize for the length of this post, I know it could be trimmed.

U.S. Soldier’s Guide to Iraq—Circa 1943

U.S. Soldier’s Guide to Iraq—“Circa 1943
MSNBC/Newsweek

You aren’t going to Iraq to change the Iraqis. Just the opposite. We are fighting this war to preserve the principle of ‘live and let live.’ Maybe that sounded like a lot of words to you at home. Now you have a chance to prove it to yourself and others. If you can, it’s going to be a better world to live in for all of us.”…

It is a good idea in any foreign country to avoid any religious or political discussions. This is even truer in Iraq than most countries, because it happens that here the Moslems themselves are divided into two factions something like our division into Catholic and Protestant denominations—so don’t put in your two cents worth when Iraqis argue about religion. There are also political differences in Iraq that have puzzled diplomats and statesmen.”

Seventy years ago, and we understood the issues better than we do now.

Protecting Property in New Orleans—The Law Looks the Other Way for Some

New York Times

ORLEANS, Sept. 8 – Waters were receding across this flood-beaten city today as police officers began confiscating weapons, including legally registered firearms, from civilians in preparation for a mass forced evacuation of the residents still living here.

No civilians in New Orleans will be allowed to carry pistols, shotguns or other firearms, said P. Edwin Compass III, the superintendent of police. “Only law enforcement are allowed to have weapons,” he said.

But that order apparently does not apply to hundreds of security guards hired by businesses and some wealthy individuals to protect property. The guards, employees of private security companies like Blackwater, openly carry M-16’s and other assault rifles. Mr. Compass said that he was aware of the private guards, but that the police had no plans to make them give up their weapons.

That’s an interesting distinction. One can understand the practical aspects. Those who can afford to hire armed security forces presumably can afford to keep them healthy and fed. And those forces (perhaps) are less likely to engage in illegal activities than non-incorporated forces. But fundamentally, this means that people with money can protect their property by means that violate the law, but people without money cannot. Whether the decision is valid or not, the result is that the poor will lose more than the rich.

Definition: Buffer Overflow

Buffer Overflow

If you read any press on computer security problems, at some point
you are likely to come across the phrase “Buffer Overflow”–it’s by
far the most common security error that programmers make. It’s common
for several reasons.

  • It has nothing to do (by itself) with security.
  • It’s an easy error to make, and a hard one to detect.
  • It’s human nature not to expect the unexpected.

So what is a buffer overflow? I’ll start off extremely non-technical
here, and gradually bump up the level until the final section, at
which point if you don’t understand programming and call stacks you
may want to stop reading, and if you do understand them, you may
decide to start reading.

First, here’s the non-technical explanation.

You need to tell a co-worker something important, you go to their
office, expecting a conversation something like this:

“Hello.”
“Hi.”
“I though you should know about this new thing.”
“Oh? What is it?”
You tell them the important thing.

Instead the conversation goes like this:

“Hello.”
“Hey! Just the person I wanted to see! Did you hear about this
crazy election thing,”…followed by five minutes of political
diatribe. By the end of the conversation, not only have you
forgotten what you came in to say, you’re on the way out the door
with a poster to protest something.

Your buffer just overflowed, and you were hijacked for a purpose
other than your original intent. You had an expectation of how the
conversation would go (the protocol) and it was violated, with the
result that you ended up doing something different. That’s exactly
what happens to a program when someone exploits a buffer-overflow
problem.

Now a slightly more technical explanation.

When a program is designed, it is designed with an interface to the
outside world. That interface is not just what you see on the
screen, but also how it communicates with other programs and the
operating system. The interface is typically defined in terms of
either an API (a set of programming conventions for direct
communication with another piece of code) or a protocol (a definition
of a set of data and commands to be passed between programs). Think
of the API as how your brain tells you arm to pick something up, the
protocol as how you ask someone to pass the salt. Of course the
protocols are not always executed directly. Your brain tends to use
the mouth API to tell someone to pass the salt, rather than using
telepathy directly, and many programs use standard sets of code
provided by the operating system when they want to use a protocol.

Now, these APIs and protcols specify the form of the information to
be passed back and forth. For instance, a specification might say
that the correct response to an initial communication is no more than
five letters long (e.g. “Hello”). In the days before people had to
worry about hostile programs, code was written assuming that the
program you were talking to was going to be following the rules of
the protocol. If the protocol said “five letters” then there wasn’t
a lot of point in leaving room for six. Sure, your program might
crash if there were six, but it wasn’t your bug, it was a bug in
the program talking to you–it should have sent five letters.

So that’s a buffer overflow. You expect one thing, and somebody
sends you something much bigger. The “buffer” that you had set aside
to store that information doesn’t have room for what you get, and you
end up writing those six (or six hundred) letters on top of other
things that you were trying to remember. Obviously that’s not going
to be a good thing for the continued functioning of your program, but
it turns out it’s also a major security problem.

And still a bit more technical.

Computers tend to think in terms of two things–code and data. Code
consists of the instructions for the computer, telling it what to do.
Data is what it does it to and with. When you run a program, it
loads into memory both the code and the data that code needs. When
that program communicates with some other program, it is receiving
data, and it will then use the code that it already has to figure out
what to do next. This makes remote communication relatively safe.
The remote program can only tell the local program to do within the
constraints of the original code. Assuming nobody has done anything
stupid (which is not generally a good assumption), the remote program
cannot tell the local program to do anything that wasn’t originally
intended.

Modern computer architectures have an unfortunate design, however.
They don’t really no the difference between data and code. If
somebody can convince your program to try running the data that it
has in memory, it will do so quite happily. So a malicious program
has two goals. First it wants to get some code to your machine, and
then it wants to persuade somebody to run it. This is of course, no
different than an email virus writer’s goal. In that case, they
expect you to run it, in the case of a buffer overflow, they expect
the broken program to run it. Email viruses are so successful
because users often don’t know the difference between data and code
either (and some operating systems helpfully try to hide the
difference so as no to confuse them).

It turns out that if a malicious programmer can find a target program
that didn’t check for a buffer overflow, it can be very trivial to
get that program to execute code provided by the remote program. So
easy, in fact, that there are standard packages out there that
provide the entire payload for the overflow–all the script kiddie
(we’ll define that sometime, but suffice to say it isn’t a compliment
of someone’s hacking prowess) has to do is find the write length for
the buffer overflow and bang–they have control of your computer.

Before you panic, remember that doing this requires that they have
remote access to a program on your computer already, and that that
program have a buffer overflow problem. That means (for an internet
exploit) that your computer has to have some program that is
listening to external connections (e.g. print server, file
sharing…) or that you have a malicious user at your computer (or
you helpfully downloaded and ran their software).

Now let’s get completely technical.

How does a buffer overflow exploit work from a programmer’s perspective?

First you find some place in that program where it’s reading data and
assuming that it’s going to be reading something rational. E.g.

        char    buf[4];      /* Store 4 characters */
        gets(buf)               /* Read any number of characters from the input
                                                and put them in buf */

where the input turns out to be more than 4 characters long.

Now the question is, where is the data stored in “buf” located?

If “buf” is a global variable, then that data is probably allocated
in a data segment somewhere, and you’re going to try and overwrite
some other piece of data which will result in something useful (e.g.
a place where the program was going to execute one program, now
executes another). That’s tricky and hard to do without source code.

However “buf” is probably a local variable, allocated on the stack.
So instead of overwriting data, your goal is to overwrite the stack
itself. So you are going to put in buf some amount of padding (that
will overwrite the rest of the data stored on the stack), followed by
some machine code that overwrites the part of the stack that had code
on it. You’ll set things up so that your code will be executed
(possibly when this particular function returns) instead of the code
that normally would have been executed. Now you’re home free. Since
there are plenty of examples of sample exploit machine code, all you
need to do when you find a new buffer overflow is figure out the
appropriate offset–the rest of the work has been done already. You
don’t need to transfer very much data, just enough to run something
that connects you to the remote machine–from there you can transfer
the rest of the software you want to install remotely.

This is where security-by-obscurity comes in handy. Want to lessen
the chance of buffer-overflow attacks? Just run some obscure piece
of hardware. Run a Mac, or even Linux on the PowerPC1Of course with Apple switching to an Intel platform, some of that obscurity goes away, but exploits still have to vary from operating system to operating system, even if the underlying processor is the same.. It’s not that
there aren’t buffer-overflow problems, but their are less handy
examples of how to exploit them running around. Less examples, less
successful attacks. It’s not a solution of course (especially if
everyone does it :-), but it is one way to slightly increase your
odds of remaining secure.

There are machine/OS architectures that would make buffer overflows
much harder to exploit. Disable dynamic creation and execution of
code on the stack for one. Or keep a separate data stack. And there
are tools out there which will put watchdog data on the stack, and
then watch it to make sure it doesn’t get overwritten (effective, but
rather painful from a performance standpoint). But fundamentally,
where there are bugs, there are exploits. And modern software, with
it’s layers and layers of abstraction that no one person can fully
grok, has a hell of a lot of bugs.

Why I do *not* support Family PC’s Parent’s Bill of Rights

When was it that everyone started to talk about
rights, and forgot all about responsibilities?

[A note from the future, in 2017. I was right. I didn’t get more conservative, and they grew up safe and awesome. One’s editing movies. The other’s working QA at a robotics company. Let’s hear it for sensible parenting.]

Kee's kids--A long time ago
Before I begin, let me set
some context. I’m a parent, I have two terminally cute daughters; one six, the
other four. I’ve heard that the number one correlation between sexual conservatism
and other factors is whether a person has daughters. Maybe things will be different
when they reach adolescence, but so far my values haven’t changed.

So now we have had a
summit,
and
everyone’s talking about how to protect the rights of
parents on the internet.
This is apparently something that greatly concerns many parents,
although
from a reading of the statistics,
I can only assume that it’s primarily a concern of parents who are
not
on the internet, since those that are, aren’t even using the available
tools. But I don’t mean to belittle the core desire–parents
want to make sure that children’s exposure to new concepts and people is
consistent with their beliefs, whether that exposure is on the internet, the
street, or the corner store.

And that’s the fundamental issue I have with all this ruckus. The
internet doesn’t exist as a thing, it isn’t something that’s safe or not
safe. The internet is a community of people, and the things you
have to teach your kids in this community are the same as the things you teach
them in your own. Be polite, don’t interrupt, don’t speak unless
you have something to say, stay away from the seamier parts of town, and of
course, don’t go off alone with strangers. Those are values I try and teach my
kids. If I haven’t gotten them across by the time they learn to
send email, it’s probably too late anyway. But these values are not
specific to
the internet–I expect them to be applied online, and down at the
coffee shop.

I think I know where things went wrong. Some parents thought that if
their kid was staying home in front of the computer, that they were safe
and could be left alone–just like when they were sitting in front
of the television. They were wrong of course, the two mediums are not
comparible–there is far more violence and sex on television.

But on the internet, no one knows you’re a dog. What good does it do to teach
your kids right from wrong, if someone can pretend to be a teenage soulmate,
when they are actually a lecherous old man? There is some validity in this,
but frankly, anyone who has spent much time in online communities very quickly
learns that identity is both central to, and yet completely apart from, the
online experience. I spent my freshman year in college hooked on “the con”,
as it was called by those of us with access to
Dartmouth’s
Time Sharing System
years before AOL’s forums and IRC. We all knew
the story of the guy that gets all excited about this great girl he’s been chatting
with for hours, only to walk over to his roommate’s cubby to tell him the news–and
find out he’s been chatting with him all this time. The notion of an
online identity, or identities, that is separate from your physical one is fundamental
to the system–our children will understand that long before their parents.
This isn’t the dark side of the internet, this is one of the liberating things
about the internet. (Note that having multiple identities is not the same as
being anonymous, I’ll talk more about that some other time–if you want some
mandatory reading on that subject, check out “The Transparent Society : Will Technology Force Us to Choose Between
Privacy and Freedom”
by David Brin.)

“But…,” (says my wife), “it is different, you think they
are safe because they are in the house. With other communities, you know where
they are.” Well, one can hope, but the teen
pregnancy
rate in the U.S. would seem to argue otherwise. The fact of the
matter is that, the older your kids get, the less control you have over them.
That’s why I see this whole thing very differently. This isn’t an issue of parent’s
rights, it’s a question of parent’s responsibilities, and that’s
a word that seems to be very much out of favor recently. All the internet is
doing is bringing home that our job as parents is not to control, but to guide.

Here is FamilyPC’s “Internet Bill of Rights” proposal, with my responses. They asked if they
could publish by responses, so if you see it in a physical copy, let me
know which issue.

  1. Parental blocking software should be integrated into every Internet
    browser.


    Including experimental
    ones? Ones meant for developer use only? Ones running on PDAs that
    don’t
    have the necessary memory or CPU? No. If the demand is there, the
    industry
    will provide it. Protection of children is the responsibility of
    the parent,
    not something that can be regulated.
  2. Web site creators must
    rate their
    sites in an industry-standard way that is recognizable by the browser
    (for now this means using RSACi or SafeSurf or PICS).


    Aside from being
    unenforcable, there is no need to do this. If people stop going to
    unrated
    sites, then sites will rate themselves. The fact of the matter is the
    majority of sites that rate themselves are adult sites–they don’t
    want
    minors on their sites. The rest of sites don’t have the time or
    interest
    to rate themselves.
  3. An arbitration board should be created to
    arbitrate
    discrepancies in site ratings.


    That implies that ratings have the
    force
    of law. Ratings are going to be relative my definition. An independent
    board cannot be created to legislate free speech. If we were
    talking about
    signs on a front yard, this wouldn’t stand up in court for a
    second.

  4. Webmasters who do not comply with voluntary ratings should not be
    listed
    on the major search services.


    Absolutely not. This restricts adult
    access
    to sites, never mind access to sites outside of the United States.
    Search
    engines are already beginning to offer alternative, rated-only search
    facilities. There is no need to legislate this.
  5. Children’s chat
    rooms
    will be monitored to keep them safe; monitoring can be human or
    electronic.


    If you are worried about what your children say to whom, then
    monitor
    them. Don’t forget to tape phone conversations and follow them to the
    school bathroom as well. Chat room monitoring is neither practical or
    workable.
  6. Web sites must fully disclose what they do with
    information
    collected from people who register at their sites.


    This is a
    general issue
    that has nothing to do with the specific issue you are addressing
    here.
  7. Advertising must be clearly labeled as advertising and kept
    separate
    from editorial content.


    Ditto.
  8. If online shopping is involved,
    advertisers
    must require parental permission prior to purchase. Parents will be
    able
    to cancel an order mistakenly sent by a minor at no charge to the
    parent.


    The standards here should be the same as they are anywhere else.
    Use of
    a credit card is deemed to be an indication of adult status.
  9. If an
    advertiser communicates with a child by e-mail, the parent should
    be notified
    and should have the option, with each mailing, to discontinue
    mailings.


    If you want to disallow communications with children by advertisers, I
    might consider that a good goal. However, “on the internet no one
    knows
    your a dog”. It’s impossible to tell whether you are communicating
    with
    a child on the internet. As for the ability to remove yourself from
    commercial
    mailings–go for it, but this is a general issue, not one specific to
    children’s/parent’s rights.
    Frankly I find the whole concept of a
    “Parent’s
    Bill of Rights” to be misguided. First we need to construct a Parent’s
    Bill of Responsibilities. For the past 15 years my closing email
    signature
    has been the same. And every year I feel it is more and more
    appropriate.
    “I’m not sure which upsets me more; that people are so unwilling to
    accept
    responsibility for their actions, or that they are so eager to
    regulate
    everyone else’s.”