Trapped in the Filter

by Jonathan Wallace jw@bway.net

The Supreme Court just over-ruled the decision of a trial court invalidating the Children's Internet Protection Act (CIPA), a law which required libraries accepting federal E-rate funding to install software filters. By doing so, the court endorsed some over-lapping types of dishonest behavior.

CIPA was a fake-out. Congress passed a law which described itself as limited and benevolent, but which is actually sweeping and violent. CIPA, as the title indicates, pretends to be aimed only at the protection of children from pornography. In reality, it requires the installation of filtering software, censorware, on every Internet-connected computer, whether used by adults or children, in any library accepting even one dollar of E-rate funding.

CIPA also purports to ban only sexual and violent images. Since all the software available for library use is unable, given the existing state of the technology, to recognize or categorize images, a law which bans pictures has been enforced by the use of software which categorizes text as inappropriate, an approach not explicitly described in the law.

CIPA can be read as mandating the use of a nonexistent product: software which can reliably look at images and determine if they are sexual. Since this product does not exist, the companies stepping up to the plate to provide the software used by libraries effectively claim to address the requirements of CIPA, when these products do not come close to doing so. Many of these companies make claims about the efficacy of their products in filtering the Internet and in categorizing text which are wildly wrong. Or they stand silent mindful of the truthfulness mandated by securities laws, and let the far right and the Justice Department make their claims for them. Which amounts to the same thing.

The fundamental problem in filtering the Net is its size. Nobody knows how large the web is exactly, but Google as of today is purporting to search more than three billion individual web pages. For convenience, I will use this figure as if it was the entire Web, even though we know it isn't.

There are two basic approaches used by censorware companies. One is to have teams of human beings scanning the web, categorizing sites and adding them to the company's blacklist. The other is to write software which does this job. Many of the censorware vendors use a combination of both approaches.

No team of humans could scan and categories the entire web in any reasonable period of time. If you don't agree with this proposition, take a calculator and work it out for yourself. Assume a team of 100 people (censorware companies keep the actual size of their teams a closely guarded secret and the evidence is that they are far smaller than this). There are about 2000 working hours in a year, which comes to 120,000 minutes. Assume that a human being can accurately review and categorize a web site in one minute. That would be 120,000 sites per worker per year, assuming no vacation, lunch breaks, bathroom breaks, staff meetings, or inattention. The whole team working at this inhuman rate would review 12 million sites a year. In a mere 250 years, they would have reviewed the entire segment of the web covered by Google.

Of course, the real teams are far smaller, require vacations and breaks, and cover many fewer sites in a year. There are other problems. There is a high percentage of human error. I interviewed former members of one of these teams, and learned that they are often required to meet quotas of so many sites per day, even though they believe they spend inadequate time on each site in order to meet these goals. They are not given substantial training in the fine legal distinctions between obscenity, indecency and literature. They are not lawyers either; they are art students, homemakers, customer service reps and a wide variety of other people who one day find themselves sitting at a terminal, entrusted with the heady responsibility of deciding what you can and can't see in a library.

Because of the impossibility of using human teams to catalog any significant portion of the web, the companies rely very heavily on the other approach, writing "spiders" to crawl the web and categorize sites. These programs rely heavily on keywords which the company has decided are likely to reveal the presence of inappropriate material. This is why a dry series of essays I wrote about the social impact of pornography has been characterized as porn by several of these products; some of the keywords were present, without there being any porn on the site. (In fact, parts of my site have been blocked by numerous censorware products for equally bad reasons, and at least one product blocked the entire site.) Due to the size of the web and the number of sites the spiders crawl in a day, it is not practical to have humans review every site characterized as porn by a spider. Some companies claimed in the past that every banned site was human-reviewed; most no longer make that claim. You need only review a sampling of inappropriately blocked sites (my essays, a Liza Minelli fan page, the American Association of University Women, a web site for a girls' hockey league, etc. etc.) to determine that no human could have reviewed these.

"But Google uses software to review the web. Why shouldn't censorware?" The answer is that the impact is very different. Depending on how you have defined your search, you may receive many irrelevant responses from Google. All Google claims is that a particular word or phrase has occurred on the page it found for you. It is up to you, as an autonomous human being exercising discretion and judgment, to decide if the page is relevant. Censorware uses algorithms less powerful than Google's to decide, by contrast, that you cannot see my essays on pornography. It treats you as if you were a child.

Another issue of stunning importance raised by censorware is whether libraries can appropriately delegate their decision-making processes to twenty year old art students with no training clicking through hundreds of sites a day. Even if librarians could create their own censorware by reviewing web sites and deciding which to "acquire" for the library, there seems something inherently suspect in allowing strangers to do it--especially when the strangers refuse to disclose to the libraries the contents of their blacklists or even the methodologies used to compile them.

The Supreme Court majority didn't agree. The decision is what my cousin Marty, the personal injury attorney, used to call "a fuck-you" (oops, there goes your chance of reading this in a public library).

Chief Justice Rehnquist wrote the plurality opinion. He rapidly zeroes in on the fundamental question of whether the software's "overblocking" of innocent and useful web sites creates a First Amendment problem. Ultimately, anyone's opinion on the issue will be driven by whether the blocking of sites like mine as porn makes you indignant or not. The Chief Justice is not bothered. "A public library does not acquire Internet terminals in order to create a public forum for web publishers to express themselves, any more than it collects books in order to provide a public forum for the authors of books to speak." What is the purpose, then? The Chief Justice says it is "to facilitate research, learning and recreational pursuits by furnishing materials of requisite and appropriate quality."

If a person in authority says something grammatically and emphatically enough, it sounds true, but there is much less to this statement than meets the eye. If a library's job is to provide an appropriate range of opinions and creative expressions to its users, how is this different than providing a forum for the authors to communicate with their audiences? One implies, even demands, the other.

Significant portions of my site, such as An Auschwitz Alphabet, and even the pornography essays cited above, are routinely used as sources by high school and college students writing papers. If you block me on a library computer, you are segregating me from a significant section of my audience whom I believe would benefit from being able to read my views. If that segregation has occurred because some artificial stupidity software misapprehended the nature of my site, or a homemaker working part-time for a censorware company didn't take enough time to understand what my site is about, don't I have reason to complain?

Not according to Justice Rehnquist. Citing from the explanatory material Congress attached to CIPA, he says, "The Internet is just another method for making information available in a school or library." Apparently the diversity and quality of the information is not strongly important.

You can tell in the opening paragraphs of a Supreme Court opinion which way the court is going to break on an issue, by the vocabulary adopted and the amount of information the court chooses to disclose. Justice Rehnquist chooses not to give a single example of an inappropriately blocked site. He drily acknowledges that overblocking exists, and then makes a statement pregnant with meaning:

Assuming that such erroneous blocking presents Constitutional difficulties, any such concerns are dispelled by the ease with which patrons may have the filtering software disabled.

The first clause, beginning with "assuming", is truly chilling, and justifies my characterization of the plurality opinion as a "fuck-you". The Chief Justice says that the wholesale mischaracterization of innocuous or socially useful sites as porn by a program that then prevents you from seeing them in a forum in which a large number of Americans secures Internet access, may not present any kind of free speech problem. That's not food for thought; its poison for thought.

The rest of the sentence is a howler. As the trial court discovered, censorware is not easily turned off in a library. Most products forward you through a single proxy server. The library does not have the ability to get you out to the Internet through any other port, and there is no way to switch off the proxy. The best the library can do is contact the censorware company and request that the particular site the library patron wishes to view be removed from the blacklist. Sometimes the censorware companies don't respond to such requests; when they do, it can take weeks, rendering the unblocking futile, as the patron has probably forgotten, lost interest, or turned in her research paper by now. Finally, there have been instances of censorware companies manually unblocking a site several times, only to have it re-added to the blacklist by the spider.

Censorware is by far, the shoddiest, most negligent and scattershot form of censorship ever reviewed by the Supreme Court. For the Court to uphold CIPA, it was probably necessary to take the derisive view that none of the players--the librarian, the Web publisher, the patron--have any rights that really matter. Shades of what the court said in the 19th century when it held that a Negro slave had no rights that "the white man is bound to respect."

Since censorware itself is a fiction--a startling case of doing something inaccurate, poorly planned, buggy, in order to create the fiction of doing something-- it is not surprising that the Chief Justice tacked on an additional fiction, that the censorware can be turned off when a user requests. If we can pretend that censorware works, we might as well also pretend that it can be shut off.

There is also the chilling effect. Many people would be embarassed to approach a librarian and ask for a site to be unblocked. The Supreme Court has historically been very sensitive to this issue; it held forty years ago that a federal law was invalid which required subscribers to visit the Post Office and request their copy of the Communist Party newspaper. The Chief Justice doesn't care: The "Constitution does not guarantee the right to acquire information at a public library without any risk of embarassment."

Astonishingly, the opinion never broaches the issue of the delegation of the librarian's responsibility to untrained teams following a secret methodology. This is a crucial issue. Perhaps it wasn't properly argued on the appeal; perhaps there was no way for the plurality to slide past it, so it couldn't be raised at all. There is a long line of cases holding that governments cannot delegate their first amendment responsibilities to private commercial parties. These precedents are never raised in the decision, let alone explained away.

Two other justices concurred in the result without joining Rehnquist's opinion. Justice Kennedy noted the trial court's statement that "unblocking may take days", or even be unavailable, but mystifyingly says that this does not appear to be a "specific finding" of the court. He notes that if this proves to be true, he expects the parties to come back to court to argue that CIPA, though "facially" constitutional, is a violation of the First Amendment "as applied". Justice Breyer, also concurring, shares Justice Kennedy's concerns about the possibility that turning off the software does not happen as quickly as it should, but notes that patrons also have to wait for material to be sent from restricted collections.

Justice Stevens, dissenting, begins with the disappointing statement that he believes that the voluntary use of censorware by libraries is an excercise of local discretion. The only question which interests him is whether Congress can make all libraries use censorware as a a condition of receiving E-rate funding. His answer is no. "[A]ll filtering software blocks access to a substantial number of Web sites that contain constitutionally protected speech on a wide variety of topics." He notes somewhat sarcastically that, due to the buggy nature of the software, CIPA's message is that "all speech which gets through the software is supported by the government". He concludes that "This Court should not permit federal funds to be used to enforce this kind of broad restriction of First Amendment rights, particularly when such a restriction is unnecessary to accomplish Congress' stated goal."

Justices Souter and Ginsburg filed their own dissenting opinion. They noted that CIPA does not actually require that the library disable the censorware at a patron's request, or even that it unblock a single site. They understand CIPA to mean that adults will be denied access to a substantial amount of information lawful for them if inappropriate for children, and also to "a substantial quantity of text and pictures harmful to no-one". Unlike Kennedy, they believe that an individual library should not have the discretion to impose censorware on adults, due to its overblocking of First Amendment-protected information. Complaining of the majority's obfuscation of the true First Amendment issues at hand, they conclude that ""There is no good reason....to treat blocking of adult inquiry as anything different from the censorship it presumptively is."

Coming in the same week as the Court's landmark decision recognizing the right of gay people to be left alone to pursue their lifestyles in private--certainly one of the most important rights decisions in decades--the CIPA decision is baffling. I'm left with the impression that most of the justices were not convinced that the issue was important enough to care about. For people like me--small publishers whose web sites are blocked by these products--it is an issue of passionate importance.