Why Filters Can't Work

3/8/98

Statement of Michael Sims ( jellicle@inch.com) affiliated with The Censorware Project, to the Senate Commerce, Science and Transportation Committee, to be included in the record for the hearing on "Indecency on the Internet".

[This statement makes reference to two other statements which I believe have been submitted: one by David Burt, who maintains and administers "filteringfacts.org", an Internet website which advocates the use of "filtering software" in libraries; and one by the American Library Association, which does not.]

I am an internet activist concerned with maintaining free speech on the internet in the face of censorship efforts. I have participated in an effort to evaluate one of the more well-known commercially available "filtering products", CyberPatrol, which is available online at http://www.spectacle.org/cwp/ .

In our report, we showed that CyberPatrol, which has received good press for accuracy, is grossly overbroad. As one of the co-authors, I think I am better qualified to describe the results than David Burt and filteringfacts.org. Despite David Burt's characterization of the report, we showed that CyberPatrol blocked well over a million individual user homepages under their "Full Nudity" and "Sexual Acts" categories. Imagine if you had written your Great American Novel, and found it banned from public libraries because it contained full frontal nudity and graphic description of sexual acts. But it didn't. In fact, it was about biochemistry, or pet grooming, or Nike shoes, or mathematics, or mirrors, or the Army Corps of Engineers, or any of the many other sites we found banned for containing full frontal nudity, when they never had and never would. And this gross error couldn't be corrected. The library told you they subscribed to a list, managed by a private company, which even they couldn't look at, which determined what books they would make available, and it seems that your novel wasn't on it. Every morning, the company employees came in with two big black bags. They put some books on the shelves from the one bag, and removed unknown books from the shelves and took them away in the other, and wouldn't respond to any queries about why YOUR book was called "pornography" and had been taken off the shelves.

You would be fighting mad in short order, I imagine.

This is precisely the situation with implementing censorware in libraries today. CyberPatrol's accuracy can best be called abysmal. If you defined "accuracy" as

"number of correctly banned web pages divided by total number of webpages banned"

then Cyberpatrol is rather less than 1% accurate - if you come across a banned webpage, the odds are much less than 1% that it actually contains any "pornography" by any definition. But you won't be able to evaluate that for yourself - since you'll be prevented from viewing the site at all - so there's no way for the average user to remedy the inherent flaws of companies which try to categorize hundreds of millions of webpages using a staff of a dozen part-time, minimum-wage employees.

Companies rely upon computers to determine what should and shouldn't be blocked. The vast size and rates of both growth and change of internet websites make human evaluation impossible - if a company used humans to evaluate all banned sites, they would have to employ THOUSANDS of people and the software would cost thousands of dollars per copy consequently. The search website Yahoo.com has graphically demonstrated this. Yahoo employs less than fifty people to maintain its directory of websites. They don't have the search the internet for new sites - all they have to do is categorize submissions from people who want to be listed in the directory (who must request a specific category to be placed in to begin with). Yet Yahoo is deluged. They are swamped with requests to be put in the directory. Most submissions are discarded without action, since the number of submissions so far outstrips the abilities of this group of people to categorize them. I've been trying to get various websites listed there for months, without success.

Imagine now that you had to search the internet to find all potential sites. The problem becomes hundreds of times worse. These companies employ only a few humans and some computer search tools to automate the process of scanning the WWW for potentially offensive sites. Many sites get added to the blacklist without ever being seen by a human. For example, a page discussing Windows Emulation software was banned under Cyberpatrol's category for alcohol. Why? Because the acronym used for the software was WINE. No human ever viewed this page filled with densely packed text about obscure software, despite CyberPatrol's obviously false claim that humans see all banned sites. A computer "viewed" it, a computer banned it. It is here that I take issue with the ALA's statement. They say:

"When a library installs commercial filters or blocking software, it transfers the professional judgement about the information needs of the community from the librarian to anonymous third parties - often part time workers with no credentials and no ties to the community - who evaluate sites for the software manufacturer."

More correctly, the librarian transfers their judgement to a computer program, which may or may not be superficially overseen by a few humans. This is no substitute for a human with a Master's in Library Science.

Imposing censorware in libraries and schools invokes a slew of Constitutional problems. A private, unseen list of blacklisted works using unknown criteria wouldn't be tolerable even if run by the government, and will be doubly intolerable when administered by a private company. As public servants and members of the United States Senate, you swore to preserve, protect, and defend the Constitution of the United States. Obey that oath in the spirit intended - cease wasting public money defending unConstitutional attempts to censor the internet in response to small, vocal pressure groups. Your children will thank you for it when they have a chance to access information about biochemistry in the public library without being BANNED BY CYBERPATROL.