Skip to main content

Currently Skimming:

2. Technology
Pages 31-70

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 31...
... This chapter addresses the technological dimensions of this "reference scenario" and some of the things that can be done to protect against it. 2.1 AN ORIENTATION TO CYBERSPACE AND THE INTERNET 2.1.1 Characteristics of Digital Information In the reference scenario, the student is seeking information (content)
From page 32...
... Digital information can be shared more easily than any type of analog information in the past. In the physical world, broadcasting information to groups has serious costs and hence requires a certain wherewithal and commitment.
From page 33...
... 4It is true that access to the Internet may require an individual to log into a computer or even to an Internet service provider. But for the most part, the identity of the user once captured for purposes of accessing the Internet is not a part of information that is automatically passed on to an applications provider E.g., a Web site ownery.
From page 35...
... Given the vast information resources that it offers coupled with search capabilities for finding many things quickly, it is no wonder that for many people the Internet is the information resource of first resort. 2.1.3 Internet Access Devices In the reference scenario, the student uses a computer to access the Internet.
From page 36...
... 2.1.4 Connecting to the Internet In the reference scenario, the student connects to the Internet. In general, access to cyberspace is provided by one or more Internet service providers (ISPs)
From page 37...
... These services and content are available only to those who subscribe to those online service providers. In other cases, services are available to some non-subscribers (for example, the instant message (IM)
From page 38...
... Advertising entails payments by advertisers to the ISP for the privilege of displaying ads, and thus the user must be willing to accept the presence of ads in return for access privileges. 2.1.5 Identifying Devices on the Internet: The Role of Addressing Every computer or other device connected to the Internet is identified by a series of numbers called an IP address.7 The domain name system is a naming system that translates these computer-readable IP addresses into human-readable forms, namely domain names.
From page 39...
... Also, some Web pages are accessible only through the "https:" protocol. 9For example, as of November 2001 the Google search engine had indexed 1.6 billion Web pages.
From page 40...
... or private (by invitation only) Content of chat and online identities of participants are visible to everyone participating in the chat room Chat rooms are an online equivalent of CB radio Used to initiate, establish, and maintain online relationships Instant One-on-one dialog, and private messages Text-based, but can contain links; images and voice can sometimes be transmitted as well Initiation of instant message requires knowledge of user name "Buddy lists" allow user to know who is online at the same time as the user Usenet Populated by some 30,000 newsgroups of specialized topics; newsgroups function essentially as online bulletin boards on which users can post anything they wish, often anonymously Many newsgroups contain sexually explicit material, and some are oriented primarily toward such material; sexually explicit content on Usenet newsgroups is often more extreme than those on adult-oriented Web sites Cost of content distribution is borne by Internet service provider that carries newsgroups with content rather than by publisher or receiver Sexually explicit Usenet newsgroups serve as conduits for advertising of adult-oriented Web sites and as a medium in which sexually explicit content can be exchanged among users Internet service providers make choices about what Usenet newsgroups to carry; some carry the full line, and others carry only a subset (e.g., all except those devoted to child pornography)
From page 41...
... Search engines rely on technologies of information retrieval, as discussed in Section 2.2. Given the enormous volume of information on the Web, users in general do not know where to find the information they seek.
From page 42...
... 10This mode of file sharing first gained widespread publicity with the Napster network, an online service that facilitated the sharing of digital music files among users. The files themselves the information content of interest to end users always remained on client systems and never passed through a centralized server (such as one that would host a Web page)
From page 44...
... MUDs and MOOs are complex online games relying mainly on text interactions while relatively new games like Microsoft's Age of Empires and Electronic Arts' The Sims Online utilize visual representations to create fantastic communities for role playing. · Instant messaging services allow a two-way, real-time, private dialog between two users These services include such well-known entities as AOL's Instant Messenger and Yahoo's Messenger.
From page 45...
... Web cameras and streaming media depend on the increasing availability of broadband Internet connections to allow the high-quality real-time transmission of audio and video content. Today's Internet videoconferencing suffers from many of the same problems as Internet telephony, most notably poor quality (low resolution as well as "jitter" in the moving images)
From page 46...
... The availability of devices to convert sound into digital form, to digitize existing images, and to record still and video imagery enables individuals to generate digital content inexpensively and in private. Digital cameras, Web cameras, and camcorders are dropping in price and the pictures they take increasing in quality, and virtually anyone can publish videos to the Web or can participate in or set up videoconferences at very low cost.l3 Thus, while one might have had 13A 2001 video advertisement from Sony Europe for its Vaio line of notebook computers (which can have a Webcam built into them)
From page 47...
... Furthermore, because digital information can be so freely reproduced, it is essentially impossible to rely on mechanical difficulty or expense of reproduction to curtail the availability of anything to anyone. Once released onto the Internet, content is next to impossible to ban whether that content involves a political manifesto, sensitive classified information, company trade secrets, one's medical records, or child pornography.~4 Finally, the Internet contains an enormous volume of material that changes rapidly.
From page 48...
... law. For example, the mere fact that a domain name has a country suffix such as .ru or .jp does not necessarily mean that its owner is located in Russia or Japan.
From page 49...
... Further, the scale of a "Web catalog" (i.e., the volume of information accessible through popular search engines) is much larger than that of most library catalogs of holdings, and Web search engines often do not provide adequate categorization of Web pages contained in their databases.
From page 50...
... ; and · Provide an interface between the user and the other components of the system to support the user's interaction with those components and with the information objects. Filtering systems, discussed at greater length in Sections 2.3.1 and 12.1, work like information retrieval systems in reverse; that is, they are concerned not with retrieving desirable information, but rather with making sure that undesirable information is not retrieved.
From page 51...
... The matching process is thus itself inevitably uncertain, since the representations on which it depends cannot be complete and certain. Because information retrieval and information filtering are probabilistic, any search engine will find material that is irrelevant to the user's needs and fail to find material that is relevant.
From page 52...
... sites, which are then incorporated into the filtering software, which stands between the user's Internet access tool and the Internet itself. Bad sites for black lists can be identified through any of the technologies described below.
From page 54...
... A complication in this analysis of page names is that different hosts can share the same IP address through a process known as IP-based virtual hosting, which is a way of assigning multiple domain names to the same IP address. IP-based virtual hosting is made possible by the fact that the HTTP protocol passes the URL containing the requested domain name to the site at the given IP address, and the software at that IP address maps the domain name to the appropriate portion of the server.
From page 55...
... A list that designated 204.1.23.3 as containing inappropriate material would block both domain names. Filtering by Textual Analysis Filtering by textual analysis makes use of information retrieval representation technologies discussed in Section 2.2 and Appendix C
From page 56...
... For instance, due to the nature of search engines, the more times a word that is used in a query appears in a site, the higher up in retrieval rankings that site will be placed. Thus, extended repetition of commonly used search terms in the metadata, which have no relationship to the actual content of the site itself, will result in that site's being retrieved and placed highly in the results when those terms are used.
From page 57...
... This is, in effect, the human version of the statistically based automatic text classification described above. The filter then works by establishing which categories of sites are allowed to be presented, reading the appropriate label in the metadata, and refusing all sites that are either on a black list of categories, or not on a white list.
From page 58...
... Thus, by combining the various techniques, the level of error can be reduced. For example, if image analysis indicates the high probability of a naked person but textual analysis does not indicate any of the words usually associated with adultoriented material, analysis of the associated URL finds the domain .gov, and the metadata indicates that the owner of the site is the National Gallery of Art, the filter would be justified in predicting that the site should not be regarded as containing adult-oriented, sexually explicit material, despite the evidence from image analysis.
From page 59...
... Some search engines provide users with the option to perform filtered searches. Third-party commercial software vendors sell stand-alone filters that can be installed on a personal computer or into a local area network serving an organization (e.g., a school or a library system)
From page 62...
... Assurance about age must, in general, be provided by reference to a document that provides information about it, and today's infrastructures needed to support online authentication of identity ~enerallv do not include such documents. In the physical world, age verification can be provided as a part of the credential being presented a driver's license generally has a date of birth 23Indeed, in the physical world, someone who presents a fake ID that is recognized as such by the clerk is subject to arrest.
From page 63...
... , which provide a verification of adult status to other adult Web sites, also use credit cards.26 Because the credit card is generally the user's method of payment for the service, the AVS relies on the credit card to verify the adult status of the user.27 Another approach to age verification is to rely upon databases of public records (i.e., government-issued documents such as voter registrations and/or drivers' licenses)
From page 64...
... At the same time, some minors do own credit cards or prepaid cards that function as credit cards, while other minors are willing to use credit cards borrowed with or without permission from their parents. (Even when parents review credit card statements, either their own or those of their children, they may not be able to identify transactions made with adultoriented sexually explicit Web sites, as the adult nature of such transactions is often not readily identifiable from information provided on the statement.)
From page 65...
... No technology today or on the horizon can hope to make such fine distinctions in the case of individuals.28 For this reason, biometric technologies as a method for age verification are not considered here. Age verification technologies as integrated into functional systems are discussed in greater detail in Chapter 13.
From page 66...
... This identity is then used for posting messages, sending e-mail, participating in chats, and accessing Web pages. (Some anonymizers enable return paths when necessary; for example, the recipient of an anonymous e-mail may wish to reply to the (anonymous)
From page 67...
... . But in the event that the user chooses to be deceptive (e.g., to avoid restrictions on Internet service based on his or her location)
From page 68...
... But there is no protocol in place to pass this information to relevant parties, and thus such aggregation is not done today. The bottom line is that determining the physical location of most Internet users is a challenging task today, though this task is likely to be easier in the future.
From page 69...
... Even today, fine-grained access control driven by policy is, or soon will be, beyond the scope of human management and may be beyond the scope of mechanistic alternatives. If access control policies are impossible to formulate, the only alternative is an approach that depends on users to exercise self-control.
From page 70...
... It is for this reason, among others, that the committee in later chapters emphasizes social and educational strategies for protecting children from inappropriate sexually explicit material. Finally, many of the issues associated with protecting children from inappropriate material and experiences on the Internet relate to the architecture of the Internet as it exists today, a state of existence that reflects policy and engineering decisions made decades ago.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.