|
THE INTERNET LAW AND POLICY FORUM WORKING GROUP
ON CONTENT BLOCKING
Previous | Next
Table of Contents
III. Content Blocking - Technical Aspects
Introduction
The earlier sections of this report have highlighted a trend towards
self-regulatory frameworks to deal with illegal and harmful Internet
content. One unifying feature of all such self-regulatory models
is the perceived need for comprehensive technical blocking means
for content. The primary focus in this section is the role of
technical blocking methods in the context of the content regulation
debate rather than an in-depth review of the technical strengths
and weaknesses of existing types of blocking software.
Are such filters an essential adjunct to self-regulation or are
they merely an optional piece of software for those users who
decide they wish to proactively tailor the nature of the content
they receive to their specific requirements? Current trends in
the European Union and Australia would suggest the latter with
a strong focus being placed on the PICS technology (see below).
This section of the report first seeks to examine technical approaches
to content blocking and some of the issues which arise. It then
looks in more detail at the Platform for Internet Content Selection
("PICS") which appears to be the emerging as a de facto
industry standard and finally examines some of the challenges
PICS still faces.
- Technical Approaches/Issues
This section assumes that the user's own ability to terminate
an unwanted connection is insufficient - that the user must be
alerted to the nature of a site (perhaps because of "adult"
material) or must be refused access on legal or ethical grounds.
In order for unwanted material to be blocked there are a number
of steps which must be carried out:
- The offensive material must be susceptible of being identified;
- here must be software implemented at the appropriate point
in the system architecture which is capable of carrying out such
identification; and
- Having identified the material the appropriate response must
be given (words blanked out error messages conveyed password prompt
sought etc.).
- Each of these areas faces its own challenges even assuming
that one can determine with certainty what material ought to be
blocked.
- Material must be identifiable
At present commercial blocking software operates on the basis
of relatively simple identifiers of offensive material. In particular
the software available today is capable of blocking specific manually
blacklisted URLs or sites (such as the CyberNOT blacklist) or
automatically blocking pages containing certain key words.
Image recognition is still in its infancy and is likely to remain
so for the foreseeable future.
Any increase in the sophistication of blocking technologies or
in the extent of their use has clear cost implications. These
must either be borne directly by the consumer in terms of additional
client software or in terms of recovery by an ISP of its cost
of running proxy servers and filtering software.
The difficulties of identification being limited to URLs labels
and words are several:
- URLS
Firstly, there are many examples of the reaction to the blocking
of specific URLs being the appearance of numerous mirror sites
(in the case of the German ISPs blocking Radikal (see Section
II above) around 50 mirror sites were recorded). Secondly it
is relatively easy for URLs to be changed more rapidly than the
means of blocking them where they are for example used by a small
number of paedophiles who can be individually notified of a new
and rapidly changing URL. Responsiveness to such changes is more
easily achievable with a proxy server rather than a client based
filter.
- Labels
For a labelling system to be effective the labels must be ubiquitous
and capable of being read by the software implemented at the point
of blocking. One of the major questions is how to ensure that
labels accurately reflect the content they are labelling; any
labelling exercise is largely subjective and it is necessary for
the reader to have an informed understanding of the editorial
stance both of a labelling system (i.e. what the labels
imply) and of the labelling service (i.e. who is attaching
the specific labels). There have recently been objections that
some blocking systems (for example CYBERsitter ) have taken overtly
political decisions such as blocking access to feminist sites
in addition to sites that focus on topics such as adult issues
and illegal activities. The introduction of the PICS standard
(see below) is an attempt to overcome these problems.
- Words or expressions
For software to be able to filter words or expressions or indeed
(when such becomes commercially available) to identify types of
images (naked flesh and so on) these need to be available in plain
text or not encrypted in a manner which renders them opaque to
the blocking software. In theory this makes them relatively easy
to circumvent. However in practice the benefits of opacity are
often outweighed by the disadvantages. Most of the resources that
people want to block are commercial (usually pornographic sites)
and these sites actively want to be seen especially by search
engines. The techniques they use to make themselves visible to
search engines such as padding pages with 'invisible' keywords
are the very techniques which also make them obvious to blocking
software. It is only where sites are happy to be invisible to
search engines etc. that encryption can be readily implemented
in order to avoid blocking technologies.
- Software must recognise for blocking purposes
Conceptionally it ought to be possible for blocking software to
be applied to any client with Internet connectivity or at a suitable
point in the system (for example a proxy server or cache operated
by an ISP or by a corporation in conjunction with a firewall).
There will however be cost implications in terms of perceived
system performance at the point at which the software is implemented.
Clearly the software must recognise the relevant factor which
enables identification (URL, label, word). This requires that
the information to be identified is in a form compatible with
the blocking software. The PICS standard endorsed by the W3 Consortium
is an attempt to introduce an industry wide convention setting
out a framework for describing ratings and labels so that software
from different vendors can effectively exchange labels (see below).
Prior to this development software ratings systems have had interoperability
limited by the difficulty of ensuring that labels produced under
one software could be read by a client operating another software.
- Appropriate response required to offensive material
identified
While 'access denied' is the most likely response to an item identified
by a piece of software as offensive or illegal there are a range
of more sophisticated editorial options such as individual words
or images being removed. These may have unintended effects so
that blocking at the paragraph level is likely to be more constructive
. It could also be possible to trigger alarm or monitoring systems
allowing sysops or others to become aware of attempts to obtain
certain types of material. The use of this type of information
clearly gives rise to concerns relating to privacy as well as
possible criminal and civil liability for contributing to subsequent
misuse or failure to report misuse.
- PICS
Since the commencement of this report the first generation of
software filtering tools has arguably been superseded by the PICS
technology. PICS is a set of open technical specifications for
creating rating systems and compatible filtering software for
Internet content which has been developed by the World Wide Web
Consortium.
This allows content providers to create rating labels for
Internet content and embed them within that content. These labels
indicate specific aspects of content such as offensiveness of
language explicitness of sex and the degree of violence. Rating
services on the other hand determine the substance of the
labels by setting the rating criteria or values. The labels can
be applied by content providers themselves ("self-rating")
or by independent organisations ("third party rating").
In the first instance the labels are embedded in a site's HTML
or expressed as part of the HTTP stream between the client and
the site. In the case of third party label bureaus the labels
can be expressed standalone and matched to a URL. Filtering software
installed on a user's computer (or indeed the web browser) is
used to automatically read the labels and block content that does
not fit the "permissible content' criteria specified by the
Internet user.
The flexibility of PICS allows a single Website to have multiple
labels applied by different rating systems. Users are free to
choose which rating systems to install on their computer. They
can also choose whether to block access to content that has no
label at all or to override the block after viewing the material.
In assessing the long-term viability of PICS there are two quite
distinct factors which could both respectively seal its fate:
technical shelf life and acceptance by industry and society. Industry
in this context means the service providers and the content providers
the browser and filtering software developers. Society means the
users (who may also be content providers) the parents of users,
policy makers, regulators and the conventional morality of the
wider community. If PICS is to become the effective adjunct to
self-regulation as is widely predicted a number of developments
will be required in the short to medium term.
- Technical Shelf Life
There appears to be widespread consensus that technically PICS
is a major step forward in the field of content labelling. Indeed
the PICS standard is fully operational and already being implemented
by one major browser (Microsoft's Internet Explorer 3) and - to
a lesser or greater degree - by most commercial filtering products
(Cyber Patrol, PlanetWeb, Net Nanny, Net Shepherd and Safe Surf
to name but a few).
However, Internet related technical innovation continues at breakneck
speed and there is always the danger that such innovation could
at some point render PICS obsolete (or at least significantly
less effective). The Leitmotif of PICS however is flexibility
and it is this flexibility which is likely to ensure that it adapts
to deal with new technical challenges. The PICS standard is the
subject of ongoing development by the W3C and indeed now offers
"metadata" or descriptive functions which will make
it as useful for locating content as for blocking it. It therefore
promises to offer further refinements and options developing "organically"
to meet the needs of the industry (see below at Section b(3)).
This report predicts that the industry will see a technical harmonisation
of all commercial filtering software in the short to medium term
with PICS becoming the global technical infrastructure for content
labelling. However it is vital to distinguish what has been termed
the "plumbing" of content labelling from the rating
element which provides the framework for the moral and cultural
value judgements to be applied to the content.
- Acceptance by Industry and Society
If the technical viability and robustness of PICS is not currently
in dispute the widespread uptake and acceptance of PICS is certainly
contingent upon the availability of effective user friendly rating
systems which meet both user and content provider requirements
in terms of simplicity and flexibility and which - ideally - are
universal in their application. Universal in this context means
the capacity to balance local cultural requirements and specificity
with an open infrastructure permitting compatibility with other
systems and the mapping of one system's ratings to another. Until
such systems become available it is likely that much Internet
material will remain unrated with the result that users will be
denied access to the vast majority of Internet material if they
install PICS compatible filtering software and configure it to
accept only rated pages. This is of course likely to prohibit
the widespread uptake of PICS in the short term.
Prerequisites for effective rating
- A Comprehensive System
Should ratings be expressed as scores on a number of categories
(e.g. violence language nudity sex) or should they be converted
into recommended suitability for an age range as with films and
videos? The RSACi system devised by the Recreational Software
Advisory Council ("RSAC") is based on the above four
categories and there appears to be some consensus that this approach
is preferable to an age based system for reasons discussed at
(2) below. RSAC has recently produced a revised methodology which
sub-categories the above four in much greater detail however arguably
other categories such as racism, blasphemy, drug use, hate, etc.,
must be covered by a truly comprehensive system. The need for
comprehensive systems does not of course preclude specialist systems
being developed in tandem which are tailored to specific moral
ethical cultural or religious etc. requirements (e.g. the Jewish
faith).
- A Global System
It is generally acknowledged that rating systems cannot stand
alone; they must be capable of using and interpreting other systems
to build up a wider catchment of rated sites. Unique local systems
may work for small groups of users with a particular interest
but a stand-alone national system is unlikely to overcome the
"chicken and egg" problem that users will not switch
on a system which only reaches a few sites while site owners will
not rate with a system which has few users. The UK is currently
formulating a proposal for such a system and hopes to collaborate
with bodies in the US Europe and Australia all of which have expressed
interest in this work (see Section 2 below).
- An Accountable System
A frequent rhetorical question heard in the context of the rating
debate is "who will rate the raters?" There is concern
that the discretionary self rating systems which PICS permits
may be open to abuse in terms of negligent or bad faith rating
by content providers.
It has been stated that "One of the most critical aspects
of any self-regulatory regime is the lengths that it goes to ensure
that people aren't cheating the system" . While this statement
was made in relation to self-regulation of computer games it applies
equally to Internet content. Indeed RSAC's Internet rating system
RSACi currently the leading database of PICS labels freely available
to the public operates a system of spot checks to ensure self
raters rate accurately. If they do not RSAC has contractual rights
allowing it to take legal action against those who wilfully mis-rate.
This seems a good model for self-rating.
In the case of third party rating services the reputation and
credibility of such systems will rest on their track record for
providing consistently accurate ratings in accordance with their
chosen rating system(s). Therefore accountability will be less
of an issue. However no doubt in time there will be rating services
to rate the rating services! Ultimately however de facto standard
systems will emerge as users decide which systems best meet their
individual needs and this is precisely the aim of PICS; to provide
users with maximum choice and control.
- Content Producer Cooperation
It was stated recently that 'the biggest problem (with PICS) is
that although many companies are now supporting PICS in their
business the folks who create Web site development tools don't
seem to have even heard of the concept . However the same commentator
predicted that once Web development tools allow PICS labels to
be generated semi-automatically the task will appear less onerous
and time consuming to content producers.
The prevailing view amongst content providers is that they will
be willing to label content provided rating is made quick cheap
and easy.
In addition the above presupposes that an acceptable rating system
is in place providing them with a framework for rating.
If third party rating services do become the most popular method
of obtaining labels then self-rating may gradually become redundant.
This is not likely in the short term however.
- Browser Compatibility
For PICS to really take off it needs to be supported in all major
browsers as many users may not want the additional expense of
purchasing a filtering product. This is not currently the case
although Netscape promises that a future version of Netscape navigator
will support PICS details to be announced fairly soon.
- User Confidence
The key to ensuring that PICS wins over users is to achieve factors
(a) - (e) above. In other words once several rating systems become
accepted as industry standards trusted labelling services are
established labelling tools exist to facilitate the generation
of labels and all major filtering software products and browsers
are fully PICS compatible then users will have the requisite confidence
that all parts of the PICS equation are in place.
In the meantime it is vital that the industry concentrate on educating
users and raising their awareness of the existence of PICS in
tandem with the above developments.
A possible model for a rating system
- Internet Watch Foundation (UK)
The recently formed Advisory Board of the Internet Watch Foundation
("IWF") is currently investigating a way forward to
bridge the gap between the objectives of having a mechanism for
rating and blocking content sensitive to the needs of UK culture
and of having an internationally accepted system to promote widespread
usage and allow users access to the maximum amount of global content.
The report focus on the work of IWF as it appears to be the most
ambitious scheme for a universal rating system undertaken to date.
The proposal is based on the difference between describing and
blocking content and has been summarised as follows by one of
the Advisory Board members.
- Develop an agreed international methodology for describing
content
There needs to be a common basis for describing content. This
will first require establishing a set of factors of concern -
e.g. nudity, language, violence, etc., and second an agreed system
for calibrating a scale of the extent to which any factor is present.
From this can be developed a common methodology for content producers
to place their content within the matrix of factors.
While no such a matrix can ever be value free (the selection of
the factors of concern inevitably is a result of different cultural
religious and social influences) the aim is to have a common international
basis for describing content which is as objective as possible.
It may therefore be necessary to include factors that some cultures
consider unimportant to obtain maximum agreement.
Such an objective methodology must crucially choose a calibrated
scale of how much a factor is present that is unrelated to the
age of potential users. Different cultures have very different
views about what is suitable at different ages. A simple numeric
scale against agreed criteria is preferable.
Content producers will rate their web sites using the agreed methodology
and insert a PICS label into the URL. This will create a vast
international data set describing global Internet content.
- Adopt universally available PICS compatible blocking
software
Users need to access the data set of content labels and can do
so using browsers and blocking software compatible with PICS.
The current version of Microsoft Internet Explorer with the RSACi
labels has already demonstrated the possibilities of this approach.
Individual users around the world have the capability to set their
browser against all the factors agreed in the international data
set. Such an approach gives maximum freedom of choice to users.
For example a Muslim in Saudi Arabia could set a very restrictive
block on all the factors of concern including one that recognises
modesty of clothing and a user in Northern Europe could take a
more relaxed approach on all or some of the factors.
- Make available a variety of "blocking profiles"
While the maximum choice of factors option outlined above is attractive
some users and some cultures will find this bewildering. Some
users will be anxious to have advice as to what constitutes a
'safe' setting and in some countries there may be a desire to
relate the description of content to existing recognised rating
systems. These desires can be met if software profiles
are produced for individual countries or cultures which draw on
the information in the data set. Profiles are in fact a technical
modification of PICS discussed below in (3).
To take a hypothetical example of a profile in the UK a software
block could be devised based on the widely recognised age related
video classification system. The classification body could examine
the data set and determine for example that an approximation to
a 12 age rating was a score of 2 on language, 1 on nudity, 1 on
sex, etc. Thus a UK profile could be built which matched the data
set to preexisting video classification criteria.
While such a process will inevitably mean losing the fine tuning
offered by the user setting such factors individually it may provide
an additional guide and tool for many people that they find helpful.
Moreover it does allow sensitivity to cultural differences.
An analogy which may help understanding this process is the choice
consumers make in adjusting the settings for a central heating
system sold in a variety of countries. Such systems allow a number
of different factors to be controlled -thermostat temperature
setting length of time boiler operates whether it comes on once
or twice whether water only is heated or both water and heating
etc. Most users accept the manufacturer's recommended settings
for their particular climate and needs knowing that they can adjust
individual factors if they want. Others may adopt settings suggested
by energy conservation organisations. In the content blocking
case users might be offered the opportunity to consider adjusting
all the settings individually or accepting a recommended profile.
In time third party rating services will undoubtedly offer 'off
the shelf' profiles as part of their service.
- Ongoing challenges for PICS
The question is whether the W3C developers can keep up with Internet
developments and adapt the PICS technology swiftly enough to allay
user concerns that (1) PICS can effectively filter all content
and (2) that if it cannot they are not denied access to large
amounts of valuable material merely because it is unrated.
- Rating Information Push Services
One of the most significant recent Internet technologies is Information
Push where a user subscribes to an information "channel"
and content is downloaded to the client machine automatically.
Companies that are providing this kind of service include PointCast
Castanet by Marimba BackWeb and Tibco's TIBNET. The method used
to get the content to the client varies (not necessarily HTTP
and HTML-based) as does the structure of the channels. This raises
the following questions:
- How would the actual channels themselves be rated in particular
within the current PICS framework?
- How would the labels get to the user?
- How would their rating of individual "articles"
within a channel be achieved?
- If the user has a preference file for blocking channels/articles
how can we ensure this is compatible with all Information Push
services?
These issues need to be investigated and the W3C PICS working
group is already addressing them.
- Newsgroup Ratings
It is the stated intention of IWF in the UK to produce composite
ratings for all newsgroups. This will require a ratings service
to sample groups and produce the ratings and a ratings bureau
to maintain a rating file. It has been proposed that IWF should
facilitate these actions using volunteers - both from ISPs and
independent raters. Although there are ways of reducing this mammoth
task (currently about 22,000 news groups) it is still a daunting
undertaking. The outcome of the work may either be a commercial
product/service or freely available to the UK industry and its
subscribers.
The IWF is also spearheading the INCORE (Internet Content Rating
for Europe) proposal which has been submitted to the EU Commission
as part of a bid for EU funding. One of the three key aims of
INCORE is to address the existence of illegal material on Usenet
newsgroups at European level and develop a rating system for Usenet.
Several technical approaches are being considered some of which
are based on the PICS technology.
- Profiles
A ratings profile is a set of filtering rules that allows the
user to select or block content from the Internet based on its
URL and PICS label. Profiles are intended to be simple to use
and portable allowing the user to configure their filtering software
easily and share their profiles with other users. There are several
ways that profiles could be used:
- the user could install a pre-determined profile from a trusted
organisation whose values they agree with - this would make it
easy for the user to select Internet content without having to
write their own profile;
- profiles could be sent to search engine servers proxies and
other Internet services as part of the user's query or request.
This would allow the server to only return information that fits
with the user's profile preventing invalid links getting to the
user;
- users could use many different software filters on their machine
but use the same portable profile with all of them.
Profiles are written in picsRULZ a common profile specification
language. It allows various types of rules to be written that
block or allow access based on the URL of the resource and its
PICS label.
The technical specification for PICS profiles still requires some
fine tuning however this work is in progress at the W3C.
Previous | Next
Table of Contents
About ILPF | To Join ILPF | Working Groups & Publications Member Resources | Events | Home
|
|