|
The World Wide Web is the product
of a vision developed long before the current popularity of the
web. The web is a place where people can deposit content and
also make links between the content on one web page and the content
on other pages, and between the content of one site and the content
of another. In its present state the World Wide Web is a very
disorganised collection of data, information and knowledge where
retrieval is largely a manual process. Phenomenal growth in content
on the World Wide Web is partly responsible for the chaotic situation
that now prevails.
A new vision of the World Wide
Web has recently been formulated. This new vision for the web
is referred to as The Semantic Web. This is essentially seen
as an information resource where there is more structure and
it becomes possible to easily extract information and knowledge
from content that is globally distributed.
The construction of the Semantic
Web requires developments that will enable web content to be
better organised and classified, as well as the application of
knowledge-based technologies that will help to automate the collection
of information and the extraction of knowledge from web content.
The intention is to transform the web into an efficient information
and knowledge source and to enable the development of value added
services based on the web.
Main Issues
The computer was invented as
a device for computation, but it has now become a truly universal
machine for it also provide a means of entertainment as well
as an entry point to a world wide network of information exchange.
A technology is now needed that supports access to unstructured
and heterogeneous distributed information and knowledge sources.
This technology is called the Semantic Web.
A major problem is extracting
useful knowledge from information found on the web. Everyone
is facing a deluge of data and information. The volume is overwhelming
and there is a need to be able to extract useful knowledge from
this. However there are a number of important issues that need
to be addressed. The first of these is the development of shared
vocabularies (also known as ontologies). These will provide the
basis of a common understanding of the meaning of words used
in different applications. Without such agreed vocabularies the
present chaos will continue. Trust is also importance - knowing
that data, information and knowledge derives from reliable sources.
This is a major difficulty that will need to be resolved if the
vision of value added services based on the Semantic Web are
to become a reality. Content also needs to be annotated, preferably
by automated means, if knowledge-based services are to be delivered
based on web content.
A number of activities have
been initiated by the World Wide Web Consortium (W3C) that are
directed towards the development of the Semantic Web. Their Semantic
Web activity is based upon evolving the current World Wide Web
into something that better supports automation. The key tools
for this are ontologies and resource description framework. The
aim of W3C is for the semantics of ontologies to be defined by
user communities. Resource description framework provides a generic
means of describing resources of any kind. The resource description
framework makes use of Extensible Mark-up Language (XML), and
it also provides a means of helping to automate aspects of using
the web.
The semantic web activity in
W3C centres around working and interest groups looking at issues
such as resource description framework, development of specifications,
and achieving better integration of resource description framework
with web services and Extensible Mark-up Language. A web ontology
group is working to produce a more sophisticated, richer and
expressive language through which user communities can expose
the more detailed semantics of their onologies.
A difficulty is the creating
ontologies as these need to be consensual. This is sometimes
routine, but sometimes very hard depending on the situation.
At the moment there are no good design guidelines for developing
onologies. The key issue is capturing the rationale for representing
the world in a particular way. The question of how to maintain
ontologies in areas where rapid changes and developments are
taking place is also a major problem, one for which there is
as yet no solutions.
Maintenance of the content
is also an important issue. This is often addressed as an afterthought
but is in fact a central issue. There is a need to acquire knowledge
with a view to future maintenance. Not enough attention was paid
to this subject.
The issue of how to deal with
both general and specific knowledge needs to be addressed. One
way forward is to create high level abstract ontologies, and
then to specialise these to cases. However there is a view that
onologies are task dependent and that mapping between ontlogies
or merging ontologies is hard. There is also no single best ontology.
It is unlikely that there will be single uniquely accepted ontologies.
Conclusions and Future Directions
The creation of the Semantic
Web provides an opportunity for technologies to be used to serve
people. Application of technologies offers the potential to make
the web easier to use, more user friendly and to automate the
extraction of knowledge from the data and information on the
web. The challenges that lie ahead include the construction of
shared vocabularies, the development of automated means of annotating
web content, especially legacy content from current web sites,
dealing with the authentication of content (a question of quality
and trust) and maintenance of content. A major challenge is using
knowledge technologies to create a more automated approach to
using the web and extracting knowledge from the vast resource
that is the World Wide Web. Ensuring the quality of the knowledge
provided is a major issue to be resolved if knowledge-based web
services are to be widely accepted. |