UltraXML in the Journal and Book Industry
Optionally Supported DTDs
DocBook is a DTD maintained by
the
DocBook Technical Committee of OASIS. It is a set of tags for describing
books, articles, and other prose documents, particularly technical
documentation. DocBook is defined using the native DTD syntax of SGML and
XML.
DocBook is a large and robust
DTD and its main structures correspond to the general notion of what
constitutes a book. DocBook has been adopted by a large and growing
community of authors writing books of all kinds.
DocBook began in 1991 as a joint
project of HaL Computer Systems and O'Reilly. Its popularity grew, and
eventually it spawned its own maintenance organization, the Davenport Group.
In mid-1998, it became a Technical Committee (TC) of the Organization for
the Advancement of Structured Information Standards (OASIS).
DocBook XML 4.3 is the current XML version of DocBook

The National Center for Biotechnology
Information (NCBI) of the National
Library of Medicine (NLM) created the Journal Archiving and Interchange
Document Type Definition (DTD) with the intent of providing a common format
in which publishers and archives can exchange journal content.
This DTD was created from the
Journal Archiving and Interchange DTD Suite, which provides a set of XML
modules that define elements and attributes for describing the textual and
graphical content of journal articles as well as some non-article material
such as letters, editorials, and book and product reviews).
The Suite of Modules
The intent of this DTD Suite is
to preserve the intellectual content of journals independent of the form in
which that content was originally delivered. The Suite has been written as a
set of XML DTD modules, each of which is a separate physical file. No module
is an entire DTD by itself, but these modules can be combined into a number
of different DTDs.
The Archiving and Interchange DTD may be used as is, or the Suite can be used to construct DTDs for
authoring and archiving journal articles as well as DTDs for transferring
journal articles from publishers to archives and between archives. Details
on creating DTDs from the Suite are available in the Tag Library. Although
the full Suite was developed to support electronic production, the
structures should be adequate to support some print production as well.

The Journal Publishing Document
Type Definition (DTD) was Created by the
National Center for Biotechnology Information (NCBI), a center of the
National Library of Medicine (NLM),
with the intent of providing a common format for the creation of journal
content in XML.
The Journal Publishing DTD
defines a document type for journal articles and some non-article journal
material such as product and book reviews, editorials, and letters to the
editor. The DTD was written to describe both the metadata for a journal
article and the content of the article, but it can also describe just the
article header metadata. This is a prescriptive DTD, optimized for the
authoring and initial XML tagging of journal material. Although designed for
biomedical journals, this DTD should be sufficiently general to describe not
only STM journals but technical journals in any field.
The DTD was constructed using
the modules of the Archiving and Interchange DTD Suite and has been modelled
along the same philosophical lines as the Journal Archiving and Interchange
DTD, which is a DTD for interchange and storage of journal material.
However, because this is a publishing DTD optimized for the creation of new
material, the DTD is far smaller (fewer elements, and fewer choices in many
contexts) than was the full Journal Archiving and Interchange DTD. Where, in
the interchange DTD, there may have been several ways to express the same
information, only one way is provided for this publishing DTD. It was not
the intention to limit the expressive power licensed by this DTD but rather
to limit the meaningless choices that a full interchange DTD needs to make
conversion from a wide variety of formats as easy as possible. The
philosophy for the interchange DTD was to accept as many varied forms of
many structures as possible. The philosophy of this DTD is to prefer a
single structural form, or at least a single style of tagging.
The only element in the
Publishing DTD that is not in the Archiving and Interchange model is the NLM
Citation Model. This citation model, although loose enough to accommodate
the full range of citation types in the NLM Guidelines, is far more
prescriptive that the Citation model. This model and the extensive examples
of tagged citations provided are intended to encourage the creation of
citations according to NLM's guidelines.
This Tag Library describes the
Journal Publishing DTD as well as elements from the Archiving and
Interchange DTD Suite. This Tag Library provides:
-
an
introduction to version 1.1 of the Tag Library explaining modifications to
the Archiving
-
and Interchange DTD Suite
-
an
introduction to the design principles for this DTD and the Suite as a
whole
-
an
explanation of how to read and use this Tag Library
-
a
top-level introduction to this DTD
-
large reference sections that describe the elements, attributes, and
Parameter Entities defined in this DTD
-
tree-like hierarchical diagrams that show the structures in the Journal
Publishing DTD
-
tables and appendices useful for reference on the fine points of the DTD
-
reading copies of the two modules that compose the DTD and the rest of the
modules that make up the full Suite
-
a
full article sample, both as a PDF file showing the format and as an XML
file valid to this DTD
Version 1.1 of the Journal
Publishing DTD is a fully backward compatible revision of the Journal
Publishing DTD. That is, all documents that were valid according to version
1.0 of the DTD will also be valid according to version 1.1.
The DTD was changed based on
experience using the DTD. Some users have been converting articles tagged
according to other DTDs into Journal Publishing articles and found that they
had to lose information (such as semantic identification of some sections)
in the transformation. The changes to the DTD are largely to allow
preservation of such information.

Online
Information eXchange - ONIX, refers to a standard format that publishers can use
to distribute electronic information about their books to wholesale, e-tail
and retail booksellers, other publishers, and anyone else involved in the
sale of books.
ONIX was developed as a solution
to two modern problems: (1) the need for richer book data online; and (2)
the widely varying format requirements of the major book wholesalers and
retailers. Throughout 1999, the American Association of Publishers (AAP)
worked together with the major wholesalers, online retailers, and book
information services to create a universal, international format in which
all publishers, regardless of their size, could exchange information about
books. The group unveiled ONIX 1.0 in January 2000.
An ONIX message is a set of data elements defined by "tags" that is written
in the computer language XML (eXtensible Markup Language) and that conforms
to a specific template, or set of rules, also known as the ONIX DTD
(Document Type Definition). The DTD defines, among other things, how to
order the data elements, and how the elements are interrelated.
Much of ONIX is based on the
pre-existing EPICS (EDItEUR Product Information Communication Standards), a
much broader standard for defining products which was developed
internationally by EDItEUR, drawing on the combined experience of Book
Industry Study Group (BISG) in the US and Book Industry Communication (BIC)
in the UK.
The standard allows for a
publisher to use either of two standards – Level 1 or Level 2. Level 1
contains all the information in Level 2. Standard data elements in Level 1
are targeted to publishers who have not established an in-house database of
product information. Level 2 is targeted for those publishers who feel that
Level 1 data elements are not adequate.
The ONIX standard defines both a list of data fields about a book and how to
send that data in an "ONIX message." ONIX specifies over 200 data elements,
each of which has a standard definition, so that everyone can be sure
they're referring to the same thing. Some of these data elements, such as
ISBN, author name, and title, are required; others, such as book reviews and
cover image, remain optional. While most data elements consist of text
(e.g., contributor biography), many are multimedia files, such as images and
audio files. (It is particularly these optional fields--excerpts, reviews,
cover images, author photos, etc.--that lead to more sales online.)
ONIX is now published and
maintained by EDItEUR in association
with the Book Industry Study Group (BISG)
in the U.S. and BIC in the U.K. The
latest version of ONIX is referred to as ONIX International
|