UltraXML for Newspapers, Colour Magazines and Periodicals
Optionally Supported DTDs

The IPTC launched the News
Industry Text Format (NITF) project in the early 1990s when members began looking
for a successor to ANPA 1312 and IPTC 7901. These two formats were
standardized in 1979 and provided a common platform for news services and
newspapers to share content.
When XML was introduced in 1998, the NITF was modified to be compliant.
NITF is an XML-conforming vocabulary. This means that NITF uses the constructs standardized by XML to
describe elements of content within a document, and the descriptive
attributes of that content.
NITF supports the identification and description of a tremendous number of news characteristics. Highlights
include:
- Who owns the copyright to the item, who may republish it, and who it's
about.
- What subjects, organizations, and events it covers.
- When it was reported, issued, and revised.
- Where it was written, where the action took place, and where it may be
released.
- Why it is newsworthy, based on the editor's analysis of the metadata.
NITF is the widely used XML
vocabulary among news publishers worldwide these include:
- AFP - Agence France Press
- ANSA - Italy's largest newswire
- AP Digital - Part of the Associated Press for interactive markets
- dpa - Deutsche Presse-Agentur
- LexisNexis
- The New York Times
- Primedia Business Magazines & Media

The IPTC started to work on
"an XML-based standard to represent and manage news throughout its
lifecycle, including production, interchange, and consumer use" in 1999.
After only one year of defining requirements, working out specifications and
development NewsML 1.0 was approved by IPTC in October 2000.
NewsML proved to be stable in
production environment: since its introduction it was updated only two
times, the current version is 1.2 of October 2003.
NewsML takes the form of an XML
document, which has a series of components, or elements, that are used to
structure and process the actual news content. These elements may have
attributes to specify their properties and can carry content in the form of
other elements (sub elements) and/or character data or external references.
News Metadata
Efficient use of metadata is a
key feature for NewsML and considerable effort has been put into the
development of a core set of metadata. This work was able to draw on the
substantial intellectual capital represented by the earlier IIM (Information
Interchange Model) and NITF (News Industry Text Format) standards, but has
been substantially extended, making use of some advanced XML features.
In general, the design of NewsML
tries to keep the metadata as close as possible to the item it describes,
while much of the metadata is optional.
At the lowest level that could
contain news data - the "ContentItem" - attributes can be added to describe
the physical character of the news representation.
Details of the llatest release of NewsML are available at
NewsML.org.

Rich Site Summary, RDF Site
Summary, or Really Simple Syndication, RSS defines an XML grammar for
sharing and distributing news. Each RSS text file contains both static
information about the web site, plus dynamic information about the content
or new stories, Each story is defined by an <item> tag, which contains a
headline TITLE, URL, and DESCRIPTION. Each RSS channel can contain up to 15
items and is easily parsed using Perl or other open source software.
RSS is currently used for a
number of applications, including news and other headline syndication,
weblog syndication, and the propagation of software update lists. It is
generally used for any situation when a machine-readable list of textual
items and/or metadata about them needs to be distributed. There are a number
of revisions, at least 7 of the RSS format defined, many of which are
actively used.
RSS 2.0 Specification

Other Specifications for Aggregation and Syndication
Publishing Requirements for Industry
Standard Metadata (PRISM)
Atom
Publishing Format and Protocol
|