Thursday, July 12, 2007

RSS Technical notes

RSS (which, in its latest format, stands for "Really Simple Syndication") is a family of web feed formats used to publish frequently updated content such as blog entries, news headlines or podcasts. An RSS document, which is called a "feed," "web feed," or "channel," contains either a summary of content from an associated web site or the full text. RSS makes it possible for people to keep up with their favorite web sites in an automated manner that's easier than checking them manually.

The initials "RSS" are used to refer to the following formats:

  • Really Simple Syndication (RSS 2.0)
  • RDF Site Summary (RSS 1.0 and RSS 0.90)
  • Rich Site Summary (RSS 0.91)

RSS formats are specified using XML, a generic specification for the creation of data formats.

RSS Elements

An RSS document consists of the following elements (On this document were named the most important).

  1. RSS: The rss element is the top-level element of an RSS feed. A feed that conforms to the RSS specification must contain a version attribute with the value "2.0". This element is required and must contain a channel element. The rss element must not contain more than one channel.

1.1 Channel: The channel element describes the RSS feed, providing such information as its title and description, and contains items that represent discrete updates to the web content represented by the feed. This element is required and must contain three child elements: description, link and title.

1.1.1 Description: The description element holds character data that provides a human-readable characterization or summary of the feed (required).

2.1.1 Link: The link element identifies the URL of the web site associated with the feed (required).

3.1.1 Title: The title element holds character data that provides the name of the feed (required).

4.1.1 Item: An item element represents distinct content published in the feed such as a news article, weblog entry or some other form of discrete update. A channel may contain any number of items (or no items at all). An item may contain the following child elements: author, category, comments, description, enclosure, guid, link, pubDate, source and title. All of these elements are optional but an item must contain either a title or description.

4.1.1.1 Author: An item's author element provides the e-mail address of the person who wrote the item (optional).

4.1.1.2 Category: An item's category element identifies a category or tag to which the item belongs (optional).

4.1.1.3 Title: An item's title element holds character data that provides the item's headline. This element is optional if the item contains a description element.

4.1.1.4 Comments: An item's comments element identifies the URL of a web page that contains comments received in response to the item (optional).

4.1.1.5 Description: An item's description element holds character data that contains the item's full content or a summary of its contents, a decision entirely at the discretion of the publisher. This element is optional if the item contains a title element.

4.1.1.6 Link: An item's link element identifies the URL of a web page associated with the item (optional).

Java API to make RSS

A Java API was found to construct the RSS in whatever of its versions. The name of the Project is Informa. This API is an Open Source Code.

Informa we enable you with a harmonized view on a news channel object model. Both channels and news items do have metadata assigned, it does not make any difference from which channel format they were originally retrieved (RSS 0.9x, RSS 1.0 / RDF, RSS 2.0, Atom 0.3 and Atom 1.0). Informa is in beta state, but it is quite stable and fully usable.

For more information

http://www.rssboard.org/rss-profile

http://informa.sourceforge.net/

No comments: