feedparser — Universal Feed Parser

Note

For more comprehensive documentation, please visit the feedparser website.

Universal Feed Parser is a Python module for downloading and parsing syndicated feeds. It can handle RSS 0.90, Netscape RSS 0.91, Userland RSS 0.91, RSS 0.92, RSS 0.93, RSS 0.94, RSS 1.0, RSS 2.0, Atom 0.3, Atom 1.0, and CDF feeds. It also parses several popular extension modules, including Dublin Core and Apple’s iTunes extensions.

Universal Feed Parser is easy to use; the module is self-contained in a single file, feedparser.py, and it has one primary public function, parse(). parse() takes a number of arguments, but only one is required, and it can be a URL, a local filename, or a raw string containing feed data in any format.

Parsing a feed from a remote URL

>>> import feedparser
>>> d = feedparser.parse('http://feedparser.org/docs/examples/atom10.xml')
>>> d['feed']['title']
u'Sample Feed'

The following example assumes you have saved a feed in atom10.xml.

Parsing a feed from a local file

>>> import feedparser
>>> d = feedparser.parse(r'atom10.xml')
>>> d['feed']['title']
u'Sample Feed'

Universal Feed Parser can also parse a feed in memory.

Parsing a feed from a string

>>> import feedparser
>>> rawdata = """<rss version="2.0">
<channel>
<title>Sample Feed</title>
</channel>
</rss>"""
>>> d = feedparser.parse(rawdata)
>>> d['feed']['title']
u'Sample Feed'

Values are returned as Python Unicode strings (except when they’re not – see advanced encoding for all the gory details).

feedparser.parse(url_file_stream_or_string, etag=None, modified=None, agent=None, referrer=None, handlers=None, request_headers=None, response_headers=None)

Parse a feed from a URL, file, stream, or string.

request_headers, if given, is a dict from http header name to value to add to the request; this overrides internally generated values.