ccrdf.rdfextract

Package ccrdf :: Module rdfextract

[show private | hide private]

Module ccrdf.rdfextract

rdfextract.py

A pluggable class for extracting RDF from blocks of text. By default uses a simple regex for finding RDF; can be extended with any number of other function for specialized processing.

Classes
`RdfExtractor`	A pluggable class for extracting RDF from blocks of text.

Function Summary
	`href_extractor(text, url)` Extracts metadata stored in linked files specified by <a rel="license" href="..." >
	`link_extractor(text, url)` Extracts metadata stored in linked files specified by <link rel="meta" ...> as in: <link rel="meta" type="application/rdf+xml" href="/en/wiki/wiki.phtml?title=Main_Page&action=creativecommons">
	`null_extractor(text, url)` This is a sample extractor with no functionality; it exists in the source for the purpose of documenting the extractor function signature.
	`regex_extractor(text, url)` Extracts RDF segments from a textblock; returns a list of strings.
	`retrieveUrl(url)` Returns the document contained at [url].
	`string_extractor(text, url)` Extracts RDF segments from a block of text using simple string methods; for fallback only.

Variable Summary
`str`	`__copyright__` = `'(c) 2004, Nathan R. Yergler'`
`str`	`__id__` = `'$Id: rdfextract.py 668 2006-07-10 14:27:27Z ny...`
`str`	`__license__` = `'licensed under the GNU GPL2'`
`str`	`__version__` = `'$Revision: 668 $'`

Function Details

href_extractor(text, url)

Extracts metadata stored in linked files specified by <a rel="license" href="..." >

link_extractor(text, url)

Extracts metadata stored in linked files specified by <link rel="meta" ...> as in: <link rel="meta" type="application/rdf+xml" href="/en/wiki/wiki.phtml?title=Main_Page&action=creativecommons">

null_extractor(text, url)

This is a sample extractor with no functionality; it exists in the source for the purpose of documenting the extractor function signature.

An extractor function takes a single parameter, text, and returns a list of RDF blocks extracted from the text. If no RDF is found, an empty list should be returned.

regex_extractor(text, url)

Extracts RDF segments from a textblock; returns a list of strings.

retrieveUrl(url)

Returns the document contained at [url].

string_extractor(text, url)

Extracts RDF segments from a block of text using simple string methods; for fallback only.

Variable Details

copyright

Type:: str
Value:: '(c) 2004, Nathan R. Yergler'

id

Type:: str
Value:: '$Id: rdfextract.py 668 2006-07-10 14:27:27Z nyergler $'

license

Type:: str
Value:: 'licensed under the GNU GPL2'

version

Type:: str
Value:: '$Revision: 668 $'

Generated by Epydoc 2.1 on Mon Jul 10 17:08:26 2006

http://epydoc.sf.net