Package ccrdf :: Module rdfextract
[show private | hide private]
[frames | no frames]

Module ccrdf.rdfextract

rdfextract.py

A pluggable class for extracting RDF from blocks of text. By default uses a simple regex for finding RDF; can be extended with any number of other function for specialized processing.
Classes
RdfExtractor A pluggable class for extracting RDF from blocks of text.

Function Summary
  href_extractor(text, url)
Extracts metadata stored in linked files specified by <a rel="license" href="..." >
  link_extractor(text, url)
Extracts metadata stored in linked files specified by <link rel="meta" ...> as in: <link rel="meta" type="application/rdf+xml" href="/en/wiki/wiki.phtml?title=Main_Page&action=creativecommons">
  null_extractor(text, url)
This is a sample extractor with no functionality; it exists in the source for the purpose of documenting the extractor function signature.
  regex_extractor(text, url)
Extracts RDF segments from a textblock; returns a list of strings.
  retrieveUrl(url)
Returns the document contained at [url].
  string_extractor(text, url)
Extracts RDF segments from a block of text using simple string methods; for fallback only.

Variable Summary
str __copyright__ = '(c) 2004, Nathan R. Yergler'
str __id__ = '$Id: rdfextract.py 668 2006-07-10 14:27:27Z ny...
str __license__ = 'licensed under the GNU GPL2'
str __version__ = '$Revision: 668 $'

Function Details

href_extractor(text, url)

Extracts metadata stored in linked files specified by <a rel="license" href="..." >

link_extractor(text, url)

Extracts metadata stored in linked files specified by <link rel="meta" ...> as in: <link rel="meta" type="application/rdf+xml" href="/en/wiki/wiki.phtml?title=Main_Page&action=creativecommons">

null_extractor(text, url)

This is a sample extractor with no functionality; it exists in the source for the purpose of documenting the extractor function signature.

An extractor function takes a single parameter, text, and returns a list of RDF blocks extracted from the text. If no RDF is found, an empty list should be returned.

regex_extractor(text, url)

Extracts RDF segments from a textblock; returns a list of strings.

retrieveUrl(url)

Returns the document contained at [url].

string_extractor(text, url)

Extracts RDF segments from a block of text using simple string methods; for fallback only.

Variable Details

__copyright__

Type:
str
Value:
'(c) 2004, Nathan R. Yergler'                                          

__id__

Type:
str
Value:
'$Id: rdfextract.py 668 2006-07-10 14:27:27Z nyergler $'               

__license__

Type:
str
Value:
'licensed under the GNU GPL2'                                          

__version__

Type:
str
Value:
'$Revision: 668 $'                                                     

Generated by Epydoc 2.1 on Mon Jul 10 17:08:26 2006 http://epydoc.sf.net