be.re.repo
Interface TextExtract


public interface TextExtract

The interface for external modules to extract the text from a document. Such modules are declared in /configuration/maps.xml using the text-extract element as in the following example:

 <text-extract>
   <mime-type>text/*</mime-type>
   <!-- Additional mime-type elements can come here. -->
   <!-- Several path elements can come here. -->
   <class>
     <name>be.re.repo.mod.TextExtractPlainText</name>
     <!-- Optional URL relative to maps.xml.
     <jar>modules/my_module.jar</jar>
     -->
   </class>
 </text-extract>
 
 

Author:
Werner Donné

Method Summary
 Reader get(String vcr, String version, InputStream in, String mimeType, Context context)
          The method should return the complete text from the document.
 

Method Detail

get

Reader get(String vcr,
           String version,
           InputStream in,
           String mimeType,
           Context context)
           throws IOException
The method should return the complete text from the document.

Parameters:
vcr - the local path of the resource.
version - the version path of the resource.
in - the inputstream for the document.
mimeType - the MIME type of the document.
context - the repository context.
Returns:
The stream of extracted characters.
Throws:
IOException