public abstract class AbstractParser extends Object implements Parser
| Modifier and Type | Field and Description |
|---|---|
protected StringBuffer |
content |
protected String |
encoding |
protected String |
filename |
protected Locale |
locale |
protected static org.slf4j.Logger |
log |
| Constructor and Description |
|---|
AbstractParser() |
| Modifier and Type | Method and Description |
|---|---|
String |
getAuthor() |
String |
getContent() |
String |
getEncoding()
The character encoding of the content to be parsed
|
String |
getFilename()
The original file name of the content to be parsed
|
Locale |
getLocale()
The locale of the content to be parsed
|
String |
getSourceDate() |
String |
getTags() |
String |
getTitle() |
String |
getVersion() |
protected abstract void |
internalParse(InputStream is)
Invoked by the parse method
|
void |
parse(File file)
Same as the other method that accept an input stream, use this when you
have a file rather than a stream.
|
void |
parse(InputStream input)
Extracts content for the text content of the given binary document.
|
void |
setEncoding(String encoding) |
void |
setFilename(String filename) |
void |
setLocale(Locale locale) |
protected static org.slf4j.Logger log
protected StringBuffer content
protected String filename
protected Locale locale
protected String encoding
public String getContent()
getContent in interface Parserpublic String getSourceDate()
getSourceDate in interface Parserpublic String getVersion()
getVersion in interface Parserpublic String getFilename()
ParsergetFilename in interface Parserpublic void setFilename(String filename)
setFilename in interface Parserpublic Locale getLocale()
Parserpublic String getEncoding()
ParsergetEncoding in interface Parserpublic void setEncoding(String encoding)
setEncoding in interface Parserpublic void parse(File file)
Parserpublic void parse(InputStream input)
Parser#getContentTypes() unless the implementation
explicitly permits other content types.
The implementation can choose either to read and parse the given document immediately or to return a reader that does it incrementally. The only constraint is that the implementation must close the given stream latest when the returned reader is closed. The caller on the other hand is responsible for closing the returned reader.
The implementation should only throw an exception on transient errors, i.e. when it can expect to be able to successfully extract the text content of the same binary at another time. An effort should be made to recover from syntax errors and other similar problems.
This method should be thread-safe, i.e. it is possible that this method is invoked simultaneously by different threads to extract the text content of different documents. On the other hand the returned reader does not need to be thread-safe.
The parsing has to be completed before the seconds specified in the parser.timeout config. property.
protected abstract void internalParse(InputStream is) throws Exception
ExceptionCopyright © 2008-2014 Logical Objects. All Rights Reserved.