public class TextExtractor extends DefaultCallback
| Modifier and Type | Field and Description |
|---|---|
MutableString |
text
The text resulting from the parsing process.
|
MutableString |
title
The title resulting from the parsing process.
|
EMPTY_CALLBACK_ARRAY| Constructor and Description |
|---|
TextExtractor() |
| Modifier and Type | Method and Description |
|---|---|
boolean |
characters(char[] characters,
int offset,
int length,
boolean flowBroken)
Receive notification of character data inside an element.
|
void |
configure(BulletParser parser)
Configure the parser to parse text.
|
boolean |
endElement(Element element)
Receive notification of the end of an element.
|
void |
startDocument()
Receive notification of the beginning of the document.
|
boolean |
startElement(Element element,
Map<Attribute,MutableString> attrMapUnused)
Receive notification of the start of an element.
|
cdata, endDocument, getInstancepublic final MutableString text
public final MutableString title
public void configure(BulletParser parser)
configure in interface Callbackconfigure in class DefaultCallbackpublic void startDocument()
CallbackThe callback must use this method to reset its internal state so that it can be resued. It must be safe to invoke this method several times.
startDocument in interface CallbackstartDocument in class DefaultCallbackpublic boolean characters(char[] characters,
int offset,
int length,
boolean flowBroken)
CallbackYou must not write into text, as it could be passed
around to many callbacks.
flowBroken will be true iff
the flow was broken before text. This feature makes it possible
to extract quickly the text in a document without looking at the elements.
characters in interface Callbackcharacters in class DefaultCallbackcharacters - an array containing the character data.offset - the start position in the array.length - the number of characters to read from the array.flowBroken - whether the flow is broken at the start of text.public boolean endElement(Element element)
CallbackThis method will never be called for element without closing tags, even if such a tag is found.
endElement in interface CallbackendElement in class DefaultCallbackelement - the element whose closing tag was found.public boolean startElement(Element element, Map<Attribute,MutableString> attrMapUnused)
CallbackFor simple elements, this is the only notification that the callback will ever receive.
startElement in interface CallbackstartElement in class DefaultCallbackelement - the element whose opening tag was found.attrMapUnused - a map from Attributes to MutableStrings.Copyright © 2006–2019 SYSTAP, LLC DBA Blazegraph. All rights reserved.