public class LineWordReader extends Object implements WordReader, Serializable
WordReader that considers each line
of a document a single word.
The intended usage of this class is that of indexing stuff like lists of document
identifiers: if the identifiers contain nonalphabetical characters, the default
FastBufferedReader might do a poor job.
Note that the non-word returned by next(MutableString, MutableString) is
always empty.
| Constructor and Description |
|---|
LineWordReader() |
| Modifier and Type | Method and Description |
|---|---|
LineWordReader |
copy()
Returns a copy of this word reader.
|
boolean |
next(MutableString word,
MutableString nonWord)
Extracts the next word and non-word.
|
LineWordReader |
setReader(Reader reader)
Resets the internal state of this word reader, which will start again reading from the given reader.
|
public boolean next(MutableString word, MutableString nonWord) throws IOException
WordReaderIf this method returns true, a new non-empty word, and possibly
a new non-word, have been extracted. It is acceptable
that the first call to this method after creation
or after a call to WordReader.setReader(Reader) returns an empty
word. In other words both word and nonWord are maximal.
next in interface WordReaderword - the next word returned by the underlying reader.nonWord - the nonword following the next word returned by the underlying reader.word and nonWord are unchanged).IOExceptionpublic LineWordReader setReader(Reader reader)
WordReadersetReader in interface WordReaderreader - the new reader providing characters.public LineWordReader copy()
WordReaderThis method must return a word reader with a behaviour that matches exactly that of this word reader.
copy in interface WordReaderCopyright © 2006–2019 SYSTAP, LLC DBA Blazegraph. All rights reserved.