TokenBuffer (Blazegraph Database Platform 2.1.5 API)

java.lang.Object
- com.bigdata.search.TokenBuffer<V>

Type Parameters:
V - The generic type of the document identifier.
```
public class TokenBuffer<V extends Comparable<V>>
extends Object
```
A buffer holding tokens extracted from one or more documents / fields. Each entry in the buffer corresponds to the TermFrequencyData extracted from a field of some document. When the buffer overflows it is flush(), writing on the indices.

Version:

$Id$

Author:

Bryan Thompson

Constructor Summary

Constructors
Constructor and Description

TokenBuffer(int capacity, FullTextIndex<V> textIndexer)
Ctor.

Constructors
Constructor and Description
`TokenBuffer(int capacity, FullTextIndex<V> textIndexer)` Ctor.

Method Summary

Methods
Modifier and Type	Method and Description
`void`	`add(V docId, int fieldId, String token)` Adds another token to the current field of the current document.
`protected long`	`deleteFromIndex(int n, byte[][] keys, byte[][] vals)` Writes on the index.
`void`	`flush()` Write any buffered data on the indices.
`TermFrequencyData<V>`	`get(int index)` Return the `TermFrequencyData` for the specified index.
`void`	`reset()` Discards all data in the buffer and resets it to a clean state.
`int`	`size()` The #of entries in the buffer.
`protected long`	`writeOnIndex(int n, byte[][] keys, byte[][] vals)` Writes on the index.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - TokenBuffer
```
public TokenBuffer(int capacity,
           FullTextIndex<V> textIndexer)
```
    Ctor.
    
    Parameters:
    capacity - The #of distinct {document,field} tuples that can be held in the buffer before it will overflow. The buffer will NOT overflow until you exceed this capacity.
    textIndexer - The object on which the buffer will write when it overflows or is flush()ed.
- Method Detail
  - reset
```
public void reset()
```
    Discards all data in the buffer and resets it to a clean state.
  - size
```
public int size()
```
    The #of entries in the buffer.
  - get
```
public TermFrequencyData<V> get(int index)
```
    Return the TermFrequencyData for the specified index.
    
    Parameters:
    index - The index in [0:count).
    
    Returns:
    The TermFrequencyData at that index.
    
    Throws:
    
    IndexOutOfBoundsException
  - add
```
public void add(V docId,
       int fieldId,
       String token)
```
    Adds another token to the current field of the current document. If either the field or the document identifier changes, then begins a new field and possibly a new document. If the buffer is full then it will be flush()ed before beginning a new field.
    Note: This method is NOT thread-safe.
    Note: There is an assumption that the caller will process all tokens for a given field in the same document at once. Failure to do this will lead to only part of the term-frequency distribution for the field being captured by the indices.
    
    Parameters:
    docId - The document identifier.
    fieldId - The field identifier.
    token - The token.
  - flush
```
public void flush()
```
    Write any buffered data on the indices.
    Note: The writes on the terms index are scattered since the key for the index is {term, docId, fieldId}. This method will batch up and then apply a set of updates, but the total operation is not atomic. Therefore search results which are concurrent with indexing may not have access to the full data for concurrently indexed documents. This issue may be resolved by allowing the indexer to write ahead and using a historical commit time for the search.
    Note: If a document is pre-existing, then the existing data for that document MUST be removed unless you know that the fields to be found in the will not have changed (they may have different contents, but the same fields exist in the old and new versions of the document).
  - writeOnIndex
```
protected long writeOnIndex(int n,
                byte[][] keys,
                byte[][] vals)
```
    Writes on the index.
    
    Parameters:
    n -
    keys -
    vals -
    
    Returns:
    The #of pre-existing records that were updated.
  - deleteFromIndex
```
protected long deleteFromIndex(int n,
                   byte[][] keys,
                   byte[][] vals)
```
    Writes on the index.
    
    Parameters:
    n -
    keys -
    vals -
    
    Returns:
    The #of pre-existing records that were updated.

Class TokenBuffer<V extends Comparable<V>>

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

TokenBuffer

Method Detail

reset

size

get

add

flush

writeOnIndex

deleteFromIndex