AbstractStatementBuffer (Blazegraph Database Platform 2.1.5 API)

java.lang.Object
- com.bigdata.rdf.rio.AbstractStatementBuffer<F,G>

Type Parameters:
F - The generic type of the source Statement added to the buffer by the callers.
G - The generic type of the BigdataStatements stored in the buffer.

All Implemented Interfaces:

IStatementBuffer<F>, IBuffer<F>

Direct Known Subclasses:

AbstractStatementBuffer.StatementBuffer2
```
public abstract class AbstractStatementBuffer<F extends org.openrdf.model.Statement,G extends BigdataStatement>
extends Object
implements IStatementBuffer<F>
```
Class for efficiently converting Statements into BigdataStatements, including resolving term identifiers (or adding entries to the lexicon for unknown terms) as required. The class does not write the converted BigdataStatements onto the database, but that can be easily done using a resolving iterator pattern.

Version:

$Id: AbstractStatementBuffer.java 6022 2012-02-13 17:55:15Z thompsonbry $

Author:

Bryan Thompson

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`static class`	`AbstractStatementBuffer.StatementBuffer2<F extends org.openrdf.model.Statement,G extends BigdataStatement>` Loads `Statement`s into an RDF database.

Field Summary

Fields
Modifier and Type	Field and Description
`protected static boolean`	`DEBUG`
`protected static boolean`	`INFO`
`protected static org.apache.log4j.Logger`	`log`
`protected boolean`	`readOnly` When `true`, `Value`s will be resolved against the `LexiconRelation` and `Statement`s will be resolved against the `SPORelation`, but unknown `Value`s and unknown `Statement`s WILL NOT be inserted into the corresponding relations.
`protected G[]`	`statementBuffer` Buffer for accepted `BigdataStatement`s.

Constructor Summary

Constructors
Constructor and Description

AbstractStatementBuffer(AbstractTripleStore db, boolean readOnly, int capacity)

Constructors
Constructor and Description
`AbstractStatementBuffer(AbstractTripleStore db, boolean readOnly, int capacity)`

Method Summary

Methods
Modifier and Type	Method and Description
`void`	`add(F e)` Imposes a canonical mapping on the subject, predicate, and objects of the given `Statement`s and stores a new `BigdataStatement` instance in the internal buffer.
`void`	`add(org.openrdf.model.Resource s, org.openrdf.model.URI p, org.openrdf.model.Value o)` Add an "explicit" statement to the buffer with a "null" context.
`void`	`add(org.openrdf.model.Resource s, org.openrdf.model.URI p, org.openrdf.model.Value o, org.openrdf.model.Resource c)` Add an "explicit" statement to the buffer.
`void`	`add(org.openrdf.model.Resource s, org.openrdf.model.URI p, org.openrdf.model.Value o, org.openrdf.model.Resource c, StatementEnum type)` Add a statement to the buffer.
`protected void`	`clear()` Clears the state associated with the `BigdataStatement`s in the internal buffer but does not discard the blank nodes or deferred statements.
`protected BigdataValue`	`convertValue(org.openrdf.model.Value value)` Return a canonical `BigdataValue` instance representing the given value.
`long`	`flush()` Converts any buffered statements and any deferred statements and then invokes `overflow()` to flush anything remaining in the buffer.
`AbstractTripleStore`	`getDatabase()` The database from the ctor.
`AbstractTripleStore`	`getStatementStore()` Note: Returns the same value as `getDatabase()` since the distinction is not captured by this class.
`BigdataValueFactory`	`getValueFactory()` The `ValueFactory` for `Statement`s and `Value`s created by this class.
`protected abstract int`	`handleProcessedStatements(G[] a)` Invoked by `overflow()`.
`boolean`	`isEmpty()` `true` if there are no buffered statements and no buffered deferred statements
`protected void`	`overflow()` Invoked each time the `statementBuffer` buffer would overflow.
`protected void`	`processBufferedValues()` Efficiently resolves/adds term identifiers for the buffered `BigdataValue`s.
`protected void`	`processDeferredStatements()` Processes any `BigdataStatement`s in the `deferredStatementBuffer`, adding them to the `statementBuffer`, which may cause the latter to `overflow()`.
`void`	`reset()` Discards all state (term map, bnodes, deferred statements, the buffered statements, and the counter whose value is reported by `flush()`).
`void`	`setBNodeMap(Map<String,BigdataBNode> bnodes)` Set the canonicalizing map for blank nodes based on their ID.
`int`	`size()` #of buffered statements plus the #of buffered statements that are being deferred.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - log
```
protected static final org.apache.log4j.Logger log
```
  - INFO
```
protected static final boolean INFO
```
  - DEBUG
```
protected static final boolean DEBUG
```
  - readOnly
```
protected final boolean readOnly
```
    When true, Values will be resolved against the LexiconRelation and Statements will be resolved against the SPORelation, but unknown Values and unknown Statements WILL NOT be inserted into the corresponding relations.
  - statementBuffer
```
protected final G extends BigdataStatement[] statementBuffer
```
    Buffer for accepted BigdataStatements. This buffer is cleared each time it would overflow.
- Constructor Detail
  - AbstractStatementBuffer
```
public AbstractStatementBuffer(AbstractTripleStore db,
                       boolean readOnly,
                       int capacity)
```
    Parameters:
    db - The database against which the Values will be resolved (or added). If this database supports statement identifiers, then statement identifiers for the converted statements will be resolved (or added) to the lexicon.
    readOnly - When true, Values (and statement identifiers iff enabled) will be resolved against the LexiconRelation, but entries WILL NOT be inserted into the LexiconRelation for unknown Values (or for statement identifiers for unknown Statements when statement identifiers are enabled).
    capacity - The capacity of the backing buffer.
- Method Detail
  - getDatabase
```
public AbstractTripleStore getDatabase()
```
    The database from the ctor.
    
    Specified by:
    
    getDatabase in interface IStatementBuffer<F extends org.openrdf.model.Statement>
  - getStatementStore
```
public AbstractTripleStore getStatementStore()
```
    Note: Returns the same value as getDatabase() since the distinction is not captured by this class. This MUST be overriden in derived classes which make this distinction.
    
    Specified by:
    
    getStatementStore in interface IStatementBuffer<F extends org.openrdf.model.Statement>
  - getValueFactory
```
public BigdataValueFactory getValueFactory()
```
    The ValueFactory for Statements and Values created by this class.
  - setBNodeMap
```
public void setBNodeMap(Map<String,BigdataBNode> bnodes)
```
    Description copied from interface: IStatementBuffer
    
    Set the canonicalizing map for blank nodes based on their ID. This allows you to reuse the same map across multiple IStatementBuffer instances. For example, the BigdataSail does this so that the same bnode map is used throughout the life of a SailConnection. While RIO provides blank node correlation within a given source, it does NOT provide blank node correlation across sources. You need to use this method to do that.
    Note: It is reasonable to expect that the bnodes map is used by concurrent threads. For this reason, the map SHOULD be thread-safe. This can be accomplished either using Collections.synchronizedMap(Map) or a ConcurrentHashMap. However, implementations MUST still be synchronized on the map reference across operations which conditionally insert into the map in order to make that update atomic and thread-safe. Otherwise a race condition exists for the conditional insert and different threads could get incoherent answers.
    
    Specified by:
    
    setBNodeMap in interface IStatementBuffer<F extends org.openrdf.model.Statement>
    
    Parameters:
    bnodes - The blank nodes map.
  - convertValue
```
protected BigdataValue convertValue(org.openrdf.model.Value value)
```
    Return a canonical BigdataValue instance representing the given value. The scope of the canonical instance is until the next internal buffer overflow (URIs and Literals) or until flush() (BNodes, since blank nodes are global for a given source). The purpose of the canonicalizing mapping is to reduce the buffered BigdataValues to the minimum variety required to represent the buffered BigdataStatements, which improves throughput significantly (40%) when resolving terms to the corresponding term identifiers using the LexiconRelation.
    Note: This is not a true canonicalizing map when statement identifiers are used since values used in deferred statements will be held over until the buffer is flush()ed. This relaxation of the canonicalizing mapping is not a problem since the purpose of the mapping is to provide better throughput and nothign relies on a pure canonicalization of the Values.
    
    Parameters:
    value - A value.
    
    Returns:
    The corresponding canonical BigdataValue for the target BigdataValueFactory. This will be null iff the value is null (allows for the context to be undefined).
  - isEmpty
```
public boolean isEmpty()
```
    true if there are no buffered statements and no buffered deferred statements
    
    Specified by:
    
    isEmpty in interface IBuffer<F extends org.openrdf.model.Statement>
  - size
```
public int size()
```
    #of buffered statements plus the #of buffered statements that are being deferred.
    
    Specified by:
    
    size in interface IBuffer<F extends org.openrdf.model.Statement>
  - add
```
public void add(F e)
```
    Imposes a canonical mapping on the subject, predicate, and objects of the given Statements and stores a new BigdataStatement instance in the internal buffer. If the given statement is a BigdataStatement then its StatementEnum will be used. Otherwise the new statement will be StatementEnum.Explicit.
    Note: Unlike the Values, a canonicalizing mapping is NOT imposed for the statements. This is because, unlike the Values, there tends to be little duplication in Statements when processing RDF.
    
    Specified by:
    
    add in interface IStatementBuffer<F extends org.openrdf.model.Statement>
    
    Specified by:
    
    add in interface IBuffer<F extends org.openrdf.model.Statement>
    
    Parameters:
    e - The statement. If stmt implements BigdataStatement then the StatementEnum will be used (this makes it possible to load axioms into the database as axioms) but the term identifiers on the stmt's values will be ignored.
  - add
```
public void add(org.openrdf.model.Resource s,
       org.openrdf.model.URI p,
       org.openrdf.model.Value o)
```
    Description copied from interface: IStatementBuffer
    
    Add an "explicit" statement to the buffer with a "null" context.
    
    Specified by:
    
    add in interface IStatementBuffer<F extends org.openrdf.model.Statement>
    
    Parameters:
    s - The subject.
    p - The predicate.
    o - The object.
  - add
```
public void add(org.openrdf.model.Resource s,
       org.openrdf.model.URI p,
       org.openrdf.model.Value o,
       org.openrdf.model.Resource c)
```
    Description copied from interface: IStatementBuffer
    
    Add an "explicit" statement to the buffer.
    
    Specified by:
    
    add in interface IStatementBuffer<F extends org.openrdf.model.Statement>
    
    Parameters:
    s - The subject.
    p - The predicate.
    o - The object.
    c - The context (optional).
  - add
```
public void add(org.openrdf.model.Resource s,
       org.openrdf.model.URI p,
       org.openrdf.model.Value o,
       org.openrdf.model.Resource c,
       StatementEnum type)
```
    Description copied from interface: IStatementBuffer
    
    Add a statement to the buffer.
    Note: The context parameter (c) is NOT used. The database at this time is either a triple store or a triple store with statement identifiers, and in neither case is the context used.
    
    Specified by:
    
    add in interface IStatementBuffer<F extends org.openrdf.model.Statement>
    
    Parameters:
    s - The subject.
    p - The predicate.
    o - The object.
    c - The context (optional).
    type - The statement type (optional).
  - processBufferedValues
```
protected void processBufferedValues()
```
    Efficiently resolves/adds term identifiers for the buffered BigdataValues.
    If readOnly), then the term identifier for unknown values will remain IRawTripleStore#NULL.
  - processDeferredStatements
```
protected void processDeferredStatements()
```
    Processes any BigdataStatements in the deferredStatementBuffer, adding them to the statementBuffer, which may cause the latter to overflow().
  - overflow
```
protected final void overflow()
```
    Invoked each time the statementBuffer buffer would overflow. This method is responsible for bulk resolving / adding the buffered BigdataValues against the db and adding the fully resolved BigdataStatements to the queue on which the iterator is reading.
  - handleProcessedStatements
```
protected abstract int handleProcessedStatements(G[] a)
```
    Invoked by overflow().
    
    Parameters:
    a - An array of processed BigdataStatements.
    
    Returns:
    The delta that will be added to the counter reported by flush().
  - flush
```
public long flush()
```
    Converts any buffered statements and any deferred statements and then invokes overflow() to flush anything remaining in the buffer.
    
    Specified by:
    
    flush in interface IBuffer<F extends org.openrdf.model.Statement>
    
    Returns:
    The total #of converted statements processed so far. (The counter is reset to zero as a side-effect.)
  - reset
```
public void reset()
```
    Discards all state (term map, bnodes, deferred statements, the buffered statements, and the counter whose value is reported by flush()).
    
    Specified by:
    
    reset in interface IBuffer<F extends org.openrdf.model.Statement>
  - clear
```
protected void clear()
```
    Clears the state associated with the BigdataStatements in the internal buffer but does not discard the blank nodes or deferred statements.

Class AbstractStatementBuffer<F extends org.openrdf.model.Statement,G extends BigdataStatement>

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

log

INFO

DEBUG

readOnly

statementBuffer

Constructor Detail

AbstractStatementBuffer

Method Detail

getDatabase

getStatementStore

getValueFactory

setBNodeMap

convertValue

isEmpty

size

add

add

add

add

processBufferedValues

processDeferredStatements

overflow

handleProcessedStatements

flush

reset

clear