ParseOp (Blazegraph Database Platform 2.1.5 API)

java.lang.Object
- com.bigdata.bop.CoreBaseBOp
- - com.bigdata.bop.BOpBase
  - - com.bigdata.bop.PipelineOp
    - - com.bigdata.bop.rdf.update.ParseOp

All Implemented Interfaces:

BOp, IPropertySet, Serializable, Cloneable
```
public class ParseOp
extends PipelineOp
```
Operator parses a RDF data source, writing bindings which represent statements onto the output sink. This operator is compatible with the ChunkedResolutionOp and the InsertStatementsOp.

Version:

$Id$ TODO Examine the integration point for Truth Maintenance (TM).
DataLoader.ClosureEnum and DataLoader.CommitEnum shape the way in which the update plan is generated. They are not options on the ParseOp itself.
We need to setup the assertion and retraction buffers such that they have the appropriate scope or (for database at once closure) we do not setup those buffers but we recompute the closure of the database afterwards.
The assertion buffers might be populated after the IV resolution step and before we write on the indices. We then compute the fixed point of the closure over the delta and then write that onto the database. We should be able to specify that some sources contain data to be removed (INSERT DATA and REMOVE DATA or UNLOAD src). The operation should combine assertions and retractions to be efficient.
See DataLoader. TODO Add an operator which handles a zip archive, creating a LOAD for each resource in that archive. Recursive directory processing is similar. Both should result in multiple ParseOp instances which can run in parallel. Those ParseOp instances will feed the IV resolution, optional TM, and statement writer operations.
If we can make the SOURCE_URI a value expression, then we could flow solutions into the LOAD operation which would be the bindings for the source URI. Very nice! Then we could hash partition the LOAD operator across a cluster and do a parallel load very easily. If the source for those solutions was the parse of a single RDF file (or streamed URI) containing the files to be loaded then we could also gain the indirection necessary to load large numbers of files in parallel on a cluster. TODO In at least the SIDS mode, we need to do some special operations when the statement buffer is flushed. That statement buffer could either be fed directly by the ParserOp or indirectly through solutions modeling statements flowing through the query engine. I am inclined to the latter for better parallelism. Even though there is more stuff on the heap and more latency within the stages, I think that we will get more out of the increased parallelism. TODO Any annotation here should be configurable from the LoadGraph AST node and (ideally) the SPARQL UPDATE syntax. FIXME This does not handle SIDS. The StatementBuffer logic needs to get into InsertStatementsOp for that to work, or the plan needs to be slightly different and hit a different insert operator for statements all together. FIXME This does not handle Truth Maintenance.

Author:

Bryan Thompson

See Also:
PresortRioLoader, StatementBuffer, DataLoader, DataLoader.Options, RDFParserOptions, DataLoader.ClosureEnum, DataLoader.CommitEnum, Serialized Form

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

static interface ParseOp.Annotations
Note: BOp.Annotations#TIMEOUT is respected to limit the read time on an HTTP connection.

Nested Classes
Modifier and Type	Class and Description
`static interface`	`ParseOp.Annotations` Note: `BOp.Annotations#TIMEOUT` is respected to limit the read time on an HTTP connection.

Field Summary

Fields
Modifier and Type	Field and Description
`protected static Var<?>`	`c` The s, p, o, and c variable names.
`protected static Var<?>`	`o` The s, p, o, and c variable names.
`protected static Var<?>`	`p` The s, p, o, and c variable names.
`protected static Var<?>`	`s` The s, p, o, and c variable names.

Fields inherited from class com.bigdata.bop.CoreBaseBOp
DEFAULT_INITIAL_CAPACITY

Fields inherited from interface com.bigdata.bop.BOp
NOANNS, NOARGS

Constructor Summary

Constructors
Constructor and Description

ParseOp(BOp[] args, Map<String,Object> annotations)

ParseOp(ParseOp op)

Constructors
Constructor and Description
`ParseOp(BOp[] args, Map<String,Object> annotations)`
`ParseOp(ParseOp op)`

Method Summary

Methods
Modifier and Type	Method and Description
`FutureTask<Void>`	`eval(BOpContext<IBindingSet> context)` Return a `FutureTask` which computes the operator against the evaluation context.
`ParserStats`	`newStats()` Return a new object which can be used to collect statistics on the operator evaluation.

Methods inherited from class com.bigdata.bop.PipelineOp
assertAtOnceJavaHeapOp, assertMaxParallelOne, getChunkCapacity, getChunkOfChunksCapacity, getChunkTimeout, getMaxMemory, getMaxParallel, isAtOnceEvaluation, isBlockedEvaluation, isLastPassRequested, isPipelinedEvaluation, isReorderSolutions, isSharedState

Methods inherited from class com.bigdata.bop.BOpBase
__replaceArg, _clearProperty, _set, _setProperty, annotations, annotationsCopy, annotationsEqual, annotationsRef, argIterator, args, argsCopy, arity, clearAnnotations, clearProperty, deepCopy, deepCopy, get, getProperty, setArg, setProperty, setUnboundProperty, toArray, toArray

Methods inherited from class com.bigdata.bop.CoreBaseBOp
annotationsEqual, annotationsToString, annotationsToString, annotationValueToString, checkArgs, clone, equals, getEvaluationContext, getId, getProperty, getRequiredProperty, hashCode, indent, isController, mutation, shortenName, toShortString, toString, toString

Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait

- Field Detail
  - s
```
protected static final Var<?> s
```
    The s, p, o, and c variable names.
  - p
```
protected static final Var<?> p
```
    The s, p, o, and c variable names.
  - o
```
protected static final Var<?> o
```
    The s, p, o, and c variable names.
  - c
```
protected static final Var<?> c
```
    The s, p, o, and c variable names.
- Constructor Detail
  - ParseOp
```
public ParseOp(BOp[] args,
       Map<String,Object> annotations)
```
  - ParseOp
```
public ParseOp(ParseOp op)
```
- Method Detail
  - newStats
```
public ParserStats newStats()
```
    Description copied from class: PipelineOp
    
    Return a new object which can be used to collect statistics on the operator evaluation. This may be overridden to return a more specific class depending on the operator.
    
    Overrides:
    
    newStats in class PipelineOp
  - eval
```
public FutureTask<Void> eval(BOpContext<IBindingSet> context)
```
    Description copied from class: PipelineOp
    
    Return a FutureTask which computes the operator against the evaluation context. The caller is responsible for executing the FutureTask (this gives them the ability to hook the completion of the computation).
    
    Specified by:
    
    eval in class PipelineOp
    
    Parameters:
    context - The evaluation context.
    
    Returns:
    The FutureTask which will compute the operator's evaluation.

Class ParseOp

Nested Class Summary

Field Summary

Fields inherited from class com.bigdata.bop.CoreBaseBOp

Fields inherited from interface com.bigdata.bop.BOp

Constructor Summary

Method Summary

Methods inherited from class com.bigdata.bop.PipelineOp

Methods inherited from class com.bigdata.bop.BOpBase

Methods inherited from class com.bigdata.bop.CoreBaseBOp

Methods inherited from class java.lang.Object

Field Detail

s

p

o

c

Constructor Detail

ParseOp

ParseOp

Method Detail

newStats

eval