DataLoader.Options (Blazegraph Database Platform 2.1.5 API)

All Superinterfaces:

RDFParserOptions.Options

All Known Subinterfaces:

AbstractTripleStore.Options, BigdataSail.Options, LocalTripleStore.Options, TempTripleStore.Options

Enclosing class:

DataLoader
```
public static interface DataLoader.Options
extends RDFParserOptions.Options
```
Options for the DataLoader. Note: The default for RDFParserOptions.Options#PRESERVE_BNODE_IDS is conditionally overridden when LexiconRelation.isStoreBlankNodes() is true.

Author:

Bryan Thompson

Field Summary

Fields
Modifier and Type	Field and Description
`static String`	`BUFFER_CAPACITY` Optional property specifying the capacity of the `StatementBuffer` (default is "100000" statements).
`static String`	`CLOSURE` Optional property controls whether and when the RDFS(+) closure is maintained on the database as documents are loaded (default ).
`static String`	`COMMIT` Optional property specifying whether and when the `DataLoader` will `ITripleStore.commit()` the database (default ).
`static String`	`DEFAULT_BUFFER_CAPACITY`
`static String`	`DEFAULT_CLOSURE`
`static String`	`DEFAULT_COMMIT`
`static String`	`DEFAULT_DUMP_JOURNAL` The default value (`false`) for {@link #DUMP_JOURNAL)
`static String`	`DEFAULT_DURABLE_QUEUES` The default value (`false`) for {@link #DURABLE_QUEUES)
`static String`	`DEFAULT_FLUSH` The default value (`true`) for `FLUSH`.
`static int`	`DEFAULT_GZIP_BUFFER_SIZE`
`static String`	`DEFAULT_IGNORE_INVALID_FILES` The default value (`false`) for {@link #IGNORE_INVALID_FILES)
`static String`	`DEFAULT_QUEUE_CAPACITY`
`static String`	`DEFAULT_VERBOSE` The default value (`0`) for {@link #VERBOSE)
`static String`	`DUMP_JOURNAL` When true, runs DumpJournal after each commit (with the -pages option) to obtain a distribution of the BTree index page sizes.
`static String`	`DURABLE_QUEUES` When `true`, the data loader will rename each file as it is processed to either `file.good` or `file.fail` to indicate success or failure.
`static String`	`FLUSH` When `true` the `StatementBuffer` is flushed by each `DataLoader.loadData(String, String, RDFFormat)` or `DataLoader.loadData(String[], String[], RDFFormat[])` operation and when `DataLoader.doClosure()` is requested.
`static String`	`GZIP_BUFFER_SIZE` Java property to override the default GZIP buffer size used for `GZipInputStream` and `GZipOutputStream`.
`static String`	`IGNORE_INVALID_FILES` When `true`, the loader will not break on unresolvable parse errors, but instead skip the file containing the error.
`static String`	`QUEUE_CAPACITY` Optional property specifying the capacity of blocking queue used by the `StatementBuffer` -or- ZERO (0) to disable the blocking queue and perform synchronous writes (default is "10" statements).
`static String`	`VERBOSE` When greater than ZERO (0), significant information may be reported at each commit point.

Fields inherited from interface com.bigdata.rdf.rio.RDFParserOptions.Options
DATATYPE_HANDLING, DEFAULT_DATATYPE_HANDLING, DEFAULT_PRESERVE_BNODE_IDS, DEFAULT_STOP_AT_FIRST_ERROR, DEFAULT_VERIFY_DATA, PRESERVE_BNODE_IDS, STOP_AT_FIRST_ERROR, VERIFY_DATA

- Field Detail
  - GZIP_BUFFER_SIZE
```
static final String GZIP_BUFFER_SIZE
```
    Java property to override the default GZIP buffer size used for GZipInputStream and GZipOutputStream. This specifies the size in Bytes to use. The default is 65535. -Dcom.bigdata.journal.DataLoader.gzipBufferSize=65535 See BLZG-1777
  - DEFAULT_GZIP_BUFFER_SIZE
```
static final int DEFAULT_GZIP_BUFFER_SIZE
```
    See Also:
    Constant Field Values
  - COMMIT
```
static final String COMMIT
```
    Optional property specifying whether and when the DataLoader will ITripleStore.commit() the database (default ).
    Note: commit semantics vary depending on the specific backing store. See ITripleStore.commit().
  - DEFAULT_COMMIT
```
static final String DEFAULT_COMMIT
```
  - BUFFER_CAPACITY
```
static final String BUFFER_CAPACITY
```
    Optional property specifying the capacity of the StatementBuffer (default is "100000" statements).
    Note: With BLGZ-1522, the QUEUE_CAPACITY can increase the effective amount of data that is being buffered quite significantly. Caution is recommended when overriding the BUFFER_CAPACITY in combination with a non-zero value of the QUEUE_CAPACITY. The best performance will probably come from small (20k - 50k) buffer capacity values combined with a queueCapacity of 5-20. Larger values will increase the GC burden and could require a larger heap, but the net throughput might also increase.
  - DEFAULT_BUFFER_CAPACITY
```
static final String DEFAULT_BUFFER_CAPACITY
```
    See Also:
    Constant Field Values
  - QUEUE_CAPACITY
```
static final String QUEUE_CAPACITY
```
    Optional property specifying the capacity of blocking queue used by the StatementBuffer -or- ZERO (0) to disable the blocking queue and perform synchronous writes (default is "10" statements). The blocking queue holds parsed data pending writes onto the backing store and makes it possible for the parser to race ahead while writer is blocked writing onto the database indices.
    
    See Also:
    BLZG-1552
  - DEFAULT_QUEUE_CAPACITY
```
static final String DEFAULT_QUEUE_CAPACITY
```
    See Also:
    Constant Field Values
  - CLOSURE
```
static final String CLOSURE
```
    Optional property controls whether and when the RDFS(+) closure is maintained on the database as documents are loaded (default ).
    Note: The InferenceEngine supports a variety of options. When closure is enabled, the caller's Properties will be used to configure an InferenceEngine object to compute the entailments. It is VITAL that the InferenceEngine is always configured in the same manner for a given database with regard to options that control which entailments are computed using forward chaining and which entailments are computed using backward chaining.
    Note: When closure is being maintained the caller's Properties will also be used to provision the TempTripleStore.
    
    See Also:
    InferenceEngine, InferenceEngine.Options
  - DEFAULT_CLOSURE
```
static final String DEFAULT_CLOSURE
```
  - FLUSH
```
static final String FLUSH
```
    When true the StatementBuffer is flushed by each DataLoader.loadData(String, String, RDFFormat) or DataLoader.loadData(String[], String[], RDFFormat[]) operation and when DataLoader.doClosure() is requested. When false the caller is responsible for flushing the DataLoader.buffer. The default is "true".
    This behavior MAY be disabled if you want to chain load a bunch of small documents without flushing to the backing store after each document and DataLoader.loadData(String[], String[], RDFFormat[]) is not well-suited to your purposes. This can be much more efficient, approximating the throughput for large document loads. However, the caller MUST invoke DataLoader.endSource() (or DataLoader.doClosure() if appropriate) once all documents are loaded successfully. If an error occurs during the processing of one or more documents then the entire data load should be discarded (this is always true).
    This feature is most useful when blank nodes are not in use, but it causes memory to grow when blank nodes are in use and forces statements using blank nodes to be deferred until the application flushes the DataLoader when statement identifiers are enabled.
  - DEFAULT_FLUSH
```
static final String DEFAULT_FLUSH
```
    The default value (true) for FLUSH.
    
    See Also:
    Constant Field Values
  - IGNORE_INVALID_FILES
```
static final String IGNORE_INVALID_FILES
```
    When true, the loader will not break on unresolvable parse errors, but instead skip the file containing the error. This option is useful when loading large input that may contain invalid RDF, in order to make sure that the loading process does not fully fail when malicious files are detected. Note that an error will still be logged in case files cannot be loaded, so one is able to track the files that failed.
    
    See Also:
    (Add option to make the DataLoader robust to files that cause rio to throw a fatal exception)
  - DEFAULT_IGNORE_INVALID_FILES
```
static final String DEFAULT_IGNORE_INVALID_FILES
```
    The default value (false) for {@link #IGNORE_INVALID_FILES)
    
    See Also:
    Constant Field Values
  - DURABLE_QUEUES
```
static final String DURABLE_QUEUES
```
    When true, the data loader will rename each file as it is processed to either file.good or file.fail to indicate success or failure. In addition, the default for IGNORE_INVALID_FILES will be true and the default for RDFParserOptions.getStopAtFirstError() will be false.
    
    See Also:
    (durable queues)
  - DEFAULT_DURABLE_QUEUES
```
static final String DEFAULT_DURABLE_QUEUES
```
    The default value (false) for {@link #DURABLE_QUEUES)
    
    See Also:
    Constant Field Values
  - DUMP_JOURNAL
```
static final String DUMP_JOURNAL
```
    When true, runs DumpJournal after each commit (with the -pages option) to obtain a distribution of the BTree index page sizes.
    
    See Also:
    (support dump journal in data loader)
  - DEFAULT_DUMP_JOURNAL
```
static final String DEFAULT_DUMP_JOURNAL
```
    The default value (false) for {@link #DUMP_JOURNAL)
    
    See Also:
    Constant Field Values
  - VERBOSE
```
static final String VERBOSE
```
    When greater than ZERO (0), significant information may be reported at each commit point. At ONE (1) it enables a trace of the parser performance (statements loaded, statements per second, etc). At TWO (2) it provides detailed information about the performance counters at each commit. At THREE (3) it provides additional information about the assertion buffers each time it reports on the incremental parser performance.
  - DEFAULT_VERBOSE
```
static final String DEFAULT_VERBOSE
```
    The default value (0) for {@link #VERBOSE)
    
    See Also:
    Constant Field Values

Interface DataLoader.Options

Field Summary

Fields inherited from interface com.bigdata.rdf.rio.RDFParserOptions.Options

Field Detail

GZIP_BUFFER_SIZE

DEFAULT_GZIP_BUFFER_SIZE

COMMIT

DEFAULT_COMMIT

BUFFER_CAPACITY

DEFAULT_BUFFER_CAPACITY

QUEUE_CAPACITY

DEFAULT_QUEUE_CAPACITY

CLOSURE

DEFAULT_CLOSURE

FLUSH

DEFAULT_FLUSH

IGNORE_INVALID_FILES

DEFAULT_IGNORE_INVALID_FILES

DURABLE_QUEUES

DEFAULT_DURABLE_QUEUES

DUMP_JOURNAL

DEFAULT_DUMP_JOURNAL

VERBOSE

DEFAULT_VERBOSE