public static interface DataLoader.Options extends RDFParserOptions.Options
DataLoader
.
Note: The default for RDFParserOptions.Options#PRESERVE_BNODE_IDS
is conditionally overridden when
LexiconRelation.isStoreBlankNodes()
is true
.Modifier and Type | Field and Description |
---|---|
static String |
BUFFER_CAPACITY
Optional property specifying the capacity of the
StatementBuffer (default is "100000"
statements). |
static String |
CLOSURE
Optional property controls whether and when the RDFS(+) closure is
maintained on the database as documents are loaded (default
).
|
static String |
COMMIT
Optional property specifying whether and when the
DataLoader
will ITripleStore.commit() the database (default
). |
static String |
DEFAULT_BUFFER_CAPACITY |
static String |
DEFAULT_CLOSURE |
static String |
DEFAULT_COMMIT |
static String |
DEFAULT_DUMP_JOURNAL
The default value (
false ) for {@link #DUMP_JOURNAL) |
static String |
DEFAULT_DURABLE_QUEUES
The default value (
false ) for {@link #DURABLE_QUEUES) |
static String |
DEFAULT_FLUSH
The default value (
true ) for FLUSH . |
static int |
DEFAULT_GZIP_BUFFER_SIZE |
static String |
DEFAULT_IGNORE_INVALID_FILES
The default value (
false ) for {@link #IGNORE_INVALID_FILES) |
static String |
DEFAULT_QUEUE_CAPACITY |
static String |
DEFAULT_VERBOSE
The default value (
0 ) for {@link #VERBOSE) |
static String |
DUMP_JOURNAL
When true, runs DumpJournal after each commit (with the -pages option) to obtain a distribution of the BTree index page sizes.
|
static String |
DURABLE_QUEUES
When
true , the data loader will rename each file as it
is processed to either file.good or file.fail
to indicate success or failure. |
static String |
FLUSH
When
true the StatementBuffer is flushed by each
DataLoader.loadData(String, String, RDFFormat) or
DataLoader.loadData(String[], String[], RDFFormat[])
operation and when DataLoader.doClosure() is requested. |
static String |
GZIP_BUFFER_SIZE
Java property to override the default GZIP buffer size used for
GZipInputStream and GZipOutputStream . |
static String |
IGNORE_INVALID_FILES
When
true , the loader will not break on unresolvable
parse errors, but instead skip the file containing the error. |
static String |
QUEUE_CAPACITY
Optional property specifying the capacity of blocking queue used by
the
StatementBuffer -or- ZERO (0) to disable the blocking
queue and perform synchronous writes (default is
"10" statements). |
static String |
VERBOSE
When greater than ZERO (0), significant information may be reported
at each commit point.
|
DATATYPE_HANDLING, DEFAULT_DATATYPE_HANDLING, DEFAULT_PRESERVE_BNODE_IDS, DEFAULT_STOP_AT_FIRST_ERROR, DEFAULT_VERIFY_DATA, PRESERVE_BNODE_IDS, STOP_AT_FIRST_ERROR, VERIFY_DATA
static final String GZIP_BUFFER_SIZE
GZipInputStream
and GZipOutputStream
.
This specifies the size in Bytes to use. The default is 65535.
-Dcom.bigdata.journal.DataLoader.gzipBufferSize=65535
See BLZG-1777static final int DEFAULT_GZIP_BUFFER_SIZE
static final String COMMIT
DataLoader
will ITripleStore.commit()
the database (default
).
Note: commit semantics vary depending on the specific backing store.
See ITripleStore.commit()
.
static final String DEFAULT_COMMIT
static final String BUFFER_CAPACITY
StatementBuffer
(default is "100000"
statements).
Note: With BLGZ-1522, the QUEUE_CAPACITY
can increase the
effective amount of data that is being buffered quite significantly.
Caution is recommended when overriding the BUFFER_CAPACITY
in combination with a non-zero value of the QUEUE_CAPACITY
.
The best performance will probably come from small (20k - 50k) buffer
capacity values combined with a queueCapacity of 5-20. Larger values
will increase the GC burden and could require a larger heap, but the
net throughput might also increase.
static final String DEFAULT_BUFFER_CAPACITY
static final String QUEUE_CAPACITY
StatementBuffer
-or- ZERO (0) to disable the blocking
queue and perform synchronous writes (default is
"10" statements). The blocking queue
holds parsed data pending writes onto the backing store and makes it
possible for the parser to race ahead while writer is blocked writing
onto the database indices.BLZG-1552
static final String DEFAULT_QUEUE_CAPACITY
static final String CLOSURE
Note: The InferenceEngine
supports a variety of options. When
closure is enabled, the caller's Properties
will be used to
configure an InferenceEngine
object to compute the
entailments. It is VITAL that the InferenceEngine
is always
configured in the same manner for a given database with regard to
options that control which entailments are computed using forward
chaining and which entailments are computed using backward chaining.
Note: When closure is being maintained the caller's
Properties
will also be used to provision the
TempTripleStore
.
InferenceEngine
,
InferenceEngine.Options
static final String DEFAULT_CLOSURE
static final String FLUSH
true
the StatementBuffer
is flushed by each
DataLoader.loadData(String, String, RDFFormat)
or
DataLoader.loadData(String[], String[], RDFFormat[])
operation and when DataLoader.doClosure()
is requested. When
false
the caller is responsible for flushing the
DataLoader.buffer
. The default is "true".
This behavior MAY be disabled if you want to chain load a bunch of
small documents without flushing to the backing store after each
document and
DataLoader.loadData(String[], String[], RDFFormat[])
is not
well-suited to your purposes. This can be much more efficient,
approximating the throughput for large document loads. However, the
caller MUST invoke DataLoader.endSource()
(or
DataLoader.doClosure()
if appropriate) once all documents are
loaded successfully. If an error occurs during the processing of one
or more documents then the entire data load should be discarded (this
is always true).
This feature is most useful when blank nodes are not in use,
but it causes memory to grow when blank nodes are in use and forces
statements using blank nodes to be deferred until the application
flushes the DataLoader
when statement identifiers are
enabled.
static final String DEFAULT_FLUSH
true
) for FLUSH
.static final String IGNORE_INVALID_FILES
true
, the loader will not break on unresolvable
parse errors, but instead skip the file containing the error. This
option is useful when loading large input that may contain invalid
RDF, in order to make sure that the loading process does not fully
fail when malicious files are detected. Note that an error will still
be logged in case files cannot be loaded, so one is able to track the
files that failed.(Add option to make the DataLoader robust to files
that cause rio to throw a fatal exception)
static final String DEFAULT_IGNORE_INVALID_FILES
false
) for {@link #IGNORE_INVALID_FILES)static final String DURABLE_QUEUES
true
, the data loader will rename each file as it
is processed to either file.good
or file.fail
to indicate success or failure. In addition, the default for
IGNORE_INVALID_FILES
will be true
and the
default for RDFParserOptions.getStopAtFirstError()
will be
false
.(durable queues)
static final String DEFAULT_DURABLE_QUEUES
false
) for {@link #DURABLE_QUEUES)static final String DUMP_JOURNAL
(support dump journal in data loader)
static final String DEFAULT_DUMP_JOURNAL
false
) for {@link #DUMP_JOURNAL)static final String VERBOSE
static final String DEFAULT_VERBOSE
0
) for {@link #VERBOSE)Copyright © 2006–2019 SYSTAP, LLC DBA Blazegraph. All rights reserved.