public static interface OverflowManager.Options extends IndexManager.Options, IServiceShutdown.Options
OverflowManager
.Modifier and Type | Field and Description |
---|---|
static String |
ACCELERATE_SPLIT_THRESHOLD
The #of index partitions below which we will accelerate the decision
to split an index partition (default
"20").
|
static String |
BUILD_SERVICE_CORE_POOL_SIZE
The #of threads in the pool handling index segment builds from the
old journal.
|
static String |
COPY_INDEX_THRESHOLD
Index partitions having no more than this many entries as reported by
a range count will be copied to the new journal during synchronous
overflow processing rather than building a new index segment from the
buffered writes (default "1000").
|
static String |
DEFAULT_ACCELERATE_SPLIT_THRESHOLD |
static String |
DEFAULT_BUILD_SERVICE_CORE_POOL_SIZE |
static String |
DEFAULT_COPY_INDEX_THRESHOLD |
static String |
DEFAULT_HOT_SPLIT_THRESHOLD |
static String |
DEFAULT_JOINS_ENABLED |
static String |
DEFAULT_MAXIMUM_BUILD_SEGMENTS_BYTES |
static String |
DEFAULT_MAXIMUM_JOURNALS_PER_VIEW |
static String |
DEFAULT_MAXIMUM_MOVE_PERCENT_OF_SPLIT |
static String |
DEFAULT_MAXIMUM_MOVES |
static String |
DEFAULT_MAXIMUM_MOVES_PER_TARGET |
static String |
DEFAULT_MAXIMUM_SEGMENTS_PER_VIEW |
static String |
DEFAULT_MERGE_SERVICE_CORE_POOL_SIZE |
static String |
DEFAULT_MINIMUM_ACTIVE_INDEX_PARTITIONS |
static String |
DEFAULT_MOVE_PERCENT_CPU_TIME_THRESHOLD |
static String |
DEFAULT_NOMINAL_SHARD_SIZE |
static String |
DEFAULT_OPTIONAL_COMPACTING_MERGES_PER_OVERFLOW |
static String |
DEFAULT_OVERFLOW_CANCELLED_WHEN_JOURNAL_FULL |
static String |
DEFAULT_OVERFLOW_ENABLED |
static String |
DEFAULT_OVERFLOW_MAX_COUNT |
static String |
DEFAULT_OVERFLOW_TASKS_CONCURRENT |
static String |
DEFAULT_OVERFLOW_THRESHOLD |
static String |
DEFAULT_OVERFLOW_TIMEOUT
The default timeout in milliseconds for asynchronous overflow
processing (equivalent to 10 minutes).
|
static String |
DEFAULT_PERCENT_OF_SPLIT_THRESHOLD |
static String |
DEFAULT_SCATTER_SPLIT_ENABLED |
static String |
DEFAULT_TAIL_SPLIT_THRESHOLD |
static String |
HOT_SPLIT_THRESHOLD
Deprecated.
Hot splits are not implemented and this option does not
do anything. It will be going away soon.
|
static String |
JOINS_ENABLED
Option may be used to disable index partition joins.
|
static String |
MAXIMUM_BUILD_SEGMENT_BYTES
Option limits the #of
IndexSegmentStore bytes that an
OverflowActionEnum.Build operation will process (default
"20971520"). |
static String |
MAXIMUM_JOURNALS_PER_VIEW
Deprecated.
merges are now performed in priority order while time
remains in a given asynchronous overflow cycle.
|
static String |
MAXIMUM_MOVE_PERCENT_OF_SPLIT
This is the maximum percentage (in [0:2]) of a full index partition
which will be considered for a move (default
".8").
|
static String |
MAXIMUM_MOVES
Deprecated.
Moves are now decided on a case by case basis. An
alternative parameter might be introduced in the future
to restrict the rate at which a DS can shed shards by
moving them to other nodes.
|
static String |
MAXIMUM_MOVES_PER_TARGET
Deprecated.
Moves are now decided on a case by case basis. An
alternative parameter might be introduced in the future
to restrict the rate at which a DS can shed shards by
moving them to other nodes.
Note: This is also used to disable moves by some of the unit tests so we need a way to replace that functionality before this can be taken out. |
static String |
MAXIMUM_OPTIONAL_MERGES_PER_OVERFLOW
Deprecated.
merges are now performed in priority order while time
remains in a given asynchronous overflow cycle.
|
static String |
MAXIMUM_SEGMENTS_PER_VIEW
Deprecated.
merges are now performed in priority order while time
remains in a given asynchronous overflow cycle.
|
static String |
MERGE_SERVICE_CORE_POOL_SIZE
The #of threads in the pool handling index partition merges.
|
static String |
MINIMUM_ACTIVE_INDEX_PARTITIONS
The minimum #of active index partitions on a data service before the
resource manager will consider moving an index partition to another
service (default "1").
|
static String |
MOVE_PERCENT_CPU_TIME_THRESHOLD
The threshold for a service to consider itself sufficiently loaded
that it will consider moving an index partition (default
".7").
|
static String |
NOMINAL_SHARD_SIZE
The nominal size on the size of a full index partition (~200MB).
|
static String |
OVERFLOW_CANCELLED_WHEN_JOURNAL_FULL
Deprecated.
Asynchronous overflow processing should run to completion
with a minimum goal of an incremental build for each
index partition having data on the previous journal.
|
static String |
OVERFLOW_ENABLED
Boolean property determines whether or not
IResourceManager.overflow() processing is enabled (default
"true"). |
static String |
OVERFLOW_MAX_COUNT
Deprecated.
This is no longer used, even for testing.
|
static String |
OVERFLOW_TASKS_CONCURRENT
Deprecated.
|
static String |
OVERFLOW_THRESHOLD
Floating point property specifying the percentage of the maximum
extent at which synchronous overflow processing will be triggered
(default
DEFAULT_OVERFLOW_THRESHOLD ). |
static String |
OVERFLOW_TIMEOUT
Deprecated.
Asynchronous overflow processing should run to completion
with a minimum goal of an incremental build for each
index partition having data on the previous journal.
|
static String |
PERCENT_OF_SPLIT_THRESHOLD
The minimum percentage (where
1.0 corresponds to 100
percent) that an index partition must constitute of a nominal index
partition before a head or tail split will be considered (default
".9"). |
static String |
SCATTER_SPLIT_ENABLED
Boolean option indicates whether or not scatter splits are allowed
(default ) on this service.
|
static String |
TAIL_SPLIT_THRESHOLD
The minimum percentage (in [0:1]) of leaf splits which must be in the
tail of the index partition before a tail split of an index partition
will be considered (default ".4").
|
DEFAULT_INDEX_CACHE_CAPACITY, DEFAULT_INDEX_CACHE_TIMEOUT, DEFAULT_INDEX_SEGMENT_CACHE_CAPACITY, DEFAULT_INDEX_SEGMENT_CACHE_TIMEOUT, INDEX_CACHE_CAPACITY, INDEX_CACHE_TIMEOUT, INDEX_SEGMENT_CACHE_CAPACITY, INDEX_SEGMENT_CACHE_TIMEOUT
ACCELERATE_OVERFLOW_THRESHOLD, DATA_DIR, DEFAULT_ACCELERATE_OVERFLOW_THRESHOLD, DEFAULT_IGNORE_BAD_FILES, DEFAULT_PURGE_OLD_RESOURCES_DURING_STARTUP, DEFAULT_STORE_CACHE_CAPACITY, DEFAULT_STORE_CACHE_TIMEOUT, IGNORE_BAD_FILES, PURGE_OLD_RESOURCES_DURING_STARTUP, STORE_CACHE_CAPACITY, STORE_CACHE_TIMEOUT
ALTERNATE_ROOT_BLOCK, BUFFER_MODE, CREATE, CREATE_TEMP_FILE, CREATE_TIME, DEFAULT_BUFFER_MODE, DEFAULT_CREATE, DEFAULT_CREATE_TEMP_FILE, DEFAULT_DELETE_ON_CLOSE, DEFAULT_DELETE_ON_EXIT, DEFAULT_DOUBLE_SYNC, DEFAULT_FILE_LOCK_ENABLED, DEFAULT_FORCE_ON_COMMIT, DEFAULT_FORCE_WRITES, DEFAULT_HALOG_COMPRESSOR, DEFAULT_HISTORICAL_INDEX_CACHE_CAPACITY, DEFAULT_HISTORICAL_INDEX_CACHE_TIMEOUT, DEFAULT_HOT_CACHE_SIZE, DEFAULT_HOT_CACHE_THRESHOLD, DEFAULT_INITIAL_EXTENT, DEFAULT_LIVE_INDEX_CACHE_CAPACITY, DEFAULT_LIVE_INDEX_CACHE_TIMEOUT, DEFAULT_MAXIMUM_EXTENT, DEFAULT_MINIMUM_EXTENSION, DEFAULT_READ_CACHE_BUFFER_COUNT, DEFAULT_READ_ONLY, DEFAULT_USE_DIRECT_BUFFERS, DEFAULT_VALIDATE_CHECKSUM, DEFAULT_WRITE_CACHE_BUFFER_COUNT, DEFAULT_WRITE_CACHE_COMPACTION_THRESHOLD, DEFAULT_WRITE_CACHE_ENABLED, DEFAULT_WRITE_CACHE_MIN_CLEAN_LIST_SIZE, DELETE_ON_CLOSE, DELETE_ON_EXIT, DOUBLE_SYNC, FILE, FILE_LOCK_ENABLED, FORCE_ON_COMMIT, FORCE_WRITES, HALOG_COMPRESSOR, HISTORICAL_INDEX_CACHE_CAPACITY, HISTORICAL_INDEX_CACHE_TIMEOUT, HOT_CACHE_SIZE, HOT_CACHE_THRESHOLD, IGNORE_BAD_ROOT_BLOCK, INITIAL_EXTENT, JNL, LIVE_INDEX_CACHE_CAPACITY, LIVE_INDEX_CACHE_TIMEOUT, MAXIMUM_EXTENT, MEM_MAX_EXTENT, MINIMUM_EXTENSION, minimumInitialExtent, minimumMinimumExtension, OFFSET_BITS, OTHER_MAX_EXTENT, READ_CACHE_BUFFER_COUNT, READ_ONLY, RW_MAX_EXTENT, SEG, TMP_DIR, UPDATE_ICU_VERSION, USE_DIRECT_BUFFERS, VALIDATE_CHECKSUM, WRITE_CACHE_BUFFER_COUNT, WRITE_CACHE_COMPACTION_THRESHOLD, WRITE_CACHE_ENABLED, WRITE_CACHE_MIN_CLEAN_LIST_SIZE
DEFAULT_SHUTDOWN_TIMEOUT, SHUTDOWN_TIMEOUT
static final String OVERFLOW_ENABLED
IResourceManager.overflow()
processing is enabled (default
"true"). When disabled the journal will
grow without bounds, IndexSegment
s will never be generated
and index partitions will not be split, joined nor moved away from
this ResourceManager
.static final String DEFAULT_OVERFLOW_ENABLED
static final String OVERFLOW_MAX_COUNT
Options.OFFSET_BITS
for scale-up.static final String DEFAULT_OVERFLOW_MAX_COUNT
static final String OVERFLOW_THRESHOLD
DEFAULT_OVERFLOW_THRESHOLD
). The value is
multiplied into the configured
Options.MAXIMUM_EXTENT
. If the result is
GTE the current extend of the live journal, then synchronous overflow
processing will be triggered. However, note that synchronous overflow
processing can not be triggered until asynchronous overflow
processing for the last journal is complete. Therefore if
asynchronous overflow processing takes a long time, the overflow
threshold might not be checked until after it has already been
exceeded.
The main purpose of this property is to trigger overflow processing
before the maximum extent is exceeded. The trigger needs to lead the
maximum extent somewhat since overflow processing can not proceed
until there is an exclusive lock on the write service, and tasks
already running will continue to write on the live journal.
Overflowing the maximum extent is not a problem as long as the
BufferMode
supports transparent extension of the journal.
However, some BufferMode
s do not and therefore they can not
be used reliably with the overflow manager.
static final String DEFAULT_OVERFLOW_THRESHOLD
static final String COPY_INDEX_THRESHOLD
DEFAULT_COPY_INDEX_THRESHOLD
static final String DEFAULT_COPY_INDEX_THRESHOLD
static final String ACCELERATE_SPLIT_THRESHOLD
IDataService
. Since each index (partition) is
single threaded for writes, we can increase the potential concurrency
if we split the initial index partition. We accelerate decisions to
split index partitions by reducing the minimum and target #of tuples
per index partition for an index with fewer than the #of index
partitions specified by this parameter. When ZERO (0) this feature is
disabled and we do not count the #of index partitions.static final String DEFAULT_ACCELERATE_SPLIT_THRESHOLD
static final String PERCENT_OF_SPLIT_THRESHOLD
1.0
corresponds to 100
percent) that an index partition must constitute of a nominal index
partition before a head or tail split will be considered (default
".9"). Values near to and
greater than 1.0
are permissible and imply that the
post-split leftSibling index partition will be approximately a
nominal index partition. However the maximum percentage may not be
greater than 2.0
(200 percent).static final String DEFAULT_PERCENT_OF_SPLIT_THRESHOLD
static final String TAIL_SPLIT_THRESHOLD
static final String DEFAULT_TAIL_SPLIT_THRESHOLD
static final String HOT_SPLIT_THRESHOLD
static final String DEFAULT_HOT_SPLIT_THRESHOLD
static final String SCATTER_SPLIT_ENABLED
IndexMetadata.Options#SCATTER_SPLIT_ENABLED
static final String DEFAULT_SCATTER_SPLIT_ENABLED
static final String JOINS_ENABLED
ACCELERATE_SPLIT_THRESHOLD
behaviors since the target for
the split size increases as a function of the #of index partitions.
For example, a scatter split can cause the adjust nominal size of a
shard to jump to its configured setting, which will cause the shards
to be "undercapacity" and hence drive JOINs. In order to fix this we
have to somehow discount joins, either by requiring deletes on the
index partition or by waiting some #of overflows since the split,
etc. Alternatively, joins could be ignored unless there are more
partitions of a given index than were (or would be) produced by a
scatter split. For the moment joins are disabled by default.static final String DEFAULT_JOINS_ENABLED
static final String MINIMUM_ACTIVE_INDEX_PARTITIONS
Note: This makes sure that we don't do a move if there are only a few active index partitions on this service. This value is also used to place an upper bound on the #of index partitions that can be moved away from this service - if we move too many (or too many at once) then this service stands a good chance of becoming under-utilized and index partitions will just bounce around which is very inefficient.
Note: Even when only a single index partition for a new scale-out index is initially allocated on this service, if it is active and growing it will eventually split into enough index partitions that we will begin to re-distribute those index partitions across the federation.
Note: Index partitions are considered to be "active" iff
ITx.UNISOLATED
or ITx.READ_COMMITTED
operations are
run against the index partition during the life cycle of the live
journal. There may be many other index partitions on the same service
that either are never read or are subject only to historical reads.
However, since only the current state of the index partition is
moved, not its history, moving index partitions which are only the
target for historical reads will not reduce the load on the service.
Instead, read burdens are reduced using replication.
static final String DEFAULT_MINIMUM_ACTIVE_INDEX_PARTITIONS
static final String MAXIMUM_MOVES
Note: Index partition moves MAY be disabled by setting this property to ZERO (0).
DEFAULT_MAXIMUM_MOVES
static final String DEFAULT_MAXIMUM_MOVES
static final String MAXIMUM_MOVES_PER_TARGET
Note: This is also used to disable moves by some of the unit tests so we need a way to replace that functionality before this can be taken out.
Note: Index partitions are moved to the identified under-utilized services using a round-robin approach which aids in distributing the load across the federation.
Note: Index partition moves MAY be disabled by setting this property to ZERO (0).
DEFAULT_MAXIMUM_MOVES_PER_TARGET
static final String DEFAULT_MAXIMUM_MOVES_PER_TARGET
static final String MAXIMUM_MOVE_PERCENT_OF_SPLIT
static final String DEFAULT_MAXIMUM_MOVE_PERCENT_OF_SPLIT
static final String MOVE_PERCENT_CPU_TIME_THRESHOLD
static final String DEFAULT_MOVE_PERCENT_CPU_TIME_THRESHOLD
static final String MAXIMUM_OPTIONAL_MERGES_PER_OVERFLOW
Once this #of optional compacting merge tasks have been identified
for a given overflow event, the remainder of the index partitions
that are neither split, joined, moved, nor copied will use
incremental builds. An incremental build is generally cheaper since
it only copies the data on the mutable BTree
for the
lastCommitTime rather than the fused view. A compacting merge permits
the older index segments to be released and results in a simpler view
with view IndexSegment
s. Either a compacting merge or an
incremental build will permit old journals to be released once the
commit points on those journals are no longer required.
Note: Mandatory compacting merges are identified based on
MAXIMUM_JOURNALS_PER_VIEW
and
MAXIMUM_SEGMENTS_PER_VIEW
. There is NO limit the #of
mandatory compacting merges that will be performed during an
asynchronous overflow event. However, each mandatory compacting merge
does count towards the maximum #of optional merges. Therefore if the
#of mandatory compacting merges is greater than this parameter then
NO optional compacting merges will be selected in a given overflow
cycle.
static final String DEFAULT_OPTIONAL_COMPACTING_MERGES_PER_OVERFLOW
static final String MAXIMUM_JOURNALS_PER_VIEW
OverflowActionEnum.Copy
is not
selected. As long as index partition splits, builds or merges are
performed the #of journals in the view WILL NOT exceed 2 and will
always be ONE (1) after an asynchronous overflow in which a split,
build or merge was performed.
It is extremely important to perform compacting merges in order to release dependencies on old resources (both journals and index segments) and keep down the #of sources in a view. This is especially true when those sources are journals. Journals are organized by write access, not read access. Once the backing buffer for a journal is released there will be large spikes in IOWAIT when reading on an old journal as reads are more or less random.
Note: The MAXIMUM_OPTIONAL_MERGES_PER_OVERFLOW
will be
ignored if a compacting merge is recommended for an index partition
based on this parameter.
Note: Synchronous overflow will refuse to copy tuples for an index
partition whose mutable BTree
otherwise satisfies the
COPY_INDEX_THRESHOLD
if the #of sources in the view exceeds
thresholds which demand a compacting merge.
static final String DEFAULT_MAXIMUM_JOURNALS_PER_VIEW
static final String MAXIMUM_SEGMENTS_PER_VIEW
It is extremely important to perform compacting merges in order to
release dependencies on old resources (both journals and index
segments) and keep down the #of sources in a view. However, this is
less important when those resources are IndexSegment
s since
they are very efficient for read operations. In this case the main
driver is to reduce the complexity of the view, to require fewer open
index segments (and associated resources) in order to materialize the
view, and to make it possible to release index segments and thus have
less of a footprint on the disk.
Note: The MAXIMUM_OPTIONAL_MERGES_PER_OVERFLOW
will be
ignored if a compacting merge is recommended for an index partition
based on this parameter.
Note: Synchronous overflow will refuse to copy tuples for an index
partition whose mutable BTree
otherwise satisfies the
COPY_INDEX_THRESHOLD
if the #of sources in the view exceeds
thresholds which demand a compacting merge.
static final String DEFAULT_MAXIMUM_SEGMENTS_PER_VIEW
static final String MAXIMUM_BUILD_SEGMENT_BYTES
IndexSegmentStore
bytes that an
OverflowActionEnum.Build
operation will process (default
"20971520"). Given that the
nominal size of an index partition is 200M, a reasonable value for
this might be 1/10th to 1/5th of that, so 20-40M. The key is to keep
the builds fast so they should not do too much work while reducing
the frequency with which we must do a compacting merge. This option
only effects the #of IndexSegment
s that will be incorporated
into an OverflowActionEnum.Build
operation. When ZERO (0L),
OverflowActionEnum.Build
operations will only include the
data from the historical journal.static final String DEFAULT_MAXIMUM_BUILD_SEGMENTS_BYTES
static final String OVERFLOW_TIMEOUT
DEFAULT_OVERFLOW_TIMEOUT
). Any overflow
task that does not complete within this timeout will be canceled.
Asynchronous overflow processing is responsible for splitting, moving, and joining index partitions. The asynchronous overflow tasks are written to fail "safe". Also, each task may succeed or fail on its own. Iff the task succeeds, then its effect is made restart safe. Otherwise clients continue to use the old view of the index partition.
If asynchronous overflow processing DOES NOT complete each time then we run several very serious and non-sustainable risks, including: (a) the #of sources in a view can increase without limit; and (b) the #of journal that must be retained can increase without limit.
static final String DEFAULT_OVERFLOW_TIMEOUT
static final String OVERFLOW_TASKS_CONCURRENT
static final String DEFAULT_OVERFLOW_TASKS_CONCURRENT
static final String OVERFLOW_CANCELLED_WHEN_JOURNAL_FULL
static final String DEFAULT_OVERFLOW_CANCELLED_WHEN_JOURNAL_FULL
static final String BUILD_SERVICE_CORE_POOL_SIZE
static final String DEFAULT_BUILD_SERVICE_CORE_POOL_SIZE
static final String MERGE_SERVICE_CORE_POOL_SIZE
static final String DEFAULT_MERGE_SERVICE_CORE_POOL_SIZE
static final String NOMINAL_SHARD_SIZE
Note: If you modify this, you may also need to modify the size of the
buffers in the DirectBufferPool
used to fully buffer the
nodes region of the index segment file.
static final String DEFAULT_NOMINAL_SHARD_SIZE
Copyright © 2006–2019 SYSTAP, LLC DBA Blazegraph. All rights reserved.