public interface IDataService extends ITxCommitProtocol, IService, IRemoteExecutor
The data service interface provides remote access to named indices, provides
for both unisolated and isolated operations on those indices, and exposes the
ITxCommitProtocol
interface to the ITransactionManagerService
service for the coordination of distributed transactions. Clients normally
write to the IIndex
interface. The ClientIndexView
provides
an implementation of that interface supporting range partitioned scale-out
indices which transparently handles lookup of data services in the metadata
index and mapping of operations across the appropriate data services.
Indices are identified by name. Scale-out indices are broken into index
partitions, each of which is a named index hosted on a data service. The name
of an index partition is given by
DataService.getIndexPartitionName(String, int)
. Clients are
strongly encouraged to use the ClientIndexView
which
encapsulates lookup and distribution of operations on range partitioned
scale-out indices.
The data service exposes both fully isolated read-write transactions, read-only transactions, lightweight read-historical operations, and unisolated operations on named indices. These choices are captured by the timestamp associated with the operation. When it is a transaction, this is also known as the transaction identifier or tx. The following distinctions are available:
Unisolated operation specify ITx.UNISOLATED
as their transaction
identifier. Unisolated operations are ACID, but their scope is limited to the
commit group on the data service where the operation is executed. Unisolated
operations correspond more or less to read-committed semantics except that
writes are immediately visible to other operations in the same commit group.
Unisolated operations that allow writes obtain an exclusive lock on the live version of the named index for the duration of the operation. Unisolated operations that are declared as read-only read from the last committed state of the named index and therefore do not compete with read-write unisolated operations. This allows unisolated read operations to achieve higher concurrency. The effect is as if the unisolated read operation runs before the unisolated writes in a given commit group since the impact of those writes are not visible to unisolated readers until the next commit point.
Unisolated write operations MAY be used to achieve "auto-commit" semantics when distributed transactions are not required. Fully isolated transactions are useful when multiple operations must be composed into a ACID unit.
While unisolated operations on a single data service are ACID, clients
generally operate against scale-out indices having multiple index partitions
hosted on multiple data services. Therefore client MUST NOT assume that an
unisolated operation described by the client against a scale-out index will
be ACID when that operation is distributed across the various index
partitions relevant to the client's request. In practice, this means that
contract for ACID unisolated operations is limited to either: (a) operations
where the data is located on a single data service instance; or (b)
unisolated operations that are inherently designed to achieve a
consistent result. Sometimes it is sufficient to configure a
scale-out index such that index partitions never split some logical unit -
for example, the {schema + primaryKey} for a SparseRowStore
, thereby
obtaining an ACID guarentee since operations on a logical row will always
occur within the same index partition.
tx
, where tx
is a timestamp and is associated with the closest commit point LTE to the
timestamp. A historical read is fully isolated but has very low overhead and
does NOT require the caller to open the transaction. The read will have a
consistent view of the data as of the most recent commit point not greater
than tx. Unlike a distributed read-only transaction, a historical
read does NOT impose a distributed read lock. While the operation will have
access to the necessary resources on the local data service, it is possible
that resources for the same timestamp will be concurrently released on other
data services. If you need to map a read operation across the distributed
database, the you must use a read only transaction which will assert the
necessary read-lock.ITransactionManagerService
service and incur more overhead than both
unisolated and historical read operations. Transactions are assigned a start
time (the transaction identifier) when they begin and must be explicitly
closed by either an abort or a commit. Both read-only and read-write
transactions assert read locks which force the retention of resources
required for a consistent view as of the transaction start time until the
transaction is closed.Implementations of this interface MUST be thread-safe. Methods declared by this interface MUST block for each operation. Client operations SHOULD be buffered by a thread pool with a FIFO policy so that client requests may be decoupled from data service operations and clients may achieve greater parallelism.
Scale-out indices are broken tranparently down into index partitions. When a scale-out index is initially registered, one or more index partitions are created and registered on one or more data services.
Note that each index partitions is just an IIndex
registered under
the name assigned by DataService.getIndexPartitionName(String, int)
and whose IndexMetadata.getPartitionMetadata()
returns a description
of the resources required to compose a view of that index partition from the
resources located on a DataService
. The IDataService
will
respond for that index partition IFF there is an index under that name
registered on the IDataService
as of the timestamp associated
with the request. If the index is not registered then a
NoSuchIndexException
will be thrown. If the index was registered and
has since been split, joined or moved then a StaleLocatorException
will be thrown (this will occur only for index partitions of scale-out
indices). All methods on this and derived interfaces which are
defined for an index name and timestamp MUST conform to these semantics.
As index partitions grow in size they may be split into 2 or more
index partitions covering the same key range as the original index partition.
When this happens a new index partition identifier is assigned by the
metadata service to each of the new index partitions and the old index
partition is retired in an atomic operation. A similar operation can
move an index partition to a different IDataService
in
order to load balance a federation. Finally, when two index partitions shrink
in size, they maybe moved to the same IDataService
and an atomic
join operation may re-combine them into a single index partition
spanning the same key range.
Split, join, and move operations all result in the old index partition being
dropped on the IDataService
. Clients having a stale
PartitionLocator
record will attempt to reach the now defunct index
partition after it has been dropped and will receive a
StaleLocatorException
.
StaleLocatorException
IDataService
clients MUST handle this exception by refreshing their
cached PartitionLocator
for the key range associated with the index
partition which they wish to query and then re-issuing their request. By
following this simple rule the client will automatically handle index
partition splits, joins, and moves without error and in a manner which is
completely transparent to the application. Note that splits, joins, and moves
DO NOT alter the PartitionLocator
for historical reads, only for
ongoing writes. This exception is generally (but not always) wrapped.
Applications typically DO NOT write directly to the IDataService
interface and therefore DO NOT need to worry about this. See
ClientIndexView
, which automatically handles this exception.
IOException
All methods on this and derived interfaces can throw an IOException
.
In all cases an unwrapped exception that is an instance of
IOException
indicates an error in the Remote Method Invocation (RMI)
layer.
ExecutionException
and InterruptedException
An unwrapped ExecutionException
or
InterruptedException
indicates a problem when running the request as
a task in the IConcurrencyManager
on the IDataService
. The
exception always wraps a root cause which may indicate the underlying
problem. Methods which do not declare these exceptions are not run under the
IConcurrencyManager
.
Modifier and Type | Method and Description |
---|---|
void |
dropIndex(String name)
Drops the named index.
|
void |
forceOverflow(boolean immediate,
boolean compactingMerge)
Method sets a flag that will force overflow processing during the next
group commit and optionally forces a group commit (Note: This
method exists primarily for unit tests and benchmarking activities and
SHOULD NOT be used on a deployed federation as the overhead associated
with a compacting merge of each index partition can be significant).
|
long |
getAsynchronousOverflowCounter()
The #of asynchronous overflows that have taken place on this data service
(the counter is not restart safe).
|
IndexMetadata |
getIndexMetadata(String name,
long timestamp)
Return the metadata for the named index.
|
IQueryPeer |
getQueryEngine()
Return the
IQueryPeer running on this service. |
boolean |
isOverflowActive()
Return
true iff the data service is currently engaged in
overflow processing. |
boolean |
purgeOldResources(long timeout,
boolean truncateJournal)
This attempts to pause the service accepting
ITx.UNISOLATED
writes and then purges any resources that are no longer required based on
the StoreManager.Options#MIN_RELEASE_AGE . |
ResultSet |
rangeIterator(long tx,
String name,
byte[] fromKey,
byte[] toKey,
int capacity,
int flags,
IFilter filter)
Streaming traversal of keys and/or values in a key range.
|
IBlock |
readBlock(IResourceMetadata resource,
long addr)
Deprecated.
This was a first try at adding support for reading low-level records
from a journal or index segment in support of the
BigdataFileSystem .
The API should provide a means to obtain a socket from which record data may be streamed. The client sends the resource identifier (UUID of the journal or index segment) and the address of the record and the data service sends the record data. This is designed for streaming reads of up to 64M or more (a record recorded on the store as identified by the address). |
void |
registerIndex(String name,
IndexMetadata metadata)
Register a named mutable index on the
DataService . |
Future<? extends Object> |
submit(Callable<? extends Object> proc)
|
<T> Future<T> |
submit(long tx,
String name,
IIndexProcedure<T> proc)
Submit a procedure.
|
abort, prepare, setReleaseTime, singlePhaseCommit
destroy, getHostname, getServiceIface, getServiceName, getServiceUUID
void registerIndex(String name, IndexMetadata metadata) throws IOException, InterruptedException, ExecutionException
DataService
.
Note: In order to register an index partition the
partition metadata
property
MUST be set. The resources
property will then be overriden when the index is actually registered so
as to reflect the IResourceMetadata
description of the journal on
which the index actually resides.
name
- The name that can be used to recover the index. In order to
create a partition of an index you must form the name of the
index partition using
DataService.getIndexPartitionName(String, int)
(this
operation is generally performed by the
IMetadataService
which manages scale-out indices).metadata
- The metadata describing the index.
The LocalPartitionMetadata.getResources()
property on
the IndexMetadata.getPartitionMetadata()
SHOULD NOT be
set. The correct IResourceMetadata
[] will be assigned
when the index is registered on the IDataService
.
IOException
InterruptedException
ExecutionException
IndexMetadata getIndexMetadata(String name, long timestamp) throws IOException, InterruptedException, ExecutionException
name
- The index name.timestamp
- A transaction identifier, ITx.UNISOLATED
for the
unisolated index view, ITx.READ_COMMITTED
, or
timestamp
for a historical view no later than
the specified timestamp.IOException
InterruptedException
ExecutionException
void dropIndex(String name) throws IOException, InterruptedException, ExecutionException
Note: In order to drop a partition of an index you must form the name of
the index partition using
DataService.getIndexPartitionName(String, int)
(this operation is
generally performed by the IMetadataService
which manages
scale-out indices).
name
- The index name.IllegalArgumentException
- if name does not identify a registered index.IOException
InterruptedException
ExecutionException
ResultSet rangeIterator(long tx, String name, byte[] fromKey, byte[] toKey, int capacity, int flags, IFilter filter) throws InterruptedException, ExecutionException, IOException
Streaming traversal of keys and/or values in a key range.
Note: In order to visit all keys in a range, clients are expected to
issue repeated calls in which the fromKey is incremented to the
successor of the last key visited until either an empty ResultSet
is returned or the ResultSet#isLast()
flag is set, indicating
that all keys up to (but not including) the startKey have been
visited. See ClientIndexView
(scale-out indices) and
DataServiceTupleIterator
(unpartitioned indices), both of which
encapsulate this method.
Note: If the iterator can be determined to be read-only and it is
submitted as ITx.UNISOLATED
then it will be run as
ITx.READ_COMMITTED
to improve concurrency.
tx
- The transaction identifier -or- ITx.UNISOLATED
IFF the
operation is NOT isolated by a transaction -or-
- tx
to read from the most recent commit point
not later than the absolute value of tx (a fully
isolated read-only transaction using a historical start time).name
- The index name (required).fromKey
- The starting key for the scan (or null
iff
there is no lower bound).toKey
- The first key that will not be visited (or null
iff there is no upper bound).capacity
- When non-zero, this is the maximum #of entries to process.flags
- One or more flags formed by bitwise OR of zero or more of the
constants defined by IRangeQuery
.filter
- An optional object that may be used to layer additional
semantics onto the iterator. The filter will be constructed on
the server and in the execution context for the iterator, so
it will execute directly against the index for the maximum
efficiency.InterruptedException
- if the operation was interrupted.ExecutionException
- If the operation caused an error. See
Throwable.getCause()
for the underlying
error.IOException
<T> Future<T> submit(long tx, String name, IIndexProcedure<T> proc) throws IOException
Submit a procedure.
Unisolated operations SHOULD be used to achieve "auto-commit" semantics. Fully isolated transactions are useful IFF multiple operations must be composed into a ACID unit.
While unisolated batch operations on a single data service are ACID, clients are required to locate all index partitions for the logical operation and distribute their operation across the distinct data service instances holding the affected index partitions. In practice, this means that contract for ACID unisolated operations is limited to operations where the data is located on a single data service instance. For ACID operations that cross multiple data service instances the client MUST use a fully isolated transaction.
tx
- The transaction identifier, ITx.UNISOLATED
for an ACID
operation NOT isolated by a transaction,
ITx.READ_COMMITTED
for a read-committed operation not
protected by a transaction (no global read lock), or any valid
commit time for a read-historical operation not protected by a
transaction (no global read lock).name
- The name of the index partition.proc
- The procedure to be executed.Future
from which the outcome of the procedure may be
obtained.RejectedExecutionException
- if the task can not be accepted for execution.IOException
- if there is an RMI problem.Future<? extends Object> submit(Callable<? extends Object> proc) throws RemoteException
Callable
and return its Future
. The
Callable
will execute on the
IBigdataFederation.getExecutorService()
.
Note: This interface is specialized by the IDataService
for tasks
which need to gain access to the IDataService
in order to gain
local access to index partitions, etc. Such tasks declare the
IDataServiceCallable
. For example, scale-out joins use
this mechanism.
submit
in interface IRemoteExecutor
Future
for that task.RemoteException
IDataServiceCallable
IBlock readBlock(IResourceMetadata resource, long addr) throws IOException
BigdataFileSystem
.
The API should provide a means to obtain a socket from which record data may be streamed. The client sends the resource identifier (UUID of the journal or index segment) and the address of the record and the data service sends the record data. This is designed for streaming reads of up to 64M or more (a record recorded on the store as identified by the address).
IRawStore
described by
the IResourceMetadata
.resource
- The description of the resource containing that block.addr
- The address of the block in that resource.IllegalArgumentException
- if the resource is null
IllegalArgumentException
- if the addr is 0L
IllegalStateException
- if the resource is not available.IllegalArgumentException
- if the record identified by addr can not be read from the
resource.IOException
void forceOverflow(boolean immediate, boolean compactingMerge) throws IOException, InterruptedException, ExecutionException
Normally there is no reason to invoke this method directly. Overflow
processing is triggered automatically on a bottom-up basis when the
extent of the live journal nears the Options.MAXIMUM_EXTENT
.
immediate
- The purpose of this argument is to permit the caller to
trigger an overflow event even though there are no writes
being made against the data service. When true
the method will write a token record on the live journal in
order to provoke a group commit. In this case synchronous
overflow processing will have occurred by the time the method
returns. When false
a flag is set and overflow
processing will occur on the next commit.compactingMerge
- The purpose of this flag is to permit the caller to indicate
that a compacting merge should be performed for all indices on
the data service (at least, all indices whose data are not
simply copied onto the new journal) during the next
synchronous overflow. Note that compacting merges of indices
are performed automatically from time to time so this flag
exists mainly for people who want to force a compacting merge
for some reason.IOException
InterruptedException
- may be thrown if immediate is true
.ExecutionException
- may be thrown if immediate is true
.boolean purgeOldResources(long timeout, boolean truncateJournal) throws IOException, InterruptedException
ITx.UNISOLATED
writes and then purges any resources that are no longer required based on
the StoreManager.Options#MIN_RELEASE_AGE
.
Note: Resources are normally purged during synchronous overflow handling. However, asynchronous overflow handling can cause resources to no longer be needed as new index partition views are defined. This method MAY be used to trigger a release before the next overflow event.
timeout
- The timeout (in milliseconds) that the method will await the
pause of the write service.truncateJournal
- When true
, the live journal will be truncated
to its minimum extent (all writes will be preserved but there
will be no free space left in the journal). This may be used
to force the DataService
to its minimum possible
footprint for the configured history retention policy.truncateJournal
- When true
the live journal will be truncated
such that no free space remains in the journal.true
if successful and false
if the
write service could not be paused after the specified timeout.IOException
InterruptedException
long getAsynchronousOverflowCounter() throws IOException
IOException
boolean isOverflowActive() throws IOException
true
iff the data service is currently engaged in
overflow processing.IOException
IQueryPeer getQueryEngine() throws IOException
IQueryPeer
running on this service.IOException
Copyright © 2006–2019 SYSTAP, LLC DBA Blazegraph. All rights reserved.