MutableBucketData (Blazegraph Database Platform 2.1.5 API)

java.lang.Object
- com.bigdata.htree.MutableBucketData

All Implemented Interfaces:

IAbstractNodeData, IKeysData, ILeafData, IDataRecordAccess
```
public class MutableBucketData
extends Object
implements ILeafData
```
Implementation maintains Java objects corresponding to the persistent data and defines methods for a variety of mutations on the ILeafData record which operate by direct manipulation of the Java objects. The bucket page is logically divided into buddy hash buckets. Unlike a B+Tree, the tuples are neither ordered nor dense and the same key MAY appear multiple times within a given buddy bucket.
Note: Overflow pages must be provided for when the #of duplicates in a bucket exceeds the capacity of the bucket (this can be done by conversion of the bucket to a blob record or by chaining the bucket).
Note: package private fields are used so that they may be directly accessed by the HashBucket class.

Version:

$Id: MutableLeafData.java 2265 2009-10-26 12:51:06Z thompsonbry $ TODO Consider mutable implementation based on a compacting record ala GOM. This is especially attractive for the hash tree. The implementation would have to be wholly different from the MutableLeafData class. Instead of managing the IRaba for the keys and values separately, each ITuple would be appended into a byte[] (or IDataRecord). There would be a budget for that backing buffer which is the maximum in-memory size of the bucket. An index would provide random access into the buffer for only those entries which are "live" and a counter is maintain of the #of entries in the buffer which are no longer in use. When the buffer capacity is reached, the buffer is compacted by copying all of the entries accessible from the index into a new buffer and the old buffer is released.
Records which are too large for the buffer should be moved out of line.
This can be used in combination with a dynamically maintained trie for fast hash lookup, or we could just scan the entries.
Lexicon key search can scan the entries using the index. Scanning can have a side-effect in which the position of the entry offsets in the index is swapped if the keys are out of order. This would give us MonetDB style "cracking". The keys would have to be put into full order no later than when the record is made persistent.
Even though mutation is not thread safe, compacting the data record must not cause the assignment of indices to tuples to change when the caller is iterating through the entries by index. TODO Consider disallowing delete markers and version timestamps for the HTree if it is to be used only with in unisolated contexts in which case we can drop support for that stuff from this class.

Author:

Bryan Thompson

Constructor Summary

Constructors
Constructor and Description
`MutableBucketData(int branchingFactor, boolean hasVersionTimestamps, boolean hasDeleteMarkers, boolean hasRawRecords)` Create an empty data record with internal arrays dimensioned for the specified branching factor.
`MutableBucketData(int branchingFactor, ILeafData src)` Copy ctor.

Method Summary

Methods
Modifier and Type	Method and Description
`int`	`capacity()` The capacity of the bucket page (`2^addressBits`).
`AbstractFixedByteArrayBuffer`	`data()` The coded (aka compressed) data.
`boolean`	`getDeleteMarker(int index)` Return `true` iff the entry at the specified index is marked as deleted.
`int`	`getKeyCount()` Return the #of keys in the node or leaf.
`IRaba`	`getKeys()` The object used to contain and manage the keys.
`long`	`getMaximumVersionTimestamp()` The most recent tuple revision timestamp associated with any tuple spanned by this node or leaf.
`long`	`getMinimumVersionTimestamp()` The earliest tuple revision timestamp associated with any tuple spanned by this node or leaf.
`long`	`getNextAddr()` The address of the next leaf in key order, `0L` if it is known that there is no next leaf, and `-1L` if either: (a) it is not known whether there is a next leaf; or (b) it is known but the address of that leaf is not known to the caller.
`long`	`getPriorAddr()` The address of the previous leaf in key order, `0L` if it is known that there is no previous leaf, and `-1L` if either: (a) it is not known whether there is a previous leaf; or (b) it is known but the address of that leaf is not known to the caller.
`long`	`getRawRecord(int index)` Return the address of the raw record on the backing store of the value stored in the tuple having the given index -or- `IAddressManager.NULL` if the value is the actual `byte[]` value associated with the key in the leaf.
`int`	`getValueCount()` The #of values in the leaf (this MUST be equal to the #of keys for a leaf).
`IRaba`	`getValues()` Return the object storing the logical byte[][] containing the values for the leaf.
`long`	`getVersionTimestamp(int index)` The version timestamp for the entry at the specified index.
`boolean`	`hasDeleteMarkers()` Return `true` iff the leaf maintains delete markers.
`boolean`	`hasRawRecords()` Return `true` iff the leaf promotes large `byte[]` values to raw records on the backing store.
`boolean`	`hasVersionTimestamps()` Return `true` iff the leaf maintains version timestamps.
`void`	`insert(int insIndex, byte[] key, byte[] val, boolean rawRecord)` Inserts a new index/value, maintaining version timestamps and delete markers when necessary
`boolean`	`isCoded()` No.
`boolean`	`isDoubleLinked()` No - this class does not support double-linked leaves (only the `IndexSegment` actually uses double-linked leaves).
`boolean`	`isLeaf()` Always returns `true`.
`boolean`	`isReadOnly()` No - this is a mutable data record.
`protected boolean`	`rangeCheckTupleIndex(int index)` Range check a tuple index.
`void`	`remove(int index)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - MutableBucketData
```
public MutableBucketData(int branchingFactor,
                 boolean hasVersionTimestamps,
                 boolean hasDeleteMarkers,
                 boolean hasRawRecords)
```
    Create an empty data record with internal arrays dimensioned for the specified branching factor.
    
    Parameters:
    branchingFactor - The branching factor for the owning B+Tree.
    hasVersionTimestamps - true iff version timestamps will be maintained.
    hasDeleteMarkers - true iff delete markers will be maintained.
    hasRawRecords - true iff raw record markers will be maintained.
  - MutableBucketData
```
public MutableBucketData(int branchingFactor,
                 ILeafData src)
```
    Copy ctor.
    
    Parameters:
    branchingFactor - The branching factor for the owning B+Tree.
    src - The source leaf.
- Method Detail
  - capacity
```
public final int capacity()
```
    The capacity of the bucket page (2^addressBits).
  - rangeCheckTupleIndex
```
protected final boolean rangeCheckTupleIndex(int index)
```
    Range check a tuple index.
    
    Parameters:
    index - The index of a tuple in [0:#capacity()-1].
    
    Returns:
    true
    
    Throws:
    
    IndexOutOfBoundsException - if the index is not in the legal range.
  - isReadOnly
```
public final boolean isReadOnly()
```
    No - this is a mutable data record.
    
    Specified by:
    
    isReadOnly in interface IAbstractNodeData
  - isCoded
```
public final boolean isCoded()
```
    No.
    
    Specified by:
    
    isCoded in interface IAbstractNodeData
  - data
```
public final AbstractFixedByteArrayBuffer data()
```
    Description copied from interface: IAbstractNodeData
    
    The coded (aka compressed) data.
    
    Specified by:
    
    data in interface IAbstractNodeData
    
    Specified by:
    
    data in interface IDataRecordAccess
  - getVersionTimestamp
```
public final long getVersionTimestamp(int index)
```
    Description copied from interface: ILeafData
    
    The version timestamp for the entry at the specified index.
    
    Specified by:
    
    getVersionTimestamp in interface ILeafData
    
    Returns:
    The version timestamp for the index entry.
  - getMinimumVersionTimestamp
```
public final long getMinimumVersionTimestamp()
```
    Description copied from interface: IAbstractNodeData
    
    The earliest tuple revision timestamp associated with any tuple spanned by this node or leaf. If there are NO tuples for the leaf, then this MUST return Long.MAX_VALUE since the initial value of the minimum version timestamp is always the largest possible long integer.
    
    Specified by:
    
    getMinimumVersionTimestamp in interface IAbstractNodeData
  - getMaximumVersionTimestamp
```
public final long getMaximumVersionTimestamp()
```
    Description copied from interface: IAbstractNodeData
    
    The most recent tuple revision timestamp associated with any tuple spanned by this node or leaf. If there are NO tuples for the leaf, then this MUST return Long.MIN_VALUE since the initial value of the maximum version timestamp is always the smallest possible long integer.
    
    Specified by:
    
    getMaximumVersionTimestamp in interface IAbstractNodeData
  - getDeleteMarker
```
public final boolean getDeleteMarker(int index)
```
    Description copied from interface: ILeafData
    
    Return true iff the entry at the specified index is marked as deleted.
    
    Specified by:
    
    getDeleteMarker in interface ILeafData
  - getRawRecord
```
public final long getRawRecord(int index)
```
    Description copied from interface: ILeafData
    
    Return the address of the raw record on the backing store of the value stored in the tuple having the given index -or- IAddressManager.NULL if the value is the actual byte[] value associated with the key in the leaf. When the value is the address of a raw record, the actual byte[] value should be read from the backing store using the decoded address.
    Raw record addresses are created transparently when a large byte[] is associated with a key in the leaf. They are materialized transparently when the tuple associated with the leaf is read. They are deleted when the tuple associated with the leaf is deleted.
    Note: Raw records are managed at the leaf, rather than the IRaba level, because there is not always a backing store associated with an IRaba object. This is similar to how deleted tuples are handled. However, IRabaCoders are responsible for coding the long address stored in the values raba. Raw records are only used for large byte[] values. Highly specialized IRabaCoders can avoid the potential for a conflict with their own coding scheme by ensuring that the index either will not promote large values to raw records or by refraining from inserting large values into the index.
    
    Specified by:
    
    getRawRecord in interface ILeafData
    
    Returns:
    The address of the raw record -or- IAddressManager.NULL
  - getValues
```
public final IRaba getValues()
```
    Description copied from interface: ILeafData
    
    Return the object storing the logical byte[][] containing the values for the leaf. When the leaf maintains delete markers you MUST check whether or not the tuple is deleted before requesting its value.
    Note: When the leaf maintains raw records you MUST check whether or not the value was written onto a raw record before interpreting the data returned by IRaba.get(int). If the length of the value exceeded the configured maximum record length for the index, then the value was written onto a raw record on the backing store and IRaba.get(int) will return the encoded address of that record rather than its data.
    
    Specified by:
    
    getValues in interface ILeafData
    
    See Also:
    ILeafData.hasDeleteMarkers(), ILeafData.getDeleteMarker(int)
  - getKeys
```
public final IRaba getKeys()
```
    Description copied from interface: IKeysData
    
    The object used to contain and manage the keys.
    
    Specified by:
    
    getKeys in interface IKeysData
  - isLeaf
```
public final boolean isLeaf()
```
    Always returns true.
    
    Specified by:
    
    isLeaf in interface IAbstractNodeData
  - getValueCount
```
public final int getValueCount()
```
    Description copied from interface: ILeafData
    
    The #of values in the leaf (this MUST be equal to the #of keys for a leaf).
    
    Specified by:
    
    getValueCount in interface ILeafData
    
    Returns:
    The #of values in the leaf.
  - hasRawRecords
```
public final boolean hasRawRecords()
```
    Description copied from interface: ILeafData
    
    Return true iff the leaf promotes large byte[] values to raw records on the backing store.
    
    Specified by:
    
    hasRawRecords in interface ILeafData
  - hasDeleteMarkers
```
public final boolean hasDeleteMarkers()
```
    Description copied from interface: ILeafData
    
    Return true iff the leaf maintains delete markers.
    
    Specified by:
    
    hasDeleteMarkers in interface ILeafData
  - hasVersionTimestamps
```
public final boolean hasVersionTimestamps()
```
    Description copied from interface: ILeafData
    
    Return true iff the leaf maintains version timestamps.
    
    Specified by:
    
    hasVersionTimestamps in interface IAbstractNodeData
    
    Specified by:
    
    hasVersionTimestamps in interface ILeafData
  - getKeyCount
```
public final int getKeyCount()
```
    Description copied from interface: IKeysData
    
    Return the #of keys in the node or leaf. A node has nkeys+1 children. A leaf has nkeys keys and values. The maximum #of keys for a node is one less than the branching factor of the B+Tree. The maximum #of keys for a leaf is the branching factor of the B+Tree. For a hash bucket, this is the #of entries in the bucket.
    
    Specified by:
    
    getKeyCount in interface IKeysData
    
    Returns:
    The #of defined keys.
  - isDoubleLinked
```
public final boolean isDoubleLinked()
```
    No - this class does not support double-linked leaves (only the IndexSegment actually uses double-linked leaves).
    
    Specified by:
    
    isDoubleLinked in interface ILeafData
  - getNextAddr
```
public final long getNextAddr()
```
    Description copied from interface: ILeafData
    
    The address of the next leaf in key order, 0L if it is known that there is no next leaf, and -1L if either: (a) it is not known whether there is a next leaf; or (b) it is known but the address of that leaf is not known to the caller.
    
    Specified by:
    
    getNextAddr in interface ILeafData
  - getPriorAddr
```
public final long getPriorAddr()
```
    Description copied from interface: ILeafData
    
    The address of the previous leaf in key order, 0L if it is known that there is no previous leaf, and -1L if either: (a) it is not known whether there is a previous leaf; or (b) it is known but the address of that leaf is not known to the caller.
    
    Specified by:
    
    getPriorAddr in interface ILeafData
  - insert
```
public void insert(int insIndex,
          byte[] key,
          byte[] val,
          boolean rawRecord)
```
    Inserts a new index/value, maintaining version timestamps and delete markers when necessary
    
    Parameters:
    insIndex -
    key -
    val -
    rawRecord -
  - remove
```
public void remove(int index)
```
    Parameters:
    index - - slot to be removed

Class MutableBucketData

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

MutableBucketData

MutableBucketData

Method Detail

capacity

rangeCheckTupleIndex

isReadOnly

isCoded

data

getVersionTimestamp

getMinimumVersionTimestamp

getMaximumVersionTimestamp

getDeleteMarker

getRawRecord

getValues

getKeys

isLeaf

getValueCount

hasRawRecords

hasDeleteMarkers

hasVersionTimestamps

getKeyCount

isDoubleLinked

getNextAddr

getPriorAddr

insert

remove