public interface IDirectoryData extends ITreeNodeData
The hash directory provides an index which maps a subset of the bits in a hash value onto a directory entry. The directory entry provides the storage address of the child page to which a lookup with that hash value would be directed. The directory entry also indicates whether the child is a bucket or another directory. This requires 1-bit per directory entry, which amounts to an overhead of 3% when compared to a record which manages to encode the bucket / directory distinction into the storage address.
The number of entries in a hash directory is a function of the globalDepth of
that directory: entryCount := 2^globalDepth
. The globalDepth of
a child directory is its localDepth in the parent directory. While the
localDepth of a persistent child may be computed by scanning and
counting the #of references to that child, the copy-on-write policy used to
support MVCC for the hash tree requires that the storage address of a dirty
child is undefined. Therefore, the localDepth MUST be explicitly stored in
the directory record. Assuming 32-bit hash codes, this is a cost of 4 bits
per directory entry which amounts to a 11% overhead when compared to a record
which manages to encode that information using a scan of the directory
entries.
By far the largest storage cost associated with a directory page are the
addresses of the child pages. Bigdata uses long
addresses for
the IRawStore
interface. However, it is possible to get by with int32
addresses when using the RWStore.
Finally, bigdata uses checksums on all data records. Therefore the maximum space available on a 4k page is actually 4096-4 := 4094 bytes. [Yikes! This means that we can not store power of 2 addresses efficiently. That means that we really need to use a compressed dictionary representation in order to have efficient storage utilization with good fan out.]
Modifier and Type | Method and Description |
---|---|
byte[] |
getOverflowKey()
If this is an overflow directory, then there is a single key for which
the directory will reference multiple BucketPages storing the associated
values.
|
boolean |
isOverflowDirectory()
true iff this is an overflow directory page. |
data, getMaximumVersionTimestamp, getMinimumVersionTimestamp, hasVersionTimestamps, isCoded, isLeaf, isReadOnly
getChildAddr, getChildCount
boolean isOverflowDirectory()
true
iff this is an overflow directory page. An overflow
directory page is created when a bucket page overflows as the parent of
that bucket page. The children of the overflow directory page may be
other overflow directory pages or bucket pages. All bucket pages below an
overflow directory page will have the same key. That key is recorded once
in each overflow bucket page.byte[] getOverflowKey()
Copyright © 2006–2019 SYSTAP, LLC DBA Blazegraph. All rights reserved.