IndexSegmentBuilder (Blazegraph Database Platform 2.1.5 API)

java.lang.Object
- com.bigdata.btree.IndexSegmentBuilder

All Implemented Interfaces:

Callable<IndexSegmentCheckpoint>
```
public class IndexSegmentBuilder
extends Object
implements Callable<IndexSegmentCheckpoint>
```
Builds an IndexSegment given a source btree and a target branching factor. There are two main use cases:
1. Evicting a key range of an index into an optimized on-disk index. In this case, the input is a BTree that is ideally backed by a fully buffered IRawStore so that no random reads are required.
2. Merging index segments. In this case, the input is typically records emerging from a merge-sort. There are two distinct cases here. In one, we simply have raw records that are being merged into an index. This might occur when merging two key ranges or when external data are being loaded. In the other case we are processing two time-stamped versions of an overlapping key range. In this case, the more recent version may have "delete" markers indicating that a key present in an older version has been deleted in the newer version. Also, key-value entries in the newer version replaced (rather than are merged with) key-value entries in the older version. If an entry history policy is defined, then it must be applied here to cause key-value whose retention is no longer required by that policy to be dropped.
One pass vs. Two Pass Design Alternatives
There are at least three design alternatives for index segment builds: (A) do an exact range count instead and generate a perfect plan; (B) fully buffer the source iterator into byte[][] keys, byte[][] vals, boolean[] deleteMarkers, and long[] versionTimestamps and generate an exact plan, consuming the buffered byte[]s directly from RAM; and (C) use the fast range count to generate a plan based on an overestimate of the tuple count and then apply a variety of hacks when the source iterator is exhausted to make the output B+Tree usable, but not well formed.
The disadvantage of (A) is that it requires two passes over the source view, which substantially increases the run time of the algorithm. In addition, the passes can drive evictions in the global LRU and could defeat caching for a view approaching the nominal size for a split. However, with (A) we can do builds for very large source B+Trees. Therefore, (A) is implemented for such use cases.
The disadvantage of (B) is that it requires more memory. However, it is much faster than (A). To compensate for the increased memory demand, we can single thread builds, merges, and splits and fall back to (A) if memory is very tight or the source view is very large.
The disadvantage of (C) is that the "hacks" break encapsulation and leak into the API where operations such as retrieving the right sibling of a node could return an empty leaf (since we ran out of tuples for the plan). Since these "hacks" would break encapsulation, it would be difficult to have confidence that the B+Tree API was fully insulated against the effects of ill-formed IndexSegments. Therefore, I have discarded this approach and backed out changes designed to support it from the code base.
Design alternatives for totally ordered nodes and leaves
In order for the nodes to be written in a contiguous block we either have to buffer them in memory or have to write them onto a temporary file and then copy them into place after the last leaf has been processed. This concern was not present in West's algorithm because it did not attempt to place the nodes and/or leaves contiguously onto the generated B+Tree file.
For the two pass design described above as option (A), the code buffers the nodes and leaves onto TemporaryRawStores. This approach is scalable, which is the concern of (A), but requires at least twice the IO when compared to directly writing the nodes and leaves onto the output file.
When sufficient memory is available, as cases where (B) would apply, we can write the leaves directly on the backing file (using double-buffering to update the prior/next addrs). Since there are far fewer nodes than leaves, we can buffer the nodes in memory, writing them once the leaves are finished.
Version:

$Id: IndexSegmentBuilder.java 2265 2009-10-26 12:51:06Z thompsonbry $

Author:

Bryan Thompson

See Also:
"Post-order B-Tree Construction" by Lawerence West, ACM 1992. Note that West's algorithm is for a b-tree (values are stored on internal stack as well as leaves), not a b+-tree (values are stored only on the leaves). Our implementation is therefore an adaptation., "Batch-Construction of B+-Trees" by Kim and Won, ACM 2001. The approach outlined by Kim and Won is designed for B+-Trees, but it appears to be less efficient on first glance., IndexSegment, IndexSegmentFile, IndexSegmentCheckpoint

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`protected static class`	`IndexSegmentBuilder.AbstractSimpleNodeData` Abstract base class for classes used to construct and serialize nodes and leaves written onto the index segment.
`protected static class`	`IndexSegmentBuilder.NOPNodeFactory` Factory does not support node or leaf creation.
`protected static class`	`IndexSegmentBuilder.SimpleLeafData` A class that can be used to (de-)serialize the data for a leaf without any of the logic for operations on the leaf.
`protected static class`	`IndexSegmentBuilder.SimpleNodeData` A class that can be used to (de-)serialize the data for a node without any of the logic for operations on the node.

Field Summary

Fields
Modifier and Type	Field and Description
`protected boolean`	`bufferNodes` When `true` the generated nodes will be fully buffered in RAM.
`long`	`commitTime` The commit time associated with the view from which the `IndexSegment` is being generated (from the ctor).
`boolean`	`compactingMerge` `true` iff the generated `IndexSegment` will incorporate all state for the source index (partition) as of the specified commitTime.
`long`	`elapsed` The process runtime in milliseconds.
`long`	`elapsed_build` The time to write the nodes and leaves into their respective buffers, not including the time to transfer those buffered onto the output file.
`long`	`elapsed_setup` The time to setup the index build, including the generation of the index plan and the initialization of some helper objects.
`long`	`elapsed_write` The time to write the nodes and leaves from their respective buffers onto the output file and synch and close that output file.
`long`	`entryCount` The value specified to the ctor.
`protected static String`	`ERR_NO_TUPLES` Message when the index segment will be empty.
`protected static String`	`ERR_TOO_MANY_TUPLES` Error message when the #of tuples in the `IndexSegment` would exceed `Integer.MAX_VALUE`.
`float`	`mbPerSec` The data throughput rate in megabytes per second.
`IndexMetadata`	`metadata` A copy of the metadata object provided to the ctor.
`protected RandomAccessFile`	`out` The file on which the `IndexSegment` is written.
`File`	`outFile` The file specified by the caller on which the `IndexSegment` is written.
`IndexSegmentPlan`	`plan` The plan for building the B+-Tree.
`UUID`	`segmentUUID` The unique identifier for the generated `IndexSegment` resource.

Constructor Summary

Constructors
Modifier	Constructor and Description
`protected`	`IndexSegmentBuilder(File outFile, File tmpDir, long entryCount, ITupleIterator<?> entryIterator, int m, IndexMetadata metadata, long commitTime, boolean compactingMerge, boolean bufferNodes)` Designated constructor sets up a build of an `IndexSegment` for some caller defined read-only view.

Method Summary

Methods
Modifier and Type	Method and Description
`protected void`	`addChild(IndexSegmentBuilder.SimpleNodeData parent, long childAddr, IndexSegmentBuilder.AbstractSimpleNodeData child)` Record the persistent address of a child on its parent and the #of entries spanned by that child.
`protected void`	`addSeparatorKey(IndexSegmentBuilder.SimpleLeafData leaf)` Copies the first key of a new leaf as a separatorKey for the appropriate parent (if any) of that leaf.
`protected void`	`buildBTree()` Scan the source tuple iterator in key order writing output leaves onto the index segment file with the new branching factor.
`IndexSegmentCheckpoint`	`call()` Build the `IndexSegment` given the parameters specified to the constructor.
`protected void`	`flushNodeOrLeaf(IndexSegmentBuilder.AbstractSimpleNodeData node)` Flush a node or leaf that has been closed (no more data will be added).
`IndexSegmentCheckpoint`	`getCheckpoint()` The `IndexSegmentCheckpoint` record written on the `IndexSegmentStore`.
`protected IndexSegmentBuilder.SimpleNodeData`	`getParent(IndexSegmentBuilder.AbstractSimpleNodeData node)` Return the parent of a node or leaf in the `stack`.
`IResourceMetadata`	`getSegmentMetadata()` The description of the constructed `IndexSegment` resource.
`long`	`getStartTime()` The timestamp in milliseconds when `call()` was invoked -or- ZERO (0L) if `call()` has not been invoked.
`static void`	`main(String[] args)` Driver for index segment build against a named index on a local journal.
`static IndexSegmentBuilder`	`newInstance(File outFile, File tmpDir, long entryCount, ITupleIterator<?> entryIterator, int m, IndexMetadata metadata, long commitTime, boolean compactingMerge, boolean bufferNodes)` A more flexible factory for an `IndexSegment` build which permits override of the index segment branching factor, replacement of the `IndexMetadata`, and the use of the caller's iterator.
`static IndexSegmentBuilder`	`newInstance(ILocalBTreeView src, File outFile, File tmpDir, boolean compactingMerge, long createTime, byte[] fromKey, byte[] toKey)` Builder factory will build an `IndexSegment` from an index (partition).
`static IndexSegmentBuilder`	`newInstance(Object[] a, int alen, IndexMetadata indexMetadata, File outFile, File tmpDir, int m, boolean compactingMerge, long createTime, boolean bufferNodes)` Variant using an array of objects in the desired order.
`protected static IndexSegmentBuilder`	`newInstanceFullyBuffered(ILocalBTreeView src, File outFile, File tmpDir, int m, boolean compactingMerge, long createTime, byte[] fromKey, byte[] toKey, boolean bufferNodes)` A one pass algorithm which materializes the tuples in RAM, computing the exact tuple count as it goes.
`protected static IndexSegmentBuilder`	`newInstanceTwoPass(ILocalBTreeView src, File outFile, File tmpDir, int m, boolean compactingMerge, long createTime, byte[] fromKey, byte[] toKey, boolean bufferNodes)` A two pass build algorithm.
`protected void`	`resetNode(IndexSegmentBuilder.SimpleNodeData parent)` The `stack` contains nodes which are reused for each node or leaf at a given level in the generated B+Tree.
`protected static void`	`usage(String[] args, String msg, int exitCode)` Prints the usage and then exits.
`protected IndexSegmentCheckpoint`	`writeIndexSegment(FileChannel outChannel, long commitTime)` Writes the complete file format for the index segment.
`protected long`	`writeLeaf(IndexSegmentBuilder.SimpleLeafData leaf)` Code the leaf, obtaining its address, update the prior/next addr of the previous leaf, and write that previous leaf onto the output file.
`protected long`	`writeNode(IndexSegmentBuilder.SimpleNodeData node)` Code and write the node onto the `nodeBuffer`.
`protected long`	`writeNodeOrLeaf(IndexSegmentBuilder.AbstractSimpleNodeData node)` Write the node or leaf onto the appropriate output channel.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - ERR_TOO_MANY_TUPLES
```
protected static final String ERR_TOO_MANY_TUPLES
```
    Error message when the #of tuples in the IndexSegment would exceed Integer.MAX_VALUE.
    Note: This is not an inherent limit in the IndexSegment but rather a limit in the IndexSegmentPlan (and perhaps the IndexSegmentBuilder) which presumes that the entry count is an int rather than a long.
    
    See Also:
    Constant Field Values
  - ERR_NO_TUPLES
```
protected static final String ERR_NO_TUPLES
```
    Message when the index segment will be empty.
    
    See Also:
    Constant Field Values
  - outFile
```
public final File outFile
```
    The file specified by the caller on which the IndexSegment is written.
  - entryCount
```
public final long entryCount
```
    The value specified to the ctor.
  - commitTime
```
public final long commitTime
```
    The commit time associated with the view from which the IndexSegment is being generated (from the ctor). This value is written into IndexSegmentCheckpoint.commitTime.
  - compactingMerge
```
public final boolean compactingMerge
```
    true iff the generated IndexSegment will incorporate all state for the source index (partition) as of the specified commitTime.
    Note: This flag is written into the IndexSegmentCheckpoint but it has no other effect on the build process.
  - metadata
```
public final IndexMetadata metadata
```
    A copy of the metadata object provided to the ctor. This object is further modified before being written on the IndexSegmentStore.
  - segmentUUID
```
public final UUID segmentUUID
```
    The unique identifier for the generated IndexSegment resource.
  - out
```
protected RandomAccessFile out
```
    The file on which the IndexSegment is written. The file is closed regardless of the outcome of the operation.
  - bufferNodes
```
protected final boolean bufferNodes
```
    When true the generated nodes will be fully buffered in RAM. Otherwise they will be buffered on the nodeBuffer and then transferred to the output file en mass.
  - plan
```
public final IndexSegmentPlan plan
```
    The plan for building the B+-Tree.
  - elapsed_setup
```
public final long elapsed_setup
```
    The time to setup the index build, including the generation of the index plan and the initialization of some helper objects.
  - elapsed_build
```
public long elapsed_build
```
    The time to write the nodes and leaves into their respective buffers, not including the time to transfer those buffered onto the output file.
  - elapsed_write
```
public long elapsed_write
```
    The time to write the nodes and leaves from their respective buffers onto the output file and synch and close that output file.
  - elapsed
```
public long elapsed
```
    The process runtime in milliseconds.
  - mbPerSec
```
public float mbPerSec
```
    The data throughput rate in megabytes per second.
- Constructor Detail
  - IndexSegmentBuilder
```
protected IndexSegmentBuilder(File outFile,
                   File tmpDir,
                   long entryCount,
                   ITupleIterator<?> entryIterator,
                   int m,
                   IndexMetadata metadata,
                   long commitTime,
                   boolean compactingMerge,
                   boolean bufferNodes)
                       throws IOException
```
    Designated constructor sets up a build of an IndexSegment for some caller defined read-only view.
    
    Note: The caller must determine whether or not deleted index entries are present in the view. The entryCount MUST be the exact #of index entries that are visited by the given iterator. In general, this is not difficult. However, if a compacting merge is desired (that is, if you are trying to generate a view containing only the non-deleted entries) then you MUST explicitly count the #of entries that will be visited by the iterator, e.g., it will require two passes over the iterator to setup the index build operation.
    
    Note: With a branching factor of 4096 a tree of height 2 (three levels) could address 68,719,476,736 entries - well beyond what we want in a given index segment! Well before that the index segment should be split into multiple files. The split point should be determined by the size of the serialized leaves and nodes, e.g., the amount of data on disk required by the index segment and the amount of memory required to fully buffer the index nodes. While the size of a serialized node can be estimated easily, the size of a serialized leaf depends on the kinds of values stored in that index. The actual sizes are recorded in the IndexSegmentCheckpoint record in the header of the IndexSegment.
    
    Parameters:
    outFile - The file on which the index segment is written. The file MAY exist but MUST have zero length if it does exist (this permits you to use the temporary file facility to create the output file).
    tmpDir - The temporary directory in data are buffered during the build (optional - the default temporary directory is used if this is null).
    entryCount - The #of entries that will be visited by the iterator. This MUST be an exact range count.
    entryIterator - Visits the index entries in key order that will be written onto the IndexSegment.
    m - The branching factor for the generated tree. This can be chosen with an eye to minimizing the height of the generated tree. (Small branching factors are permitted for testing, but generally you want something relatively large.)
    metadata - The metadata record for the source index. A copy will be made of this object. The branching factor in the generated tree will be overridden to m.
    commitTime - The commit time associated with the view from which the IndexSegment is being generated. This value is written into IndexSegmentCheckpoint.commitTime.
    compactingMerge - true iff the generated IndexSegment will incorporate all state for the source index (partition) as of the specified commitTime. This flag is written into the IndexSegmentCheckpoint but does not otherwise effect the build process.
    bufferNodes - When true the generated nodes will be fully buffered in RAM (faster, but imposes a memory constraint). Otherwise they will be written onto a temporary file and then transferred to the output file en mass.
    
    Throws:
    
    IOException
- Method Detail
  - getCheckpoint
```
public IndexSegmentCheckpoint getCheckpoint()
```
    The IndexSegmentCheckpoint record written on the IndexSegmentStore.
  - getStartTime
```
public long getStartTime()
```
    The timestamp in milliseconds when call() was invoked -or- ZERO (0L) if call() has not been invoked.
  - newInstance
```
public static IndexSegmentBuilder newInstance(ILocalBTreeView src,
                              File outFile,
                              File tmpDir,
                              boolean compactingMerge,
                              long createTime,
                              byte[] fromKey,
                              byte[] toKey)
                                       throws IOException
```
    Builder factory will build an IndexSegment from an index (partition). Delete markers are propagated to the IndexSegment unless compactingMerge is true.
    
    Parameters:
    src - A view of the index partition as of the createTime. When compactingMerge is false then this MUST be a single BTree since incremental builds are only support for a BTree source while compacting merges are defined for any IIndex.
    outFile - The file on which the IndexSegment will be written. The file MAY exist, but if it exists then it MUST be empty.
    compactingMerge - When true the caller asserts that src is a FusedView and deleted index entries WILL NOT be included in the generated IndexSegment. Otherwise, it is assumed that the only select component(s) of the index partition view are being exported onto an IndexSegment and deleted index entries will therefore be propagated to the new IndexSegment (aka an incremental build).
    createTime - The commit time associated with the view from which the IndexSegment is being generated. This value is written into IndexSegmentCheckpoint.commitTime.
    fromKey - The lowest key that will be included (inclusive). When null there is no lower bound.
    toKey - The first key that will be included (exclusive). When null there is no upper bound.
    
    Returns:
    An object which can be used to construct the IndexSegment .
    
    Throws:
    
    IOException
  - newInstanceTwoPass
```
protected static IndexSegmentBuilder newInstanceTwoPass(ILocalBTreeView src,
                                     File outFile,
                                     File tmpDir,
                                     int m,
                                     boolean compactingMerge,
                                     long createTime,
                                     byte[] fromKey,
                                     byte[] toKey,
                                     boolean bufferNodes)
                                                 throws IOException
```
    A two pass build algorithm. The first pass is used to obtain an exact entry count for the view. Based on that exact range count we can compute a plan for a balanced B+Tree. A second pass over the view is required to populate the output B+Tree. This flavor also buffers the leaves and nodes on temporary stores, which means that it does more IO. However, this version is capable of processing very large source views.
    
    Throws:
    
    IOException
  - newInstanceFullyBuffered
```
protected static IndexSegmentBuilder newInstanceFullyBuffered(ILocalBTreeView src,
                                           File outFile,
                                           File tmpDir,
                                           int m,
                                           boolean compactingMerge,
                                           long createTime,
                                           byte[] fromKey,
                                           byte[] toKey,
                                           boolean bufferNodes)
                                                       throws IOException
```
    A one pass algorithm which materializes the tuples in RAM, computing the exact tuple count as it goes. This is faster than the two-pass algorithm and is a better choice when the source view and the output index segment are within the normal ranges for an index partition, e.g., an output index segment file of ~200M on the disk. FIXME The unit tests need to run against both builds based on the materialized tuples and builds based on two passes in order to obtain the exact range count. They already do for TestIndexSegmentBuilderWithLargeTrees but not yet for the other test suite variants.
    
    Throws:
    
    IOException
  - newInstance
```
public static IndexSegmentBuilder newInstance(Object[] a,
                              int alen,
                              IndexMetadata indexMetadata,
                              File outFile,
                              File tmpDir,
                              int m,
                              boolean compactingMerge,
                              long createTime,
                              boolean bufferNodes)
                                       throws IOException
```
    Variant using an array of objects in the desired order. A single root leaf is generated from those objects. The root leaf is then fed into the algorithm to efficient construct the corresponding read-only IndexSegment.
    
    Parameters:
    a - The array of objects to be written onto the index. The index must know how to generate tuples from these objects. The objects must already be in the natural order of the keys that will be generated for those tuples.
    alen - The #of elements in that array.
    indexMetadata - The IndexMetadata that will serve as the template for the generated IndexSegment.
    outFile - The file on which the IndexSegment will be written. The file MAY exist, but if it exists then it MUST be empty.
    tmpDir - The temporary directory in data are buffered during the build (optional - the default temporary directory is used if this is null).
    m - The branching factor for the generated IndexSegment.
    compactingMerge - When true the caller asserts that src is a FusedView and deleted index entries WILL NOT be included in the generated IndexSegment. Otherwise, it is assumed that the only select component(s) of the index partition view are being exported onto an IndexSegment and deleted index entries will therefore be propagated to the new IndexSegment (aka an incremental build).
    createTime - The commit time associated with the view from which the IndexSegment is being generated. This value is written into IndexSegmentCheckpoint.commitTime.
    bufferNodes - When true the generated nodes will be fully buffered in RAM (faster, but imposes a memory constraint). Otherwise they will be written onto a temporary file and then transferred to the output file en mass.
    
    Returns:
    
    Throws:
    
    IOException - TODO We could pass a flag indicating whether the leaf needs to be sorted after it is generated, but the caller would still be responsible for ensuring that there are no duplicates in the array.
  - newInstance
```
public static IndexSegmentBuilder newInstance(File outFile,
                              File tmpDir,
                              long entryCount,
                              ITupleIterator<?> entryIterator,
                              int m,
                              IndexMetadata metadata,
                              long commitTime,
                              boolean compactingMerge,
                              boolean bufferNodes)
                                       throws IOException
```
    A more flexible factory for an IndexSegment build which permits override of the index segment branching factor, replacement of the IndexMetadata, and the use of the caller's iterator.
    
    Note: The caller must determine whether or not deleted index entries are present in the view. The entryCount MUST be the exact #of index entries that are visited by the given iterator. In general, this is not difficult. However, if a compacting merge is desired (that is, if you are trying to generate a view containing only the non-deleted entries) then you MUST explicitly count the #of entries that will be visited by the iterator, e.g., it will require two passes over the iterator to setup the index build operation.
    
    Note: With a branching factor of 4096 a tree of height 2 (three levels) could address 68,719,476,736 entries - well beyond what we want in a given index segment! Well before that the index segment should be split into multiple files. The split point should be determined by the size of the serialized leaves and nodes, e.g., the amount of data on disk required by the index segment and the amount of memory required to fully buffer the index nodes. While the size of a serialized node can be estimated easily, the size of a serialized leaf depends on the kinds of values stored in that index. The actual sizes are recorded in the IndexSegmentCheckpoint record in the header of the IndexSegment.
    
    Parameters:
    outFile - The file on which the index segment is written. The file MAY exist but MUST have zero length if it does exist (this permits you to use the temporary file facility to create the output file).
    tmpDir - The temporary directory in data are buffered during the build (optional - the default temporary directory is used if this is null).
    entryCount - The #of entries that will be visited by the iterator. This MUST be an exact range count.
    entryIterator - Visits the index entries in key order that will be written onto the IndexSegment.
    m - The branching factor for the generated tree. This can be chosen with an eye to minimizing the height of the generated tree. (Small branching factors are permitted for testing, but generally you want something relatively large.)
    metadata - The metadata record for the source index. A copy will be made of this object. The branching factor in the generated tree will be overridden to m.
    commitTime - The commit time associated with the view from which the IndexSegment is being generated. This value is written into IndexSegmentCheckpoint.commitTime.
    compactingMerge - true iff the generated IndexSegment will incorporate all state for the source index (partition) as of the specified commitTime. This flag is written into the IndexSegmentCheckpoint but does not otherwise effect the build process.
    bufferNodes - When true the generated nodes will be fully buffered in RAM (faster, but imposes a memory constraint). Otherwise they will be written onto a temporary file and then transferred to the output file en mass.
    
    Throws:
    
    IOException
  - call
```
public IndexSegmentCheckpoint call()
                            throws Exception
```
    Build the IndexSegment given the parameters specified to the constructor.
    
    Specified by:
    
    call in interface Callable<IndexSegmentCheckpoint>
    
    Throws:
    
    Exception
  - buildBTree
```
protected void buildBTree()
```
    Scan the source tuple iterator in key order writing output leaves onto the index segment file with the new branching factor. We also track a stack of nodes that are being written out concurrently on a temporary channel.
    The plan tells us the #of values to insert into each leaf and the #of children to insert into each node. Each time a leaf becomes full (according to the plan), we "close" the leaf, writing it out onto the store and obtaining its "address". The "close" logic also takes care of setting the address on the leaf's parent node (if any). If the parent node becomes filled (according to the plan) then it is also "closed".
    Each time (except the first) that we start a new leaf we record its first key as a separatorKey in the appropriate parent node.
    Note: The root may be a leaf as a degenerate case.
  - flushNodeOrLeaf
```
protected void flushNodeOrLeaf(IndexSegmentBuilder.AbstractSimpleNodeData node)
```
    Flush a node or leaf that has been closed (no more data will be added).
    
    Note: When a node or leaf is flushed we write it out to obtain its address and set that address on its direct parent using #addChild(SimpleNodeData, long, AbstractSimpleNodeData, boolean). This also updates the per-child counters of the #of entries spanned by a node.
    
    Parameters:
    node - The node to be flushed.
  - addChild
```
protected void addChild(IndexSegmentBuilder.SimpleNodeData parent,
            long childAddr,
            IndexSegmentBuilder.AbstractSimpleNodeData child)
```
    Record the persistent address of a child on its parent and the #of entries spanned by that child. If all children on the parent become assigned then the parent is closed.
    
    Parameters:
    parent - The parent.
    childAddr - The address of the child (node or leaf).
    child - The child reference.
  - resetNode
```
protected void resetNode(IndexSegmentBuilder.SimpleNodeData parent)
```
    The stack contains nodes which are reused for each node or leaf at a given level in the generated B+Tree. This method prepares a node in the stack for reuse.
  - addSeparatorKey
```
protected void addSeparatorKey(IndexSegmentBuilder.SimpleLeafData leaf)
```
    Copies the first key of a new leaf as a separatorKey for the appropriate parent (if any) of that leaf. This must be invoked when the first key is set on that leaf. However, it must not be invoked on the first leaf.
    
    Parameters:
    leaf - The current leaf. The first key on that leaf must be defined.
  - getParent
```
protected IndexSegmentBuilder.SimpleNodeData getParent(IndexSegmentBuilder.AbstractSimpleNodeData node)
```
    Return the parent of a node or leaf in the stack.
    
    Parameters:
    node - The node or leaf.
    
    Returns:
    The parent or null iff node is the root node or leaf.
  - writeNodeOrLeaf
```
protected long writeNodeOrLeaf(IndexSegmentBuilder.AbstractSimpleNodeData node)
```
    Write the node or leaf onto the appropriate output channel.
    
    Returns:
    The address that may be used to read the node or leaf from the file. Note that the address of a node is relative to the start of the node channel and therefore must be adjusted before reading the node from the final index segment file.
  - writeLeaf
```
protected long writeLeaf(IndexSegmentBuilder.SimpleLeafData leaf)
```
    Code the leaf, obtaining its address, update the prior/next addr of the previous leaf, and write that previous leaf onto the output file.
    Note: For leaf addresses we know the absolute offset into the IndexSegmentStore where the leaf will wind up so we encode the address of the leaf using the IndexSegmentRegion.BASE region.
    Note: In order to write out the leaves using a double-linked list with prior-/next-leaf addresses we have to use a "write behind" strategy. Instead of writing out the leaf as soon as it is serialized, we save the uncoded address and a copy of the coded data record on private member fields. When we code the next leaf (or if we learn that we have no more leaves to code because IndexSegmentPlan.nleaves EQ nleavesWritten) then we patch the coded representation of the prior leaf and write it on the store at the previously obtained address, thereby linking the leaves together in both directions. It is definitely confusing.
    
    Returns:
    The address that may be used to read the leaf from the file backing the IndexSegmentStore.
  - writeNode
```
protected long writeNode(IndexSegmentBuilder.SimpleNodeData node)
```
    Code and write the node onto the nodeBuffer.
    
    Returns:
    An relative address that must be correctly decoded before you can read the compressed node from the file. This value is also set on IndexSegmentBuilder.SimpleNodeData.addr.
    See Also:
    IndexSegmentBuilder.SimpleNodeData, IndexSegmentRegion, IndexSegmentAddressManager
  - writeIndexSegment
```
protected IndexSegmentCheckpoint writeIndexSegment(FileChannel outChannel,
                                       long commitTime)
                                            throws IOException,
                                                   InterruptedException
```
    Writes the complete file format for the index segment. The file is divided up as follows:
    1. fixed length IndexSegmentCheckpoint record (required)
    2. leaves (required)
    3. nodes (may be empty)
    4. the bloom filter (optional)
    5. the IndexMetadata record (required, but extensible)
    The index segment metadata is divided into a base IndexSegmentCheckpoint record with a fixed format containing only essential data and additional metadata records written at the end of the file including the optional bloom filter and the required IndexMetadata record. The latter is where we write variable length metadata including the _name_ of the index, or additional metadata defined by a specific class of index.
    
    Once all nodes and leaves have been buffered we are ready to start writing the data. We skip over a fixed size metadata record since otherwise we are unable to pre-compute the offset to the leaves and hence the addresses of the leaves. The node addresses are written in an encoding that requires active translation by the receiver who must be aware of the offset to the start of the node region. We can not write the metadata record until we know the size and length of each of these regions (leaves, nodes, and the bloom filter, or other metadata records) since that information is required in order to be able to form their addresses for insertion in the metadata record.
    Parameters:
    outChannel -
    commitTime -
    
    Throws:
    
    IOException
    
    InterruptedException - FIXME There is no sense of an atomic commit when building a new index segment. We should write ZEROs into the checkpoint record initially and then seek back to the head of the file once we are done and write out the correct checkpoint record.
    Note: There are similar issues involved when we replicate index segment or journal files to verify that they are good.
  - getSegmentMetadata
```
public IResourceMetadata getSegmentMetadata()
```
    The description of the constructed IndexSegment resource.
    
    Throws:
    
    IllegalStateException - if requested before the build operation is complete.
  - usage
```
protected static void usage(String[] args,
         String msg,
         int exitCode)
```
    Prints the usage and then exits.
    
    Parameters:
    args - The command line args.
  - main
```
public static void main(String[] args)
                 throws Exception
```
    Driver for index segment build against a named index on a local journal.
    
    Parameters:
    args - [opts] journal [name]*, where journal is the name of the journal file, where name is the name of a B+Tree registered on that journal, and where opts are any of:
    
    -m #
    
    Override the default branching factor for the index segment.
    
    -alg algorithm
    
    Specify which build algorithm to use. See BuildEnum.
    
    -merge or -build
    
    Specifies whether to do a compacting merge (deleted tuples are purged from the generated index segment) or an incremental build (deleted tuples are preserved). The default is merge.
    
    -O outDir
    
    Specify the name of the directory on which the generated index segment file(s) will be written. This defaults to the current working directory. Each index segment file will be named based on the name of the source index with the .seg extension). .
    
    . If no names are specified, then an index segment will be generated for each named B+Tree registered on the source journal.
    
    Throws:
    
    Exception

Class IndexSegmentBuilder

One pass vs. Two Pass Design Alternatives

Design alternatives for totally ordered nodes and leaves

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

ERR_TOO_MANY_TUPLES

ERR_NO_TUPLES

outFile

entryCount

commitTime

compactingMerge

metadata

segmentUUID

out

bufferNodes

plan

elapsed_setup

elapsed_build

elapsed_write

elapsed

mbPerSec

Constructor Detail

IndexSegmentBuilder

Method Detail

getCheckpoint

getStartTime

newInstance

newInstanceTwoPass

newInstanceFullyBuffered

newInstance

newInstance

call

buildBTree

flushNodeOrLeaf

addChild

resetNode

addSeparatorKey

getParent

writeNodeOrLeaf

writeLeaf

writeNode

writeIndexSegment

getSegmentMetadata

usage

main