IKeyBuilder (Blazegraph Database Platform 2.1.5 API)

All Superinterfaces:

IByteArraySlice, IManagedByteArray, ISortKeyBuilder<Object>

All Known Implementing Classes:

KeyBuilder
```
public interface IKeyBuilder
extends ISortKeyBuilder<Object>, IManagedByteArray
```
Interface for building up variable unsigned byte[] keys from one or more primitive data types values and/or Unicode strings. An instance of this interface may be reset() and reused to encode a series of keys.

A sort key is an unsigned byte[] that preserves the total order of the original data. Sort keys may potentially be formed from multiple fields but field markers do not appear within the resulting sort key. While the original values can be extracted from sort keys (this is true of all the fixed length fields, such as int, long, float, or double) they can not be extracted from Unicode variable length fields (the collation ordering for a Unicode string depends on the Locale, the collation strength, and the decomposition mode and is a non-reversable operation).

Unicode

Factory methods are defined by KeyBuilder for obtaining instances of this interface that optionally support Unicode. Instances may be created for a given Locale, collation strength, decomposition mode, etc.

The ICU library supports generation of compressed Unicode sort keys and is used by default when available. The JDK java.text package also supports the generation of Unicode sort keys, but it does NOT produce compressed sort keys. The resulting sort keys are therefore (a) incompatible with those produced by the ICU library and (b) much larger than those produced by the ICU library.

Support for Unicode MAY be disabled using KeyBuilder.Options.COLLATOR, by using KeyBuilder.newInstance() or another factory method that does not enable Unicode support, or by using one of the KeyBuilder constructors that does not support Unicode.

Multi-field keys with variable length fields

Multi-field keys in which variable length fields are embedded within the key present a special problem. Any run of fixed length fields can be compared as unsigned byte[]s. Likewise, any any key with a fixed length prefix (including zero) but a variable length field in its tail can also be compared directly as unsigned byte[]s. However, the introduction of a variable length field into any non-terminal position in a multi-field key must be handled specially since simple concatenation of the field keys will NOT produce the correct total ordering. (This is why SQL requires that text fields compare as if they were padded out with ASCII blanks (0x20) to some maximum length for the field.) A utility method exists specifically for this purpose - see appendText(String, boolean, boolean).

Version:

$Id$

Author:

Bryan Thompson

See Also:
KeyBuilder.newInstance(), KeyBuilder.newUnicodeInstance(), KeyBuilder.newUnicodeInstance(Properties), SuccessorUtil

Field Summary

Fields
Modifier and Type Field and Description

static int maxlen
The maximum length of a variable length text field is 65535 (pow(2,16)-1).

Fields
Modifier and Type	Field and Description
`static int`	`maxlen` The maximum length of a variable length text field is `65535` (`pow(2,16)-1`).

Method Summary

Methods
Modifier and Type	Method and Description
`IKeyBuilder`	`append(BigDecimal d)` Encode a `BigDecimal` into an unsigned byte[] and append it into the key buffer.
`IKeyBuilder`	`append(BigInteger i)` Encode a `BigInteger` into an unsigned byte[] and append it into the key buffer.
`IKeyBuilder`	`append(byte b)` Appends a byte - the byte is treated as an `unsigned` value.
`IKeyBuilder`	`append(byte[] a)` Appends an array of bytes - the bytes are treated as `unsigned` values.
`IKeyBuilder`	`append(byte[] a, int off, int len)` Append len bytes starting at off in a to the key buffer - the bytes are treated as `unsigned` values.
`IKeyBuilder`	`append(double d)` Appends a double precision floating point value by first converting it into a signed long integer using `Double.doubleToLongBits(double)`, converting that values into a twos-complement number and then appending the bytes in big-endian order into the key buffer.
`IKeyBuilder`	`append(float f)` Appends a single precision floating point value by first converting it into a signed integer using `Float.floatToIntBits(float)` converting that values into a twos-complement number and then appending the bytes in big-endian order into the key buffer.
`IKeyBuilder`	`append(int v)` Appends a signed integer to the key by first converting it to a lexiographic ordering as an unsigned integer and then appending it into the buffer as 4 bytes using a big-endian order.
`IKeyBuilder`	`append(long v)` Appends a signed long integer to the key by first converting it to a lexiographic ordering as an unsigned long integer and then appending it into the buffer as 8 bytes using a big-endian order.
`IKeyBuilder`	`append(Object val)` Append the value to the buffer, encoding it as appropriate based on the class of the object.
`IKeyBuilder`	`append(short v)` Appends a signed short integer to the key by first converting it to a two-complete representation supporting unsigned byte[] comparison and then appending it into the buffer as 2 bytes using a big-endian order.
`IKeyBuilder`	`append(String s)` Encodes a Unicode string using the configured `KeyBuilder.Options.COLLATOR` and appends the resulting sort key to the buffer (without a trailing nul byte).
`IKeyBuilder`	`append(UUID uuid)` Appends the UUID to the key using the MSB and then the LSB (this preserves the natural order imposed by `UUID.compareTo(UUID)`).
`IKeyBuilder`	`appendASCII(String s)` Encodes a unicode string by assuming that its contents are ASCII characters.
`IKeyBuilder`	`appendNul()` Append an unsigned zero byte to the key.
`IKeyBuilder`	`appendSigned(byte v)` Converts the signed byte to an unsigned byte and appends it to the key.
`IKeyBuilder`	`appendText(String text, boolean unicode, boolean successor)` Encodes a variable length text field into the buffer.
`byte[]`	`array()` The backing byte[] WILL be transparently replaced if the buffer capacity is extended.
`long[]`	`fromZOrder(int numDimensions)` Inverts method above in the sense that it interprets the buffer as a zOrderString and returns an array of long values of size numDimensions, reflecting the individual components of the z-order string.
`byte[]`	`getKey()` Return the encoded key.
`boolean`	`isUnicodeSupported()` Return `true` iff Unicode is supported by this object (returns `false` if only ASCII support is configured).
`int`	`len()` The length of the slice is number of bytes written onto the backing byte[].
`int`	`off()` The offset of the slice into the backing byte[] is always zero.
`IKeyBuilder`	`reset()` Reset the key length to zero before building another key.
`byte[]`	`toByteArray()` An alias for `getKey()`.
`byte[]`	`toZOrder(int numDimensions)` Converts the key into a z-order byte array, assuming numDimensions components of type Long (i.e., 64bit each).

Methods inherited from interface com.bigdata.btree.keys.ISortKeyBuilder
getSortKey

Methods inherited from interface com.bigdata.io.IManagedByteArray
capacity, ensureCapacity, ensureFree

- Field Detail
  - maxlen
```
static final int maxlen
```
    The maximum length of a variable length text field is 65535 (pow(2,16)-1).
    Note: This restriction only applies to multi-field keys where the text field appears in a non-terminal position within the key - that is as encoded by . When a text field appears in such a non-terminal position trailing pad characters are used to maintain lexiographic ordering over the multi-field key.
    
    See Also:
    Constant Field Values
- Method Detail
  - array
```
byte[] array()
```
    The backing byte[] WILL be transparently replaced if the buffer capacity is extended. The backing byte[]. This method DOES NOT guarantee that the backing array reference will remain constant. Some implementations use an extensible backing byte[] and will replace the reference when the backing buffer is extended.
    
    Specified by:
    
    array in interface IByteArraySlice
  - off
```
int off()
```
    The offset of the slice into the backing byte[] is always zero. The start of the slice in the IByteArraySlice.array().
    
    Specified by:
    
    off in interface IByteArraySlice
  - len
```
int len()
```
    The length of the slice is number of bytes written onto the backing byte[]. This is set to ZERO (0) by reset(). The length of the slice in the IByteArraySlice.array(). Note: IByteArraySlice.len() has different semantics for some concrete implementations. ByteArrayBuffer.len() always returns the capacity of the backing byte[] while ByteArrayBuffer.pos() returns the #of bytes written onto the backing buffer. In contrast, KeyBuilder.len() is always the #of bytes written onto the backing buffer.
    
    Specified by:
    
    len in interface IByteArraySlice
  - getKey
```
byte[] getKey()
```
    Return the encoded key. Comparison of keys returned by this method MUST treat the array as an array of unsigned bytes.
    Note that keys are donated to the btree so it is important to allocate new keys when running in the same process space. When using a network api, the api provides the necessary decoupling.
    
    Returns:
    A new array containing the key.
    See Also:
    BytesUtil.compareBytes(byte[], byte[])
  - toByteArray
```
byte[] toByteArray()
```
    An alias for getKey(). Return a copy of the data in the slice.
    
    Specified by:
    
    toByteArray in interface IByteArraySlice
    
    Returns:
    A new array containing data in the slice.
  - reset
```
IKeyBuilder reset()
```
    Reset the key length to zero before building another key.
    
    Specified by:
    
    reset in interface IManagedByteArray
    
    Returns:
    this
  - append
```
IKeyBuilder append(String s)
```
    Encodes a Unicode string using the configured KeyBuilder.Options.COLLATOR and appends the resulting sort key to the buffer (without a trailing nul byte).
    Note: The SuccessorUtil.successor(String) of a string is formed by appending a trailing nul character. However, since IDENTICAL appears to be required to differentiate between a string and its successor (with the trailing nul character), you MUST form the sort key first and then its successor (by appending a trailing nul). Failure to follow this pattern will lead to the successor of the key comparing as EQUAL to the key. For example,
```
            
            IKeyBuilder keyBuilder = ...;
            
            String s = "foo";
            
            byte[] fromKey = keyBuilder.reset().append( s );
            
            // right.
            byte[] toKey = keyBuilder.reset().append( s ).appendNul();
            
            // wrong!
            byte[] toKey = keyBuilder.reset().append( s+"\0" );
            
 
```
    Parameters:
    s - A string.
    
    Returns:
    this
    
    Throws:
    
    UnsupportedOperationException - if Unicode is not supported.
    See Also:
    SuccessorUtil.successor(String), SuccessorUtil.successor(byte[]), FIXME update the javadoc further to speak to handling of multi-field keys.
  - appendText
```
IKeyBuilder appendText(String text,
                     boolean unicode,
                     boolean successor)
```
    Encodes a variable length text field into the buffer. The text is truncated to maxlen characters. The sort keys for strings that differ after truncation solely in the #of trailing #pad characters will be identical (trailing pad characters are implicit out to maxlen characters).
    Note: Trailing pad characters are normalized to a representation as a single pad character (1 byte) followed by the #of actual or implied trailing pad characters represented as an unsigned short integer (2 bytes). This technique serves to keep multi-field keys with embedded variable length text fields aligned such that the field following a variable length text field does not bleed into the lexiographic ordering of the variable length text field.
    Note: While the ASCII encoding happens to use one byte for each character that is NOT true of the Unicode encoding. The space requirements for the Unicode encoding depend on the text, the Locale, the collator strength, and the collator decomposition mode.
    Note: The successor option is designed to encapsulate some trickiness around forming the successor of a variable length text field embedded in a multi-field key. In particular, simply appending a nul byte will NOT work (it works fine when the text field is the last field in the key or when it is the only component in the key). This approach breaks encapsulation of the field boundaries such that the resulting "successor" is actually ordered before the original key. This happens because you introduce a 0x0 byte right on the boundary of the next field, effectively causing the next field to have a smaller value. Consider the following example (in hex) where "|" represents the end of the "text" field:
```
     ab cd | 12
 
```
    if you compute the successor by appending a nul byte to the text field you get
```
     ab cd | 00 12
 
```
    which is ordered before the original key!
    Parameters:
    text - The text.
    unicode - When true the text is interpreted as Unicode according to the KeyBuilder.Options.COLLATOR option. Otherwise it is interpreted as ASCII.
    successor - When true, the successor of the text will be encoded. Otherwise the text will be encoded.
    
    Returns:
    The IKeyBuilder.
    See Also:
    http://www.unicode.org/reports/tr10/tr10-10.html#Interleaved_Levels
  - isUnicodeSupported
```
boolean isUnicodeSupported()
```
    Return true iff Unicode is supported by this object (returns false if only ASCII support is configured).
  - appendASCII
```
IKeyBuilder appendASCII(String s)
```
    Encodes a unicode string by assuming that its contents are ASCII characters. For each character, this method simply chops of the high byte and converts the low byte to an unsigned byte.
    Note: This method is potentially much faster than the Unicode aware append(String). However, this method is NOT unicode aware and non-ASCII characters will not be encoded correctly. This method MUST NOT be mixed with keys whose corresponding component is encoded by the unicode aware methods, e.g., append(String).
    
    Parameters:
    s - A String containing US-ASCII characters.
    
    Returns:
    this
  - append
```
IKeyBuilder append(byte b)
```
    Appends a byte - the byte is treated as an unsigned value.
    
    Specified by:
    
    append in interface IManagedByteArray
    
    Parameters:
    b - The byte.
    
    Returns:
    this
  - append
```
IKeyBuilder append(byte[] a)
```
    Appends an array of bytes - the bytes are treated as unsigned values.
    
    Specified by:
    
    append in interface IManagedByteArray
    
    Parameters:
    a - The array of bytes.
    
    Returns:
    this
  - append
```
IKeyBuilder append(byte[] a,
                 int off,
                 int len)
```
    Append len bytes starting at off in a to the key buffer - the bytes are treated as unsigned values.
    
    Specified by:
    
    append in interface IManagedByteArray
    
    Parameters:
    off - The offset.
    len - The #of bytes to append.
    a - The array containing the bytes to append.
    
    Returns:
    this
  - append
```
IKeyBuilder append(double d)
```
    Appends a double precision floating point value by first converting it into a signed long integer using Double.doubleToLongBits(double), converting that values into a twos-complement number and then appending the bytes in big-endian order into the key buffer.
    Note: this converts -0d and +0d to the same key.
    
    Parameters:
    d - The double-precision floating point value.
    
    Returns:
    this
  - append
```
IKeyBuilder append(float f)
```
    Appends a single precision floating point value by first converting it into a signed integer using Float.floatToIntBits(float) converting that values into a twos-complement number and then appending the bytes in big-endian order into the key buffer.
    Note: this converts -0f and +0f to the same key.
    
    Parameters:
    f - The single-precision floating point value.
    
    Returns:
    this
  - append
```
IKeyBuilder append(UUID uuid)
```
    Appends the UUID to the key using the MSB and then the LSB (this preserves the natural order imposed by UUID.compareTo(UUID)).
    
    Parameters:
    uuid - The UUID.
    
    Returns:
    this
  - append
```
IKeyBuilder append(long v)
```
    Appends a signed long integer to the key by first converting it to a lexiographic ordering as an unsigned long integer and then appending it into the buffer as 8 bytes using a big-endian order.
    
    Returns:
    this
  - append
```
IKeyBuilder append(int v)
```
    Appends a signed integer to the key by first converting it to a lexiographic ordering as an unsigned integer and then appending it into the buffer as 4 bytes using a big-endian order.
    
    Returns:
    this
  - append
```
IKeyBuilder append(short v)
```
    Appends a signed short integer to the key by first converting it to a two-complete representation supporting unsigned byte[] comparison and then appending it into the buffer as 2 bytes using a big-endian order.
    
    Returns:
    this
  - appendSigned
```
IKeyBuilder appendSigned(byte v)
```
    Converts the signed byte to an unsigned byte and appends it to the key.
    
    Parameters:
    v - The signed byte.
    
    Returns:
    this
  - appendNul
```
IKeyBuilder appendNul()
```
    Append an unsigned zero byte to the key.
    
    Returns:
    this
  - append
```
IKeyBuilder append(BigInteger i)
```
    Encode a BigInteger into an unsigned byte[] and append it into the key buffer.
    The encoding is a 2 byte run length whose leading bit is set iff the BigInteger is negative followed by the byte[] as returned by BigInteger.toByteArray().
    
    Parameters:
    The - BigInteger value.
    
    Returns:
    The unsigned byte[].
  - append
```
IKeyBuilder append(BigDecimal d)
```
    Encode a BigDecimal into an unsigned byte[] and append it into the key buffer.
    
    Parameters:
    The - BigDecimal value.
    
    Returns:
    The unsigned byte[].
  - append
```
IKeyBuilder append(Object val)
```
    Append the value to the buffer, encoding it as appropriate based on the class of the object. This method handles all of the primitive data types plus UUID and Unicode Strings.
    
    Parameters:
    val - The value.
    
    Returns:
    this
    
    Throws:
    
    IllegalArgumentException - if val is null.
    
    UnsupportedOperationException - if val is an instance of an unsupported class.
  - toZOrder
```
byte[] toZOrder(int numDimensions)
```
    Converts the key into a z-order byte array, assuming numDimensions components of type Long (i.e., 64bit each). For instance, assume the current key's buffer is 001001011010010001010100 and we call the method with numDimensions=3. The method logically proceeds as follows: 1. Split the key into n components, namely: 00100101 10100100 01010100 2. Merge the component bit by bit: 010 001 110 001 000 111 000 100 3. The result is this merged array
    
    Parameters:
    baseSize -
    numDimensions -
  - fromZOrder
```
long[] fromZOrder(int numDimensions)
```
    Inverts method above in the sense that it interprets the buffer as a zOrderString and returns an array of long values of size numDimensions, reflecting the individual components of the z-order string.
    
    Parameters:
    size -
    numDimensions -
    
    Returns:

Interface IKeyBuilder

Unicode

Multi-field keys with variable length fields

Field Summary

Method Summary

Methods inherited from interface com.bigdata.btree.keys.ISortKeyBuilder

Methods inherited from interface com.bigdata.io.IManagedByteArray

Field Detail

maxlen

Method Detail

array

off

len

getKey

toByteArray

reset

append

appendText

isUnicodeSupported

appendASCII

append

append

append

append

append

append

append

append

append

appendSigned

appendNul

append

append

append

toZOrder

fromZOrder