public class IVSolutionSetEncoder extends Object implements IBindingSetEncoder
IBindingSets whose bound values are
IVs and their cached
IVs and the cached
BigdataValues are efficiently and compactly represented in format suitable for chunked messages or streaming. Decode is a fast online process. Both encode and decode require the maintenance of a map from the
BigdataValues to those cached values.
nbound nvars ncached (namespace) var...var[nvars-1] bitmap-for-bound-variables bitmap-for-IV-with-cached-Values IV ... IV[nbound-1] Value ... Value[ncached-1]where
nboundis the #of bindings in the binding set. When zero, the rest of the record is omitted.
nvars is the #of new variables in this binding set. The
"schema" used to encode the bindings is based on the ordered set of variables
for which bindings are observed. The encoder writes this information out
incrementally. The decoder builds up this information as it decodes
ncached is the #of bindings in the binding set for which
there is a cached
BigdataValue which has not already been written
into a previous record. Even if the
IV has a cached
BigdataValue, if the
IV has been previously written into a
record then the
IV is NOT record in this record with a cached Value.
Further, if the
IV appears more than once in a given record, the
cached value is only marked in the bitmap for the first such occurrence and
the cached value is only written into the record once.
namespace is the namespace of the lexicon relation. This
is written out for the first solution having an
It is assumed that all
BigdataValue for the same
lexicon relation. If no solutions have an
IVCache association, then
the namespace will never be written into the encoded output.
var is the name of a variable for which a binding was
first observed for the current solution. The names of the variables are
written in the order in which they are first observed. This forms the
implicit "schema" required to decode the
bitmap-for-bound-variables is zero or more bytes providing
a bit map indicating those variables which are bound in this solution out of
the total set of variables which have been observed in the solutions
presented to this encode.
bitmap-for-IVs-with-cached-Values is zero or more bytes
providing a bit map indicating which IVs are associated with cached values
written into the record. Whether or not an IV has a cached value must be
decided by the caller after processing the record and consulting an
(IV,Value) cache which they maintain over the set of records processed to
date. Cached values are written out (and the bit set) only the first time a
given IV with a cached Value is observed.
BigdataValueSerializerused to decode and materialize the cached
BigdataValues. This information can be sent before the records if it is not known to the caller.
The decoder materializes the cached values into a map (either a HashMap or HTree, as appropriate for the data scale) as the records are processed. Only one solution needs to be decoded at a time, but the decoder must maintain the (IV,Value) cache across all decoded records. There is no need to indicate the #of records, but IChunkMessage#getSolutionCount() in fact reports exactly that information.
Each solution can be turned into an
IBindingSet at the time that it
is decoded. If we use a standard
ListBindingSet, then we need to
IV against the
IV cache, setting its RDF Value
as a side effect before returning the IBindingSet to the caller. If we do a
IBindingSet implementation, then the cached
BigdataValue could be lazily materialized by hooking
IVCache.getValue(). Either way, the life cycle of the materialized
objects will be very short unless they are propagated into new solutions.
Short life cycle objects entail very little heap burden.
NOTE: the IVSolutionSetEncode may give us *DIFFERENT* representations for
the same binding set, depending on its internal state. This is relevant
insofar as we cannot perform safe equality checks over encoded values
(the IVBindingSetEncoder provides this guarantee).
|Constructor and Description|
|Modifier and Type||Method and Description|
Encode the solution on the stream.
Encode the solution as an
Flush any updates.
Release the state associated with the
public void encodeSolution(DataOutputBuffer out, IBindingSet bset)
out- The stream.
bset- The solution.
public byte encodeSolution(IBindingSet bset)
public byte encodeSolution(IBindingSet bset, boolean updateCacheIsIgnored)
IVCache associations may be buffered by this method.
IBindingSetEncoder.flush() to vector any buffered associations.
TODO We typically use a
ListBindingSet. If the
IBindingSet is large enough, then it would be more efficient to
IV map within this method since we
have to lookup bindings by variables more than once.
bset- The solution to be encoded.
true, updates are accumulated for the
BigdataValuecache. You must still use
IBindingSetEncoder.flush()to vector the accumulated updates.
public void release()
public void flush()
Copyright © 2006–2019 SYSTAP, LLC DBA Blazegraph. All rights reserved.