public class PipelineJoinStats extends BaseJoinStats
| Modifier and Type | Field and Description |
|---|---|
CAT |
inputSolutions
The #of input solutions consumed (not just accepted).
|
CAT |
outputSolutions
The #of output solutions generated.
|
accessPathChunksIn, accessPathCount, accessPathDups, accessPathRangeCount, accessPathUnitsInchunksIn, chunksOut, elapsed, mutationCount, opCount, typeErrors, unitsIn, unitsOut| Constructor and Description |
|---|
PipelineJoinStats() |
| Modifier and Type | Method and Description |
|---|---|
void |
add(BOpStats o)
Combine the statistics (addition), but do NOT add to self.
|
double |
getJoinHitRatio()
The estimated join hit ratio.
|
protected void |
toString(StringBuilder sb)
Extension hook for
BOpStats.toString(). |
public final CAT inputSolutions
Note: This counter is highly correlated with BOpStats.unitsIn but
is incremented only when we begin evaluation of the IAccessPath
associated with a specific input solution.
When PipelineJoin.Annotations.COALESCE_DUPLICATE_ACCESS_PATHS is
true, multiple input binding sets can be mapped onto the
same IAccessPath and this counter will be incremented by the #of
such input binding sets.
public final CAT outputSolutions
getJoinHitRatio(). Of
necessity, updates to inputSolutions slightly lead updates to
inputSolutions.
Note: This counter is highly correlated with BOpStats.unitsOut.
public double getJoinHitRatio()
outputSolutions / inputSolutionsIt is ZERO (0) when
inputSolutions is ZERO (0).
The join hit ratio is always accurate when the join is fully executed.
However, when a cutoff join is used to estimate the join hit ratio a
measurement error can be introduced into the join hit ratio unless
PipelineJoin.Annotations.COALESCE_DUPLICATE_ACCESS_PATHS is false,
PipelineOp.Annotations.MAX_PARALLEL is GT ONE (1), or
PipelineJoin.Annotations.MAX_PARALLEL_CHUNKS is GT ZERO (0).
When access paths are coalesced because there is an inner loop over the
input solutions mapped onto the same access path. This inner loop the
causes inputSolutions to be incremented by the
#of coalesced access paths before any outputSolutions
are counted. Coalescing access paths therefore can cause the join hit
ratio to be underestimated as there may appear to be more input solutions
consumed than were actually applied to produce output solutions if the
join was cutoff while processing a set of input solutions which were
identified as using the same as-bound access path.
The worst case can introduce substantial error into the estimated join
hit ratio. Consider a cutoff of 100. If one input solution
generates 100 output solutions and two input solutions are mapped onto
the same access path, then the input count will be 2 and the output count
will be 100, which gives a reported join hit ration of 100/2
when the actual join hit ratio is 100/1.
A similar problem can occur if PipelineOp.Annotations.MAX_PARALLEL or
PipelineJoin.Annotations.MAX_PARALLEL_CHUNKS is GT ONE (1) since input count
can be incremented by the #of threads before any output solutions are
generated. Estimation error can also occur if multiple join tasks are run
in parallel for different chunks of input solutions.
public void add(BOpStats o)
BOpStatsadd in class BaseJoinStatso - Another statistics object.protected void toString(StringBuilder sb)
BOpStatsBOpStats.toString().toString in class BaseJoinStatssb - Where to write the additional state.Copyright © 2006–2019 SYSTAP, LLC DBA Blazegraph. All rights reserved.