public class JGraph extends Object
JoinGraph
bears some
similarity to ROX (Runtime Optimizer for XQuery), but has several significant
differences:
JoinGraph
starts with one or more low cardinality
vertices.JoinGraph
extends all vertices having unexplored edges in each
breadth first expansion.JoinGraph
is
designed to prune all join paths which are known to be dominated by other
join paths for the same set of vertices in each round and iterates until a
join path is identified which uses all vertices and has the minimum expected
cumulative estimated cardinality. Join paths which survive pruning are
re-sampled as necessary in order to obtain better information about edges in
join paths which have a low estimated cardinality in order to address a
problem with underflow of the cardinality estimates.Modifier and Type | Method and Description |
---|---|
Path[] |
chooseStartingPaths(int nedges,
Path[] paths)
Choose the starting vertices.
|
Path[] |
expand(QueryEngine queryEngine,
int limitIn,
int round,
Path[] a,
Map<PathIds,EdgeSample> edgeSamples)
Do one breadth first expansion.
|
int[] |
getOrder(Path p)
Return a permutation vector which may be used to reorder the given
IPredicate [] into the evaluation order selected by the
runtime query optimizer. |
Vertex |
getVertex(int bopId)
|
List<Vertex> |
getVertices() |
Path[] |
pruneJoinPaths(Path[] a,
Map<PathIds,EdgeSample> edgeSamples)
Prune paths which are dominated by other paths.
|
protected int |
resamplePaths(QueryEngine queryEngine,
int limitIn,
int round,
Path[] a,
Map<PathIds,EdgeSample> edgeSamples)
Resample the initial vertices for the specified join paths and then
resample the cutoff join for each given join path in path order.
|
Path[] |
round0(QueryEngine queryEngine,
int limit,
int nedges)
Choose up to nedges edges to be the starting point.
|
Path |
runtimeOptimizer(QueryEngine queryEngine,
Map<PathIds,EdgeSample> edgeSamples)
Find a good join path in the data given the join graph.
|
static String |
showPath(Path x,
Map<PathIds,EdgeSample> edgeSamples)
Show the details of a join path, including the estimated cardinality and
join hit ratio for each step in the path.
|
static String |
showTable(Path[] a)
Comma delimited table showing the estimated join hit ratio, the estimated
cardinality, and the set of vertices for each of the specified join
paths.
|
static String |
showTable(Path[] a,
Path[] pruned)
Return a comma delimited table showing the estimated join hit ratio, the
estimated cardinality, and the set of vertices for each of the specified
join paths.
|
static String |
showTable(Path[] a,
Path[] pruned,
Map<PathIds,EdgeSample> edgeSamples)
Return a comma delimited table showing the estimated join hit ratio, the
estimated cardinality, and the set of vertices for each of the specified
join paths.
|
String |
toString() |
public JGraph(JoinGraph joinGraph)
joinGraph
- The pipeline operator that is executing the RTO. This defines
the join graph (vertices, edges, and constraints) and also
provides access to the AST and related metadata required to
execute the join graph.IllegalArgumentException
- if the joinGraph is null
.IllegalArgumentException
- if the JoinGraph.getVertices()
is null
.IllegalArgumentException
- if the JoinGraph.getVertices()
is an empty array.IllegalArgumentException
- if any element of the JoinGraph.getVertices()
is
null
.IllegalArgumentException
- if any constraint uses a variable which is never bound by the
given predicates.IllegalArgumentException
- if sampleType is null
.public Path runtimeOptimizer(QueryEngine queryEngine, Map<PathIds,EdgeSample> edgeSamples) throws Exception, NoSolutionsException
queryEngine
- The query engine.edgeSamples
- A map that will be populated with the samples associated with
each non-pruned join path. This map is used to associate join
path segments (expressed as an ordered array of bopIds) with
edge sample to avoid redundant effort.NoSolutionsException
- If there are no solutions for the join graph in the data (the
query does not have any results).IllegalArgumentException
- if queryEngine is null
.IllegalArgumentException
- if limit is non-positive.IllegalArgumentException
- if nedges is non-positive.Exception
- TODO It is possible that this could throw a
NoSolutionsException
if the cutoff joins do not use a
large enough sample to find a join path which produces at
least one solution (except that no solutions for an optional
join do not cause the total to fail, nor do no solutions for
some part of a UNION).
TODO We need to automatically increase the depth of search
for queries where we have cardinality estimation underflows
or punt to another method to decide the join order.
TODO RTO: HEAP MANAGMENT : The edgeSamples map holds
references to the cutoff join samples. To ensure that the map
has the minimum heap footprint, it must be scanned each time
we prune the set of active paths and any entry which is not a
prefix of an active path should be removed.
TODO RTO: MEMORY MANAGER : When an entry is cleared from this
map, the corresponding allocation in the memory manager (if
any) must be released. The life cycle of the map needs to be
bracketed by a try/finally in order to ensure that all
allocations associated with the map are released no later
than when we leave the lexicon scope of that clause.public int[] getOrder(Path p)
IPredicate
[] into the evaluation order selected by the
runtime query optimizer.IllegalArgumentException
- if the argument is null
.IllegalArgumentException
- if the given Path
does not cover all vertices in
the join graph.public Path[] chooseStartingPaths(int nedges, Path[] paths)
nedges
- The maximum #of edges to choose.paths
- The set of possible initial paths to choose from.public Path[] round0(QueryEngine queryEngine, int limit, int nedges) throws Exception
Note: An edge can not serve as a starting point for exploration if it uses variables (for example, in a CONSTRAINT) which are not bound by either vertex (since the variable(s) are not bound, the constraint would always fail).
queryEngine
- The query engine.limit
- The cutoff used when sampling the vertices and when sampling
the edges.nedges
- The maximum #of edges to choose. Those having the smallest
expected cardinality will be chosen.Exception
protected int resamplePaths(QueryEngine queryEngine, int limitIn, int round, Path[] a, Map<PathIds,EdgeSample> edgeSamples) throws Exception
queryEngine
- The query engine.limitIn
- The original limit.round
- The round number in [1:n].a
- The set of paths from the previous round. For the first round,
this is formed from the initial set of edges to consider.edgeSamples
- A map used to associate join path segments (expressed as an
ordered array of bopIds) with EdgeSample
s to avoid
redundant effort.Exception
public Path[] expand(QueryEngine queryEngine, int limitIn, int round, Path[] a, Map<PathIds,EdgeSample> edgeSamples) throws Exception
queryEngine
- The query engine.limitIn
- The original limit.round
- The round number in [1:n].a
- The set of paths from the previous round. For the first round,
this is formed from the initial set of edges to consider.edgeSamples
- A map used to associate join path segments (expressed as an
ordered array of bopIds) with EdgeSample
s to avoid
redundant effort.Exception
public Vertex getVertex(int bopId)
bopId
- The bop identifier.Vertex
-or- null
if there is no such
vertex in the join graph.public Path[] pruneJoinPaths(Path[] a, Map<PathIds,EdgeSample> edgeSamples)
If there is a path, [p] != [p1], where [p] is an unordered variant of [p1] (that is the vertices of p are the same as the vertices of p1), and the cumulative cost of [p] is LTE the cumulative cost of [p1], then [p] dominates (or is equivalent to) [p1] and p1 should be pruned.
a
- A set of paths.edgeSamples
- The set of samples for path segments. Samples which are no
longer in use after pruning will be cleared from the map and
their materialized solution sets will be discarded.public static String showTable(Path[] a)
a
- An array of join paths.public static String showTable(Path[] a, Path[] pruned)
a
- A set of paths (typically those before pruning).pruned
- The set of paths after pruning (those which were retained)
(optional). When given, the paths which were pruned are marked
in the table.public static String showTable(Path[] a, Path[] pruned, Map<PathIds,EdgeSample> edgeSamples)
a
- A set of paths (typically those before pruning).pruned
- The set of paths after pruning (those which were retained)
(optional). When given, the paths which were pruned are marked
in the table.edgeSamples
- When non-null
, the details will be shown (using
showPath(Path, Map)
) for each path that is
experiencing cardinality estimate underflow (no solutions in
the data).public static String showPath(Path x, Map<PathIds,EdgeSample> edgeSamples)
p
- The join path (required).edgeSamples
- A map containing the samples utilized by the Path
(required).Copyright © 2006–2019 SYSTAP, LLC DBA Blazegraph. All rights reserved.