public class NativeDistinctFilter extends BOpFilterBase
SPO
s.
Note: While highly scalable, this class will absorb a minimum of one direct
buffer per use. This is because we do not have access to the memory manager
of the IRunningQuery
on which the distinct filter is being run. For
this reason, it is allocating a private MemStore
and using a
finalizer pattern to ensure the eventual release of that MemStore
and
the backing direct buffers.
Note: This can not be used with pipelined joins because it would allocate one instance per as-bound evaluation of the pipeline join.
Note: You can change the code over the HTree/BTree by modifying only a few
lines. See the comments in the file.
TODO Reads against the index will eventually degrade since we can not use
ordered reads because the iterator filter pattern itself is not vectored. We
might be able to fix this with a chunked filter pattern. Otherwise fixing
this will require a more significant refactor.
TODO It would be nicer if we left the MRU 10k in the map and evicted the LRU
10k each time the map reached 20k. This can not be done with the
LinkedHashMap
as its API is not sufficient for this purpose. However,
similar batch LRU update classes have been defined in the
com.bigdata.cache
package and could be adapted here for that
purpose.
Modifier and Type | Class and Description |
---|---|
static interface |
NativeDistinctFilter.Annotations |
static class |
NativeDistinctFilter.DistinctFilterImpl
|
DEFAULT_INITIAL_CAPACITY
Constructor and Description |
---|
NativeDistinctFilter(BOp[] args,
Map<String,Object> annotations)
Required shallow copy constructor.
|
NativeDistinctFilter(NativeDistinctFilter op)
Constructor required for
com.bigdata.bop.BOpUtility#deepCopy(FilterNode) . |
Modifier and Type | Method and Description |
---|---|
protected Iterator |
filterOnce(Iterator src,
Object context)
Wrap the source iterator with this filter.
|
static int[] |
getFilterKeyOrder(SPOKeyOrder indexKeyOrder)
Return the 3-component key order which has the best locality given that
the SPOs will be arriving in the natural order of the
indexKeyOrder.
|
static NativeDistinctFilter |
newInstance(SPOKeyOrder indexKeyOrder)
A instance using the default configuration for the in memory hash map.
|
filter
__replaceArg, _clearProperty, _set, _setProperty, annotations, annotationsCopy, annotationsEqual, annotationsRef, argIterator, args, argsCopy, arity, clearAnnotations, clearProperty, deepCopy, deepCopy, get, getProperty, setArg, setProperty, setUnboundProperty, toArray, toArray
annotationsEqual, annotationsToString, annotationsToString, annotationValueToString, checkArgs, clone, equals, getEvaluationContext, getId, getProperty, getRequiredProperty, hashCode, indent, isController, mutation, shortenName, toShortString, toString, toString
finalize, getClass, notify, notifyAll, wait, wait, wait
getProperty
public NativeDistinctFilter(NativeDistinctFilter op)
com.bigdata.bop.BOpUtility#deepCopy(FilterNode)
.public static NativeDistinctFilter newInstance(SPOKeyOrder indexKeyOrder)
indexKeyOrder
- The natural order in which the ISPO
s will arrive at
this filter. This is used to decide on the filter key order
which will have the best locality given the order of arrival.protected final Iterator filterOnce(Iterator src, Object context)
BOpFilterBase
filterOnce
in class BOpFilterBase
src
- The source iterator.context
- The iterator evaluation context.public static int[] getFilterKeyOrder(SPOKeyOrder indexKeyOrder)
The return valuer is an int[3]
. The index is the ordinal
position of the triples mode key component for the filter keys. The value
at that index is the position in the SPOKeyOrder
of the quads
mode index whose natural order determines the order of arrival of the
ISPO
objects at this filter.
Thus, given indexKeyOrder = SPOKeyOrder.CSPO
, the array:
int[] = {1,2,3}would correspond to the filter key order SPO, which is the best possible filter key order for the natural order order of the
SPOKeyOrder.CSPO
index.
Note, however, that key orders can be expressed in this manner which are
not defined by SPOKeyOrder
. For example, given
SPOKeyOrder.PCSO
the best filter key order is PSO
.
While there is no PSO
key order declared by the
SPOKeyOrder
class, we can use
int[] = {0,2,3}which models the
PSO
key order for the purposes of this
class.
Note: This method now accepts triples in support of the
ASTConstructIterator
Annotations#INDEX_KEY_ORDER
,
CONSTRUCT should apply DISTINCT (s,p,o) filter Copyright © 2006–2019 SYSTAP, LLC DBA Blazegraph. All rights reserved.