public class VertexDistribution extends Object
If we build a table [f,v]
, where f
is a frequency
value and v
is a vertex identifier, then we can select according
to different biases depending on how we normalize the scores for
f
. For example, if f
is ONE (1), and we normalize
by sum(f)
, then we have a uniform selection over the vertices.
On the other hand, if f
is the #of out-edges for v
and we normalize by sum(f)
, then we have a selection that is
uniform based on the #of out-edges. Since we need to take multiple samples,
the general approach is to build the table [f,v]
using some
policy, compute sum(f)
, and the select from the table by
computing the desired random value in [0:1]
and scanning the
table until we find the corresponding row and report that vertex identifier.
Constructor and Description |
---|
VertexDistribution(Random r) |
Modifier and Type | Method and Description |
---|---|
void |
addInEdgeSample(org.openrdf.model.Resource v)
Add a sample of a vertex having some in-edge.
|
void |
addOutEdgeSample(org.openrdf.model.Resource v)
Add a sample of a vertex having some out-edge.
|
org.openrdf.model.Resource[] |
getAll()
Return all (without duplicates) vertices from the graph
|
org.openrdf.model.Resource[] |
getUnweightedSample(int desiredSampleSize,
EdgesEnum edges)
Return a sample (without duplicates) of vertices from the graph choosen
at random without regard to their frequency distribution.
|
org.openrdf.model.Resource[] |
getWeightedSample(int desiredSampleSize,
EdgesEnum edges)
Return a sample (without duplicates) of vertices from the graph choosen
randomly according to their frequency within the underlying distribution.
|
int |
size()
Return the #of samples in the distribution from which a called specified
number of samples may then drawn using a random sampling without
replacement technique.
|
String |
toString() |
public VertexDistribution(Random r)
public void addOutEdgeSample(org.openrdf.model.Resource v)
v
- The vertex.public void addInEdgeSample(org.openrdf.model.Resource v)
v
- The vertex.public int size()
#getUnweightedSample(int)
,
#getWeightedSample(int)
public org.openrdf.model.Resource[] getWeightedSample(int desiredSampleSize, EdgesEnum edges)
desiredSampleSize
- The desired sample size.edges
- The sample is taken from vertices having the specified type(s)
of edges. Vertices with zero degree for the specified type(s)
of edges will not be present in the returned sampled.public org.openrdf.model.Resource[] getUnweightedSample(int desiredSampleSize, EdgesEnum edges)
desiredSampleSize
- The desired sample size.edges
- The sample is taken from vertices having the specified type(s)
of edges. Vertices with zero degree for the specified type(s)
of edges will not be present in the returned sampled.public org.openrdf.model.Resource[] getAll()
Copyright © 2006–2019 SYSTAP, LLC DBA Blazegraph. All rights reserved.