Package com.bigdata.rdf.graph

The GAS (Gather Apply Scatter) API was developed for PowerGraph (aka GraphLab 2.1).

See: Description

Package com.bigdata.rdf.graph Description

The GAS (Gather Apply Scatter) API was developed for PowerGraph (aka GraphLab 2.1). This is a port of that API to the Java platform and schema-flexible attributed graphs using RDF.

Graph algorithms are stated using the GAS (Gather, Apply, Scatter) API. This API provides a vertex-centric approach to graph processing ("think like a vertex") that can be used to write a large number of graph algorithms (page rank, triangle counting, connected components, SSSP, betweenness centrality, etc.). The GAS API allows the GATHER operation to be efficently decomposed using fine-grained parallelism over a cluster.

Part of our effort under the XDATA program is to examine how fine-grained parallelism can be leveraged on GPUs and other many-core devices to deliver extreme performance on graph algorithms. We are looking at how the GAS abstraction can be evolved to expose more parallelism.

The interfaces of this API are stated in terms of RDF Value objects (for vertices) and Statement objects (for edges). Link attributes are handled efficiently by the bigdata implementation, which co-locates them in the indices with the links and then applies prefix compression to deliver a compact on disk foot print. See the section on Reification Done Right (below) for more details.

Reification Done Right and Property Graphs

Reification Done Right (RDR) explains the relationship between the somewhat opaque concept of RDF reification (which we use only for interchange) and statements about statements (more generally, the ability to turn any edge into a vertex and make statements about that vertex). There are different ways to handle statemetns about statements efficiently in the database, however these are internal physical schema design questions. From a user perspective, the main concern should be the performance of the database platform when using this feature. Bigdata uses a combination of inlining and prefix compression to provide a dense fast, bi-directional encoding of statements about statements and fast access paths whether querying by vertices, property values, or link attributes. You can also write queries using a high-level query language (SPARQL) that are automatically optimized and executed against the graph.

The RDR approach is more general than the Property Graph Model - anything that you can do with a property graph you can at as efficiently in an intelligently designed RDF database. Further, RDF graphs allow efficient handling of the following cases that are disallowed under the property graph model:

Because of its lack of cardinality constraints on property values and generality, RDF data sets may be freely combined and then leveraged. Data-level collisions simply do not occur.

Copyright © 2006–2019 SYSTAP, LLC DBA Blazegraph. All rights reserved.