Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Super-position of graph traversal results

IP.com Disclosure Number: IPCOM000248760D
Publication Date: 2017-Jan-06

Publishing Venue

The IP.com Prior Art Database

Abstract

Property graphs contain vertices and edges and their properties and are used to express and analyze relationships between entities. Queries are performed by writing and executing traversals. These can get complex or require the use of expensive operations, especially when needing to return vertices and edges (and their properties) for visualization. An application or user interface that supports super-position of results supports the use of simple and efficient queries.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 15% of the total text.

Super-position of graph traversal results

Graphs are a means of expressing and analyzing relationships. A graph contains vertices that represent entities - they could be real world objects like people, organizations and vehicles or they could represent conceptual things. Vertices are connected by edges that represent the relationships between vertices. Vertices and edges have identities, types and properties. Graphs have become an important tool for analysis of relationships and particularly discovery of hitherto unknown relationships. Graphs are stored in graph databases and analysed using graph analysis frameworks and languages.

A graph is queried by writing and performing a traversal. The language used to create a traversal depends on the graph database or service being queried. Examples include the Gremlin language from the Apache Tinkerpop project and the Cypher language from the Neo Technologies neo4j database. For brevity, the remainder of this article will focus on the Gremlin graph query language.

Query results can be returned as an object or array (to be used in a console or application program for example) or in a data format (such as GraphSON) that is optionally transmitted (e.g. over websockets) and then parsed and displayed in a graphical user interface.

Simple gremlin queries are straightforward to write and understand. The following examples assume the use of a database containing a sample graph of continents, countries, airports and the routes between them. It is simple to write a Gremlin query to find all the airports in Texas. In the sample graph there are 25 such airports and we can display them by their 3-character ‘code’ (or any combination of properties): gremlin> g.V().hasLabel('airport').has('region','US-TX').values('code') ==>AUS ==>SAT ==>GGG ==>GRK ==>SJT ==>ABI ==>TYR ==>MAF ==>CRP ==>SPS ==>IAH ==>HRL ==>ACT ==>AMA ==>HOU ==>MFE ==>LBB ==>ELP ==>DAL ==>CLL ==>BPT ==>DFW

2

==>VCT ==>BRO ==>LRD

The example above is a very simple query. Frequently it is necessary to write more complex queries, in order to traverse further through the graph – e.g. ‘show airports in TX with a direct flight to an airport offering a direct flight to any airport close to New York’. It is common, therefore, to create longer and more complex queries with additional terms, traversals and predicates and to perform comparative processing on the results of parts of the query. For example, a query to find the 10 busiest airports is considerably more complex than the previous example:

g.withSack([:]).V().hasLabel('airport').where(out().count().is(gt(100))).sack {m,v -> m[v.value('code')]=g.V(v).out().count().next() ; m }.fold().sack().next().sort{-it.value} ==>FRA=134 ==>LHR=132 ==>JFK=129 ==>CDG=129 ==>AMS=128 ==>ORD=127 ==>ATL=124 ==>EWR=118 ==>DFW=109 ==>LAX=105 ==>IAH=104 ==>YYZ=104 ==>MUC=101

It is possible to write some queries as a set of expressions that are combined with a union operator, which in Gremlin is achieved by using the union(...