Graph – a closer look at the data

Introduction

A graph database data is represented as ‘vertices’, sometimes called ‘nodes’. The relationships between vertices are represented by connections called ‘edges’. Graph databases also store metadata or ‘properties’ about vertices and edges.

Domino Explorer

If you look at the data in Domino Explorer after the source databases names.nsf and catalog.nsf are scanned you can group them in the same categories vertices and edges.

If you look further at the Graph related properties on the Vertices documents you notice that the next level of division is the Java class the vertex is added with to the Graph (Java object-type Vertex = graph.addVertex(id, Java.class).

Each class defines specific properties for the object and also the relation(s) to other Vertices. These Edges contain properties like label, direction (in/out).

Adjacency annotate getters and adders to represent a Vertex incident to an Edge. AdjacencyUnique extends this idea to ensure that there is only one instance of an Edge with a specific label between any two specific Vertices. This allows the user of the graph to call an “add” method and return the existing adjacency if it already exists.

So when and where are these relations defined? That depends on your code.

When you “like” a post on Twitter, the relation between you and the tweet is created when you click the “like” icon.

In Domino Explorer relations are created when you run the setup by selecting one of the “Start Scan” buttons.

Screen Shot 2016-05-23 at 13.46.55

Scan Databases

When you select to scan the databases the $ReplicaID view in the catalog.nsf is opened and for each entry found a Vertex is created using the DXDatabase class and committed to the Graph.

View allDbs = catalog.getView(“($ReplicaID)”);
DocumentCollection col = allDbs.getAllDocuments();
for (Document db : col) {
String replicaId = session.evaluate(“@Text(ReplicaID; \”*\”)”, db).elementAt(0).toString();
DXDatabase databaseVertex = graph.addVertex(replicaId, DXDatabase.class);
databaseVertex.setReplicaId(replicaId);
String dbTitle = db.getItemValueString(“Title”);
databaseVertex.setTitle(dbTitle);
databaseVertex.setFilePath(db.getItemValueString(“Pathname”));
databaseVertex.setServer(db.getItemValueString(“Server”));

graph.commit();
// ACL
scanAcl(graph, databaseVertex);
}

Then then Graph is “scanned” with the newly created vertex. Here the database ACL is collected and for each ACL entry a new vertex is created using a specific class. Also the relationship between the database and aclentry vertices is created:

DXACLEntry graphAclEntry = graph.addVertex(aclEntry.getName(), DXACLEntry.class);
db.addAclEntry(graphAclEntry);

The addAclEntry method is defined the DXDatabase class which the db object is created with:

@AdjacencyUnique(label = “hasAcl”, direction = Direction.IN)
public void addAclEntry(DXACLEntry ae);

If you look at the Vertex object in the Domino Explorer NSF you notice these properties for a DXDatabase Vertex object:

#Note-UNID 82753B9B92522221033D1650C7BE35D4
$$Key 85256714:00725208
_ODA_GraphType V
form DXDatabase
filePath AgentRunner.nsf
replicaId 85256714:00725208
server CN=dev1/O=quintessens
title Java AgentRunner
_COUNT_OPEN_IN_hasAcl 5
_OPEN_IN_hasAcl

where

  • note-UNID is the document unique id
  • $$Key is the same as the replicaId of the (target) database (different for Vertices created with other class)
  • _ODA_GraphType, V stands for Vertex
  • form is the Java class the document is created with
  • _COUNT_OPEN_IN_hasAcl, counter. Not sure where it’s used for.
  • _OPEN_IN_hasAcl, for all Vertices empty

If you look at a DXACLEntry Vertex document notice the following properties:

#Note-UNID B9962995F0A05DCBAE35F07C64964A1C
$$Key -Default-
_ODA_GraphType V
form DXACLEntry
_COUNT_OPEN_IN_hasAcl 110
_OPEN_IN_hasAcl 110
level 6
name -Default-
_COUNT_OPEN_OUT_member 1
_OPEN_OUT_member

If you search through the Edges document you will find a document matching the note UNID’s above and carrying the label “hasLabel”:

Screen Shot 2016-05-23 at 15.00.21

and carrying the following properties:

Screen Shot 2016-05-23 at 15.01.06

Hereby the Graph db recognises a relationship between one Vertex of type DXDatabase and  another Vertex of type DXACLEntry.

Because this is not the only relationship the DXDatabase object has when the ACLService remote service is called, provided with the replicaID of the Database more than one matching relationship is returned under db.getAclEntries:

DXDatabase db = graph.getElement(replicaId, DXDatabase.class);
if (db != null) {
int count = 0;
Iterable<DXACLEntry> acl = db.getAclEntries();
for (DXACLEntry entry : acl) {
JsonJavaObject aclEntry = new JsonJavaObject();
aclEntry.put(“name”, entry.getName());
aclEntry.put(“level”, entry.getLevel());
aclEntry.put(“levelName”, ACL_LEVEL[entry.getLevel()]);
data.put(count, aclEntry);
count++;
}
}

Here for each found relationship the DXACLEntry object is collected from the Graph and properties from it placed in a JsonJavaObject, placed in an array and returned.

Wrapup

After taking this deeper look under the hood and analyzing the data I hope you have gained a bit more understanding of the Graph concepts and the implementation of it from it via OpenNTF’s Domino API in Domino Explorer.

 

One thought on “Graph – a closer look at the data

  1. Nathan T Freeman 2016-May-26 / 5:03 am

    Patrick, thanks for the comprehensive write-up! A few notes…

    This code block:

    DXDatabase databaseVertex = graph.addVertex(replicaId, DXDatabase.class);
    databaseVertex.setReplicaId(replicaId);

    is particularly interesting, because one might assume that the replicaid property is changed every time a scan is run. But the Vertex virtualization layer around the document is actually careful to compare values and only add a property to the list of changed properties if the value is actually different than the current one. In this case, no matter how many times you ran it, since the replicaid value wouldn’t change, the sequence number on the item would also NOT change.

    You wondered what the _count item is for. To understand this, you have to dig into how Edge pointers are stored in the Vertices. If you have an Edge with a label of “hasACL”, then the list of all Edges from any given Vertex will be stored in an item ending in “_hasACL.” This edge list is stored in a binary format using an ODA class called a NoteList, which is simply an ordered list of NoteCoordinate objects. (Dig into the source code to find out what a NoteCoordinate is, but I personally think they’re damn cool.)

    The count of these NoteCoordinate objects is stored as a separate item for two reasons: 1) it provides a checksum against any bugs in the recording of Edge lists, which there are, as evidenced by the fact that you have a count of 5 for an edge that I’m pretty sure should only have 1 entry; and 2) because getting the count of edges is a very common graph operation when you’re just browsing and I wanted an easy way to pull an integer value for things like view sorting. Have you ever wanted to sort you customers by the number of projects they have? There ya go.

    I haven’t dived deep into the model for Domex, but I have to admit that there’s an interesting decision by Oliver here that I would consider a bug. The DXACLEntry class is keyed on the Domino name associated with the entry. So “-Default-” is an ACL Entry that appears in 110 different ACLs. But notice that the entry itself also specifies a Level property, in this case of 6. But I bet that having a -Default- access of Manager is not the case for all 110 ACLs in the system. Most likely only for the last replica that happened to be scanned.

    This leads us to an important lesson about graphs. What I think Oliver meant to do (and before anyone accuses me of condescension, I freely admit I have made this mistake more often than I care to count. I’m probably in the triple digits!) is to have some kind of “ACL Entity”, which could mean a Group, User, Server, Certificate, or even a hardcoded string like “-Default-” or “Anonymous” as a Vertex. And then the ACLEntry would be an EDGE, between the ACL Entity and the ACL, and that Edge would then have a property of Level, indicating the access level that was granted to that entity in that particular ACL.

    At least, that’s how I would do it.

    Again, thank you for your comprehensive write-up and I continue to extend my invitation to work with you directly on your use of the ODA Graph API. I think you know where to find me on Slack.🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s