For some time the OpenNTF Domino API has included Tinkerpop and an implementation to store content in a graph database structure. Recently I’ve been digging into that further for potential projects, prompted also by a redevelopment of that area of the OpenNTF Domino API and Nathan’s upcoming session at DanNotes.

Graph databases are a common concept outside of Domino and an alternative to RDBMS, as Nathan Freeman recently explained in his Modern Domino session at MWLUG. A couple of stand-out points of that session are its use for big data and query performance comparisons to RDBMS.

But from a limited awareness of technologies like OrientDB and Neo4j, many Domino developers may wonder how applicable graph databases are to Domino. I’ve got an NSF, why would I need a different database?

But the question about a different database instead of the NSF is the first key point. Graph databases are more a concept of how to structure and access non-hierarchical content within a database, particularly with Tinkerpop, and less so about where the data is stored. The data itself is stored as separate objects containing key/value pairs, with unique IDs to link Vertices (or Nodes) and Edges. This presentation from Caleb Jones is a good introduction on grpah databases and Tinkerpop.

The second key aspect is that navigating a graph database is not done by browsing a list of all documents of a particular type. For Domino developers, that’s what we’ve been constrained to, because the entry point to documents in the Notes Client is typically a view, scrolled through. Searching is not commonly used as a key method of access. But on the other hand one of the key concepts of graph databases – navigating from a vertex (document) to related vertices – is common in Domino development: embedded views showing child documents.

So why Tinkerpop and the inclusion in OpenNTF Domino API?

Tinkerpop is a Java framework that is an umbrella for various sub-projects. Blueprints is the fundamental aspect and uses Java to handle vertices and edges as Java objects, with a variety of implementations for different back-end data stores (e.g. Neo4j, MongoDB, OrientDB). OpenNTF Domino API has for some time provided another implementation to that – Domino.

But that implementation is currently in the process of being re-written to take advantage of Frames. Frames is the extremely powerful part of Tinkerpop that allows the vertices and edges to be coded using Java interfaces with annotations. It basically rapidly speeds up development of the underlying data model. And it gives a lot of potential for extensibility, to make the concept of using graphs even more appealing.

Gremlin is the language for querying, but there is also a Java implementation. This itself leverages Pipes, which gives Java classes for filtering and transforming the output. And Rexster allows REST access to the data, for those who prefer to interact with the data via JavaScript.

But for a while now I’ve been trying to use Eclipse Modeling Framework to create UML diagrams that can then be converted into Java classes that can not only be documentation of my application but also speed up development. But EMF cannot be used to create a diagram and document graph databases.

Yet Blueprints not only has implementations but ouplementations, one of which is JUNG, which includes algorithms and visualizations. Over the last few days I’ve been adding that to OpenNTF Domino API and trying to get it work. The outcome is the following diagram of a movie graph database’s structure based on sample data, generated in Eclipse and using AWT:

graph

A couple of points worth noting are that functionality uses an in-memory graph database. It failed when trying to pick up serialized data (i.e. trying to read from a Domino database rather than in-memory objects). And it creates a graph based on actual data objects rather than the abstract interfaces themselves, which is why I created a small, standalone script to create an in-memory graph of sample data rather than querying actual data. The third point is that if any of the labels being passed is blank, the attempt to draw the diagram fails – understandable really and one of the big mistakes I had.

But for documentation and visualization of a data model, it gives a powerful, automated, data-driven alternative to manually creating data models of your systems.

10 thoughts on “OpenNTF Domino API, Graph Data Models and Visualization”

  1. So, to create a visualization, you have to load the part of the graph you want to visualize into memory? While I imagine it’s blazingly fast what are the limitations? Can we visualize thousands of nodes and their corresponding relationships? Would the implementation of Rexster solve the serialization issue? Can you style it or are you limited to the default output (good color for the default 😉 )

    For visualizations we’ve been using D3.js (http://d3js.org). This is a Javascript library whose sole purpose is for visualizing data delivered via REST. It does really good at this and seems to handle “just about” anything you can throw at it (not thousands of nodes but certainly hundreds), the real limitation is the speed of the delivery of the data.

    But I’m very interested in seeing what you come up with using JUNG and with the improvements Nathan is making to the graph, it should be really exciting.

  2. Wow! A lot of topics covered. First, I’ve since come across a tutorial that talks about how to style it http://www.grotto-networking.com/JUNG/JUNG2-Tutorial.pdf, so you can definitely change colours, as well as styling of lines. That comes from this list of documentation http://jung.sourceforge.net/doc/index.html.

    I think the problem isn’t the amount of data being loaded, but the visualisation of all that data. I threw Nathan’s upcoming Star Wars graph into GraphJung and it was not readable. I know there are options in GraphJung to drill down, but I haven’t tried those.

    I’m interested in D3, I’ll have to have a look at that. GraphJung is useful for documentation purposes, but I don’t think it can be used from a browser. So it’s not useful for visualisations of actual content in an XPages app.

    Rexster is something not currently incorporated in the API. But when I read about it a while back, it certainly looks like something we need to incorporate sooner or later. D3 could be a good use case.

  3. I attended Nathan Freeman’s presentation at Dannotes about the Graph revolution, and I found it very interesting, and currently I am trying to wrap my mind around it. I have been looking at the ODA source code, to get an idea. But what I lack at the moment is sample code that shows me how to implement it in a Domino database. Is that something you plan to provide? You also mention that the code is being rewritten to take advantage of Tinkerpop frames. Any idea when this will be finished?

    Thank you

    1. I think Nathan’s pretty much completed the re-write to take advantage of Tinkerpop frames. His session at DanNotes was the key driver for completion of that. In the tests package of org.openntf.domino project there is code to crerate the Star Wars graph database implementation that I think Nathan showed, it’s the GraphDemo class that can be run from Eclipse if you have the XPages SDK installed. You just need to create the relevant databases, put their replica IDs in the static variables at the top of the class, change the replica id for the names.nsf, and set the replica id for a valid person document.

      Longer term, once I’ve got my head rouond all the developments, I plan on adding Javadocs to it all and either blogging or creating a sample app to demonstrate the functionality. But as you’ve probably seen, there’s a lot of code in there! IBM ConnectED preparation is likely to keep me busy for the next couple of months though.

  4. So I found the code for creating the Star Wars graph database in Nathan’s branch. I have downloaded Eclipse Luna installed the XPages SDK, but where do I go from here? Should I clone the Openntf Domino project and import Nathan’s branch into Eclipse?

    1. If you’re just looking to try it, yes, you can clone the repository and import Nathan’s branch into Eclipse. The key thing though is to choose File > Import and then Maven > Existing Maven Projects.

  5. I do not know much about Maven, so I have had some problems getting Maven to run correctly. Apparently the Maven plugin is not using the Eclipse proxy settings, but instead a settings.xml file. After setting our corporate proxy settings here, the download seems to work. But after importing, I still have some errors in some of the pom.xml files. When doing a Maven install on the main project I get the following error:
    [ERROR] Internal error: java.lang.RuntimeException: Invalid repository URL: ${notes-platform}: no protocol: ${notes-platform} -> [Help 1]
    org.apache.maven.InternalErrorException: Internal error: java.lang.RuntimeException: Invalid repository URL: ${notes-platform}

    The ${notes-platform} must refer to a variable. But where is this set and is it just the install path of the Notes client?

  6. Is the Star Wars graph database still available? I have been looking at Oliver Busse’s SUTOL demo application and that makes me wondering how I could bring in existing Domino databases since in the demo the created documents have these special fields that the documents in existing documents do not have…?

    1. Yes, it’s at org.openntf.domino.tests.ntf.Graph2Demo in org.openntf.domino.core. You can see it on Stash https://stash.openntf.org/projects/ODA/repos/dominoapi/browse/domino/core/src/test/java/org/openntf/domino/tests/ntf/Graph2Demo.java?at=develop. We’re intending to get a new release out in the near future which will include some fixes for graph and the REST access for graph. The REST implmentation for graph has a dependency on an enhancement to DAS in ExtLib though, but hopefully that pull request will be included in the next release of ExtLib.

  7. Pingback: Stress testing with ODA Graph – Kwintessential Notes

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.