Beliebte Suchanfragen
//

Getting started with Titan using Cassandra and Solr

25.2.2016 | 4 minutes of reading time

Titan comes with several possibilities to configure the storage (BerkleyDb, Cassandra, Hbase) and the underlying search engine (Lucene, Solr, Elastic). Since DataStax aquired Aurelius  and DataStax Enterprise Search uses Solr, I wanted to setup an environment I can easily modify to use DSE later, instead of the Apache Cassandra version.

Pre Requirements

My Environment

I am running this setup on Ubuntu 14.04 in a Virtual Machine. I am using the latest Java version “1.8.0_73”.

Please note: This article will only cover basic information on how to setup Cassandra or Solr. For more details I recommend starting reading Apache Cassandra Getting Started  and solr Quickstart .

Cassandra

For this easy setup I will only use a one node cluster, so I leave the settings in cassandra.yaml as default.

To start Cassandra, unzip the downloaded Cassandra package and run the Cassandra binary inside of cassandra/bin

tar xvfz apache-cassandra-2.1.12-bin.tar.gz
cd apache-cassandra-2.1.12
bin/cassandra

Solr

Preparation

To start Solr, first unzip the downloaded Solr package.

tar xvfz solr-5.3.1.tgz

To be able to use geospacial search, we need to copy the file jts-1.13.jar – which is coming with Titan DB – into the Solr lib folder.

cp titan-1.0.0-hadoop1/lib/jts-1.13.jar solr-5.3.1/server/lib

This step is necessary, because the schema.xml – provided by Titan – uses geo definitions to be able to use spatial queries. If we don’t copy this jar into our classpath, we will run into the following error, when trying to create the Solr core.

https://gist.github.com/HashtagMarkus/32075e726e4990059c84

The second possibility, to get rid of this error, is to delete the lines in schema.xml where a “geo” jts property is used. Of course that way we are not able to use geospacial search like shown in the official examples .

Now we can start Solr

./solr-5.3.1/bin/solr start

To validate that Solr is running, point your browser to http://localhost:8983/solr/#/

Create Core

In general, we need to create a Solr core for each index we create in Titan. In the GraphOfTheGods examples, we want to run when this setup is done, two indexes are created: “vertices” and “edges”. The “vertices” index will be used to be able to do some range search on the “age” properties of our vertices. The “edges” index will be used to search for a property named “reason” on some of the edges as well as to be able to do a geo search.

Before we can create these Solr cores, we need to copy the predefined Solr configuration files into Solr’s configsets folder. These configuration files are included in our Titan package.

https://gist.github.com/HashtagMarkus/8ae4221f02a895984bca

Now we can create our cores:

To verify, that the cores were successfully created, open the Solr pannel inside your browser and see if both cores are present in the drop down list.

Starting Gremlin Shell and creating Titan sampledata

There are several ways to use Titan. For the purpose of this tutorial I run Groovy commands inside of the Gremlin shell, which is provided within the Titan package. The Gremlin shell comes with the necessary plugins to run all example commands.

In this example I run everything on a single machine. If you want to install Cassandra and Solr on separate machines, you need to make sure your servers are accessible from the outside. You’ll also need to edit the titan-cassandra-solr.properties file to point to the correct IP addresses for both – Cassandra and Solr.

vi titan-1.0.0-hadoop1/conf/titan-cassandra-solr.properties

Also make sure that the other listed properties are set accordingly. You could also use Solr cloud, but this setup would be quite different – I will not cover this setup in this post.

https://gist.github.com/HashtagMarkus/88cd82dcc48bffba8e73

Now that we finished setting up each of our components, its time to start the Gremlin console:

cd titan-1.0.0-hadoop1
bin/gremlin.sh

To test if our setup is correct we now load the Titan default graph named “GraphOfTheGods”.

https://gist.github.com/HashtagMarkus/2342de47694ffb036d81

To test if our setup is working, in the above example I first search for the vertex with the property “name = hercules”. Then I follow the edges pointing out to find the name of hercules parents. In the last example we do a geospacial search to find places within the given radius.

For a complete example of traversing this example graph, see the official Titan documentation

Conclusion

Setting up Titan as a highly scalable graph database using Cassandra as storage and Solr as search engine can be a bit tricky. The quick start examples provided by Aurelius – especially for using Cassandra with Solr – were not working for me out of the box. I hope this post helped to setup a first environment graph environment.

share post

Likes

0

//

More articles in this subject area

Discover exciting further topics and let the codecentric world inspire you.

//

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.