Hibernate OGM - JPA for NoSQL?
The fine Hibernate folks have embarked on the ambitious goal of bringing Java Persistence API (JPA) support to the NoSQL space and have made some great progress to this end. The project is called the Hibernate Object/Grid Mapper (OGM), and it's the next framework we're evaluating for connecting to Mongo from Java.
Features
One very cool feature of Hibernate OGM is that it will be targetting multiple data stores, and already has integration available with Infinispan, Ehcache and MongoDB. The project's website lists expanding this list to include more key/value stores and NoSQL products as one of the higher priorities.
Documents are currently searched using Lucene based full-text queries on predetermined indexed attributes, but a killer feature that is being worked on is to support higher level queries (e.g. JP-QL syntax). The near term goal is to support simple queries (e.g. Restrictions and many-to-one joins), and to eventually bring in support for complex joins and aggregations.
This project is very much still in progress, so check out the ways you can contribute.
Hibernate OGM in Action
NOTE: For most of the code examples, we will only show relevant snippets. The complete project is available via the Vizuri Mongo-ODM Github repository.
Prerequisites
The standard set of tools will be used as part of our evaluation:
* Apache Maven
* Eclipse (or your favorite I D E)
* MongoDB - Setup Instructions are available here
JBoss Configuration
In order to run our example, we need to run in a modfied version of JBoss Application Server. There are already a few examples of doing this using OpenShift (here) and another good example that has the user download the Hibernate OGM source (here). The official documentation has and example that runs without a container.
In the interest of showing another option, we are going to use the Beta2 version of Hibernate OGM which is available in a public repo, and includes a pre-configured module that can be simply dropped into a JBoss AS 7.1.1 instance.
We'll get setup using the following steps:
Install a fresh JBoss AS 7.1.1 instance by downloading and extracting the archive which will create a folder named 'jboss-as-7.1.1.Final'. We will refer to the path to this folder as $JBOSS_HOME.
Download the Hibernate OGM Beta2 JBoss Module and extract this into the $JBOSS_HOME directory. This archive will update the main hibernate module and create a new ogm hibernate module to support the framework. Because of this, you should not use this modified version for other projects as this Hibernate module may cause you issues.
We need to make a temporary directory for the Lucence indexes to work in, and this is currently set to '/tmp/.hibernate_ogm_demo_luceneindexes'. This directory is set in the persistence.xml if you wish to use a different location.
Running the Example
For the impatient, we can make a long story short (too late?) by first running through the example. First we need to start JBoss and mongo:
Mongo Server
First of all, we need our Mongo server instance running, we will just run the instance without modification:
* $MONGO_HOME/bin/mongod (*nix/mac)
* %MONGO_HOME%\bin\mongod.bat (Windows)
JBoss
We will need a running instance of JBoss AS7 in order to run the examples, the Maven plugin allows us to run integration tests (Arquillian) and to deploy the application using Maven commands.
* $JBOSS_HOME/bin/standalone.sh (*nix/mac)
* %JBOSS_HOME%\bin\standalone.bat (Windows)
Mongo Shell
In order to check our progress, we will connect to the Mongo shell as well:
* $MONGO_HOME/bin/mongo (*nix/mac)
* %MONGO_HOME%\bin\mongo (Windows)
The data created from this demonstration will be in the database called "hibernate-ogm-test". We can see what databases are available in the Mongo shell using "show dbs". Our test database shouldn't exist yet, this will be created after we run our test.
Example Code
Clone the project and switch over to the hibernate project:
git clone https://github.com/Vizuri/mongo-odm.git
cd mongo-odm/hibernate-ogm
Run the Arquillian Test
For no additional cost (this week only!), the example project features an Arquillian test to quickly see if we're on the right track. Arquillian allows us to perform integration tests in the manner we are used to performing unit tests:
The necessary components are packaged into a WAR in memory using ShrinkWrap
This artifact is deployed to our running application server
Our test code is executed with the pass fail semantics we are used to in either JUnit or TestNG
The test application is undeployed
All of this happens very quickly and is every bit as slick as you are imagining.
To run the test:
mvn -Parq-jbossas-remote clean test
We should be able to log into the Mongo shell and see that a test record was created:
$mongo
>use hibernate-ogm-test
>db.Quote.find()
You should see some records similar to the following:
{
"_id" : "486d89ca-083e-4942-aae2-f0c53b954e30",
"quoteNumber" : "arquillian-1234-1361724546806",
"quoteVers" : "V1361724546806",
"shipping" : "Shipping!"
}
Run the Application
To deploy the actual web application, issue the following command (JBoss must be running to deploy, and Mongod must be running to test out the application):
mvn clean jboss-as:deploy
You can insert another row into the collection by visiting:
http://localhost:8080/hibernate-ogm/querytest
If that works well, open the default page:
http://localhost:8080/hibernate-ogm/
To see a listing of the quotes and enter new quotes.
Code Walkthrough
At a high level, Hibernate OGM has some things that will look very familiar to those comfortable with JPA, but may have some suprises when it comes to querying. This is due to the fact that JP-QL style queries are still a work in progress, and we instead utilizes Hibernate Search along with Apache Lucene.
Wiring up the Connection
We define our persistence context per usual in META-INF/persistence.xml:
<persistence-unit name="vizuri-mongo-ogm">
<provider>org.hibernate.ogm.jpa.HibernateOgmPersistence</provider>
<class>com.vizuri.mongo.odm.hibernateogm.model.LineItem</class>
<class>com.vizuri.mongo.odm.hibernateogm.model.Product</class>
<class>com.vizuri.mongo.odm.hibernateogm.model.Quote</class>
<properties>
<property name="hibernate.ogm.datastore.provider" value="org.hibernate.ogm.datastore.mongodb.impl.MongoDBDatastoreProvider"/>
<property name="hibernate.ogm.mongodb.database" value="hibernate-ogm-test"/>
<property name="hibernate.ogm.mongodb.host" value="localhost"/>
<property name="hibernate.search.default.directory_provider" value="filesystem"/>
<property name="hibernate.search.default.indexBase" value="/tmp/.hibernate_ogm_demo_luceneindexes"/>
</properties>
</persistence-unit>
</persistence>
Here we have:
Defined which entity classes should be managed
Defined the connection information (host, database name), driver implementation
Specified the indexing configuration
For our trivial example we are just using a temporary directory to store our indexes, but note that this can be configured to be managed by other means such as Infinispan.
Entities
The entities have a lot of familiar JPA annotations (@Entity, @Id, @GeneratedValue, @ManyToOne etc.), as well as some Hibernate Search annotations (namely @Index and @Field) to denote which entities we will want to perform full text searches on, and which attributes are to be indexed. Here's the interesting bits from the Quote entity:
@Entity
@XmlRootElement
@Indexed
@Proxy(lazy = false)
public class Quote {
@Id
@GeneratedValue(generator = "uuid")
@GenericGenerator(name = "uuid", strategy = "uuid2")
private String id;
@Field
private String quoteNumber;
@Field
private String quoteVers;
@OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL)
private Set<LineItem> lineItems;
/* other boring stuff */
}
Entity Manager
We will use a regular EntityManager for creating, deleting and finding entities by ID, and will use a FullTextEntityManager instance to do searching. The FullTextEntityManager is setup and made available via CDI with the helper class Resources:
@PersistenceContext(unitName = "vizuri-mongo-ogm")
EntityManager em;
@Produces
public FullTextEntityManager getFTEM() {
return Search.getFullTextEntityManager(em);
}
This allows us to inject the FullTextEntityManager into our Stateless EJB services (e.g. from QuoteService):
@Inject
private Provider<FullTextEntityManager> lazyFEM;
CRUD Operations
The "regular" entity manager can create, update and delete our entities as well as perform find by ID:
em.persist(quote);
em.remove(quote);
The JP-QL is not yet functional however, so we need to perform full text queries, and I've heavily lifted this example from the blog post porting seam hotel booking example to OGM in this excerpt from QuoteService:
@SuppressWarnings("unchecked")
public List<Quote> findQuotesByCriteria(final SearchCriteria criteria) {
FullTextEntityManager em = lazyFEM.get();
List<Quote> quotes = new ArrayList<Quote>();
final QueryBuilder builder = em.getSearchFactory().buildQueryBuilder()
.forEntity(Quote.class).get();
final Query luceneQuery =
builder.keyword()
.onField("quoteNumber")
.matching(criteria.getSearchPattern())
.createQuery();
log.debug("Lucene query: " + luceneQuery.toString());
final FullTextQuery query = em.createFullTextQuery(luceneQuery, Quote.class);
query.initializeObjectsWith(ObjectLookupMethod.SKIP,
DatabaseRetrievalMethod.FIND_BY_ID);
final List<Quote> results = query
.setFirstResult(criteria.getFetchOffset())
.setMaxResults(criteria.getFetchSize()).getResultList();
boolean nextPageAvailable = results.size() > criteria.getPageSize();
if (nextPageAvailable) {
// NOTE create new ArrayList since subList creates unserializable list
quotes = new ArrayList<Quote>(results.subList(0, criteria.getPageSize()));
} else {
quotes = results;
}
return quotes;
}
The SearchCriteria class is a simple wrapper for a query that provides a means to handle pagination and query semantics.
One issue I ran into was that it wasn't clear to perform a search that would simply get all of a type of entity (e.g. the equivalent to "select q from Quote q"). I finally ended up hacking this by adding "1234" to each of the quote numbers and performing the "all" search by matching on this value. I'm sure there is a more elegant approach to handling this.
Domain Model
The example application loosely represents a sales system that has quotes that contain line items, which reference a product.
To see this in action, start up your JBoss instance and mongod process and run the arquillian test:
mvn -Parq-jbossas-remote clean test
This set of test methods is in QuoteServiceTest, and we can observe it's results by connecting to the mongo shell and running the following queries:
$mongo
>use hibernate-ogm-test
>db.Quote.find()
>db.LineItem.find()
>db.Product.find()
As you can see, the data is represented in a fairly relational manner, which may be a bit counter to what you may be used to when using Mongo. There is an excellent discussion of exactly how they decided to store data and why in the official docs.
Parting Thoughts
The promise of the Hibernate OGM project is very cool: being able to immediately leverage standard JPA semantics to operate with key/value and NoSQL products is obviously compelling to Java developers.
As it stands now, the limitation to query only via Hibernate Search can be a fairly big detriment. Our trivial example does not manage the indexes, and it can be fairly easy to get them out of sync (e.g. by deleting the index directory or manually inserting documents in the mongo shell) which makes the queries incorrect. And there are quite a few more keystrokes to perform simple queries.
The good news is that this functionality is being developed, and there are plenty of opportunities to contribute.