Chapter 4. Working with persistent objects – NHibernate in Action

Chapter 4. Working with persistent objects

This chapter covers

  • The lifecycle of objects in an NHibernate application
  • Using the session persistence manager
  • Transitive persistence
  • Efficient fetching strategy

You now understand how NHibernate and ORM solve the static, structural aspects of the object/relational mismatch introduced in section 1.3.1. More specifically, you learned how object-oriented structures can be mapped to relational database structures to address issues of granularity, identity, inheritance, polymorphism, and associations.

This chapter covers another crucial subject—the dynamic, behavioral aspects of the object/relational mismatch. Success with NHibernate will not be guaranteed by simply mapping your domain classes to your databases. You must understand the dynamic nature of the problems that come into play at runtime, and which greatly affect the performance and stability of your applications. In our experience, many developers focus mostly on the structural mismatch and rarely pay attention to the more dynamic behavioral aspects.

In this chapter, we discuss the lifecycle of objects—how an object becomes persistent, and how it stops being considered persistent—and the method calls and other actions that trigger these transitions. The NHibernate persistence manager, the ISession, is responsible for managing object state, so you’ll learn how to use this important API.

Retrieving object graphs efficiently is another central concern, so we introduce the basic strategies in this chapter. NHibernate provides several ways to specify queries that return objects without losing much of the power inherent to SQL. Because network latency caused by remote access to the database can be an important limiting factor in the overall performance of .NET applications, you must learn how to retrieve a graph of objects with a minimal number of database hits.

Let’s start by discussing objects, their lifecycle, and the events that trigger a change of persistent state. These basics will give you the background you need when working with your object graph, so you’ll know when and how to load and save your objects. The material may be rather formal, but a solid understanding of the persistence lifecycle will greatly help you in your application development with NHibernate.

4.1. The persistence lifecycle

Because NHibernate is a transparent persistence mechanism, classes are unaware of their own persistence capability. It’s therefore possible to write application logic that is unaware of whether the objects it operates on represent persistent state or temporary state that exists only in memory. The application shouldn’t necessarily need to care that an object is persistent when invoking its methods.

But in any application with persistent state, the application must interact with the persistence layer whenever it needs to transmit state held in memory to the database (or vice versa). To do this, you call NHibernate’s persistence API. When interacting with the persistence mechanism that way, it’s necessary for the application to concern itself with the state and lifecycle of an object with respect to persistence. We’ll refer to this as the persistence lifecycle.

Different ORM implementations use different terminology and define different states and state transitions for the persistence lifecycle. Moreover, the object states used internally may be different from those exposed to the client application. NHibernate defines only three states, hiding the complexity of its internal implementation from the client code. In this section, we explain these three states: transient, persistent, and detached.

Figure 4.1 shows these states and their transitions in a state chart. You can also see the method calls to the persistence manager that trigger transitions. We discuss this chart in this section; refer to it later whenever you need an overview.

Figure 4.1. States of an object and transitions in an NHibernate application

In its lifecycle, an object can transition from a transient object to a persistent object to a detached object. Let’s take a closer look at each of these states.

4.1.1. Transient objects

When using NHibernate, simply creating objects using the new operator will not make them immediately persistent. At this point, their state is transient, which means they aren’t associated with any database table row. This is similar to any other object in a .NET application. As you would expect, their state is lost as soon as they’re dereferenced (no longer referenced by any other object) and they become inaccessible and available for garbage collection.

NHibernate considers all transient instances to be nontransactional; a modification to the state of a transient instance isn’t made in the context of any transaction. This means NHibernate doesn’t provide any rollback functionality for transient objects. In fact, NHibernate doesn’t roll back any object changes, as you’ll see later.

Objects that are referenced only by other transient instances are, by default, also transient. To transition an object from transient to persistent state, there are two choices. You can Save() it using the persistence manager, or create a reference to it from an already-persistent instance and take advantage of transitive persistence (section 4.3).

4.1.2. Persistent objects

A persistent instance is any instance with a database identity, as defined in section 3.5. That means a persistent object has a primary key value set as its database identifier.

Persistent instances might be objects instantiated by the application and then made persistent by calling the Save() method of the persistence manager (the NHibernate ISession, discussed in more detail later in this chapter). Persistent instances are then associated with the persistence manager. They might even be objects that became persistent when a reference was created from another persistent object already associated with a persistence manager. Alternatively, a persistent instance might be an instance retrieved from the database by execution of a query, by an identifier lookup, or by navigating the object graph starting from another persistent instance. In other words, persistent instances are always associated with an ISession and are transactional.

Persistent instances participate in transactions—their state is synchronized with the database at the end of the transaction. When a transaction commits, state held in memory is propagated to the database by the execution of SQL INSERT, UPDATE, and DELETE statements. This procedure can also occur at other times. For example, NHibernate may synchronize with the database before execution of a query. This ensures that queries are aware of changes made earlier during the transaction.

We call a persistent instance new if it has been allocated a primary key value but hasn’t yet been inserted into the database. The new persistent instance will remain “new” until synchronization occurs.

Of course, NHibernate doesn’t have to update the database row of every persistent object in memory at the end of the transaction. Saving objects that haven’t changed would be time consuming and unnecessary. ORM software must have a strategy for detecting which persistent objects have been modified by the application in the transaction. We call this automatic dirty checking (an object with modifications that haven’t yet been propagated to the database is considered dirty). Again, this state isn’t visible to the application. We call this feature transparent transaction-level write-behind, meaning that NHibernate propagates state changes to the database as late as possible but hides this detail from the application.

NHibernate can detect exactly which attributes have been modified, so it’s possible to include only the columns that need updating in the SQL UPDATE statement. This may bring performance gains, particularly with certain databases. But it isn’t usually a significant difference; and, in theory, it could harm performance in some environments. So, by default, NHibernate includes all columns in the SQL UPDATE statement. NHibernate can generate and cache this basic SQL once at startup, rather than on the fly each time an object is saved. If you only want to update modified columns, you can enable dynamic SQL generation by setting dynamic-update="true" in a class mapping. Note that this feature is extremely difficult and time consuming to implement in a hand-coded persistence layer. We talk about NHibernate’s transaction semantics and the synchronization process, or flushing, in more detail in the next chapter.

Finally, you can make a persistent instance transient via a Delete() call to the persistence manager API, resulting in the deletion of the corresponding row of the database table.

4.1.3. Detached objects

When a transaction completes and the data is written to the database, the persistent instances associated with the persistence manager still exist in memory. If the transaction was successful, the state of these instances has been synchronized with the database. In ORM implementations with process-scoped identity (see the following sections), the instances retain their association to the persistence manager and are still considered persistent.

But in the case of NHibernate, these instances lose their association with the persistence manager when you Close() the ISession. Because they’re no longer associated with a persistence manager, we refer to these objects as detached. Detached instances may no longer be guaranteed to be synchronized with database state; they’re no longer under the management of NHibernate. But they still contain persistent data. It’s possible, and common, for the application to retain a reference and update a detached object outside of a transaction and therefore without NHibernate tracking the changes.

Fortunately, NHibernate lets you use these instances in a new transaction by reassociating them with a new persistence manager. After reassociation, they’re considered persistent again. This feature has a deep impact on how multitiered applications may be designed. The ability to return objects from one transaction to the presentation layer and later reuse them in a new transaction is one of NHibernate’s main selling points. We discuss this usage in the next chapter as an implementation technique for long-running application transactions. We also show you how to avoid the DTO (anti-) pattern by using detached objects in section 10.3.1.

NHibernate also provides an explicit way of detaching instances: the Evict() method of the ISession. This method is typically used only for cache management (a performance consideration). It’s not common to perform detachment explicitly. Rather, all objects retrieved in a transaction become detached when the ISession is closed or when they’re serialized (if they’re passed remotely, for example). NHibernate doesn’t need to provide functionality for controlling detachment of subgraphs. Instead, the application can control the depth of the fetched subgraph (the instances that are currently loaded in memory) using the query language or explicit graph navigation. Then, when the ISession is closed, this entire subgraph (all objects associated with a persistence manager) becomes detached.

Let’s look at the different states again, but this time consider the scope of object identity.

4.1.4. The scope of object identity

As application developers, we identify an object using .NET object identity (a==b). If an object changes state, is its .NET identity guaranteed to be the same in the new state? In a layered application, that may not be the case.

In order to explore this topic, it’s important to understand the relationship between .NET identity, object.ReferenceEquals(a,b), and database identity, a.Id==b.Id. Sometimes they’re equivalent; sometimes they aren’t. We refer to the conditions under which .NET identity is equivalent to database identity as the scope of object identity.

For this scope, there are three common choices:

  • A primitive persistence layer with no identity scope makes no guarantees that if a row is accessed twice, the same .NET object instance will be returned to the application. This becomes problematic if the application modifies two different instances that both represent the same row in a single transaction (how do you decide which state should be propagated to the database?).
  • A persistence layer using transaction-scoped identity guarantees that, in the context of a single transaction, only one object instance represents a particular database row. This avoids the previous problem and also allows for some caching to be done at the transaction level.
  • Process-scoped identity goes one step further and guarantees that there is only one object instance representing the row in the whole process (.NET CLR).

For a typical web or enterprise application, transaction-scoped identity is preferred. Process-scoped identity offers potential advantages in terms of cache utilization and the programming model for reuse of instances across multiple transactions; but in a pervasively multithreaded application, the cost of always synchronizing shared access to persistent objects in the global identity map is too high. It’s simpler, and more scalable, to have each thread work with a distinct set of persistent instances in each transaction scope.

Speaking loosely, we can say that NHibernate implements transaction-scoped identity. Actually, the NHibernate identity scope is the ISession instance, so identical objects are guaranteed if the same persistence manager (the ISession) is used for several operations. But an ISession isn’t the same as a (database) transaction—it’s a much more flexible element. We explore the differences and the consequences of this concept in the next chapter. Let’s focus on the persistence lifecycle and identity scope again.

If you request two objects using the same database identifier value in the same ISession, the result will be two references to the same in-memory object. The following example demonstrates this behavior, with several Load() operations in two ISessions:

ISession session1 = sessionFactory.OpenSession();
ITransaction tx1 = session1.BeginTransaction();
// Load Category with identifier value "1234"
object a = session1.Load<Category>( 1234 );
object b = session1.Load<Category>( 1234 );
if ( object.ReferenceEquals(a,b) ) {
System.Console.WriteLine("a and b are identical.");
}
tx1.Commit();
session1.Close();
ISession session2 = sessionFactory.OpenSession();
ITransaction tx2 = session2.BeginTransaction();
// Let's use the generic version of Load()
Category b2 = session2.Load<Category>( 1234 );
if ( ! object.ReferenceEquals(a,b2) ) {
System.Console.WriteLine("a and b2 are not identical.");
}
tx2.Commit();
session2.Close();

Object references a and b not only have the same database identity, they also have the same .NET identity because they were loaded in the same ISession. Once outside this boundary, NHibernate doesn’t guarantee .NET identity, so a and b2 aren’t identical and the message is printed on the console. A test for database identity—a.Id==b2.Id—would still return true.

To further complicate our discussion of identity scopes, we need to consider how the persistence layer handles a reference to an object outside its identity scope. For example, for a persistence layer with transaction-scoped identity such as NHibernate, is a reference to a detached object (that is, an instance persisted or loaded in a previous, completed session) tolerated?

4.1.5. Outside the identity scope

If an object reference leaves the scope of guaranteed identity, we call it a reference to a detached object. Why is this concept useful?

In Windows applications, you usually don’t maintain a database transaction across a user interaction. Users take a long time to think about modifications, so for scalability reasons, you must keep database transactions short and release database resources as soon as possible. In this environment, it’s useful to be able to reuse a reference to a detached instance. For example, you might want to send an object retrieved in one unit of work to the presentation tier and later reuse it in a second unit of work, after it’s been modified by the user. For ASP.NET applications this doesn’t apply, because you shouldn’t keep business objects in memory after the page has been rendered—instead, you reload them on each request, and they don’t require reattachment.

When you need to reattach objects, you won’t usually wish to reattach the entire object graph in the second unit of work. For performance (and other) reasons, it’s important that reassociation of detached instances be selective. NHibernate supports selective reassociation of detached instances. This means the application can efficiently reattach a subgraph of a graph of detached objects with the current (“second”) NHibernate ISession. Once a detached object has been reattached to a new NHibernate persistence manager, it may be considered a persistent instance again, and its state will be synchronized with the database at the end of the transaction. This is due to NHibernate’s automatic dirty checking of persistent instances.

Reattachment may result in the creation of new rows in the database when a reference is created from a detached instance to a new transient instance. For example, a new Bid may have been added to a detached Item while it was on the presentation tier. NHibernate can detect that the Bid is new and must be inserted in the database. For this to work, NHibernate must be able to distinguish between a “new” transient instance and an “old” detached instance. Transient instances (such as the Bid) may need to be saved; detached instances (such as the Item) may need to be reattached (and later updated in the database).

There are several ways to distinguish between transient and detached instances, but the simplest approach is to look at the value of the identifier property. NHibernate can examine the identifier of a transient or detached object on reattachment and treat the object (and the associated graph of objects) appropriately. We discuss this important issue further in section 4.3.4.

If you want to take advantage of NHibernate’s support for reassociation of detached instances in your applications, you need to be aware of NHibernate’s identity scope when designing your application—that is, the ISession scope that guarantees identical instances. As soon as you leave that scope and have detached instances, another interesting concept comes into play.

We need to discuss the relationship between .NET equality and database identity. For a recap of equality, see section 3.5.1. Equality is an identity concept that we, the class developers, can control. Sometimes we have to use it for classes that have detached instances. .NET equality is defined by the implementation of the Equals() and GetHashCode() methods in the domain model’s persistent classes.

4.1.6. Implementing Equals() and GetHashCode()

The Equals() method is called by application code or, more important, by the .NET collections. An ISet collection (in the library Iesi.Collections), for example, calls Equals() on each object you put in the ISet, to determine (and prevent) duplicate elements.

First let’s consider the default implementation of Equals(), defined by System.Object, which uses a comparison by .NET identity. NHibernate guarantees that there is a unique instance for each row of the database inside an ISession. Therefore, the default identity Equals() is appropriate if you never mix instances—that is, if you never put detached instances from different sessions into the same ISet. (The issue we’re exploring is also visible if detached instances are from the same session but have been serialized and deserialized in different scopes.) But as soon as you have instances from multiple sessions, it becomes possible to have an ISet containing two Items that each represent the same row of the database table but don’t have the same .NET identity. This would almost always be semantically wrong. Nevertheless, it’s possible to build a complex application using the built-in identity equality, as long as you exercise discipline when dealing with detached objects from different sessions (and keep an eye on serialization and deserialization). One nice thing about this approach is that you don’t have to write extra code to implement your own notion of equality.

If this concept of equality isn’t what you want, you have to override Equals() in your persistent classes. Keep in mind that when you override Equals(), you must always also override GetHashCode() so the two methods are consistent (if two objects are equal, they must have the same hash code). Let’s look at some of the ways you can override Equals() and GetHashCode() in persistent classes.

Using Database Identifier Equality

A seemingly clever approach is to implement Equals() to compare just the database identifier property (usually a surrogate primary key) value:

public class User {
//...
public override bool Equals(object other) {
if (object.ReferenceEquals(this,other)) return true;
if (this.Id==null) return false;
if ( !(other is User) ) return false;
User that = (User) other;
return this.Id == that.Id;
}
public override int GetHashCode() {
return Id==null ?
base.GetHashCode(this) :
Id.GetHashCode();
}
}

Notice how this Equals() method falls back to .NET identity for transient instances (if id==null) that don’t have a database identifier value assigned yet. This is reasonable, because they can’t have the same persistent identity as another instance.

Unfortunately, this solution has one huge problem: NHibernate doesn’t assign identifier values until an entity is saved. If the object is added to an ISet before being saved, its hash code changes while it’s contained by the ISet, contrary to the contract defined by this collection. In particular, this problem makes cascade saves (discussed later in this chapter) useless for sets. We strongly discourage this solution (database identifier equality).

You could fix this problem by assigning an identifier yourself at the creation of the entities and using versioning to distinguish transient and detached instances.

Comparing by Value

A better way is to include all persistent properties of the persistent class, apart from any database identifier property, in the Equals() comparison. This is how most people perceive the meaning of Equals(); we call it by value equality.

When we say “all properties,” we don’t mean to include collections. Collection state is associated with a different table, so it seems wrong to include it. More important, you don’t want to force the entire object graph to be retrieved just to perform Equals(). In the case of User, this means you shouldn’t include the items collection (the items sold by this user) in the comparison. Here is the implementation you could use:

public class User {
//...
public override bool Equals(object other) {
if (object.ReferenceEquals(this,other)) return true;
if ( !(other is User) ) return false;
User that = (User) other;
if ( ! this.Username == that.Username )
return false;
if ( ! this.Password == that.Password )
return false;
return true;
}
public override int GetHashCode() {
int result = 14;
result = 29 * result + Username.GetHashCode();
result = 29 * result + Password.GetHashCode();
return result;
}
}

But again, this approach has two problems:

  • Instances from different sessions are no longer equal if one is modified (for example, if the user changes his password).
  • Instances with different database identity (instances that represent different rows of the database table) can be considered equal, unless some combination of properties is guaranteed to be unique (the database columns have a unique constraint). In the case of User, there is a unique property: Username.

To get to the solution we recommend, you need to understand the notion of a business key.

Using Business Key Equality

A business key is a property, or some combination of properties, that is unique for each instance with the same database identity. Essentially, it’s the natural key you’d use if you weren’t using a surrogate key. Unlike a natural primary key, it isn’t an absolute requirement that the business key never change—as long as it changes rarely, that’s enough.

We argue that every entity should have a business key, even if it includes all properties of the class (this would be appropriate for some immutable classes). The business key is what the user thinks of as uniquely identifying a particular record, whereas the surrogate key is what the application and database use.

Business key equality means that the Equals() method compares only the properties that form the business key. This is a perfect solution that avoids all the problems described earlier. The only downside is that it requires extra thought to identify the correct business key in the first place. But this effort is required anyway; it’s important to identify any unique keys if you want your database to help ensure data integrity via constraint checking.

For the User class, username is a great candidate business key. It’s never null, it’s unique, and it changes rarely (if ever):

public class User {
//...
public override bool Equals(object other) {
if (object.ReferenceEquals(this,other)) return true;
if ( !(other is User) ) return false;
User that = (User) other;
return this.Username == that.Username );
}
public override int GetHashCode() {
return Username.GetHashCode();
}
}

For some other classes, the business key may be more complex, consisting of a combination of properties. For example, candidate business keys for the Bid class are the item ID together with the bid amount, and the item ID together with the date and time of the bid. A good business key for the BillingDetails abstract class is the number together with the type (subclass) of billing details. Notice that it’s almost never correct to override Equals() on a subclass and include another property in the comparison. It’s tricky to satisfy the requirements that equality be both symmetric and transitive in this case; and, more important, the business key wouldn’t correspond to any well-defined candidate natural key in the database (subclass properties may be mapped to a different table).

You may have noticed that the Equals() and GetHashCode() methods always access the properties of the other object via the getter properties. This is important, because the object instance passed as other might be a proxy object, not the actual instance that holds the persistent state. This is one point where NHibernate isn’t completely transparent, but it’s a good practice to use properties instead of direct instance variable access anyway.

Finally, take care when you’re modifying the value of the business key properties; don’t change the value while the domain object is in a set.

So far, we’ve talked about how the persistence manager behaves when working with instances that are transient, persistent, or detached. We’ve also discussed issues of scope, and the importance of equality and identity. It’s now time to take a closer look at the persistence manager and explore the NHibernate ISession API in greater detail. We come back to detached objects in more detail in the next chapter.

4.2. The persistence manager

Any transparent persistence tool like NHibernate will include some form of persistence manager API, which usually provides services for the following:

  • Performing basic CRUD operations
  • Executing queries
  • Controlling transactions
  • Managing the transaction-level cache

The persistence manager can be exposed by several different interfaces (in the case of NHibernate, they include ISession, IQuery, ICriteria, and ITransaction). Under the covers, the implementations of these interfaces are coupled tightly.

The central interface between the application and NHibernate is ISession; it’s your starting point for all the operations just listed. For most of the rest of this book, we refer to the persistence manager and the session interchangeably; this is consistent with usage in the NHibernate community.

How do you start using the session? At the beginning of a unit of work, you create an instance of ISession using the application’s ISessionFactory. The application may have multiple ISessionFactorys if it accesses multiple datasources. But you should never create a new ISessionFactory just to service a particular request—creation of an ISessionFactory is extremely expensive. On the other hand, ISession creation is extremely in expensive; the ISession doesn’t even obtain an ADO.NET IDbConnection until a connection is required.

After opening a new session, you use it to load and save objects. Note that this section explains some of the transitions shown earlier in figure 4.1.

4.2.1. Making an object persistent

The first thing you want to do with an ISession is make a new transient object persistent. To do so, you use the Save() method:

User user = new User();
user.Name.Firstname = "Mark";
user.Name.Lastname = "Monster";
using( ISession session = sessionFactory.OpenSession() )
using( session.BeginTransaction() ) {
session.Save(user);
session.Transaction.Commit();
}

First, you instantiate a new transient object user as usual. You can also instantiate it after opening an ISession; they aren’t related yet. You open a new ISession using the ISessionFactory referred to by sessionFactory, and then you start a new database transaction.

A call to Save() makes the transient instance of User persistent. It’s now associated with the current ISession. But no SQL INSERT has yet been executed; the NHibernate ISession never executes any SQL statement until absolutely necessary.

The changes made to persistent objects must be synchronized with the database at some point. This happens when you Commit() the NHibernate ITransaction. In this case, NHibernate obtains an ADO.NET connection (and transaction) and issues a single SQL INSERT statement. Finally, the ISession is closed, and the ADO.NET connection is released.

Note that it’s better (but not required) to fully initialize the User instance before associating it with the ISession. The SQL INSERT statement contains the values that were held by the object at the point whenSave()was called. You can, of course, modify the object after calling Save(), and your changes will be propagated to the database as a SQL UPDATE.

Everything between session.BeginTransaction() and Transaction.Commit() occurs in one database transaction. We haven’t discussed transactions in detail yet; we leave that topic for the next chapter. But keep in mind that all database operations in a transaction scope are atomic—they completely succeed or completely fail. If one of the UPDATE or INSERT statements made on Transaction.Commit() fails, all changes made to persistent objects in this transaction will be rolled back at the database level. But NHibernate does not roll back in-memory changes to persistent objects; their state remains exactly as you left it. This is reasonable because a failure of a database transaction is normally non-recoverable, and you have to discard the failed ISession immediately.

4.2.2. Updating the persistent state of a detached instance

Modifying the user after the session is closed has no effect on its persistent representation in the database. When the session is closed, user becomes a detached instance. But it may be reassociated with a new Session some time later by calling Update() or Lock().

Let’s first look at the Update() method. Using Update() forces an update to the persistent state of the object in the database; a SQL UPDATE is scheduled and will be later committed. Here’s an example of detached object handling:

user.Password = "secret";
using( ISession sessionTwo = sessionFactory.OpenSession() )
using( sessionTwo.BeginTransaction() ) {
sessionTwo.Update(user);
user.Username = "jonny";
sessionTwo.Transaction.Commit();
}

It doesn’t matter if the object is modified before or after it’s passed to Update(). The important thing is that the call to Update() is used to reassociate the detached instance with the new ISession and the current transaction. NHibernate will treat the object as dirty and therefore schedule the SQL UPDATE regardless of whether the object has been updated. This makes Update() a safe way to reassociate objects with a Session, because you know changes will be propagated to the database. There is one exception: when you’ve enabled select-before-update in the persistent class mapping. With this option enabled, a call to Update() will make NHibernate determine whether the object is dirty rather than assuming it is. It does so by executing a SELECT statement and comparing the object’s current state to the current database state. This is still a “safe” option, even though NHibernate won’t force an update if it isn’t needed.

Now, let’s look a the Lock() method. A call to Lock() associates the object with the ISession without forcing NHibernate to treat the object as dirty. Consider this example:

using( ISession sessionTwo = sessionFactory.OpenSession() ){
using( sessionTwo.BeginTransaction() ) {
sessionTwo.Lock(user, LockMode.None);
user.Password = "secret";
user.LoginName = "jonny";
sessionTwo.Transaction.Commit();
}
}

When you’re using Lock(), it does matter whether changes are made before or after the object is associated with the session. Changes made before the call to Lock() aren’t propagated to the database, because NHibernate hasn’t witnessed those changes; you only use Lock() if you’re sure the detached instance hasn’t been modified beforehand.

The previous code specifies LockMode.None, which tells NHibernate not to perform a version check or obtain any database-level locks when reassociating the object with the ISession. If we specified LockMode.Read or LockMode.Upgrade, NHibernate would execute a SELECT statement in order to perform a version check (and to set an upgrade lock). We take a detailed look at NHibernate lock modes in the next chapter. Having discussed how objects are treated when you reassociate them with a Session, let’s now see what happens when you retrieve objects.

4.2.3. Retrieving a persistent object

The ISession is also used to query the database and retrieve existing persistent objects. NHibernate is especially powerful in this area, as you’ll see later in this chapter and in chapter 7. But special methods are provided on the ISession API for the simplest kind of query: retrieval by identifier. One of these methods is Get(), demonstrated here:

int userID = 1234;
using( ISession session = sessionFactory.OpenSession() )
using( session.BeginTransaction() ) {
User user = (User) session.Get(typeof(User), userID);
session.Transaction.Commit();
}

The retrieved object user may now be passed to the presentation layer for use outside the transaction as a detached instance (after the session has been closed). If no row with the given identifier value exists in the database, the Get() returns null.

Since NHibernate 1.2, you can use .NET 2.0 generics:

User user = session.Get<User>(userID);

Next, we explain the concept of automatic dirty checking.

4.2.4. Updating a persistent object transparently

Any persistent object returned by Get() or any other kind of query is already associated with the current ISession and transaction context. It can be modified, and its state will be synchronized with the database. This mechanism is called automatic dirty checking, which means NHibernate will track and save the changes you make to an object inside a session:

int userID = 1234;
using( ISession session = sessionFactory.OpenSession() )
using( session.BeginTransaction() ) {
User user = (User) session.Get(typeof(User), userID);
user.Password = "secret";
session.Transaction.Commit();
}

First you retrieve the object from the database with the given identifier. You modify the object, and these modifications are propagated to the database when Transaction.Commit() is called. Of course, as soon as you close the ISession, the instance is considered detached. Batch updates are also possible because NHibernate has been tweaked to use the ADO.NET 2.0 batching internal feature. Enabling this feature makes NHibernate perform bulk updates; these updates therefore become much faster. All you have to do is define the batch size as an NHibernate property:

<property name="hibernate.adonet.batch_size">16</property>

By default, the batch size is 0, which means this feature is disabled.

This feature currently works only on .NET 2.0 when using a SQL Server database. And because it uses .NET reflection, it may not work in some restricted environments.

Finally, when using this feature, ADO.NET 2.0 doesn’t return the number of rows affected by each statement in the batch, which means NHibernate may not perform optimistic concurrency checking correctly. For example, if one statement affects two rows and another statement affects no rows (instead of affecting one each), NHibernate will only know that two rows have been affected, and conclude that everything went OK.

4.2.5. Making an object transient

In many use cases, you need persistent (or detached) objects to become transient again, meaning they will no longer have corresponding data in the database. As we discussed at the beginning of this chapter, persistent objects are those that are in the session and have corresponding data in the database. Making them transient removes their persistent state from the database. You can easily do this using the Delete() method:

int userID = 1234;
using( ISession session = sessionFactory.OpenSession() )
using( session.BeginTransaction() ) {
User user = session.Get<User>(userID);
session.Delete(user);
session.Transaction.Commit();
}

The SQL DELETE is executed only when the ISession is synchronized with the database at the end of the transaction.

After the ISession is closed, the user object is considered an ordinary transient instance. The transient instance is destroyed by the garbage collector if it’s no longer referenced by any other object; both the in-memory instance and the persistent database row are removed.

Similarly, detached objects may be made transient. (Detached objects have corresponding state in the database but aren’t in the ISession.) You don’t have to reattach a detached instance to the session with Update() or Lock(). Instead, you can directly delete a detached instance as follows:

using( ISession session = sessionFactory.OpenSession() )
using( session.BeginTransaction() ) {
session.Delete(user);
session.Transaction.Commit();
}

In this case, the call to Delete() does two things: it associates the object with the ISession and then schedules the object for deletion, executed on Transaction.Commit().

You now know the persistence lifecycle and the basic operations of the persistence manager. Using these concepts together with the persistent class mappings we discussed in chapter 3, you can create your own small NHibernate application. (If you like, you can jump to chapter 10 and read about a handy NHibernate helper class for ISessionFactory and ISession management.) Keep in mind that we haven’t shown you any exception-handling code so far, but you should be able to figure out the try/ catch blocks yourself (as in chapter 2). Map some simple entity classes and components, and then store and load objects in a standalone console application (write a Main method). But as soon as you try to store associated entity objects—that is, when you deal with a more complex object graph—you’ll see that calling Save() or Delete() on each object of the graph isn’t an efficient way to write applications.

You’d like to make as few calls to the ISession as possible. Transitive persistence provides a more natural way to force object state changes and to control the persistence lifecycle.

4.3. Using transitive persistence in NHibernate

Real, nontrivial applications deal not with single objects but rather with graphs of objects. When the application manipulates a graph of persistent objects, the result may be an object graph consisting of persistent, detached, and transient instances. Transitive persistence is a technique that allows you to propagate persistence to transient and detached subgraphs automatically.

For example, if we add a newly instantiated Category to the already persistent hierarchy of categories, it should automatically become persistent without a call to session.Save(). We gave a slightly different example in chapter 3 when we mapped a parent/child relationship between Bid and Item. In that case, not only were bids automatically made persistent when they were added to an item, but they were also automatically deleted when the owning item was deleted.

More than one model exists for transitive persistence. The best known is persistence by reachability, which we discuss first. Although some basic principles are the same, NHibernate uses its own, more powerful model, as you’ll see later.

4.3.1. Persistence by reachability

An object persistence layer is said to implement persistence by reachability if any instance becomes persistent when the application creates an object reference to the instance from another instance that is already persistent. This behavior is illustrated by the object diagram (note that this isn’t a class diagram) in figure 4.2.

Figure 4.2. Persistence by reachability with a root persistent object

In this example, Computer is a persistent object. The objects Desktop PCs and Monitors are also persistent; they’re reachable from the Computer Category instance. Electronics and Cell Phones are transient. Note that we assume navigation is possible only to child categories and not to the parent—for example, you can call computer.ChildCategories. Persistence by reachability is a recursive algorithm: all objects reachable from a persistent instance become persistent either when the original instance is made persistent or just before in-memory state is synchronized with the data store.

Persistence by reachability guarantees referential integrity; you can re-create any object graph by loading the persistent root object. An application may walk the object graph from association to association without worrying about the persistent state of the instances. (SQL databases have a different approach to referential integrity, relying on foreign-key and other constraints to detect a misbehaving application.)

In the purest form of persistence by reachability, the database has some top-level, or root, object from which all persistent objects are reachable. Ideally, an instance should become transient and be deleted from the database if it isn’t reachable via references from the root persistent object.

Neither NHibernate nor other ORM solutions implement this form; there is no analog of the root persistent object in a SQL database and no persistent garbage collector that can detect unreferenced instances. Object-oriented data stores may implement a garbage-collection algorithm similar to the one implemented for in-memory objects by the CLR, but this option isn’t available in the ORM world; scanning all tables for unreferenced rows won’t perform acceptably.

Persistence by reachability is at best a halfway solution. It helps you make transient objects persistent and propagate their state to the database without many calls to the persistence manager. But (at least, in the context of SQL databases and ORM) it isn’t a full solution to the problem of making persistent objects transient and removing their state from the database. This turns out to be a much more difficult problem. You can’t simply remove all reachable instances when you remove an object; other persistent instances may hold references to them (remember that entities can be shared). You can’t even safely remove instances that aren’t referenced by any persistent object in memory; the instances in memory are only a small subset of all objects represented in the database. Let’s look at NHibernate’s more flexible transitive persistence model.

4.3.2. Cascading persistence with NHibernate

NHibernate’s transitive persistence model uses the same basic concept as persistence by reachability—that is, object associations are examined to determine transitive state. But NHibernate lets you specify a cascade style for each association mapping, which offers more flexibility and fine-grained control for all state transitions. NHibernate reads the declared style and cascades operations to associated objects automatically.

By default, NHibernate does not navigate an association when searching for transient or detached objects, so saving, deleting, or reattaching a Category doesn’t affect the child category objects. This is the opposite of the persistence-by-reachability default behavior. If, for a particular association, you wish to enable transitive persistence, you must override this default in the mapping metadata.

You can map entity associations in metadata with the following attributes:

  • cascade="none", the default, tells NHibernate to ignore the association.
  • cascade="save-update" tells NHibernate to navigate the association when the transaction is committed and when an object is passed to Save() or Update() and save newly instantiated transient instances and persist changes to detached instances.
  • cascade="delete" tells NHibernate to navigate the association and delete persistent instances when an object is passed to Delete().
  • cascade="all" means to cascade both save-update and delete, as well as calls to Evict and Lock.
  • cascade="all-delete-orphan" means the same as cascade="all" but, in addition, NHibernate deletes any persistent entity instance that has been removed (dereferenced) from the association (for example, from a collection).
  • cascade="delete-orphan" has NHibernate delete any persistent entity instance that has been removed (dereferenced) from the association (for example, from a collection).

This association-level cascade style model is both richer and less safe than persistence by reachability. NHibernate doesn’t make the same strong guarantees of referential integrity that persistence by reachability provides. Instead, NHibernate partially delegates referential integrity concerns to the foreign key constraints of the underlying relational database. There is a good reason for this design decision: it lets NHibernate applications use detached objects efficiently, because you can control reattachment of a detached object graph at the association level.

Let’s elaborate on the cascading concept with some example association mappings. We recommend that you read the next section in one turn, because each example builds on the previous one. Our first example is straightforward; it lets you save newly added categories efficiently.

4.3.3. Managing auction categories

System administrators can create new categories, rename categories, and move subcategories around in the category hierarchy. This structure is shown in figure 4.3.

Figure 4.3. Category class with association to itself

Now you map this class and the association:

<class name="Category" table="CATEGORY">
...
<property name="Name" column="CATEGORY_NAME"/>
<many-to-one
name="ParentCategory"
class="Category"
column="PARENT_CATEGORY_ID"
cascade="none"/>
<set
name="ChildCategories"
table="CATEGORY"
cascade="save-update"
inverse="true">
<key column="PARENT_CATEGORY_ID"/>
<one-to-many class="Category"/>
</set>
...
</class>

This is a recursive, bidirectional, one-to-many association, as briefly discussed in chapter 3. The one-valued end is mapped with the <many-to-one> element and the Set typed property with the <set>. Both refer to the same foreign key column: PARENT_CATEGORY_ID.

Suppose you create a new Category as a child category of Computer (see figure 4.4).

Figure 4.4. Adding a new Category to the object graph

You have several ways to create this new Laptops object and save it in the database. You can go back to the database and retrieve the Computer category to which the new Laptops category will belong, add the new category, and commit the transaction:

using( ISession session = sessionFactory.OpenSession() )
using( session.BeginTransaction() ) {
Category computer = session.Get<Category>(computerId);
Category laptops = new Category("Laptops");
computer.ChildCategories.Add(laptops);
laptops.ParentCategory = computer;
session.Transaction.Commit();
}

The computer instance is persistent (attached to a session), and the ChildCategories association has cascade-save enabled. Hence, this code results in the new laptops category becoming persistent when Transaction.Commit() is called, because NHibernate cascades the dirty-checking operation to the children of computer. NHibernate executes an INSERT statement.

Let’s do the same thing again, but this time create the link between Computer and Laptops outside of any transaction (in a real application, it’s useful to manipulate an object graph in a presentation tier—for example, before passing the graph back to the persistence layer to make the changes persistent):

Category computer = ... // Loaded in a previous session
Category laptops = new Category("Laptops");
computer.ChildCategories.Add(laptops);
laptops.ParentCategory = computer;

The detached computer object and any other detached objects it refers to are now associated with the new transient laptops object (and vice versa). You make this change to the object graph persistent by saving the new object in a second NHibernate session:

using( ISession session = sessionFactory.OpenSession() )
using( session.BeginTransaction() ) {
session.Save(laptops);
session.Transaction.Commit();
}

NHibernate inspects the database identifier property of the parent category of laptops and correctly creates the relationship to the Computer category in the database. NHibernate inserts the identifier value of the parent into the foreign key field of the new Laptops row in CATEGORY.

Because cascade="none" is defined for the ParentCategory association, NHibernate ignores changes to any of the other categories in the hierarchy (Computer, Electronics). It doesn’t cascade the call to Save() to entities referred to by this association. If you’d enabled cascade="save-update" on the <many-to-one> mapping of Parent-Category, NHibernate would have had to navigate the whole graph of objects in memory, synchronizing all instances with the database. This process would perform badly, because a lot of useless data access would be required. In this case, you neither needed nor wanted transitive persistence for the ParentCategory association.

Why do you have cascading operations? You could have saved the laptop object, as shown in the previous example, without any cascade mapping being used. Well, consider the following case:

Category computer = ... // Loaded in a previous Session
Category laptops = new Category("Laptops");
Category laptopAccessories = new Category("Laptop Accessories");
Category laptopTabletPCs = new Category("Tablet PCs")
laptops.AddChildCategory(laptopAccessories);
laptops.AddChildCategory(laptopTabletPCs);
computer.AddChildCategory(laptops);

(Notice that you use the convenience method AddChildCategory() to set both ends of the association link in one call, as described in chapter 3.)

It would be undesirable to have to save each of the three new categories individually. Fortunately, because you mapped the ChildCategories association with cascade="save-update", you don’t need to. The same code you used before to save the single Laptops category will save all three new categories in a new session:

using( ISession session = sessionFactory.OpenSession() )
using( session.BeginTransaction() ) {
session.Save(laptops);
session.Transaction.Commit();
}

You’re probably wondering why the cascade style is called cascade="save-update" rather than cascade="save". Having just made all three categories persistent previously, suppose you made the following changes to the category hierarchy in a subsequent request (outside of a session and transaction):

laptops.Name = "Laptop Computers";
laptopAccessories.Name = "Accessories & Parts";
laptopTabletPCs.Name = "Tablet Computers";
Category laptopBags = new Category("Laptop Bags");
laptops.AddChildCategory(laptopBags);

You add a new category as a child of the Laptops category and modify all three existing categories. The following code updates three old Category instances and inserts the new one:

using( ISession session = sessionFactory.OpenSession() )
using( session.BeginTransaction() ) {

session.Update(laptops);
session.Transaction.Commit();
}

Specifying cascade="save-update" on the ChildCategories association accurately reflects the fact that NHibernate determines what is needed to persist the objects to the database. In this case, it reattaches/updates the three detached categories (laptops, laptopAccessories, and laptopTabletPCs) and saves the new child category (laptopBags).

Notice that the last code example differs from the previous two session examples only in a single method call. The last example uses Update() instead of Save() because laptops was already persistent.

You can rewrite all the examples to use the SaveOrUpdate() method. Then the three code snippets are identical:

using( ISession session = sessionFactory.OpenSession() )
using( session.BeginTransaction() ) {
session.SaveOrUpdate(laptops);
session.Transaction.Commit();
}

The SaveOrUpdate() method tells NHibernate to propagate the state of an instance to the database by creating a new database row if the instance is a new transient instance or by updating the existing row if the instance is a detached instance. In other words, it does exactly the same thing with the laptops category as cascade="save-update" did with the child categories of laptops.

One final question: how did NHibernate know which children were detached and which were new transient instances?

4.3.4. Distinguishing between transient and detached instances

Because NHibernate doesn’t keep a reference to a detached instance, you have to let NHibernate know how to distinguish between a detached instance like laptops (if it was created in a previous session) and a new transient instance like laptopBags.

A range of options is available. NHibernate assumes that an instance is an unsaved transient instance if

  • The identifier property (if it exists) is null.
  • The version property (if it exists) is null.
  • You supply an unsaved-value in the mapping document for the class, and the value of the identifier property matches.
  • You supply an unsaved-value in the mapping document for the version property, and the value of the version property matches.
  • You supply an NHibernate IInterceptor and return true from IInterceptor.IsUnsaved() after checking the instance in your code.

The example domain model uses the primitive type long everywhere as the identifier property type. Because it isn’t nullable, you have to use the following identifier mapping in all your classes:

<class name="Category" table="CATEGORY">
<id name="Id" unsaved-value="0">
<generator class="native"/>
</id>
....
</class>

The unsaved-value attribute tells NHibernate to treat instances of Category with an identifier value of 0 as newly instantiated transient instances. The default value for the attribute unsaved-value is null if the type is nullable; otherwise, it’s the default value of the type (0 for numerical types); because you’ve chosen long as the identifier property type, you can omit the unsaved-value attribute in your auction application classes. Technically, NHibernate tries to guess the unsaved-value by instantiating an empty object and retrieving default property values from it.


Unsaved assigned identifiers

This approach works nicely for synthetic identifiers, but it breaks down in the case of keys assigned by the application, including composite keys in legacy systems. We discuss this issue in section 10.2. Avoid application-assigned (and composite) keys in new applications if possible (this is important for non-versioned entities).


You now have the knowledge to optimize your NHibernate application and reduce the number of calls to the persistence manager if you want to save and delete objects. Check the unsaved-value attributes of all your classes and experiment with detached objects to get a feel for the NHibernate transitive persistence model.

Having focused on how to persist objects with NHibernate, we can now switch perspectives and focus on how you go about retrieving (or loading) them.

4.4. Retrieving objects

Retrieving persistent objects from the database is one of the most interesting (and complex) parts of working with NHibernate. NHibernate provides the following ways to get objects out of the database:

  • Navigating the object graph, starting from an already loaded object, by accessing the associated objects through property accessor methods such as aUser.Address.City. NHibernate automatically loads (or preloads) nodes of the graph while you navigate the graph if the ISession is open.
  • Retrieving by identifier, which is the most convenient and performant method when the unique identifier value of an object is known.
  • Using Hibernate Query Language (HQL), which is a full object-oriented query language.
  • Using the NHibernate ICriteria API, which provides a type-safe and object-oriented way to perform queries without the need for string manipulation. This facility includes queries based on an example object.
  • Using native SQL queries and having NHibernate take care of mapping the ADO.NET result sets to graphs of persistent objects.

Note

Using LINQ for NHibernate is another option available. This lets you specify your NHibernate queries using LINQ. At the time of writing, LINQ for NHibernate looks very promising despite the fact it’s still a work in progress. We don’t cover it in this book, but feel free to investigate further by visiting the NHContrib project website.


In your NHibernate applications, you’ll use a combination of these techniques. Each retrieval method may use a different fetching strategy—that is, a strategy that defines what part of the persistent object graph should be retrieved. The goal is to find the best retrieval method and fetching strategy for every use case in your application while at the same time minimizing the number of SQL queries for best performance.

We don’t discuss each retrieval method in detail in this section; instead, we focus on the basic fetching strategies and how to tune NHibernate mapping files for the best default fetching performance for all methods. Before we look at the fetching strategies, we provide an overview of the retrieval methods. Note that we mention the NHibernate caching system, but we fully explore it in the next chapter.

Let’s start with the simplest case: retrieving an object by giving its identifier value (navigating the object graph should be self-explanatory). You saw a simple retrieval by identifier earlier in this chapter, but there is more to know about it.

4.4.1. Retrieving objects by identifier

The following NHibernate code snippet retrieves a User object from the database:

User user = session.Get<User>(userID);

And here’s the code without .NET 2.0 generics:

User user = (User) session.Get(typeof(User), userID);

The Get() method is special because the identifier uniquely identifies a single instance of a class. Hence it’s common for applications to use the identifier as a convenient handle to a persistent object. Retrieval by identifier can use the cache when retrieving an object, avoiding a database hit if the object is already cached.

NHibernate also provides a Load() method:

User user = session.Load<User>(userID);

The difference between these two methods is trivial. If Load() can’t find the object in the cache or database, an exception is thrown. The Load() method never returns null. The Get() method returns null if the object can’t be found.

The Load() method may return a proxy instead of a real persistent instance (when lazy loading is enabled). A proxy is a placeholder that triggers the loading of the real object when it’s accessed for the first time; we discuss proxies later in this section. It’s important to understand that Load() will return a proxy even if there is no row with the specified identifier; and an exception will be thrown if (and only if) NHibernate tries to load it. On the other hand, Get() never returns a proxy because it must return null if the entity doesn’t exist.

Choosing between Get() and Load() is easy: if you’re certain the persistent object exists, and nonexistence would be considered exceptional, Load() is a good option. If you aren’t certain there is a persistent instance with the given identifier, use Get() and test the return value to see if it’s null.

What if this object is already in the session’s cache as an un-initialized proxy? In this case, Load() will return the proxy as is, but Get() will initialize it before returning it.

Using Load() has a further implication: the application may retrieve a valid reference (a proxy) to a persistent instance without hitting the database to retrieve its persistent state. Load() may not throw an exception when it doesn’t find the persistent object in the cache or database; the exception may be thrown later, when the proxy is accessed.

This behavior has an interesting application. Let’s say lazy loading is enabled on the class Category, and analyze the following code:

using( ISession session = sessionFactory.OpenSession() )
using( session.BeginTransaction() ) {
Category parent = session.Load<Category>(anId);
Console.WriteLine( parent.Id );
Category child = new Category("test");
child.ParentCategory = parent;
session.Save(child);
session.Transaction.Commit();
}

You first load a category. NHibernate doesn’t hit the database to do this: it returns a proxy. Accessing the identifier of this proxy doesn’t cause its initialization (as long as the identifier is mapped with the access strategy "property" or "nosetter"). Then you link a new category to the proxy, and you save it. An INSERT statement is executed to save the row with the foreign key value of the proxy’s identifier. No SELECT statement is executed!

Now, let’s explore arbitrary queries, which are far more flexible than retrieving objects by identifier.

4.4.2. Introducing Hibernate Query Language

Hibernate Query Language (HQL) is an object-oriented dialect of the familiar relational query language SQL. HQL bears close resemblances to ODMG OQL and EJB-QL (from Java); but unlike OQL, it’s adapted for use with SQL databases, and it’s much more powerful and elegant than EJB-QL. JPA QL is a subset of HQL. HQL is easy to learn with a basic knowledge of SQL.

HQL isn’t a data-manipulation language like SQL. It’s used only for object retrieval, not for updating, inserting, or deleting data. Object-state synchronization is the job of the persistence manager, not the developer.

Most of the time, you’ll only need to retrieve objects of a particular class and restrict by the properties of that class. For example, the following query retrieves a user by first name:

IQuery q = session.CreateQuery("from User u where u.Firstname = :fname");
q.SetString("fname", "Max");
IList<User> result = q.List<User>();

After preparing query q, you bind the identifier value to a named parameter, fname. The result is returned as a generic IList of User objects.

Note that, instead of obtaining this list, you can provide one using q.List (myEmptyList), and NHibernate will fill it. This is useful when you want to use a collection with additional functionalities (like advanced data binding).

HQL is powerful, and even though you may not use the advanced features all the time, you’ll need them for some difficult problems. For example, HQL supports the following:

  • Applying restrictions to properties of associated objects related by reference or held in collections (to navigate the object graph using query language).
  • Retrieving only properties of an entity or entities, without the overhead of loading the entity itself in a transactional scope. This is sometimes called a report query; it’s more correctly called projection.
  • Ordering the query’s results.
  • Paginating the results.
  • Aggregating with group by, having, and aggregate functions like sum, min, and max.
  • Performing outer joins when retrieving multiple objects per row.
  • Calling user-defined SQL functions.
  • Performing subqueries (nested queries).

We discuss all these features in chapter 8, together with the optional native SQL query mechanism. We now look at another approach to issuing queries with NHibernate: Query by Criteria.

4.4.3. Query by Criteria

The NHibernate Query by Criteria (QBC) API lets you build a query by manipulating criteria objects at runtime. This approach lets you specify constraints dynamically without direct string manipulations, but it doesn’t lose much of the flexibility or power of HQL. On the other hand, queries expressed as criteria are often less readable than queries expressed in HQL.

Retrieving a user by first name is easy using a Criteria object:

ICriteria criteria = session.CreateCriteria(typeof(User));
criteria.Add( Expression.Like("Firstname", "Pierre Henri") );
IList result = criteria.List();

An ICriteria is a tree of ICriterion instances. The Expression class provides static factory methods that return ICriterion instances. Once the desired criteria tree is built, it’s executed against the database.

Many developers prefer QBC, considering it a more object-oriented approach. They also like the fact that the query syntax may be parsed and validated at compile time, whereas HQL expressions aren’t parsed until runtime.

The nice thing about the NHibernate ICriteria API is the ICriterion framework. This framework allows extension by the user, which is difficult in the case of a query language like HQL.

4.4.4. Query by Example

As part of the QBC facility, NHibernate supports Query by Example (QBE). The idea behind QBE is that the application supplies an instance of the queried class with certain property values set (to nondefault values). The query returns all persistent instances with matching property values. QBE isn’t a particularly powerful approach, but it can be convenient for some applications. The following code snippet demonstrates an NHibernate QBE:

User exampleUser = new User();
exampleUser.Firstname = "Max";
ICriteria criteria = session.CreateCriteria(typeof(User));
criteria.add( Example.Create(exampleUser) );
IList result = criteria.List();

A typical use case for QBE is a search screen that allows users to specify a range of property values to be matched by the returned result set. This kind of functionality can be difficult to express cleanly in a query language; string manipulations would be required to specify a dynamic set of constraints.

Both the QBC API and the example query mechanism are discussed in more detail in chapter 8.

You now know the basic retrieval options in NHibernate. We focus on strategies for fetching object graphs in the rest of this section. A fetching strategy defines what part of the object graph (or, what subgraph) is retrieved with a query or load operation.

4.4.5. Fetching strategies

In traditional relational data access, you fetch all the data required for a particular computation with a single SQL query, taking advantage of inner and outer joins to retrieve related entities. Some primitive ORM implementations fetch data piecemeal, with many requests for small chunks of data in response to the application’s navigating a graph of persistent objects. This approach doesn’t make efficient use of the relational database’s join capabilities. In fact, this data-access strategy scales poorly by nature. One of the most difficult problems in ORM—probably the most difficult—is providing for efficient access to relational data, given an application that prefers to treat the data as a graph of objects.

For the kinds of applications we’ve often worked with (multiuser, distributed, web, and enterprise applications), object retrieval using many round trips to/from the database is unacceptable. We argue that tools should emphasize the R in ORM to a much greater extent than has been traditional.

The problem of fetching object graphs efficiently (with minimal access to the database) has often been addressed by providing association-level fetching strategies specified in metadata of the association mapping. The trouble with this approach is that each piece of code that uses an entity requires a different set of associated objects. But this isn’t enough. We argue that what is needed is support for fine-grained runtime association fetching strategies. NHibernate supports both: it lets you specify a default fetching strategy in the mapping file and then override it at runtime in code.

NHibernate allows you to choose among four fetching strategies for any association, in association metadata and at runtime:

  • Immediate fetching The associated object is fetched immediately, using a sequential database read (or cache lookup).
  • Lazy fetchingThe associated object or collection is fetched “lazily,” when it’s first accessed. This results in a new request to the database (unless the associated object is cached).
  • Eager fetchingThe associated object or collection is fetched together with the owning object, using a SQL outer join, and no further database request is required.
  • Batch fetchingThis approach may be used to improve the performance of lazy fetching by retrieving a batch of objects or collections when a lazy association is accessed. (Batch fetching may also be used to improve the performance of immediate fetching.)

Let’s look more closely at each fetching strategy.

Immediate Fetching

Immediate association fetching occurs when you retrieve an entity from the database and then immediately retrieve another associated entity or entities in a further request to the database or cache. Immediate fetching isn’t usually an efficient fetching strategy unless you expect the associated entities to almost always be cached already.

Lazy Fetching

When a client requests an entity and its associated graph of objects from the database, it isn’t usually necessary to retrieve the whole graph of every (indirectly) associated object. You wouldn’t want to load the whole database into memory at once; for example, loading a single Category shouldn’t trigger the loading of all Items in that category.

Lazy fetching lets you decide how much of the object graph is loaded in the first database hit and which associations should be loaded only when they’re first accessed. Lazy fetching is a foundational concept in object persistence and the first step to attaining acceptable performance.

Since NHibernate 1.2, all associations are configured for lazy fetching by default; you can easily change this behavior by setting default-lazy="false" in <hibernate-mapping> of your mapping files. But we recommend that you keep this strategy and override it at runtime by queries that force eager fetching to occur.

Eager (Outer Join) Fetching

Lazy association fetching can help reduce database load and is often a good default strategy. But it’s like a blind guess as far as performance optimization goes.

Eager fetching lets you explicitly specify which associated objects should be loaded together with the referencing object. NHibernate can then return the associated objects in a single database request, utilizing a SQL outer join. Performance optimization in NHibernate often involves judicious use of eager fetching for particular transactions. Even though default eager fetching may be declared in the mapping file, it’s more common to specify the use of this strategy at runtime for a particular HQL or criteria query.

Batch Fetching

Batch fetching isn’t strictly an association fetching strategy; it’s a technique that may help improve the performance of lazy (or immediate) fetching. Usually, when you load an object or collection, your SQL WHERE clause specifies the identifier of the object or the object that owns the collection. If batch fetching is enabled, NHibernate looks to see what other proxied instances or uninitialized collections are referenced in the current session and tries to load them at the same time by specifying multiple identifier values in the WHERE clause.

We aren’t great fans of this approach; eager fetching is almost always faster. Batch fetching is useful for inexperienced users who wish to achieve acceptable performance in NHibernate without having to think too hard about the SQL that will be executed.

We now declare the fetching strategy for some associations in our mapping metadata.

4.4.6. Selecting a fetching strategy in mappings

NHibernate lets you select default association fetching strategies by specifying attributes in the mapping metadata. You can override the default strategy using features of NHibernate’s query methods, as you’ll see in chapter 8. A minor caveat: You don’t have to understand every option presented in this section immediately; we recommend that you get an overview first and use this section as a reference when you’re optimizing the default fetching strategies in your application.

A wrinkle in NHibernate’s mapping format means that collection mappings function slightly differently than single-point associations; we cover the two cases separately. Let’s first consider both ends of the bidirectional association between Bid and Item.

Single Point Associations

For a <many-to-one> or <one-to-one> association, lazy fetching is possible only if the associated class mapping enables proxying. For the Item class, you enable proxying by specifying lazy="true" (since NHibernate 1.2, this is the default value):

<class name="Item" lazy="true">

Now, remember the association from Bid to Item:

<many-to-one name="item" class="Item">

When you retrieve a Bid from the database, the association property may hold an instance of an NHibernate generated subclass of Item that delegates all method invocations to a different instance of Item that is fetched lazily from the database (this is the more elaborate definition of an NHibernate proxy).

In order to delegate method (and property) invocations, these members need to be virtual. NHibernate 1.2 uses a validator that verifies that proxied entities have a default constructor which isn’t private, that they aren’t sealed, that all public methods and properties are virtual, and that there is no public field. It’s possible to turn off this validator; but you should carefully think about why you do that. Here is the element to add to your configuration file to turn it off:

<property name="hibernate.use_proxy_validator">false</property>

Or you can do it programmatically, before building the session factory, using cfg.Properties[NHibernate.Cfg.Environment.UseProxyValidator]="false".

NHibernate uses two different instances so that even polymorphic associations can be proxied—when the proxied object is fetched, it may be an instance of a mapped subclass of Item (if there were any subclasses of Item, that is). You can even choose any interface implemented by the Item class as the type of the proxy. To do so, declare it using the proxy attribute, instead of specifying lazy="true":

<class name="Item" proxy="ItemInterface">

As soon as you declare the proxy or lazy attribute on Item, any single-point association to Item is proxied and fetched lazily, unless that association overrides the fetching strategy by declaring the outer-join attribute.

There are three possible values for outer-join:

  • outer-join="auto" The default. When the attribute isn’t specified; NHibernate fetches the associated object lazily if the associated class has proxying enabled or eagerly using an outer join if proxying is disabled (default).
  • outer-join="true" NHibernate always fetches the association eagerly using an outer join, even if proxying is enabled. This allows you to choose different fetching strategies for different associations to the same proxied class. It’s equivalent to fetch="join".
  • outer-join="false" NHibernate never fetches the association using an outer join, even if proxying is disabled. This is useful if you expect the associated object to exist in the second-level cache (see chapter 6). If it isn’t available in the second-level cache, the object is fetched immediately using an extra SQL SELECT. This option is equivalent to fetch="select".

If you wanted to re-enable eager fetching for the association, now that proxying is enabled, you would specify

<many-to-one name="item" class="Item" outer-join="true">

For a one-to-one association (discussed in more detail in chapter 7), lazy fetching is conceptually possible only when the associated object always exists. You indicate this by specifying constrained="true". For example, if an item can have only one bid, the mapping for the Bid is

<one-to-one name="item" class="Item" constrained="true">

The constrained attribute has a slightly similar interpretation to the not-null attribute of a <many-to-one> mapping. It tells NHibernate that the associated object is required and thus can’t be null.

To enable batch fetching, you specify the batch-size in the mapping for Item:

<class name="Item" lazy="true" batch-size="9">

The batch size limits the number of items that may be retrieved in a single batch. Choose a reasonably small number here.

You’ll meet the same attributes (outer-join, batch-size, and lazy) when we consider collections, but the interpretation is slightly different.

Collections

In the case of collections, fetching strategies apply not just to entity associations but also to collections of values (for example, a collection of strings could be fetched by an outer join).

Just like classes, collections have their own proxies, which we usually call collection wrappers. Unlike classes, the collection wrapper is always there, even if lazy fetching is disabled (NHibernate needs the wrapper to detect collection modifications).

Collection mappings may declare a lazy attribute, an outer-join attribute, neither, or both (specifying both isn’t meaningful). The meaningful options are as follow:

  • Neither attribute specifiedThis option is equivalent to outer-join="false" lazy="false". The collection is fetched from the second-level cache or by an immediate extra SQL SELECT. This option is most useful when the second-level cache is enabled for this collection.
  • outer-join="true" —NHibernate fetches the association eagerly using an outer join. At the time of this writing, NHibernate is able to fetch only one collection per SQL SELECT, so it isn’t possible to declare multiple collections belonging to the same persistent class with outer-join="true".
  • lazy="true" NHibernate fetches the collection lazily, when it’s first accessed. Since NHibernate 1.2, this is the default option, and we recommend that you keep this option as a default for all your collection mappings.

We don’t recommend eager fetching for collections, so you’ll map the item’s collection of bids with lazy="true". This option is almost always used for collection mappings (although it’s the default since NHibernate 1.2, we’ll continue to write it to insist on it):

<set name="Bids" lazy="true">
<key column="ITEM_ID"/>
<one-to-many class="Bid"/>
</set>

You can even enable batch fetching for the collection. In this case, the batch size doesn’t refer to the number of bids in the batch; it refers to the number of collections of bids:

<set name="Bids" lazy="true" batch-size="9">
<key column="ITEM_ID"/>
<one-to-many class="Bid"/>
</set>

This mapping tells NHibernate to load up to nine collections of bids in one batch, depending on how many uninitialized collections of bids are currently present in the items associated with the session. In other words, if five Item instances have persistent state in an ISession, and all have an uninitialized Bids collection, NHibernate will automatically load all five collections in a single SQL query if one is accessed. If there are 11 items, only 9 collections will be fetched. Batch fetching can significantly reduce the number of queries required for hierarchies of objects (for example, when loading the tree of parent and child Category objects).

Let’s talk about a special case: many-to-many associations (we discuss this mapping in more detail in chapter 7). You usually use a link table (some developers also call it relationship table or association table ) that holds only the key values of the two associated tables and therefore allows a many-to-many multiplicity. This additional table must be considered if you decide to use eager fetching. Look at the following straightforward many-to-many example, which maps the association from Category to Item:

<set name="Items" outer-join="true" table="CATEGORY_ITEM">
<key column="CATEGORY_ID"/>
<many-to-many column="ITEM_ID" class="Item"/>
</set>

In this case, the eager fetching strategy refers only to the association table CATEGORY_ITEM. If you load a Category with this fetching strategy, NHibernate automatically fetches all link entries from CATEGORY_ITEM in a single outer join SQL query, but not the item instances from ITEM!

The entities contained in the many-to-many association can also be fetched eagerly with the same SQL query. The <many-to-many> element lets you customize this behavior:

<set name="Items" outer-join="true" table="CATEGORY_ITEM">
<key column="CATEGORY_ID"/>
<many-to-many column="ITEM_ID" outer-join="true" class="Item"/>
</set>

NHibernate now fetches all Items in a Category with a single outer join query when the Category is loaded. But keep in mind that we usually recommend lazy loading as the default fetching strategy and that NHibernate is limited to one eagerly fetched collection per mapped persistent class.

Setting the Fetch Depth

We now discuss a global fetching strategy setting: the maximum fetch depth. This setting controls the number of outer-joined tables NHibernate uses in a single SQL query. Consider the complete association chain from Category to Item, and from Item to Bid. The first is a many-to-many association, and the second is one-to-many; hence both associations are mapped with collection elements. If you declare outer-join="true" for both associations (don’t forget the special <many-to-many> declaration) and load a single Category, how many queries will NHibernate execute? Will only the Items be eagerly fetched, or also all the Bids of each Item?

You probably expect a single query with an outer join operation including the CATEGORY, CATEGORY_ITEM, ITEM, and BID tables. But this isn’t the case by default.

NHibernate’s outer join fetch behavior is controlled with the global configuration option hibernate.max_fetch_depth. If you set this to 1 (also the default), NHibernate fetches only the Category and the link entries from the CATEGORY_ITEM association table. If you set it to 2, NHibernate executes an outer join that also includes the Items in the same SQL query. Setting this option to 3 won’t, as you might have expected, also include the bids of each item in the same SQL query. The limitation to one outer joined collection applies here, preventing slow Cartesian products.

Recommended values for the fetch depth depend on the join performance and the size of the database tables; test your applications with low values (less than 4) first, and decrease or increase the number while tuning your application. The global maximum fetch depth also applies to single-ended association (<many-to-one>, <one-toone>) mapped with an eager fetching strategy or using the auto default.

Keep in mind that eager fetching strategies declared in the mapping metadata are effective only if you use retrieval by identifier, use the criteria query API, or navigate through the object graph manually. Any HQL query may specify its own fetching strategy at runtime, thus ignoring the mapping defaults. You can also override the defaults (that is, not ignore them) with criteria queries. This is an important difference, and we cover it in more detail in section 8.3.2.

But you may sometimes want to initialize a proxy or a collection wrapper manually with a simple API call.

Initializing Lazy Associations

A proxy or collection wrapper is automatically initialized when any of its methods are invoked (except the identifier property getter, which may return the identifier value without fetching the underlying persistent object). But it’s only possible to initialize a proxy or collection wrapper if it’s currently associated with an open ISession. If you close the session and try to access an uninitialized proxy or collection, NHibernate throws a LazyInitializationException.

Because of this behavior, it’s sometimes useful to explicitly initialize an object before closing the session. This approach isn’t as flexible as retrieving the complete required object subgraph with an HQL query, using arbitrary fetching strategies at runtime.

You use the static method NHibernateUtil.Initialize() for manual initialization:

using( ISession session = sessionFactory.OpenSession() )
using( session.BeginTransaction() ) {
Category cat = session.Get<Category>(id);
NHibernateUtil.Initialize( cat.Items );
session.Transaction.Commit();
}
foreach(Item item in cat.Items)
//...

NHibernateUtil.Initialize() may be passed a collection wrapper, as in this example, or a proxy. You may also, in similar rare cases, check the current state of a property by calling NHibernateUtil.IsInitialized(). (Note that Initialize() doesn’t cascade to any associated objects.)

Another solution for this problem is to keep the session open until the application thread finishes, so you can navigate the object graph whenever you like and have NHibernate automatically initialize all lazy references. This is a problem of application design and transaction demarcation; we discuss it again in section 9.1. But your first choice should be to fetch the complete required graph, using HQL or criteria queries, with a sensible and optimized default fetching strategy in the mapping metadata for all other cases. NHibernate allows you to look at the underlying SQL that it sends to the database, so it’s possible to tune object retrieval if performance problems are observed. This is discussed in the next section.

4.4.7. Tuning object retrieval

In most cases, your NHibernate applications will perform well when it comes to fetching data from the database. But occasionally, you may notice that some areas of your application aren’t performing as well as they should. There can be many reasons for this; you need to understand how to analyze and tune your NHibernate applications so they work efficiently with the database. Let’s look at the steps involved when you’re tuning the object-retrieval operations in your application.

Enable the NHibernate SQL log, as described in chapter 3. You should also be prepared to read, understand, and evaluate SQL queries and their performance characteristics for your specific relational model: will a single join operation be faster than two selects? Are all the indexes used properly, and what is the cache-hit ratio inside the database? Get your DBA to help you with the performance evaluation; only she will have the knowledge to decide which SQL execution plan is the best.

Step through your application use case by use case, and note how many and what SQL statements NHibernate executes. A use case can be a single screen in your web application or a sequence of user dialogs. This step also involves collecting the object-retrieval methods you use in each use case: walking the graph, retrieval by identifier, HQL, and criteria queries. Your goal is to bring down the number (and complexity) of SQL queries for each use case by tuning the default fetching strategies in metadata.

You may encounter two common issues:

  • If the SQL statements use join operations that are too complex and slow, set outer-join to false for <many-to-one> associations (this is enabled by default). Also try to tune with the global hibernate.max_fetch_depth configuration option, but keep in mind that this is best left at a value between 1 and 4.
  • If too many SQL statements are executed, use lazy="true" for all collection mappings; by default, NHibernate will execute an immediate additional fetch for the collection elements (which, if they’re entities, can cascade further into the graph). In rare cases, if you’re sure, enable outer-join="true" and disable lazy loading for particular collections. Keep in mind that only one collection property per persistent class may be fetched eagerly. Use batch fetching with values between 3 and 15 to further optimize collection fetching if the given unit of work involves several of the same collections or if you’re accessing a tree of parent and child objects.

After you set a new fetching strategy, rerun the use case and check the generated SQL again. Note the SQL statements, and go to the next use case.

After you optimize all use cases, check every use case again and see if any optimizations had side effects for others. With some experience, you’ll be able to avoid negative effects and get it right the first time.

This optimization technique isn’t practical for more than the default fetching strategies; you can also use it to tune HQL and criteria queries, which can ignore and override the default fetching for specific use cases and units of work. We discuss runtime fetching in chapter 8.

In this section, you’ve started to think about performance issues, especially issues related to association fetching. The quickest way to fetch a graph of objects is to fetch it from the cache in memory, as we show in the next chapter.

4.5. Summary

The dynamic aspects of the object/relational mismatch are just as important as the better-known and better-understood structural mismatch problems. In this chapter, we were primarily concerned with the lifecycle of objects with respect to the persistence mechanism. We discussed the three object states defined by NHibernate: persistent, detached, and transient. Objects transition between these states when you invoke methods of the ISession interface, or when you create and remove references from a graph of already persistent instances. This latter behavior is governed by the configurable cascade styles available in NHibernate’s model for transitive persistence. This model lets you declare the cascading of operations (such as saving or deletion) on a per-association basis, which is more powerful and flexible than the traditional persistence by reachability model. Your goal is to find the best cascading style for each association and therefore minimize the number of persistence manager calls you have to make when storing objects.

Retrieving objects from the database is equally important: you can walk the graph of domain objects by accessing properties and let NHibernate transparently fetch objects. You can also load objects by identifier, write arbitrary queries in the HQL, or create an object-oriented representation of your query using the query by criteria API. In addition, you can use native SQL queries in special cases.

Most of these object-retrieval methods use the default fetching strategies we defined in mapping metadata (HQL ignores them; criteria queries can override them). The correct fetching strategy minimizes the number of SQL statements that have to be executed by lazily, eagerly, or batch-fetching objects. You optimize your NHibernate application by analyzing the SQL executed in each use case and tuning the default and runtime fetching strategies.

Next, we explore the closely related topics of transactions and caching.