Chapter 7. Retrieving objects efficiently – NHibernate in Action

Chapter 7. Retrieving objects efficiently

This chapter covers

  • NHibernate query features
  • HQL, criteria, and native SQL
  • Advanced, reporting, and dynamic queries
  • Runtime fetching and query optimization

Queries are the most interesting part of writing good data access code. A complex query may require a long time to get right, and its impact on the performance of an application can be tremendous. As with regular SQL, writing NHibernate queries becomes much easier with experience.

If you’ve been using handwritten SQL for a number of years, you may be concerned that ORM will take away some of the expressiveness and flexibility you’re used to. This is seldom the case; NHibernate’s powerful query facilities allow you to do almost anything you would in SQL, and in some cases more. For the rare cases where you can’t make NHibernate’s own query facilities do exactly what you want, NHibernate allows you to retrieve objects using your database’s native SQL dialect.

In section 4.4, we mentioned that there are three ways to express queries in NHibernate. First is the HQL :

session.CreateQuery("from Category c where c.Name like 'Laptop%'");

Next is the ICriteria API:

session.CreateCriteria(typeof(Category))
.Add( Expression.Like("Name", "Laptop%") );

Finally, there is direct SQL, which automatically maps result sets to objects:

session.CreateSQLQuery(
"select {c.*} from CATEGORY {c} where NAME like 'Laptop%'",
"c",
typeof(Category));

This chapter covers in-depth techniques using all three methods. You can also use this chapter as a reference; some sections are written in a less verbose style but show many small code examples for different use cases.

Before we continue, we briefly introduce another method of querying NHibernate: LINQ-to-NHibernate. This allows you to write LINQ queries to query NHibernate like this:

from cat in session.Linq<Category>()
where cat.Name.StartsWith("Laptop")
select cat

LINQ-to-NHibernate is an exciting and welcome addition. Unfortunately, at the time of writing it is still very much in Beta, so we can’t cover it in detail here. If you want to try using LINQ-to-NHibernate, you can find the source code in the NHContrib project at http://nhcontrib.wiki.sourceforge.net/. We recommend downloading the source code and examining the unit tests to see up-to-date examples of how it can be used.

Let’s continue with our investigations into HQL, the Criteria API, and direct SQL queries. We start our exploration by showing you how queries are executed with NHibernate. Rather than focus on the queries themselves, we focus on the various techniques for creating and running queries using NHibernate. Following that, we move on to discuss the particulars of how queries are composed.

7.1. Executing queries

The IQuery and ICriteria interfaces both define several methods for controlling execution of a query. To execute a query in your application, you need to obtain an instance of one of these query interfaces using the ISession. Let’s take a quick look at how you can do that.

7.1.1. The query interfaces

To create a new IQuery instance, call either CreateQuery() or CreateSQLQuery(). IQuery can be used to prepare an HQL query as follows:

IQuery hqlQuery = session.CreateQuery("from User");

This query is now set up to fetch all user objects in the databases. You can achieve the same thing using the CreateSQLQuery() method, using the native SQL dialect of the underlying database:

IQuery sqlQuery = session.CreateSQLQuery(
"select {u.*} from USERS {u}", "u",
typeof(User));

You’ll learn more about running SQL queries in section 8.5.4. Finally, here’s how you use the strongly typed ICriteria interface to do the same thing in a different way:

ICriteria crit = session.CreateCriteria(typeof(User));

This last example uses CreateCriteria() to get back a list of objects. Notice that the root entity type you want the query to return is specified as User. We study criteria queries in detail later.


A note about CaveatEmptor

Some queries in this chapter won’t work with the CaveatEmptor code that accompanies this book. This is because certain techniques illustrated here require variations in the way classes and their mappings are defined.


Now we continue our discussion of creating and running queries by looking at another useful concept: pagination.

Paging the Result

Pagination is a commonly used technique, and you’ve probably seen it in action. For example, an eCommerce web site may display lists of products over a number of pages, each showing only 10 or 20 products at a time. Typically, users navigate to the next or previous pages by clicking appropriate links on the page. When writing data access code for this scenario, you need to work out how to show the correct page of records at any given time—and that’s what pagination is all about.

In NHibernate, both the IQuery and ICriteria interfaces make pagination simple, as demonstrated here:

IQuery query =
session.CreateQuery("from User u order by u.Name asc");
query.SetFirstResult(0);
query.SetMaxResults(10);

The call to SetMaxResults(10) limits the query result set to the first 10 objects selected by the database. What if you wanted to get some results for the next page?

ICriteria crit = session.CreateCriteria(typeof(User));
crit.AddOrder( Order.Asc("Name") );
crit.SetFirstResult(10);
crit.SetMaxResults(10);
IList<User> results = crit.List<User>();

Starting from the tenth object, you retrieve the next 10 objects. Note that there is no standard way to express pagination in SQL, and each database vendor often provides a different syntax and approach. Fortunately, NHibernate knows the tricks for each vendor, so paging is easily done regardless of your particular database.

IQuery and ICriteria also expose a fluent interface that allows method chaining. To demonstrate, we’ve rewritten the two previous examples to take advantage of this technique:

IList<User> results =
session.CreateQuery("from User u order by u.Name asc")
.SetFirstResult(0)
.SetMaxResults(10)
.List<User>();
IList<User> results =
session.CreateCriteria(typeof(User))
.AddOrder( Order.Asc("Name") )
.SetFirstResult(40)
.SetMaxResults(20)
.List<User>();

Chaining method calls this way is considered to be less verbose and easier to write, and it’s possible to do with many of NHibernate’s APIs.

Now that you’ve created queries and set up pagination, we look at how you get the results of a query.

Listing and Iterating Results

The List() method executes the query and returns the results as a list:

IList<User> result = session.CreateQuery("from User").List<User>();

When writing queries, sometimes you want only a single instance to be returned. For example, if you want to find the highest bid, you can get that instance by reading it from the result list by index: result[0]. Alternatively, you can use SetMaxResults(1) and execute the query with the UniqueResult() method:

Bid maxBid =
(Bid) session.CreateQuery("from Bid b order by b.Amount desc")
.SetMaxResults(1)
.UniqueResult();
Bid bid = (Bid) session.CreateCriteria(typeof(Bid))
.Add( Expression.Eq("Id", id) )
.UniqueResult();

You need to be sure your query returns only one object; otherwise, an exception will be thrown.

The IQuery and ISession interfaces also provide an Enumerable() method, which returns the same result as List()or Find(), but which uses a different strategy for retrieving the results. When you use Enumerable() to execute a query, NHibernate retrieves only the primary key (identifier) values in the first SQL select; it tries to find the rest of the state of the objects in the cache before querying again for the rest of the property values. You can use this technique to optimize loading in specific cases, as discussed in section 7.7.


Why not use ISession.Find() instead of IQuery.List()?

The ISession API provides shortcut methods for simple queries. Instead of creating an IQuery instance, you can also call ISession.Find("from User"). The result is the same as from IQuery.List(). The same is true for Enumerable().

But the query shortcut methods on the ISession API will be removed in the future to reduce the bloat of session methods. We recommend always using the IQuery API.


Finally, another important factor when constructing queries is binding parameters. The IQuery interface lets you achieve this in a flexible manner, as we discuss next.

7.1.2. Binding parameters

Allowing developers to bind values to queries is an important feature for any data access library because it permits you to construct queries that are both maintainable and secure. We demonstrate the types of parameter binding available in NHibernate; but first, let’s look at the potential problems of not binding parameters.

The Problem of SQL Injection Attacks

Consider the following code:

string queryString =
"from Item i where i.Description like '" + searchString + "'";
IList result = session.CreateQuery(queryString).List();

This code is plainly and simplly bad! You may know why: it can potentially leave your application open to SQL injection attacks. In such an attack, a malicious user attempts to trick your application into running the user’s own SQL against the database, in order to cause damage or circumnavigate application security. If that user typed this searchString

foo' and CallSomeStoredProcedure() and 'bar' = 'bar

the queryString sent to the database would be

from Item i where i.Description like 'foo' and CallSomeStoredProcedure()
and 'bar' = 'bar'

As you can see, the original queryString would no longer be a simple search for a string, but would also execute a stored procedure in the database!

One of the main problems here is that the application isn’t checking the values passed in from the user interface. Because of this, the quote characters aren’t escaped, and users can inject their own SQL. Users may even accidentally crash your application just by putting a single quote in the search string. The golden rule is, “Never pass unchecked values from user input to the database!”

Fortunately, you can easily avoid this problem by using parameters. With parameters, your query may look like this:

string queryString =
"from Items I where i.Description like :searchString"

When you use parameters, queries and parameters are sent to the database separately, so the database can ensure they’re dealt with securely and efficiently.

Another reason to use parameters is that they help NHibernate be more efficient. NHibernate keeps track of the queries you execute; when parameters are used, it needs to keeps only one copy of the query in memory, even if the query is run thousands of times with different parameters each time.

Now you understand the importance of parameters. How do you use them in your NHibernate queries? There are two approaches to parameter binding: named parameters and positional parameters. We discuss these in turn.

Using Named Parameters

Using named parameters, you can rewrite the earlier query as follows:

string queryString =
"from Item item where item.Description like :searchString";

The colon followed by a parameter name indicates a named parameter. Then you can use the IQuery interface to bind a value to the searchString parameter:

IList result = session.CreateQuery(queryString)
.SetString("searchString", searchString)
.List();

Because searchString is a user-supplied string variable, you use the SetString() method of the IQuery interface to bind it to the named parameter (searchString).

Often, you’ll need multiple parameters:

string queryString = @"from Item item
where item.Description like :searchString
and item.Date > :minDate";
IList result = session.CreateQuery(queryString)
.SetString("searchString", searchString)
.SetDate("minDate", minDate)
.List();

This code is cleaner, much safer, and performs better, because a single compiled SQL statement can be reused if only bind parameters change.

Using Positional Parameters

If you prefer, you can use positional parameters:

string queryString = @"from Item item
where item.Description like ?
and item.Date > ?";
IList result = session.createQuery(queryString)
.SetString(0, searchString)
.SetDate(1, minDate)
.List();

Not only is this code less self-documenting than the alternative that uses named parameters, but it’s also much more vulnerable to breakage if you change the query string slightly:

string queryString = @"from Item item
where item.Date > ?
and item.Description like ?";

Every change of the bind parameters’ positions requires a change to the parameter-binding code. This leads to fragile and maintenance-intensive code. We recommend that you avoid positional parameters:

string userSearch =
@"from User u where u.Username like :searchString
or u.Email like :searchString";
IList result = session.CreateQuery(userSearch)
.SetString("searchString", searchString)
.List();

Notice how the named parameter may appear multiple times in the query string.

Binding Arbitrary Arguments

You’ve used SetString() and SetDate() to bind arguments to query parameters. The IQuery interface provides similar convenience methods for binding arguments of most of the NHibernate built-in types: everything from SetInt32() to SetTime-stamp() and SetEnum().

A particularly useful method is SetEntity(), which lets you bind a persistent entity:

session.CreateQuery("from Item item where item.Seller = :seller")
.SetEntity("seller", seller)
.List();

In addition, a generic method allows you to bind an argument of any NHibernate type:

string queryString = @"from Item item
where item.Seller=:seller and
item.Description like :desc";
session.CreateQuery(queryString)
.SetParameter( "seller", seller,
NHibernateUtil.Entity(typeof(User)) )
.SetParameter( "desc", description, NHibernateUtil.String )
.List();

This even works for custom user-defined types like MonetaryAmount:

IQuery q =
session.CreateQuery("from Bid bid where bid.Amount > :amount");
q.SetParameter( "amount",
givenAmount,
NHibernateUtil.Custom(typeof(MonetaryAmountUserType)) );
IList<Bid> result = q.List<Bid>();

For some parameter types, it’s possible to guess the NHibernate type from the class of the parameter value. In this case, you don’t need to specify the NHibernate type explicitly:

string queryString = @"from Item item
where item.Seller = :seller and
item.Description like :desc";
session.CreateQuery(queryString)
.SetParameter("seller", seller)
.SetParameter("desc", description)
.List();

As you can see, it even works with entities, such as seller. This approach works nicely for string, int, and bool parameters, for example, but not so well for DateTime, where the NHibernate type may be Timestamp or DateTime. In that case, you have to use the appropriate binding method or explicitly use NHibernateUtil.DateTime (or any other NHibernate type) as the third argument to SetParameter().

If you have a POCO with Seller and Description properties, you can use the SetProperties() method to bind the query parameters. For example, you can pass query parameters in an instance of the Item class:

Item item = new Item();
item.Seller = seller;
item.Description = description;
string queryString = @"from Item item
where item.Seller=:seller and
item.Description like :desc";
session.CreateQuery(queryString).SetProperties(item).List();

SetProperties() matches the names of POCO properties to named parameters in the query string, using SetParameter() to guess the NHibernate type and bind the value. In practice, this turns out to be less useful than it sounds, because some common NHibernate types aren’t guessable (DateTime in particular).

The parameter-binding methods of IQuery are null-safe, making this code legal:

session.CreateQuery("from User as u where u.Email = :email")
.SetString("email", null)
.List();

But the result of this code is almost certainly not what you intended. The resulting SQL will contain a comparison like username = null, which always evaluates to null in SQL ternary logic. Instead, you must use the is null operator:

session.CreateQuery("from User as u where u.Email is null").List();

So far, the HQL code examples we’ve shown all use embedded HQL query string literals. This isn’t unreasonable for simple queries; but once we begin considering complex queries that must be split over multiple lines, it starts to get unwieldy.

7.1.3. Using named queries

We don’t like to see HQL string literals scattered all over C# code unless they’re necessary. NHibernate lets you store query strings outside of your code, a technique that is called named queries. This approach allows you to store all queries related to a particular persistent class along with the other metadata of that class in an XML mapping file. You use the name of the query to call it from the application.

The GetNamedQuery() method obtains an IQuery instance for a named query:

session.GetNamedQuery("FindItemsByDescription")
.SetString("description", description)
.List();

In this example, you execute the named query FindItemsByDescription after binding a string argument to a named parameter. The named query is defined in mapping metadata, such as in Item.hbm.xml, using the <query> element:

<query name="FindItemsByDescription"><![CDATA[
from Item item where item.Description like :description
]]></query>

Named queries don’t have to be HQL strings; they may even be native SQL queries—and your C# code doesn’t need to know the difference:

<sql-query name="FindItemsByDescription"><![CDATA[
select {i.*} from ITEM {i} where DESCRIPTION like :description
]]>
<return alias="i" class="Item"/>
</sql-query>

This is useful if you think you may want to optimize your queries later by fine-tuning the SQL. It’s also a good solution if you have to port a legacy application to NHibernate, where SQL code was isolated from the handcoded ADO.NET routines. With named queries, you can easily port the queries one by one to mapping files.

7.1.4. Using query substitutions

It’s often necessary, or at least useful, to use a different word to name an object in a query. For example, with a Boolean property like User.IsAdmin, you write

from User u where u.IsAdmin = 1

But by adding this property to your configuration file

<property name="hibernate.query.substitutions">
true 1, false 0
</property>

you can write

from User u where u.IsAdmin = true

Note that you can also use this feature to rename SQL functions.

We’ve now wrapped up our discussion of creating and running queries. It’s time to focus on the queries themselves. The next section covers HQL, starting with simple queries and moving on to far more advanced topics.

7.2. Basic queries for objects

Let’s start with simple queries, to become familiar with the HQL syntax and semantics. Although we show the criteria alternative for most HQL queries, keep in mind that HQL is the preferred approach for complex queries. Usually, the criteria can be derived if you know the HQL equivalent; it’s much more difficult the other way around.

7.2.1. The simplest query

The simplest query retrieves all instances of a particular persistent class. In HQL, it looks like this:

from Bid


Testing NHibernate queries

You can use the open source tool NHibernate Query Analyzer to execute NHibernate queries ad hoc. It lets you select NHibernate mapping documents (or write them), set up the NHibernate configuration, and then view the result of HQL queries you type interactively. More details are provided in section 7.7.4.


Using the ICriteria interface, it looks like this:

ICriteria c = session.CreateCriteria(typeof(Bid));

Both generate the following SQL behind the scenes:

select B.BID_ID, B.AMOUNT, B.ITEM_ID, B.CREATED from BID B

Even for this simple case, you can see that HQL is less verbose than SQL.

7.2.2. Using aliases

When you query a class using HQL, you often need to assign an alias to the queried class, which you use as reference in other parts of the query:

from Bid as bid

The as keyword is always optional. The following is equivalent:

from Bid bid

Think of this as being like the temporary variable declaration in the following C# code:

for ( int i = 0; i < allQueriedBids.Count; i++ ) {
Bid bid = (Bid) allQueriedBids[i];
//...
}

You assign the alias bid to queried instances of the Bid class, allowing you to refer to their property values later in the code (or query). To remind yourself of the similarity, we recommend that you use the same naming convention for aliases that you use for temporary variables (camelCase, usually). We use shorter aliases in some of the examples in this book (for example, i instead of item) to keep the printed code readable.


Note

You never write HQL keywords in uppercase; you never write SQL keywords in uppercase either. It looks ugly and antiquated—most modern terminals can display both uppercase and lowercase characters. HQL isn’t case-sensitive for keywords, so you can write FROM Bid AS bid if you like shouting.


By contrast, a criteria query defines an implicit alias. The root entity in a criteria query is always assigned the alias this. We discuss this topic in more detail later, when you’re joining associations with criteria queries. You don’t have to think much about aliases when using the ICriteria API.

7.2.3. Polymorphic queries

We’ve described HQL as an object-oriented query language, so it should support polymorphic queries—that is, queries for instances of a class and all instances of its subclasses, respectively. You already know enough HQL that we can demonstrate this. Consider the following query:

from BillingDetails

This query returns objects of the type BillingDetails, which is an abstract class. In this case, the concrete objects are of the subtypes of BillingDetails: CreditCard and BankAccount. If you only want instances of a particular subclass, you can use

from CreditCard

The class named in the from clause doesn’t need to be a mapped persistent class; any class will do. The following query returns all persistent objects in the entire database:

from System.Object

This technique also works for interfaces. The following query returns all serializable persistent objects (those implementing the interface ISerializable):

from System.ISerializable

Criteria queries also support polymorphism:

session.CreateCriteria(typeof(BillingDetails)).List();

This query returns instances of BillingDetails and its subclasses. Likewise, the following criteria query returns all persistent objects:

session.CreateCriteria(typeof(System.Object)).List();

Polymorphism applies not only to classes named explicitly in the from clause, but also to polymorphic associations, as you’ll see later.

Now that we’ve discussed the from clause, let’s move on to the other parts of HQL.

7.2.4. Restriction

You usually don’t want to retrieve all instances of a class when you run a query. Instead, you want to express some constraints on the property values of your objects, so only a subset of objects is retrieved. This is called restriction, and in both HQL and SQL, you achieve it using the where clause.

A where clause can be simple or complex, but let’s start with a simple HQL example:

from User u where u.Email = 'foo@hibernate.org'

Notice that the constraint is expressed in terms of a property, Email, of the User class, and that you use an object-oriented notion: just as in C#, u.Email may not be abbreviated to plain Email.

For a criteria query, you must construct an ICriterion object to express the constraint. The Expression class provides factory methods for built-in ICriterion types. Let’s create the same query using criteria and immediately execute it:

ICriterion emailEq = Expression.Eq("Email", "foo@hibernate.org");
ICriteria crit = session.CreateCriteria(typeof(User));
crit.add(emailEq);
User user = (User) crit.UniqueResult();

You create an ICriterion instance holding the simple Expression for an equality comparison and add it to the ICriteria. The UniqueResult() method executes the query and returns exactly one object as a result.

Usually, you’ll write this less verbosely, using method chaining:

User user = (User) session.CreateCriteria(typeof(User))
.Add( Expression.Eq("Email", "foo@hibernate.org") )
.UniqueResult();

The SQL generated by these queries is as follows:

select U.USER_ID, U.FIRSTNAME, U.LASTNAME, U.USERNAME, U.EMAIL
from USER U
where U.EMAIL = 'foo@hibernate.org'

It’s common to have a restriction that should always be used; most of the time, it’s used to ignore deprecated data. You may, for example, have an Active property and write the following:

select User u where u.Email = 'foo@hibernate.org' and u.Active = 1

But this is dangerous, because you may forget the restriction; a better solution, in this case, is to change your mapping:

<class name="User" where="ACTIVE=1">

Now, you can write

from User u where u.Email = 'foo@hibernate.org'

This query generates the following SQL query:

select U.USER_ID, U.FIRSTNAME, U.LASTNAME, U.USERNAME, U.EMAIL
from USER U
where U.EMAIL = 'foo@hibernate.org' and U.ACTIVE = 1

Note that the ACTIVE column doesn’t have to be mapped. And, since NHibernate 1.2.0, this attribute is also used when calling ISession.Load() and ISession.Get(). This feature is also available for collections:

<bag name="Users" where="ACTIVE=1">

Here, the collection Users will contain only users whose ACTIVE value is 1.

This approach can be useful, but we recommend considering filters for most scenarios (see section 7.5.2, “Collection filters”). You can, of course, use various other comparison operators for restriction.

7.2.5. Comparison operators

A restriction is expressed using ternary logic. The where clause is a logical expression that evaluates to true, false, or null for each tuple of objects. You construct logical expressions by comparing properties of objects to other properties or literal values using HQL’s built-in comparison operators.


What is ternary logic?

A row is included in a SQL result set if and only if the where clause evaluates to true. In C#, notNullObject==null evaluates to false, and null==null evaluates to true. In SQL, NOT_NULL_COLUMN=null and null=null both evaluate to null, not true. Thus, SQL needs a special operator, IS NULL, to test whether a value is null. This ternary logic is a way of handling expressions that may be applied to null column values. It’s a (debatable) SQL extension to the familiar binary logic of the relational model and of typical programming languages such as C#.


HQL supports the same basic operators as SQL: =, <>, <, >, >=, <=, between, not between, in, and not in. For example:

from Bid bid where bid.Amount between 1 and 10
from Bid bid where bid.Amount > 100
from User u where u.Email in ( 'foo@hibernate.org', 'bar@hibernate.org' )

In case of criteria queries, all the same operators are available via the Expression class:

session.CreateCriteria(typeof(Bid))
.Add( Expression.Between("Amount", 1, 10) )
.List();
session.CreateCriteria(typeof(Bid))
.Add( Expression.Gt("Amount", 100) )
.List();
string[] emails = { "foo@NHibernate.org", "bar@NHibernate.org" };
session.CreateCriteria(typeof(User))
.Add( Expression.In("Email", emails) )
.List();

Because the underlying database implements ternary logic, testing for null values requires some care. Remember that null = null doesn’t evaluate to true in the database, but to null. All comparisons that use the null operator evaluate to null. Both HQL and the ICriteria API provide an SQL-style is null operator:

from User u where u.Email is null

This query returns all users with no email address. The same semantic is available in the ICriteria API:

session.CreateCriteria(typeof(User))
.Add( Expression.IsNull("Email") )
.List();

You also need to be able to find users who have an email address:

from User u where u.Email is not null
session.CreateCriteria(typeof(User))
.Add( Expression.IsNotNull("Email") )
.List();

Finally, the HQL where clause supports arithmetic expressions (but the ICriteria API doesn’t):

from Bid bid where ( bid.Amount / 0.71 ) - 100.0 > 0.0

For string-based searches, you need to be able to perform case-insensitive matching and matches on fragments of strings in restriction expressions.

7.2.6. String matching

The like operator allows wildcard searches, where the wildcard symbols are % and _, just as in SQL:

from User u where u.Firstname like "S%"

This expression restricts the result to users with a first name starting with a capital S. You can also negate the like operator, for example by using a substring match expression:

from User u where u.Firstname not like "%Foo S%"

For criteria queries, wildcard searches may either use the same wildcard symbols or specify a MatchMode. NHibernate provides the MatchMode as part of the ICriteria query API; you use it to write string match expressions without string manipulation. These two queries are equivalent:

session.CreateCriteria(typeof(User))
.Add( Expression.Like("Firstname", "S%") )
.List();
session.CreateCriteria(typeof(User))
.Add( Expression.Like("Firstname", "S", MatchMode.Start) )
.List();

The allowed MatchModes are Start, End, Anywhere, and Exact.

An extremely powerful feature of HQL is the ability to call arbitrary SQL functions in the where clause. If your database supports user-defined functions (most do), you can put this functionality to all sorts of uses, good or evil. For the moment, let’s consider the usefulness of the standard ANSI SQL functions upper() and lower(). They can be used for case-insensitive searching:

from User u where lower(u.Email) = 'foo@hibernate.org'

The ICriteria API doesn’t currently support SQL function calls. But it does provide a special facility for case-insensitive searching:

session.CreateCriteria(typeof(User))
.Add( Expression.Eq("Email", "foo@hibernate.org").IgnoreCase() )
.List();

Unfortunately, HQL doesn’t provide a standard string-concatenation operator; instead, it supports whatever syntax your database provides. Here is an example for SQL Server:

from User user
where ( user.Firstname + ' ' + user.Lastname ) like 'S% K%'

We return to more exotic features of the HQL where clause later in this chapter. We only used single expressions for restrictions in this section; let’s combine several with logical operators.

7.2.7. Logical operators

Logical operators (and parentheses for grouping) are used to combine expressions:

from User user
where user.Firstname like "S%" and user.Lastname like "K%"
from User user
where ( user.Firstname like "S%" and user.Lastname like "K%" )
or user.Email in ( 'foo@hibernate.org', 'bar@hibernate.org' )

If you add multiple ICriterion instances to the one ICriteria instance, they’re applied conjunctively (that is, using and):

session.CreateCriteria(typeof(User))
.Add( Expression.Like("Firstname", "S%") )
.Add( Expression.Like("Lastname", "K%") )

If you need disjunction (or), you have two options. The first is to use Expression.Or() together with Expression.And():

ICriteria crit = session.CreateCriteria(typeof(User))
.Add(
Expression.Or(
Expression.And(
Expression.Like("Firstname", "S%"),
Expression.Like("Lastname", "K%")
),
Expression.In("Email", emails)
)
);

The second option is to use Expression.Disjunction() together with Expression. Conjunction():

ICriteria crit = session.CreateCriteria(typeof(User))
.Add( Expression.Disjunction()
.Add( Expression.Conjunction()
.Add( Expression.Like("Firstname", "S%") )
.Add( Expression.Like("Lastname", "K%") )
)
.Add( Expression.In("Email", emails) )
);

We think both options are ugly, even after spending five minutes trying to format them for maximum readability. Unless you’re constructing a query on the fly, the HQL string is much easier to understand. Complex criteria queries are useful only when they’re created programmatically; for example, in the case of a complex search screen with several optional search criteria, you may have a CriteriaBuilder that translates user restrictions to ICriteria instances.

7.2.8. Ordering query results

All query languages provide a mechanism for ordering query results. HQL provides an order by clause, similar to SQL.

This query returns all users, ordered by username:

from User u order by u.Username

You specify ascending and descending order using asc or desc:

from User u order by u.Username desc

Finally, you can order by multiple properties:

from User u order by u.Lastname asc, u.Firstname asc

The ICriteria API provides a similar facility:

IList results = session.CreateCriteria(typeof(User))
.AddOrder( Order.Asc("Lastname") )
.AddOrder( Order.Asc("Firstname") )
.List();

Thus far, we’ve only discussed the basic concepts of HQL and criteria queries. You’ve learned how to write a simple from clause and use aliases for classes. You’ve combined various restriction expressions with logical operators. But you’ve focused on single persistent classes—that is, you’ve only referenced a single class in the from clause. An important query technique we haven’t discussed yet is the joining of associations at runtime.

7.3. Joining associations

When you’re querying databases, sometimes you want to combine data in two or more relations. This is achieved using a join. For example, you may join the data in the ITEM and BID tables, as shown in figure 7.1. Note that not all columns and possible rows are shown; hence the dotted lines.

Figure 7.1. The ITEM and BID tables are obvious candidates for a join operation.

When most people hear the word join in the context of SQL databases, they think of an inner join. An inner join is one of several types of joins, and it’s the easiest to understand. Consider the SQL statement and result in figure 7.2. This SQL statement is an ANSI-style join.

Figure 7.2. The result table of an ANSI-style inner join of two tables

If you join the tables ITEM and BID with an inner join, using their common attributes (the ITEM_ID column), you get all items and their bids in a new result table. Note that the result of this operation contains only items that have bids. If you want all items, and null values instead of bid data when there is no corresponding bid, you use a (left) outer join, as shown in figure 7.3.

Figure 7.3. The result of an ANSI-style left outer join of two tables

You can think of a table join as working in this way: First, you get a Cartesian product of the two tables by taking all possible combinations of ITEM rows with BID rows. Second, you filter these joined rows using a join condition. Note that the database has much more sophisticated algorithms to evaluate a join; it usually doesn’t build a memory-consuming product and then filter all rows. The join condition is a Boolean expression that evaluates to true if the joined row is to be included in the result. In the case of the left outer join, each row in the (left) ITEM table that never satisfies the join condition is also included in the result, with null values returned for all columns of BID. (A right outer join retrieves all bids and null if a bid has no item—certainly not a sensible query in this situation.)

In SQL, the join condition is usually specified explicitly; it isn’t possible to use the name of a foreign-key constraint to specify how two tables are to be joined. Instead, you have to specify the join condition in the on clause for an ANSI-style join or in the where clause for a so-called theta-style join, where I.ITEM_ID = B.ITEM_ID.

7.3.1. NHibernate join options

In NHibernate queries, you don’t usually specify a join condition explicitly. Rather, you specify the name of a mapped class association so that NHibernate can work out the join for you. For example, the Item class has an association named bids with the Bid class. If you name this association in your query, NHibernate has enough information in the mapping document to then deduce the join expression. This helps make queries less verbose and more readable.

HQL provides four ways of expressing inner and outer joins:

  • An ordinary join in the from clause
  • A fetch join in the from clause
  • A theta-style join in the where clause
  • An implicit association join

We discuss all of these options in this chapter. Because the ordinary and fetch from clause joins have the clearest syntax, we discuss these first.

When you’re working with NHibernate, there are usually several reasons to use a join, and it’s important to note that NHibernate lets you differentiate between the purposes for joining. Let’s put this in the context of a short example. If you’re querying Items, there are three possible reasons why you may be interested in joining the Bids:

  • You want to retrieve Items returned on the basis of some criterion that should be applied to their Bids. For example, you may want all Items that have a bid of more than $100; this requires an inner join.
  • You’re running a query where you’re mainly interested in only the Items without any special criterion for Bids. You may or may not want to access the Bids for an item, but you want the option for NHibernate to lazily load them when you first access the collection.
  • You want to execute an outer join to load all Items along with their Bids in the same SELECT (eager fetching).

Remember that your default preference should be to map all associations lazily; an eager, outer-join fetch query can be used to override this default fetching strategy at runtime. We discuss this scenario first.

7.3.2. Fetching associations

In HQL, you can specify that an association should be eagerly fetched by an outer join using the fetch keyword in the from clause:

from Item item
left join fetch item.Bids
where item.Description like '%part%'

This query returns all Items with a description that contains the string part, and all their Bids, in a single select. When executed, it returns a list of Item instances, with their bids collections fully initialized. We call this a from clause fetch join. The purpose of a fetch join is performance optimization: you use this syntax only because you want eager initialization of the bids collections in a single SQL select.

You can do the same thing using the ICriteria API:

session.CreateCriteria(typeof(Item))
.SetFetchMode("Bids", FetchMode.Eager)
.Add( Expression.Like("Description", "part", MatchMode.Anywhere) )
.List();

Both of these queries result in the following SQL:

select I.DESCRIPTION, I.CREATED, I.SUCCESSFUL_BID, B.BID_ID,
B.AMOUNT, B.ITEM_ID, B.CREATED
from ITEM I
left outer join BID B on I.ITEM_ID = B.ITEM_ID
where I.DESCRIPTION like '%part%'

You can also prefetch many-to-one or one-to-one associations using the same syntax:

from Bid bid
left join fetch bid.Item
left join fetch bid.Bidder
where bid.Amount > 100
session.CreateCriteria(typeof(Bid))
.SetFetchMode("Item", FetchMode.Eager)
.SetFetchMode("Bidder", FetchMode.Eager)
.Add( Expression.Gt("Amount", 100 ) )
.List();

These queries execute the following SQL:

select I.DESCRIPTION, I.CREATED, I.SUCCESSFUL_BID,
B.BID_ID, B.AMOUNT, B.ITEM_ID, B.CREATED,
U.USERNAME, U.PASSWORD, U.FIRSTNAME, U.LASTNAME
from BID B
left outer join ITEM I on I.ITEM_ID = B.ITEM_ID
left outer join USER U on U.USER_ID = B.BIDDER_ID
where B.AMOUNT > 100

Note that the left keyword is optional in HQL, so you can rewrite the previous examples using join fetch. Although this looks straightforward to use, you must consider and remember a couple of things.

First, HQL always ignores the mapping document eager fetch (outer join) setting. If you’ve mapped some associations to be fetched by outer join, by setting outer-join="true" or fetch="join" on the association mapping, any HQL query will ignore this preference. With HQL, if you want eager fetching, you need to ask for it in the query string. HQL is designed to be as flexible as possible: you can completely (re)define the fetching strategy that should be used at runtime. In comparison, the criteria will take full notice of your mappings! If you specify outer-join="true" in the mapping file, the criteria query will fetch that association by outer join—just like ISession.Get() or ISession.Load() for retrieval by identifier. For a criteria query, you can explicitly disable outer-join fetching by calling SetFetchMode("Bids", FetchMode.Lazy).

NHibernate currently limits you to fetching just one collection eagerly. This is a reasonable restriction, because fetching more than one collection in a single query would be a Cartesian product result. This restriction may be relaxed in a future version of NHibernate, but we encourage you to think about the size of the result set if more than one collection is fetched in an outer join. The amount of data that must be transported between database and application can easily grow into the megabyte range, and most of it is thrown away immediately (NHibernate flattens the tabular result set to build the object graph). You may fetch as many one-to-one or many-to-one associations as you like.

If you fetch a collection, NHibernate doesn’t return a distinct result list. For example, an individual Item may appear several times in the result IList, if you outer-join fetch the bids. You’ll probably need to make the results distinct yourself using, for example, distinctResults = new HashedSet(resultList);. An ISet doesn’t allow duplicate elements.

This is how NHibernate implements what we call runtime association fetching strategies, a powerful feature that is essential for achieving high performance in ORM. Let’s continue with the other join operations.

7.3.3. Using aliases with joins

We’ve already discussed the role of the where clause in expressing restriction. Often, you’ll need to apply restriction criteria to multiple associated classes (joined tables). If you want to do this using an HQL from clause join, you need to assign an alias to the joined class:

from Item item
join item.Bids bid
where item.Description like '%part%'
and bid.Amount > 100

This query assigns the alias item to the class Item and the alias bid to the joined Item’s bids. You then use both aliases to express your restriction criteria in the where clause. The resulting SQL is as follows:

select I.DESCRIPTION, I.CREATED, I.SUCCESSFUL_BID,
B.BID_ID, B.AMOUNT, B.ITEM_ID, B.CREATED
from ITEM I
inner join BID B on I.ITEM_ID = B.ITEM_ID
where I.DESCRIPTION like '%part%'
and B.AMOUNT > 100

The query returns all combinations of associated Bids and Items. But unlike a fetch join, the Bids collection of the Item isn’t initialized by the query! What do we mean by a combination here? We mean an ordered pair: (bid, item). In the query result, NHibernate represents an ordered pair as an array. Let’s discuss a full code example with the result of such a query:

IQuery q = session.CreateQuery("from Item item join item.Bids bid");
foreach( object[] pair in q.List() ) {
Item item = (Item) pair[0];
Bid bid = (Bid) pair[1];
}

Instead of an IList of Items, this query returns an IList of object[] arrays. At index 0 is the Item, and at index 1 is the Bid. A particular Item may appear multiple times, once for each associated Bid.

This is all different from the case of a query with an eager fetch join. The query with the fetch join returned an IList of Items, with initialized Bids collections.

If you don’t want the Bids in the query result, you can specify a select clause in HQL. This clause is optional (it isn’t optional in SQL), so you only have to use it when you aren’t satisfied with the result returned by default. You use the alias in a select clause to retrieve only the selected objects:

select item
from Item item
join item.Bids bid
where item.Description like '%part%'
and bid.Amount > 100

Now the generated SQL looks like this:

select I.DESCRIPTION, I.CREATED, I.SUCCESSFUL_BID,
from ITEM I
inner join BID B on I.ITEM_ID = B.ITEM_ID
where I.DESCRIPTION like '%part%'
and B.AMOUNT > 100

The query result contains just Items, and because it’s an inner join, only Items that have Bids:

IQuery q = session.CreateQuery("select i from Item i join i.Bids b");
foreach( Item item in q.List<Item>() {
//...
}

As you can see, using aliases in HQL is the same for both direct classes and joined associations. You assign aliases in the from clause and use them in the where and the optional select clauses. The select clause in HQL is much more powerful; we discuss it in detail later in this chapter.

ICriteria Joins

There are two ways to express a join in the ICriteria API; hence there are two ways to use aliases for restriction. The first is the CreateCriteria() method of the Criteria interface. It means that you can nest calls to CreateCriteria():

ICriteria itemCriteria = session.CreateCriteria(typeof(Item));
itemCriteria.Add( Expression.Like("Description",
"part",
MatchMode.Anywhere) );
ICriteria bidCriteria = itemCriteria.CreateCriteria("Bids");
bidCriteria.Add( Expression.Gt( "Amount", 100 ) );
IList results = itemCriteria.List();

You’ll usually write the query as follows, using method chaining:

IList results =
session.CreateCriteria(typeof(Item))
.Add( Expression.Like("Description", "part", MatchMode.Anywhere) )
.CreateCriteria("Bids")
.Add( Expression.Gt("Amount", 100) )
.List();

The creation of an ICriteria instance for the Bids of the Item results in an inner join between the tables of the two classes. Note that you may call List() on either ICriteria instance without changing the query results.

The second way to express this query using the ICriteria API is to assign an alias to the joined entity:

IList results =
session.CreateCriteria(typeof(Item))
.CreateAlias("Bids", "bid")
.Add( Expression.Like("Description", "%part%") )
.Add( Expression.Gt("bid.Amount", 100) )
.List();

This approach doesn’t use a second instance of ICriteria; properties of the joined entity must be qualified by the alias assigned in CreateAlias(). Properties of the root entity (Item) may be referred to without the qualifying alias or by using the alias "this". Thus the following is equivalent:

IList results =
session.CreateCriteria(typeof(Item))
.CreateAlias("Bids", "bid")
.Add( Expression.Like("this.Description", "%part%") )
.Add( Expression.Gt("bid.Amount", 100)
.List();

By default, a criteria query returns only the root entity—in this case, the Items—in the query result. Let’s summarize with a full example:

IList<Item> results =
session.CreateCriteria(typeof(Item))
.CreateAlias("Bids", "bid")
.Add( Expression.Like("this.Description", "%part%") )
.Add( Expression.Gt("bid.Amount", 100) )
.List<Item>();
foreach( Item item in results ) {
// Do something
}

Keep in mind that the Bids collection of each Item isn’t initialized. A limitation of criteria queries is that you can’t combine a CreateAlias with an eager fetch mode; for example, SetFetchMode("Bids", FetchMode.Eager) isn’t valid.

Sometimes you’d like a less verbose way to express a join. In NHibernate, you can use an implicit association join.

7.3.4. Using implicit joins

So far, you’ve used simple qualified property names like bid.Amount and item. Description in your HQL queries. HQL supports multipart property path expressions for two purposes:

  • Querying components
  • Expressing implicit association joins

The first use is straightforward:

from User u where u.Address.City = 'Bangkok'

You express the parts of the mapped component Address with dot notation. This usage is also supported by the ICriteria API:

session.CreateCriteria(typeof(User))
.Add( Expression.Eq("Address.City", "Bangkok") );

The second usage, implicit association joining, is available only in HQL. Here’s an example:

from Bid bid where bid.Item.Description like '%part%'

This results in an implicit join on the many-to-one associations from Bid to Item. Implicit joins are always directed along many-to-one or one-to-one associations, never through a collection-valued association (you can’t write item.Bids.Amount).

Multiple joins are possible in a single-property path expression. If the association from Item to Category was many-to-one (instead of the current many-to-many), you could write

from Bid bid where bid.Item.Category.Name like 'Laptop%'

We frown on the use of this syntactic sugar for more complex queries. Joins are important, and especially when optimizing queries, you need to be able to see at a glance how many of them there are. Consider the following query (again, using a many-to-one from Item to Category):

from Bid bid
where bid.Item.Category.Name like 'Laptop%'
and bid.Item.SuccessfulBid.Amount > 100

How many joins are required to express this in SQL? Even if you get the answer right, we bet it takes you more than a few seconds. The answer is three; the generated SQL looks something like this:

select ...
from BID B
inner join ITEM I on B.ITEM_ID = I.ITEM_ID
inner join CATEGORY C on I.CATEGORY_ID = C.CATEGORY_ID
inner join BID SB on I.SUCCESSFUL_BID_ID = SB.BID_ID
where C.NAME like 'Laptop%'
and SB.AMOUNT > 100

It’s more obvious if you express the same query like this:

from Bid bid
join bid.Item item
where item.Category.Name like 'Laptop%'
and item.SuccessfulBid.Amount > 100

You can even be more verbose:

from Bid as bid
join bid.Item as item
join item.Category as cat
join item.SuccessfulBid as winningBid
where cat.Name like 'Laptop%'
and winningBid.Amount > 100

Let’s continue with join conditions using arbitrary attributes, expressed in theta style.

7.3.5. Theta-style joins

A Cartesian product lets you retrieve all possible combinations of instances of two or more classes. This query returns all ordered pairs of Users and Category objects:

from User, Category

Obviously, this generally isn’t useful. There is one case where it’s commonly used: theta-style joins.

In traditional SQL, a theta-style join is a Cartesian product, together with a join condition in the where clause, which is applied on the product to restrict the result. In HQL, the theta-style syntax is useful when your join condition isn’t a foreign-key relationship mapped to a class association. For example, suppose you store the User’s name in log records instead of mapping an association from LogRecord to User. The classes don’t “know” anything about each other, because they aren’t associated. You can then find all the Users and their LogRecords with the following theta-style join:

from User user, LogRecord log where user.Username = log.Username

The join condition here is the username, presented as an attribute in both classes. If both entities have the same username, they’re joined (with an inner join) in the result. The query result consists of ordered pairs:

IList results = session.CreateQuery(
@"from User user, LogRecord log
where user.Username = log.Username"
)
.List();
foreach( Object[] pair in results )
User user = (User) pair[0];
LogRecord log = (LogRecord) pair[1];
}

You can change the result by adding a select clause.

You probably won’t need to use theta-style joins often. Note that the ICriteria API doesn’t provide any means for expressing Cartesian products or theta-style joins. It’s also currently not possible in NHibernate to outer-join two tables that don’t have a mapped association.

7.3.6. Comparing identifiers

It’s extremely common to perform queries that compare primary key or foreign key values to either query parameters or other primary or foreign key values. If you think about this in more object-oriented terms, what you’re doing is comparing object references. HQL supports the following:

from Item i, User u
where i.Seller = u and u.Username = 'steve'

In this query, i.Seller refers to the foreign key to the USER table in the ITEM table (on the SELLER_ID column), and User refers to the primary key of the USER table (on the USER_ID column). This next query uses a theta-style join and is equivalent to the much preferred ANSI style:

from Item i join i.Seller u
where u.Username = 'steve'

On the other hand, the following theta-style join can’t be re-expressed as a from clause join:

from Item i, Bid b
where i.Seller = b.Bidder

In this case, i.Seller and b.Bidder are both foreign keys of the USER table. Note that this is an important query in the example application; you use it to identify people bidding for their own items.

You may also want to compare a foreign key value to a query parameter—for example, to find all Comments from a User:

User givenUser = LoadUser(1)
IQuery q =
session.CreateQuery("from Comment c where c.FromUser = :user");
q.SetEntity("user", givenUser);
IList results = q.List();

Alternatively, sometimes you may prefer to express these kinds of queries in terms of identifier values rather than object references. You can refer to an identifier value by either the name of the identifier property (if there is one) or the special property name id. Every persistent entity class has this special HQL property, even if you don’t implement an identifier property on the class (see section 3.5.2).

These queries are equivalent to the previous queries:

from Item i, User u
where i.Seller.id = u.id and u.Username = 'steve'
from Item i, Bid b
where i.Seller.id = b.Bidder.id

But you can now use the identifier value as a query parameter:

long userId = 1;
IQuery q =
session.CreateQuery("from Comment c where c.FromUser.id = :id");
q.SetInt64("id", userId);
IList results = q.List();

You may have noticed that there is a world of difference between the following queries:

from Bid b where b.Item.id = 1
from Bid b where b.Item.Description like '%part%'

The second query uses an implicit table join; the first has no joins at all.

We’ve now covered most of the features of NHibernate’s query facilities that are commonly needed for retrieving objects for manipulation in business logic. In the next section, we change our focus and discuss features of HQL that are used mainly for analysis and reporting functionality.

7.4. Writing report queries

Report queries take advantage of the database’s ability to perform efficient grouping and aggregation of data. They’re more relational in nature; they don’t always return entities. For example, instead of retrieving complete Item entities, a report query may only retrieve their names and prices. If this is the only information you need for a report screen, you don’t need transactional entities and can save the small overhead of automatic dirty checking and caching in the ISession.

Let’s consider the structure of an HQL query again:

[select ...] from ... [where ...]
[group by ... [having ...]] [order by ...]

The only mandatory clause of an HQL query is the from clause; all other clauses are optional. So far, we’ve discussed the from, where, and order by clauses. We also used the select clause to declare which entities should be returned in a join query.

In report queries, you use the select clause for projection and the group by and having clauses for aggregation. Let’s look at what we mean by projection.

7.4.1. Projection

The select clause performs projection. It lets you specify which objects or properties of objects you want in your query results. For example, as you’ve already seen, the following query returns ordered pairs of Items and Bids:

from Item item join item.Bids bid where bid.Amount > 100

If you only want the Items, you can use this query instead:

select item from Item item join item.Bids bid where bid.Amount > 100

Or, if you’re displaying a list page to the user, it may be adequate to retrieve a few properties of those objects needed for that page:

select item.id, item.Description, bid.Amount
from Item item join item.Bids bid
where bid.Amount > 100

This query returns an array of objects for each row. Because there are three items in the select clause, each object[] has 3 elements. Also, because it’s a report query, the objects in the result aren’t NHibernate entities and therefore aren’t transactional. Let’s execute the query with some code:

IList results = session.CreateQuery(
@"select item.id, item.Description, bid.Amount
from Item item join item.Bids bid
where bid.Amount > 100"
)
.List();
foreach( Object[] row in results ) {
long id = (long) row[0];
string description = (string) row[1];
double amount = (double) row[2];
// ... show values in a report screen
}

If you’re used to working with domain objects, this example will seem ugly and verbose. NHibernate gives you another approach: dynamic instantiation.

Using Dynamic Instantiation

If you find working with arrays of values a little cumbersome, NHibernate let’s you use dynamic instantiation and define a class to represent each row of results. You can do this using the HQL select new construct:

select new ItemRow( item.id, item.Description, bid.Amount )
from Item item join item.Bids bid
where bid.Amount > 100

The ItemRow class is one you’d write just for your report screen; note that you also have to give it an appropriate constructor. This query returns newly instantiated (but transient) instances of ItemRow, as you can see in the next example:

IList results = session.CreateQuery(
@"select new ItemRow( item.id, item.Description, bid.Amount )
from Item item join item.Bids bid
where bid.Amount > 100"
)
.List();
foreach( ItemRow row in results ) {
// Do something
}

The custom ItemRow class doesn’t have to be a persistent class that has its own mapping file. But in order for NHibernate to “see” it, you need to import it using

<hibernate-mapping>
<import class="ItemRow" />
</hibernate-mapping>

ItemRow is therefore only a data transfer class, useful in report generation.

Getting Distinct Results

When you use a select clause, the elements of the result are no longer guaranteed to be unique. For example, Item descriptions aren’t unique, so the following query may return the same description more than once:

select item.Description from Item item

It’s difficult to see how it can possibly be meaningful to have two identical rows in a query result, so if you think duplicates are likely, you should use the distinct keyword:

select distinct item.Description from Item item

This eliminates duplicates from the returned list of Item descriptions.

Calling SQL Functions

You may recall that you can call database-specific SQL functions in the where clause. It’s also possible, at least for some NHibernate SQL dialects, to call database-specific SQL functions from the select clause. For example, the following query retrieves the current date and time from the database server (SQL Server syntax), together with a property of Item:

select item.StartDate, getdate() from Item item

The technique of database functions in the select clause isn’t limited to database-dependent functions. You can use it with other, more generic (or standardized) SQL functions as well:

select item.StartDate, item.EndDate, upper(item.Name)
from Item item

This query returns an object[] with the starting and ending date of an item auction, and the name of the item all in uppercase.

Let’s now look at calling SQL aggregate functions.

7.4.2. Using aggregation

NHibernate recognizes the following aggregate functions: count(), min(), max(), sum(), and avg().

This query counts all the Items:

select count(*) from Item

The result is returned as an Integer:

int count =
(int) session.CreateQuery("select count(*) from Item")
.UniqueResult();

Notice how you use *, which has the same semantics as in SQL.

The next variation of the query counts all Items that have a successfulBid:

select count(item.SuccessfulBid) from Item item

This query calculates the total of all the successful Bids:

select sum(item.SuccessfulBid.Amount) from Item item

The query returns a value of the same type as the summed elements; in this case double. Notice the use of an implicit join in the select clause: you navigate the association (SuccessfulBid) from Item to Bid by referencing it with a dot.

The next query returns the minimum and maximum bid amounts for a particular Item:

select min(bid.Amount), max(bid.Amount)
from Bid bid where bid.Item.id = 1

The result is an ordered pair of doubles (two instances of double in an object[] array).

The special count(distinct) function ignores duplicates:

select count(distinct item.Description) from Item item

When you call an aggregate function in the select clause without specifying any grouping in a group by clause, you collapse the result down to a single row containing your aggregated value(s). This means (in the absence of a group by clause) that any select clause that contains an aggregate function must contain only aggregate functions.

For more advanced statistics and reporting, you’ll need to be able to perform grouping.

7.4.3. Grouping

Just like in SQL, any property or alias that appears in HQL outside of an aggregate function in the select clause must also appear in the group by clause.

Consider the next query, which counts the number of users with each particular last name:

select u.Lastname, count(u) from User u
group by u.Lastname

Now look at the generated SQL:

select U.LAST_NAME, count(U.USER_ID)
from USER U
group by U.LAST_NAME

In this example, the u.Lastname isn’t inside an aggregate function; you use it to group the result. You also don’t need to specify the property you’d like to count in HQL. The generated SQL will automatically use the primary key if you use an alias that has been set in the from clause.

The next query finds the average bid amount for each item:

select bid.Item.id, avg(bid.Amount) from Bid bid
group by bid.Item.id

This query returns ordered pairs of Item identifiers and average bid amount. Notice how you use the id special property to refer to the identifier of a persistent class no matter what the identifier’s real property name is.

The next query counts the number of bids and calculates the average bid per unsold item:

select bid.Item.id, count(bid), avg(bid.Amount)
from Bid bid
where bid.Item.SuccessfulBid is null
group by bid.Item.id

This query uses an implicit association join. For an explicit ordinary join in the from clause (not a fetch join), you can re-express it as follows:

select bidItem.id, count(bid), avg(bid.Amount)
from Bid bid
join bid.Item bidItem
where bidItem.SuccessfulBid is null
group by bidItem.id

To initialize the bids collection of the Items, you can use a fetch join and refer to the associations starting on the other side:

select item.id, count(bid), avg(bid.Amount)
from Item item
fetch join item.Bids bid
where item.SuccessfulBid is null
group by item.id

Sometimes, you’ll want to further restrict the result by selecting only particular values of a group.

7.4.4. Restricting groups with having

The where clause is used to perform the relational operation of restriction on rows. The having clause performs restriction on groups.

For example, the next query counts users with each last name that begins with K:

select user.Lastname, count(user)
from User user
group by user.Lastname
having user.Lastname like 'K%'

The same rules govern the select and having clauses: only grouped properties may appear outside an aggregate function. The next query counts the number of bids per unsold item, returning results only for those items that have more than 10 bids:

select item.id, count(bid), avg(bid.Amount)
from Item item
join item.Bids bid
where item.SuccessfulBid is null
group by item.id
having count(bid) > 10

Most report queries use a select clause to choose a list of projected or aggregated properties. You’ve seen that when more than one property or alias is listed in the select clause, NHibernate returns the query results as tuples: each row of the query result list is an instance of object[]. Tuples are inconvenient and non-typesafe, so NHibernate provides the select new constructor, as mentioned earlier. You can create new objects dynamically with this technique and also use it in combination with aggregation and grouping.

If you define a class called ItemBidSummary with a constructor that takes a long, a string, and an int, you can use the following query:

select new ItemBidSummary( bid.Item.id, count(bid), avg(bid.Amount) )
from Bid bid
where bid.item.SuccessfulBid is null
group by bid.Item.id

In the result of this query, each element is an instance of ItemBidSummary, which is a summary of an Item, the number of bids for that item, and the average bid amount. This approach is typesafe, and a data transfer class such as ItemBidSummary can easily be extended for special formatted printing of values in reports.

7.4.5. Improving performance with report queries

Report queries can have an impact on the performance of your application. Let’s explore this issue in more depth.

The only time we’ve seen any significant overhead in NHibernate code compared to direct ADO.NET queries—and then only for unrealistically simple test cases—is in the special case of read-only queries against a local database. It’s possible for a database to completely cache query results in memory and respond quickly, so benchmarks are generally useless if the dataset is small: plain SQL and ADO.NET are always the fastest option.

On the other hand, even with a small result set, NHibernate must still do the work of adding the resulting objects of a query to the ISession cache (perhaps also the second-level cache) and manage uniqueness, and so on. Report queries give you a way to avoid the overhead of managing the ISession cache. The overhead of an NHibernate report query compared to direct SQL/ADO.NET isn’t usually measurable, even in unrealistic extreme cases like loading one million objects from a local database without network latency.

Report queries using projection in HQL let you specify exactly which properties you wish to retrieve. For report queries, you aren’t selecting entities, but only properties or aggregated values:

select user.Lastname, count(user)
from User user
group by user.Lastname

This query doesn’t return a persistent entity, so NHibernate doesn’t add a transactional object to the ISession cache. Furthermore, NHibernate won’t track changes to these returned objects.

Reporting queries result in faster release of allocated memory, because objects aren’t kept in the ISession cache until the ISession is closed—they may be garbage collected as soon as they’re dereferenced by the application, after executing the report.

These considerations are almost always extremely minor, so don’t go out and rewrite all your read-only transactions to use report queries instead of transactional, cached, and monitored objects. Report queries are more verbose and (arguably) less object oriented. They also make less efficient use of NHibernate’s caches, which is much more important once you consider the overhead of remote communication with the database in production systems. We follow the “don’t optimize prematurely” wisdom, and we urge you to wait until you find an actual case where you have a real performance problem before using this optimization.

7.4.6. Obtaining DataSets

It may happen that you have to interact with a component using DataSets. Many report engines, like Crystal Reports, have limited support for POCO (but which may be enough). Their common data source is either the database directly or a DataSet. But because NHibernate doesn’t return DataSets, you have to find a solution.

The most common workaround is to directly use ADO.NET to get the DataSet. This solution may fit in many situations, but it doesn’t take advantage of NHibernate features and requires careful monitoring of possible changes because they can make NHibernate caches stale.

Another solution is to use NHibernate to query data and fill a DataSet with the result. This operation can be done manually by writing code similar to that required for DTOs, but it becomes tedious when you’re dealing with numerous entities. In this case, code generation can help greatly simplify the process.

Now, let’s get back to regular entity queries. There are still many NHibernate features waiting to be discovered.

7.5. Advanced query techniques

You’ll use advanced query techniques less frequently with NHibernate, but it will be helpful to know about them. In this section, we discuss programmatically building criteria with example objects, a topic we briefly introduced earlier.

Filtering collections is also a handy technique: you can use the database instead of filtering objects in memory. Subqueries and queries in native SQL will round out your knowledge of NHibernate query techniques.

7.5.1. Dynamic queries

It’s common for queries to be built programmatically by combining several optional query criteria depending on user input. For example, a system administrator may wish to search for users by any combination of first name or last name, and to retrieve the result ordered by username. Using HQL, you can build the query using string manipulations:

public IList<User> FindUsers(string firstname,
string lastname) {
StringBuilder queryString = new StringBuilder();
bool conditionFound = false;
if (firstname != null) {
queryString.Append("lower(u.Firstname) like :firstname ");
conditionFound=true;
}
if (lastname != null) {
if (conditionFound) queryString.Append("and ");
queryString.Append("lower(u.Lastname) like :lastname ");
conditionFound=true;
}
string fromClause = conditionFound ?
"from User u where " :
"from User u ";
queryString.Insert(0, fromClause).Append("order by u.username");
IQuery query = GetSession().CreateQuery( queryString.ToString() );
if (firstname != null)
query.SetString( "firstname",
'%' + firstname.ToLower() + '%' );
if (lastname != null)
query.SetString( "lastname",
'%' + lastname.ToLower() + '%' );
return query.List<User>();
}

This code is tedious and noisy, so let’s try a different approach. The ICriteria API looks promising:

public IList<User> FindUsers(string firstname,
string lastname) {
ICriteria crit = GetSession().CreateCriteria(typeof(User));
if (firstname != null) {
crit.Add( Expression.InsensitiveLike("Firstname",
firstname,
MatchMode.Anywhere) );
}
if (lastname != null) {
crit.Add( Expression.InsensitiveLike("Lastname",
lastname,
MatchMode.Anywhere) );
}
crit.AddOrder( Order.Asc("Username") );
return crit.List<User>();
}

This code is much shorter and more readable. Note that the InsensitiveLike() operator performs a case-insensitive match. There seems no doubt that this is a better approach. But for search screens with many optional search criteria, there is an even better way.

First, observe that as you add new search criteria, the parameter list of FindUsers() grows. It would be better to capture the searchable properties as an object. Because all the search properties belong to the User class, why not use an instance of User?

QBE (Query by Example) uses this idea; you provide an instance of the queried class with some properties initialized, and the query returns all persistent instances with matching property values. NHibernate implements QBE as part of the ICriteria query API:

public IList<User> FindUsers(User u) {
Example exampleUser =
Example.Create(u).IgnoreCase().EnableLike(MatchMode.Anywhere);
return GetSession().CreateCriteria(typeof(User))
.Add(exampleUser)
.List<User>();
}

The call to Create() returns a new instance of Example for the given instance of User. The IgnoreCase() method puts the example query into a case-insensitive mode for all string-valued properties. The call to EnableLike() specifies that the SQL like operator should be used for all string-valued properties and specifies a MatchMode.

You’ve significantly simplified the code again. The nicest thing about NHibernate Example queries is that an Example is just an ordinary ICriterion. You can freely mix and match QBE with QBC.

Let’s see how this works by further restricting the search results to users with unsold Items. For this purpose, you add an ICriteria to the example user, constraining the result using its Items collection of Items:

public IList<User> FindUsers(User u) {
Example exampleUser =
Example.Create(u).IgnoreCase().EnableLike(MatchMode.Anywhere);
return GetSession().CreateCriteria(typeof(User))
.Add( exampleUser )
.CreateCriteria("Items")
.Add( Expression.IsNull("SuccessfulBid") )
.List<User>();
}

Even better, you can combine User properties and Item properties in the same search:

public IList<User> FindUsers(User u, Item i) {
Example exampleUser =
Example.Create(u).IgnoreCase().EnableLike(MatchMode.Anywhere);
Example exampleItem =
Example.Create(i).IgnoreCase().EnableLike(MatchMode.Anywhere);
return GetSession().CreateCriteria(typeof(User))
.Add( exampleUser )
.CreateCriteria("Items")
.Add( exampleItem )
.List<User>();
}

At this point, we invite you to step back and consider how much code would be required to implement this search screen using hand-coded SQL/ADO.NET. It’s a lot: if we listed it here, it would stretch for pages.

7.5.2. Collection filters

You’ll commonly want to execute a query against all elements of a particular collection. For instance, you may have an Item and wish to retrieve all bids for that item, ordered by the amount of the bid. You already know one good way to write this query:

IList results =
session.CreateQuery(@"from Bid b where b.Item = :item
order by b.Amount asc")
.SetEntity("item", item)
.List();

This query works perfectly, because the association between bids and items is bidirectional and each Bid knows its Item. Imagine that this association was unidirectional: Item has a collection of Bids, but there is no inverse association from Bid to Item.

You can try the following query:

string query = @"select bid from Item item join item.Bids bid
where item = :item order by bid.Amount asc";
IList results = session.CreateQuery(query)
.SetEntity("item", item)
.List();

This query is inefficient—it uses an unnecessary join. A better, more elegant solution is to use a collection filter: a special query that can be applied to a persistent collection or array. It’s commonly used to further restrict or order a result. You use it on an already loaded Item and its collection of bids:

IList results = session.CreateFilter( item.Bids,
"order by this.Amount asc" )
.List();

This filter is equivalent to the first query shown earlier and results in identical SQL. Collection filters have an implicit from clause and an implicit where condition. The alias this refers implicitly to elements of the collection of bids.

NHibernate collection filters aren’t executed in memory. The collection of bids may be uninitialized when the filter is called and, if so, will remain uninitialized. Furthermore, filters don’t apply to transient collections or query results; they may only be applied to a persistent collection currently referenced by an object associated with the NHibernate session.

The only required clause of an HQL query is from. Because a collection filter has an implicit from clause, the following is a valid filter:

IList results = session.CreateFilter( item.Bids, "" ).List();

To the great surprise of everyone (including the designer of this feature), this trivial filter turns out to be useful! You can use it to paginate collection elements:

IList results = session.CreateFilter( item.Bids, "" )
.SetFirstResult(50)
.SetMaxResults(25)
.List();

But usually you’ll use an order by with paginated queries.

Even though you don’t need a from clause in a collection filter, you can include one if you like. A collection filter doesn’t even need to return elements of the collection being filtered. The next query returns any Category with the same name as a category in the given collection:

string filterString =
"select other from Category other where this.Name = other.Name";
IList results =
session.CreateFilter( cat.ChildCategories, filterString )
.List();

The following query returns a collection of Users who have bid on the item:

IList results =
session.CreateFilter( item.Bids,
"select this.Bidder" )
.List();

The next query returns all these users’ bids (including those for other items):

IList results = session.CreateFilter(
item.Bids,
"select elements(this.Bidder.Bids)" )
.List();

Note that the query uses the special HQL elements() function (explained later) to select all elements of a collection.

The most important reason for the existence of collection filters is to allow the application to retrieve some elements of a collection without initializing the entire collection. In the case of large collections, this is important to achieve acceptable performance. The following query retrieves all bids made by a user in the past week:

IList results =
session.CreateFilter( user.Bids,
"where this.Created > :oneWeekAgo" )
.SetDateTime("oneWeekAgo", DateTime.Now.AddDays(-7)
.List();

Again, this query doesn’t initialize the Bids collection of the User.

7.5.3. Subqueries

Subselects are an important and powerful feature of SQL. A subselect is a select query embedded in another query, usually in the select, from, or where clause.

HQL supports subqueries in the where clause. We can’t think of many good uses for subqueries in the from clause, although select clause subqueries may be a nice future extension. (You may remember from section 3.4.2 that a derived property mapping is a select clause subselect.) Note that some database platforms supported by NHibernate don’t implement subselects. If you desire portability among many different databases, you shouldn’t use this feature.

The result of a subquery may contain either a single row or multiple rows. Typically, subqueries that return single rows perform aggregation. The following subquery returns the total number of items sold by a user; the outer query returns all users who have sold more than 10 items:

from User u where 10 < (
select count(i) from u.Items i where i.SuccessfulBid is not null
)

This is a correlated subquery—it refers to an alias (u) from the outer query. The next subquery is an uncorrelated subquery:

from Bid bid where bid.Amount + 1 >= (
select max(b.Amount) from Bid b
)

The subquery in this example returns the maximum bid amount in the entire system; the outer query returns all bids whose amount is within one (dollar) of that amount.

Note that in both cases, the subquery is enclosed in parentheses. This is always required.

Uncorrelated subqueries are harmless; there is no reason not to use them when convenient, although they can always be rewritten as two queries (after all, they don’t reference each other). You should think more carefully about the performance impact of correlated subqueries. On a mature database, the performance cost of a simple correlated subquery is similar to the cost of a join. But it isn’t necessarily possible to rewrite a correlated subquery using several separate queries.

If a subquery returns multiple rows, it’s combined with quantification. ANSI SQL (and HQL) defines the following quantifiers:

  • any
  • all
  • some (a synonym for any)
  • in (a synonym for = any)

For example, the following query returns items where all bids are less than 100:

from Item item where 100 > all ( select b.Amount from item.Bids b )

The next query returns all items with bids greater than 100:

from Item item where 100 < any ( select b.Amount from item.Bids b )

This query returns items with a bid of exactly 100:

from Item item where 100 = some ( select b.Amount from item.Bids b )

So does this one:

from Item item where 100 in ( select b.Amount from item.Bids b )

HQL supports a shortcut syntax for subqueries that operate on elements or indices of a collection. The following query uses the special HQL elements() function:

IList list = session.CreateQuery(@"from Category c
where :item in elements(c.Items)")
.SetEntity("item", item)
.List();

The query returns all categories to which the item belongs and is equivalent to the following HQL, where the subquery is more explicit:

IList results = session.CreateQuery(@"from Category c
where :item in (from c.Items)")
.SetEntity("item", item)
.List();

Along with elements(), HQL provides indices(), maxelement(), minelement(), max-index(), minindex(), and size(), each of which is equivalent to a certain correlated subquery against the passed collection. Refer to the NHibernate documentation for more information about these special functions; they’re rarely used.

Subqueries are an advanced technique; you should question their frequent use, because queries with subqueries can often be rewritten using only joins and aggregation. But they’re powerful and useful from time to time.

By now, we hope you’re convinced that NHibernate’s query facilities are flexible, powerful, and easy to use. HQL provides almost all the functionality of ANSI standard SQL. Of course, on rare occasions you do need to resort to handcrafted SQL, especially when you wish to take advantage of database features that go beyond the functionality specified by the ANSI standard.

7.6. Native SQL

We can think of some good examples why you may use native SQL queries in NHibernate: HQL provides no mechanism for specifying SQL query hints; it also doesn’t support hierarchical queries (such as the Oracle CONNECT BY clause); and you may need to quickly port SQL code to your application. We suppose you’ll stumble on other examples.

In these relatively rare cases, you’re free to resort to using the ADO.NET API directly. But doing so means writing the tedious code by hand to transform the result of the query to an object graph. You can avoid all this work by using NHibernate’s built-in support for native SQL queries.

NHibernate lets you execute arbitrary SQL queries to retrieve scalar values or even entities. These queries can be written in your C# code or in your mapping files. In the latter case, it’s also possible to call stored procedures. You can even override the SQL commands that NHibernate generates for the CRUD operations. All these techniques will be covered in the following pages.

7.6.1. Using the ISQLQuery API

ISQLQuery instances are created by calling the method ISession.CreateSQLQuery(), passing in a SQL query string. Then you can use the methods of ISQLQuery to provide more details about your query.

A SQL query can return scalar values from individual columns, a complete entity (along with its associations and collections), or multiple entities. It also supports all the features of HQL queries, which means you can use parameters, paging, and so on.

Scalar and Entity Queries

The simplest native queries are scalar queries. Here’s an example:

Ilist results = session.CreateSQLQuery("SELECT * FROM ITEM")
.AddScalar("ITEM_ID", NHibernateUtil.Int64)
.AddScalar("NAME", NHibernateUtil.String)
.AddScalar("CREATED", NHibernateUtil.Date)
.List();

This query won’t return Item objects; instead, it returns the specified columns of all items as arrays of objects (object[]). These columns override the * in the SELECT.

If you want to retrieve entities, you can do this:

Ilist<Item> results = session.CreateSQLQuery("SELECT * FROM ITEM")
.AddEntity(typeof(Item))
.List<Item>();

You may also manage associations and collections by joining them. Let’s see how you can eagerly load the items with their sellers:

Ilist<Item> results = session.CreateSQLQuery(
@"SELECT ITEM_ID, NAME, CREATED, SELLER_ID, ...
USER_ID, FIRSTNAME, ...
FROM ITEM i, USER u,
WHERE i.SELLER_ID = u.USER_ID" )
.AddEntity("item", typeof(Item))
.AddJoin("item.Seller")
.List<Item>();

In this case, the query is more complex because you must specify the columns of both tables. You must also specify the alias "item" in AddEntity() in order to join its Seller. Here is how you can try to join the Bids collection of the items:

Ilist<Item> results = session.CreateSQLQuery(
@"SELECT i.ITEM_ID, NAME, CREATED, ...
BID_ID, CREATED, b.ITEM_ID, ...
FROM ITEM i, BID b,
WHERE i.ITEM_ID = b.ITEM_ID" )
.AddEntity("item", typeof(Item))
.AddJoin("item.Bids")
.List<Item>();

This query is similar to the previous one. A good knowledge of SQL is enough to write it and take care of the ambiguous column name ITEM_ID. But unfortunately, this won’t work in NHibernate; it doesn’t recognize i.ITEM_ID because it isn’t specified like that in the mapping file or attributes.

It’s time to introduce a new trick that addresses this problem. We demonstrate this in the next section, which explains how to retrieve many entity types in a single query.

Multiple-Entity Queries

Querying more than one entity increases the chance of having column-name duplications. Thankfully, this problem is simple to solve using placeholders.

Placeholders are necessary because a SQL query result may return the state of multiple entity instances in each row, or even the state of multiple instances of the same entity. You need a way to distinguish between the different entities. NHibernate uses its own naming scheme, where column aliases are placed in the SQL to correctly map column values to the properties of particular instances. But when you’re using your own SQL, you don’t want the user to have to understand all this. Instead, you can specify native SQL queries with placeholders for the column aliases, which are much simpler.

The following native SQL query shows what these placeholders—the names enclosed in braces—look like:

Ilist<Item> results = session.CreateSQLQuery(
@"SELECT i.ITEM_ID as {item.id},
i.NAME as {item.Name},
i.CREATED as {item.Created}, ...
FROM ITEM i" )
.AddEntity("item", typeof(Item))
.List<Item>();

Each placeholder specifies an HQL-style property name. And you must provide the entity class that is referred to by item in the placeholders.

Here’s a shortcut, if you don’t want to specify every column explicitly:

Ilist<Item> results = session.CreateSQLQuery(
@"SELECT {item.*}
FROM ITEM" )
.AddEntity("item", typeof(Item))
.List<Item>();

The {item.*} placeholder is replaced with a list of the mapped column names and correct column aliases for all properties of the Item entity.

Now, let’s see how you can return multiple entities:

Ilist results = session.CreateSQLQuery(
@"SELECT {item.*}, {user.*}
FROM ITEM i INNER JOIN USER u
ON i.SELLER_ID = u.USER_ID" )
.AddEntity("item", typeof(Item))
.AddEntity("user", typeof(User))
.List();

This query returns tuples of entities; as usual, NHibernate represents a tuple as an instance of object[].

As with HQL, it’s usually recommended that you keep these queries out of your code by writing them in your mappings, as discussed next.

7.6.2. Named SQL queries

Named SQL queries are queries defined in NHibernate mapping files. Here is how you can rewrite the previous example:

<sql-query name="FindItemsAndSellers">
<return alias="item" class="Item"/>
<return alias="user" class="User"/>
<![CDATA[
SELECT {item.*}, {user.*}
FROM ITEM i INNER JOIN USER u
ON i.SELLER_ID = u.USER_ID
]]>
</sql-query>

This named query can be executed from code as follows:

IList results = session.GetNamedQuery("FindItemAndSellers")
.List();

Comparing the named query to the inline version discussed previously, you can see that the <return> element replaces the method AddEntity(). <return-join> is used for associations and collections, and <return-scalar> returns scalar values. You may also load a collection only using <load-collection>.

If you frequently return the same information, you can externalize it using <resultset>. Here is a complete example:

<resultset name="FullItem">
<return alias="item" class="Item"/>
<return-join alias="user" property="item.Seller"/>
<return-join alias="bid" property="item.Bids"/>
<return-scalar column="diff" type="int"/>
</resultset>
<sql-query name="FindItemsWithSellersAndBids" resultset-ref="FullItem">
<![CDATA[
SELECT {item.*}, {user.*}, {bid.*},
i.RESERVE_PRICE–i.INITIAL_PRICE as diff
FROM ITEM i
INNER JOIN USER u ON i.SELLER_ID = u.USER_ID
LEFT OUTER JOIN BID b ON i.ITEM_ID = b.ITEM_ID
]]>
</sql-query>

When executed, this query returns tuples containing an item with its seller and bids and a computed scalar value (diff). Note that you can also refer to <resultset> elements in code using the method ISQLQuery.SetResultSetMapping().

Named SQL queries allow you to avoid the {} syntax and define your own column aliases. Here is a simple example:

<sql-query name="FindItems">
<return alias="item" class="Item">
<return-property name="Name" column="MY_NAME"/>
<return-property name="InitialPrice">
<return-column name="MY_ITEM_PRICE_VALUE"/>
<return-column name="MY_ITEM_PRICE_CURRENCY"/>
</return-property>
</return>
<![CDATA[
SELECT i.ITEM_ID as {item.id}, ...
i.NAME as MY_NAME,
i.INITIAL_PRICE as MY_ITEM_PRICE_VALUE,
i.INITIAL_PRICE_CURRENCY as MY_ITEM_PRICE_CURRENCY,
FROM ITEM i
]]>
</sql-query>

In this example, you also use the {} syntax for columns you don’t want to customize. In section 6.1.2, you defined an item’s initial price as a composite user type; this example shows how to load it.

Because the native SQL is tightly coupled to the actual mapped tables and columns, we strongly recommend that you define all native SQL queries in the mapping document instead of embedding them in the C# code.

Using Stored Procedures

NHibernate-named SQL queries can call stored procedures and functions since version 1.2.0. Suppose you create a stored procedure like this (using a SQL Server database):

CREATE PROCEDURE FindItems_SP AS
SELECT ITEM_ID, NAME, INITIAL_PRICE, INITIAL_PRICE_CURRENCY, ...
FROM ITEM

You can call it using this query:

<sql-query name="FindItems">
<return alias="item" class="Item">
<return-property name="id" column="ITEM_ID "/>
<return-property name="Name" column="NAME"/>
<return-property name="InitialPrice">
<return-column name="INITIAL_PRICE "/>
<return-column name="INITIAL_PRICE_CURRENCY"/>
</return-property>
...
</return>
exec FindItems_SP
</sql-query>

Unlike the previous example, you must map all properties here; this is obvious, because NHibernate can’t inject its own column aliases.

Note that the stored procedure must return a resultset to be able to work with NHibernate. Another limitation is that associations and collections aren’t supported; a SQL query calling a stored procedure can only return scalar values and entities.

Finally, stored procedures are database-dependent. Therefore, the mapping for a SQL Server database may not be the same as for an Oracle database.

If, in some special cases, you need even more control over the SQL that is executed, or if you want to call a stored procedure that isn’t supported, NHibernate offers you a way to get an ADO.NET connection. The property session.Connection returns the currently active ADO.NET IDbConnection from the ISession. It’s not your responsibility to close this connection, just to execute whatever SQL statements you like and then continue using the ISession (and finally, close the ISession). The same is true for transactions; you must not commit or roll back this connection yourself (unless you completely manage the connection for NHibernate).

7.6.3. Customizing create, retrieve, update, and delete commands

In most cases, the commands generated by NHibernate to save your entities are acceptable. But it may happen that you need to perform a specific operation and override NHibernate’s generated SQL. NHibernate lets you specify the SQL statements for create, retrieve, update, and delete operations.

The custom commands are written in the mapping of the concerned class. For example:

<class name="Item">
...
<sql-insert>
INSERT INTO ITEM (NAME, ..., ITEM_ID) VALUES (UPPER(?), ..., ?)
</sql-insert>
<sql-update>UPDATE ITEM SET NAME=UPPER(?), ...
WHERE ITEM_ID=?</sql-update>
<sql-delete>exec DeleteItem_SP ?</sql-delete>
</class>

In this example, <sql-insert> and <sql-update> respectively save and update an item with a custom logic (converting names to uppercase). And as you can see in <sql-delete>, these custom commands can also call stored procedures. In this last case, the order of the positional parameters must be respected (as you can see here, the identifier is generally the last parameter). The order is defined by NHibernate; you can read it by enabling debug logging and reading the static SQL commands that are generated by NHibernate (remember to do that before writing these custom commands).

Note that the custom <sql-insert> will be ignored if you use identity to generate identifier values for the class. Your custom commands are required to affect the same number of rows as NHibernate-generated SQL would. You can disable this verification by adding check="none" to your commands. Also, for NHibernate 1.2, it isn’t possible to supply named parameters such as ITEM_ID = :id for these insert/update SQL queries.

Retrieve commands are defined as named queries. For example, here is a query to load an item with a pessimistic lock:

<sql-query name="LoadItem">
<return alias="item" class="Item" lock-mode="upgrade"/>
SELECT {item.*}
FROM ITEM
WHERE ITEM_ID = ?
FOR UPDATE
</sql-query>

Then it must be referenced in the mapping:

<class name="Item">
...
<loader query-ref="LoadItem"/>
</class>

It’s also possible to customize how a collection should be loaded. In this case, the named query will use the <load-collection> tag. Here is an example for the Bids collection of Item:

<sql-query name="LoadItemBids">
<load-collection alias="bid" role="Item.Bids"/>
SELECT {bid.*}
FROM BID
WHERE ITEM_ID = :id
</sql-query>

Here is how it’s referenced:

<bag name="Bids" ...>
...
<loader query-ref="LoadItemBids"/>
</bag>

When you’re writing queries and testing them in your application, you may encounter one of the common performance issues with ORM. Fortunately, you know how to avoid (or, at least, limit) their impact. This process is called optimizing object retrieval. Let’s walk through the most common issues.

7.7. Optimizing object retrieval

Performance-tuning your application should first include the most obvious settings, such as the best fetching strategies and use of proxies, as shown in chapter 4. Note that we consider enabling the second-level cache to be the last optimization you should usually make.

The fetch joins, part of the runtime fetching strategies, as introduced in this chapter, deserve some extra attention. But some design issues can’t be resolved by tuning; they can only be avoided if possible.

7.7.1. Solving the n+1 selects problem

The biggest performance killer in applications that persist objects to SQL databases is the n+1 selects problem. When you tune the performance of an NHibernate application, this problem is the first thing you’ll usually need to address.

It’s normal (and recommended) to map almost all associations for lazy initialization. This means you generally set all collections to lazy="true" and change some of the one-to-one and many-to-one associations to not use outer joins by default. This is the only way to avoid retrieving all objects in the database in every transaction. Unfortunately, this decision exposes you to the n+1 selects problem. It’s easy to understand this problem by considering a simple query that retrieves all Items for a particular user:

IList<Item> results = session.CreateCriteria(typeof(Item))
.Add( Expression.Eq("item.Seller", user) )
.List<Item>();

This query returns a list of items, where each collection of bids is an uninitialized collection wrapper. Suppose you now wish to find the maximum bid for each item. The following code would be one way to do this:

IList maxAmounts = new ArrayList();
foreach( Item item in results ) {
double maxAmount = 0;
foreach ( Bid bid in item.Bids ) {
if( maxAmount < bid.Amount )
maxAmount = bid.Amount;
}
maxAmounts.Add( new MaxAmount( item.Id, maxAmount ) );
}

But there is a huge problem with this solution (aside from the fact that this would be much better executed in the database using aggregation functions): each time you access the collection of bids, NHibernate must fetch this lazy collection from the database for each item. If the initial query returns 20 items, the entire transaction requires 1 initial select that retrieves the items plus 20 additional selects to load the bids collections of each item. This may easily result in unacceptable latency in a system that accesses the database across a network. Usually, you don’t explicitly create such operations, because you should quickly see that doing so is suboptimal. But the n+1 selects problem is often hidden in more complex application logic, and you may not recognize it by looking at a single routine.

The first attempt to solve this problem may be to enable batch fetching. You change your mapping for the bids collection to look like this:

<set name="bids" lazy="true" inverse="true" batch-size="10">

With batch fetching enabled, NHibernate prefetches the next 10 items when the collection is first accessed. This reduces the problem from n+1 selects to n/10 + 1 selects. For many applications, this may be sufficient to achieve acceptable latency. On the other hand, it also means that in some other transactions, collections are fetched unnecessarily. It isn’t the best you can do in terms of reducing the number of round trips to the database.

A much, much better solution is to take advantage of HQL aggregation and perform the work of calculating the maximum bid on the database. Thus you avoid the problem:

string query = @"select new MaxAmount( item.id, max(bid.Amount) )
from Item item join item.Bids bid"
where item.Seller = :user group by item.id";
IList maxAmounts = session.CreateQuery(query)
.SetEntity("user", user)
.List();

Unfortunately, this isn’t a complete solution to the generic issue. In general, you may need to do more complex processing on the bids than merely calculating the maximum amount. It’s better to do this processing in the .NET application.

You can try enabling eager fetching at the level of the mapping document:

<set name="Bids" lazy="false" inverse="true" outer-join="true">

The outer-join attribute is available for collections and other associations. It forces NHibernate to load the association eagerly, using a SQL outer join. You may also use the fetch attribute; fetch="select" is equivalent to outer-join="false", and fetch="join" is equivalent to outer-join="true". (Note that, as previously mentioned, HQL queries ignore the outer-join attribute; but you may be using a criteria query.)

This mapping avoids the problem as far as this transaction is concerned; you’re now able to load all bids in the initial select. Unfortunately, any other transaction that retrieves items using Get(), Load(), or a criteria query will also retrieve all the bids at once. Retrieving unnecessary data imposes extra load on both the database server and the application server and may also reduce the concurrency of the system, creating too many unnecessary read locks at the database level.

Hence we consider eager fetching at the level of the mapping file to be almost always a bad approach. The outer-join attribute of collection mappings is arguably a misfeature of NHibernate (fortunately, it’s disabled by default). Occasionally, it makes sense to enable outer-join for a <many-to-one> or <one-to-one> association (the default is auto; see section 4.4.6), but you’d never do this in the case of a collection.

Our recommended solution for this problem is to take advantage of NHibernate’s support for runtime (code-level) declarations of association fetching strategies. The example can be implemented like this:

IList<Item> results = session.CreateCriteria(typeof(Item))
.Add( Expression.Eq("item.Seller", user) )
.SetFetchMode("Bids", FetchMode.Eager)
.List<Item>();
// Make results distinct
ISet<Item> distinctResults = new HashedSet<Item>(results);
IList maxAmounts = new ArrayList();
foreach ( Item item in distinctResults ) {
double maxAmount = 0;
foreach ( Bid bid in item.Bids ) {
if( maxAmount < bid.Amount )
maxAmount = bid.Amount;
}
maxAmounts.Add( new MaxAmount( item.Id, maxAmount ) );
}

You disabled batch fetching and eager fetching at the mapping level; the collection is lazy by default. Instead, you enable eager fetching for this query alone by calling SetFetchMode(). As discussed earlier in this chapter, this is equivalent to a fetch join in the from clause of an HQL query.

The previous code example has one extra complication: the result list returned by the NHibernate criteria query isn’t guaranteed to be distinct. In the case of a query that fetches a collection by outer join, it will contain duplicate items. It’s the application’s responsibility to make the results distinct if that is required. You implement this by adding the results to a HashedSet (from the library Iesi.Collections) and then iterating the set.

You’ve now established a general solution to the n+1 selects problem. Rather than retrieving just the top-level objects in the initial query and then fetching needed associations as the application navigates the object graph, you follow a two-step process:

1.  

Fetch all needed data in the initial query by specifying exactly which associations will be accessed in the following unit of work.

2.  

Navigate the object graph, which consists entirely of objects that have already been fetched from the database.

This is the only true solution to the mismatch between the object-oriented world, where data is accessed by navigation, and the relational world, where data is accessed by joining.

Another efficient solution, for deep graphs of objects, is to issue one query per level and let NHibernate resolve the references between the objects. For example, you can query categories, asking NHibernate to fetch their items. Then you can query these items, asking NHibernate to fetch their bids.

Finally, there is one further solution to the n+1 selects problem. For some classes or collections with a sufficiently small number of instances, it’s possible to keep all instances in the second-level cache, avoiding the need for database access. Obviously, this solution is preferred where and when it’s possible (it isn’t possible in the case of the Bids of an Item, because you wouldn’t enable caching for this kind of data).

The n+1 selects problem may appear whenever you use the List() method of IQuery to retrieve the result. As we mentioned earlier, this issue can be hidden in more complex logic; we highly recommend the optimization strategies mentioned in section 4.4.7 to find such scenarios. It’s also possible to generate too many selects by using Find(), the shortcut for queries on the ISession API, or Load() and Get().

Next, we examine a third query API method. It’s extremely important to understand when it’s applicable, because it produces n+1 selects!

7.7.2. Using Enumerable() queries

The Enumerable() method of the ISession and IQuery interfaces behaves differently than the Find() and List() methods. It’s provided specifically to let you take full advantage of the second-level cache.

Consider the following code:

IQuery categoryByName =
session.CreateQuery("from Category c where c.Name like :name");
categoryByName.SetString("name", categoryNamePattern);
IList categories = categoryByName.List();

This query results in execution of a SQL select, with all columns of the CATEGORY table included in the select clause:

select CATEGORY_ID, NAME, PARENT_ID from CATEGORY where NAME like ?

If you expect that categories are already cached in the session or second-level cache, then you only need the identifier value (the key to the cache). This will reduce the amount of data you have to fetch from the database. The following SQL would be slightly more efficient:

select CATEGORY_ID from CATEGORY where NAME like ?
You can use the Enumerable() method:
IQuery categoryByName =
session.CreateQuery("from Category c where c.Name like :name");
categoryByName.SetString("name", categoryNamePattern);
IEnumerable<Category> categories = categoryByName.Enumerable<Category>();

The initial query only retrieves the category primary key values. You then iterate through the result, and NHibernate looks up each Category in the current session and in the second-level cache. If a cache miss occurs, NHibernate executes an additional select, retrieving the category by its primary key from the database.

In most cases, this is a minor optimization. It’s usually much more important to minimize row reads than to minimize column reads. Still, if your object has large string fields, this technique may be useful to minimize data packets on the network and, therefore, latency.

Let’s talk about another optimization, which also isn’t applicable in every case. So far, we’ve only discussed caching the results of a lookup by identifier (including implicit lookups, such as loading a lazy association) in chapter 5. It’s also possible to cache the results of NHibernate queries.

7.7.3. Caching queries

For applications that perform many queries and few inserts, deletes, or updates, caching queries can have an impact on performance. But if the application performs many writes, the query cache won’t be utilized efficiently. NHibernate expires a cached query result set when there is any insert, update, or delete of any row of a table that appears in the query.

Just as not all classes or collections should be cached, not all queries should be cached or will benefit from caching. For example, if a search screen has many different search criteria, then it won’t happen often that the user chooses the same criterion many times. In this case, the cached query results will be underused, and you’d be better off not enabling caching for that query.

Note that the query cache doesn’t cache the entities returned in the query result set, just the identifier values. But NHibernate does fully cache the value-typed data returned by a projection query. For example, the projection query "select u, b.Created fromUser u, Bid b where b.Bidder = u" results in caching of the identifiers of the users and the date object when they made their bids. It’s the responsibility of the second-level cache (in conjunction with the session cache) to cache the actual state of entities. If the cached query you just saw is executed again, NHibernate will have the bid-creation dates in the query cache but perform a lookup in the session and second-level cache (or even execute SQL again) for each user in the result. This is similar to the lookup strategy of Enumerable(), as explained in the previous section.

The query cache must be enabled using, for example, the following:

<add key="hibernate.cache.use_query_cache" value="true" />

But this setting alone isn’t enough for NHibernate to cache query results. By default, NHibernate queries always ignore the cache. To enable query caching for a particular query (to allow its results to be added to the cache, and to allow it to draw its results from the cache), you use the IQuery interface:

IQuery categoryByName =
session.CreateQuery("from Category c where c.Name = :name");
categoryByName.SetString("name", categoryName);
categoryByName.SetCacheable(true);

But even this doesn’t give you sufficient granularity. Different queries may require different query-expiration policies. NHibernate allows you to specify a different named cache region for each query:

IQuery userByName =
session.CreateQuery("from User u where u.Username= :uname");
userByName.SetString("uname", username);
userByName.SetCacheable(true);
userByName.SetCacheRegion("UserQueries");

You can now configure the cache-expiration policies using the region name. When query caching is enabled, the cache regions are as follows:

  • The default query cache region, null
  • Each named region
  • The timestamp cache, NHibernate.Cache.UpdateTimestampsCache, which is a special region that holds timestamps of the most recent updates to each table

NHibernate uses the timestamp cache to decide whether a cached query result set is stale. NHibernate looks in the timestamp cache for the timestamp of the most recent insert, update, or delete made to the queried table. If it’s later than the timestamp of the cached query results, then the cached results are discarded and a new query is issued. For best results, you should configure the timestamp cache so that the update timestamp for a table doesn’t expire from the cache while queries against the table are still cached in one of the other regions. The easiest way is to turn off expiry for the timestamp cache.

Some final words about performance optimization: remember that issues like the n+1 selects problem can slow your application to unacceptable performance. Try to avoid the problem by using the best fetching strategy. Verify that your object-retrieval technique is the best for your use case before you look into caching anything.

From our point of view, caching at the second level is an important feature, but it isn’t the first option when optimizing performance. Errors in the design of queries or an unnecessarily complex part of your object model can’t be improved with a “cache it all” approach. If an application only performs at an acceptable level with a hot cache (a full cache) after several hours or days of runtime, you should check it for serious design mistakes, unperformant queries, and n+1 selects problems.

7.7.4. Using profilers and NHibernate Query Analyzer

In most cases, there are many ways to write a query; it may be hard to select the optimal approach. Profiler tools can help you test the performance of these options; use them as often as possible.

When you’re working with HQL queries, you may wonder whether the generated SQL is optimal. A tool called NHibernate Query Analyzer lets you dynamically write and execute queries on your domain model. It displays the generated SQL query in real time and displays the result of your query when you execute it. It’s also helpful when you’re learning HQL. For more details, refer to the documentation at http://www.ayende.com/projects/nhibernate-query-analyzer.aspx.

7.8. Summary

We don’t expect that you know everything about HQL and criteria after reading this chapter once. But the chapter will be useful as a reference in your daily work with NHibernate, and we encourage you to come back and reread sections whenever you need to.

The code examples in this chapter show the three basic NHibernate query techniques: HQL, Query by Criteria (including a Query by Example mechanism), and direct execution of database-specific SQL queries.

We consider HQL the most powerful method. HQL queries are easy to understand, and they use persistent class and property names instead of table and column names. HQL is polymorphic: you can retrieve all objects with a given interface by querying for that interface. With HQL, you have the full power of arbitrary restrictions and projection of results, with logical operators and function calls just as in SQL, but always on the object level using class and property names. You can use named parameters to bind query arguments in a secure and type-safe way. Report-style queries are also supported, and this is an important area where other ORM solutions usually lack features.

Most of this is also true for criteria-based queries; but instead of using a query string, you use a typesafe API to construct the query. So-called example objects can be combined with criteria—for example, to retrieve “all items that look like the given example.”

The most important part of object retrieval is the efficient loading of associated objects—that is, how you define the part of the object graph you’d like to load from the database in one operation. NHibernate provides lazy-, eager-, and batch-fetching strategies, in mapping metadata and dynamically at runtime. You can use association joins and result iteration to prevent common problems such as the n+1 selects problem. Your goal is to minimize database round trips with many small queries; but at the same time, you try to minimize the amount of data loaded in one query.

The best query and the ideal object-retrieval strategy depend on your use case, but you should be well prepared with the examples in this chapter and NHibernate’s excellent runtime fetching strategies.