Appendix A. SQL fundamentals – NHibernate in Action

Appendix A. SQL fundamentals

This book assumes that you have a basic understanding of relational databases and the Structured Query Language (SQL). It will be easier for you to learn some of the advanced features of NHibernate if you already have a sound knowledge of SQL. This appendix gives a brief overview of the fundamentals of SQL. We highly recommend that you find a book to learn more about it.

Tables

A table, with its rows and columns, is a familiar sight to anyone who has worked with an SQL database. Sometimes you’ll see tables referred to as relations, rows as tuples, and columns as attributes. This is the language of the relational data model, the mathematical model that SQL databases (imperfectly) implement.

Relational model

The relational model lets you define data structures and constraints that guarantee the integrity of your data (for example, by disallowing values that don’t accord with your business rules). The relational model also defines the relational operations of restriction, projection, Cartesian product, and relational join. These operations let you do useful things with your data, such as summarize or navigate it.

Each of the operations produces a new table from a given table or combination of tables. SQL is a language for expressing these operations in your application (therefore called a data language) and for defining the base tables on which the operations are performed.

DDL and DML

You write SQL Data Definition Language (DDL) statements to create and manage the tables. We say that DDL defines the database schema. Statements such as CREATE TABLE, ALTER TABLE, and CREATE SEQUENCE belong to DDL.

You write SQL Data Manipulation Language (DML) statements to work with your data at runtime. Let’s describe these DML operations in the context of tables from the CaveatEmptor application.

In CaveatEmptor, you naturally have entities like items, users, and bids. Assume that the SQL database schema for this application includes an ITEM table and a BID table. You can create the data types, tables, and constraints for this schema with SQL DDL (CREATE and ALTER operations).

Table operations

Insertion is the operation of creating a new table from an old table by adding a row. SQL databases perform this operation in place, so the new row is added to the existing table:

insert into ITEM values (4, 'Fum', 45.0)

An SQL update modifies an existing row:

update ITEM set INITIAL_PRICE = 47.0 where ITEM_ID = 4

A deletion removes a row:

delete from ITEM where ITEM_ID = 4

But the real power of SQL lies in querying data.

Queries

A single query may perform many relational operations on several tables. Let’s look at the basic operations.

Restriction is the operation of choosing rows of a table that match a particular criterion. In SQL, this criterion is the expression that occurs in the where clause:

select * from ITEM where NAME like 'F%'

Projection is the operation of choosing columns of a table and eliminating duplicate rows from the result. In SQL, the columns to be included are listed in the select clause. You can eliminate duplicate rows by specifying the distinct keyword:

select distinct NAME from ITEM

A Cartesian product (also called a cross join) produces a new table consisting of all possible combinations of rows from two existing tables. In SQL, you express a Cartesian product by listing tables in the from clause:

select * from ITEM i, BID b

A relational join produces a new table by combining the rows of two tables. For each pair of rows for which a join condition is true, the new table contains a row with all field values from both joined rows. In ANSI SQL, the join clause specifies a table join; the join condition follows the on keyword.

For example, to retrieve all items that have bids, you join the ITEM and BID tables on their common ITEM_ID attribute:

select * from ITEM i inner join BID b on i.ITEM_ID = b.ITEM_ID

A join is equivalent to a Cartesian product followed by a restriction. So, joins are often instead expressed in theta style, with a product in the from clause and the join condition in the where clause. This SQL theta-style join is equivalent to the previous ANSI-style join:

select * from ITEM i, BID b where i.ITEM_ID = b.ITEM_ID

Along with these basic operations, relational databases define operations for aggregating rows (GROUP BY) and ordering rows (ORDER BY):

select b.ITEM_ID, max(b.AMOUNT)
from BID b
group by b.ITEM_ID
having max(b.AMOUNT) > 15
order by b.ITEM_ID asc

SQL was called a structured query language in reference to a feature called subselects. Because each relational operation produces a new table from an existing table or tables, an SQL query can operate on the result table of a previous query. SQL lets you express this using a single query, by nesting the first query inside the second:

select *
from (
select b.ITEM_ID as ITEM, max(b.AMOUNT) as AMOUNT
from BID b
group by b.ITEM_ID
)
where AMOUNT > 15
order by ITEM asc

The result of this query is equivalent to the previous one.

A subselect can appear anywhere in an SQL statement. The case of a subselect in the where clause is the most interesting:

select * from BID b where b.AMOUNT >= (select max(c.AMOUNT) from BID c)

This query returns the largest bid in the database.

where clause subselects are often combined with quantification. The following query is equivalent:

select * from BID b where b.AMOUNT >= all(select c.AMOUNT from BID c)

An SQL restriction criterion is expressed in a sophisticated expression language that supports mathematical expressions, function calls, string matching, and even more sophisticated features such as full-text searches:

select * from ITEM i
where lower(i.NAME) like '%ba%'
or lower(i.NAME) like '%fo%'

SQL also includes many other operations that you’ll want to learn about as you become more experienced.