Hydrate Cookbook
   Home  
   Getting Started 
   Download   
   Documentation  
   API Docs 
   Reference  
   FAQ 
  EJB 3.0 Support 
  Contact  
   Viewpoint 

SourceForge.net Logo

Hydrate Primer

Introduction

Hydrate concerns itself with object data models and the process of populating those data models from various relational and hierarchical data sources, of saving those data models back to a relational database and of mapping the same schema to one or more XML representations.

Basic Usage - The Hello World Application

Since time immemorial, the humble 'Hello World' application has been used to demonstrate the simplest possible program in a programming language or application framework. The Hello World for the java programming language is the iconic:

1: public class HelloWorld {
2: public static void main(String args[]) {
3: System.out.println("Hello World!");
4: }
5: }

But what if you want to be able to support a greeting that changes depending on the current locale? Assuming we have a database with a table listing a range of greetings by language that we want to read so that the greeting changes. Suppose we have been given a database table listing greetings from different countries as follows:

+---------+---------------------+------------------------+
| country | language | greeting |
+---------+---------------------+------------------------+
| GB | Simple English | Hello World |
| GB | Pretentious English | Greetings Planet Earth |
| US | American English | Hello World! |
| US | Cowboy English | Howdy Y'all |
| AU | Informal Aussie | G'day Fellas |
| FR | French | Bonjour Le Monde |
| ES | Spanish | Hola El Mundo |
+---------+---------------------+------------------------+

Note the clause 'we have been given'. It is the case more often than not in database design and integration that we have to work with existing data using a database schema that we may not have chosen ourselves. In this case, we might like to have more information about the country (a name for example), but we work with what we have.

We should like to have this information represented through an object view consisting of two objects, a country and a greeting, with a one-to-many relationship between them. We write down a UML-style class diagram showing the relationship between country and greeting:

In Hydrate we use the toolset to generate two java bean classes: Country and Greeting that represent the above UML diagram (more on this later). We can then write in place of the main(...) method above:

1: public static void main(String args[]) {
2: setUp();
3:
4: PreparedStatement stmt = conn.prepareStatement(
5: "select country country_code, greeting text from Greetings");
6: QueryRunner runner = new QueryRunner(ctx);
7: runner.addAssembler(new GreetingAssembler(ctx, null, null));
8: runner.build(stmt, null);
9:
10: Locale loc = Locale.getDefault();
11: Country country = new CountryKey(loc.getCountry()).findCountry(ctx);
12:
13: for(Greeting greeting : country.getGreetings()) {
14: System.out.println(greeting.getText());
15: }
16:}

Let's look at what is happening here.

Line 2: The call to set up establishes a few services that will be needed to read data from the database and manage the hydrate context. The code has factored out of this listing as it is standard boilerplate code and adds little to the understanding of what Hydrate is doing.

Lines 4-5: We create a prepared statement using JDBC. There is nothing special to Hydrate about this prepared statement - it is created using unadulterated java.sql classes. In fact, any prepared statement can be run and mapped to the objects that are in the object model. The names of the columns returned have been carefully chosen to match the names of the attributes of the 'Greeting' object. One of these attributes is actually a reference to the country object and the full name 'country_code' includes the name of the primary key of that object. This simple mapping will create greeting and country objects, automatically linking one to the other and eliminating duplicate country objects.

Lines 6-8: We use the helper class QueryRunner, to actually execute the prepared statement that has been created. We add an 'Assembler' object to the query runner before telling it to run the query. The assembler in question is a 'GreetingAssembler', which tells hydrate that it should be creating and resolving 'Greeting' objects from this query. The build method runs the query and adds any objects found to the hydrate context. After calling 'build' at line 8, the Hydrate framework will have run the query, built all the objects it can find from it, and stored them in the JVM's memory. In this case, it will have created five country objects: US, UK, FR, ES and AU, and linked each to a list of greetings: two greetings each in the case of US and UK and one each for the other countries.

Lines 10-11: we use a key to look up the 'Country' object for the default locale in the hydrate context. We do this by creating an instance of a key object (the class for which has been generated) with the country code taken from the default locale as a parameter. This key can be resolved directly to a country object assuming one has been loaded. This call looks only among the objects that have been loaded into memory by the previous query, it will never go out to the database itself.

Lines 13-16: Iterate through the greetings belonging to the default country and print each one to System.out.

Summary

  • Hydrate permits you to lay an object model of your choice over existing relational data.

  • Classes are generated by the Hydrate framework to support the object model you have chosen.

  • You have full control over the SQL that runs against the database and how it is mapped.

  • Hydrate manages the mapping of data attributes and the resolution of relationships and duplicate objects as they are read from the database.

  • The data and relationships in the hydrated objects are accessed through regular javabean getter and setter methods.


UML and class diagrams

There is much polemic about the best form for describing an object data model: Java POJOs, Entity relationship diagrams, relational models and XML all have their strengths and weaknesses.

The Hydrate framework opts to use a limited version of the class diagram as defined in Unified Modelling Language (UML) as a starting point for its data modelling. From this, with one or two additional pieces of information, everything else flows through static code generation:

  • Java interfaces to be implemented by the Java bean objects that represent the model.

  • A basic Java implementation of the above interfaces suitable for most applications.

  • Classes to read and write the object graph from relational schemas.

  • Classes to read and write the object graph from declared XML schemas.

  • One possible database schema for the objects declared together with queries to read and write from it.

For the applications in which Hydrate excels, the quantity of handwritten code required is less than would be needed using POJOs, the code generation step is fast and has been streamlined through ant, and yet you retain great flexibility in how objects are populated with data, written to the database and generally manipulated.

Here is an example of a Hydrate definition file that defines two classes:

<Family name="HelloWorld" namespace="org.hydrateframework.sample.model">

<Class name="Country" comment="A national identity">
<Attribute name="code" type="String" size="2"
comment="Short ISO code for country"/>
<Attribute name="telCode" type="PositiveInteger"
mandatory="false"
comment="International dialling code for country"/>
<Attribute name="name" type="String" size="30"/>
<Key name="CountryKey" primary="true">
<Member name="code"/>
</Key>
<Key name="CountryTelKey">
<Member name="telCode"/>
</Key>
</Class>

<Class name="Greeting" comment="A typical greeting in the given country">
<Attribute name="text" type="String" size="80"
comment="Greeting in a language of the country"/>
<Attribute name="language" type="String" size="5"
comment="Language or dialect of the greeting"/>
<Key name="GreetingKey">
<Member name="country"/>
<Member name="language"/>
</Key>
</Class>

<Relationship name="CountryGreeting">
<Reference name="greetings" toObject="Greeting" multiplicity="0..*"
navigable="true"/>
<Reference name="country" toObject="Country" multiplicity="1" />
</Relationship>

<XMLSchema name="Greetings">
<Element asMember="countries" class="Country" collection="true"
failIfNotPopulated="+">
<Element asMember="greetings" failIfNotPopulated="text"/>
</Element>
</XMLSchema>
</Family>

To generate the sample files run the ant build script in the HydrateSample directory. The code generation process, creates a rather bewildering array of classes and interfaces, but for now it is worth concentrating on the java bean interfaces that have been created.

For each object in the model, Hydrate writes a bean interface, that is, a Java interface with the same name as the object it represents, that declares getter and setter methods for each of the attributes in your model. The types of each of the attributes matches the declared type, but note that native types are used where possible for performance and storage efficiency. You'll see that references to other objects return references to the appropriate object class, so each Greeting object has one and only one Country in the model, so the greeting has a getter/setter pair for setting the country with a Country reference. Meanwhile, the Country object which has a reference to a list of Greetings, has a getter/setter pair for a collection of Greetings.

For the above model, two interfaces are created, one for the Country and one for the Greeting. The Country has getters and setters for its declared attributes:

    String getCode();
void setCode(String x);

String getTelCode();
void setTelCode(String x);

String getName();
void setName(String x);

as well as a method for getting and setting the collection of Greetings:

    Collection<Greeting> getGreetings();
void setGreetings(Collection<Greeting> x);

The Greeting object has similar getters and setters and a method for setting the country reference:

    String getLanguage();
void setLanguage(String x);

String getText();
void setText(String x);

Country getCountry();
void setCountry(Country x);

This is pretty uncontroversial stuff.

If you are writing bean objects that are principally designed to store and provide access to data, there are not really many choices to make at the interface level. The Hydrate definition file is actually a pretty succinct way to represent this information and keep it up to date, so ceding control over the actual coding of your POJOs, or at least the interfaces that front them, is starting to pay dividends.

The interface also declares any other methods that you have declared in your model. These methods can be declared in the object definition file to have any specification, and this allows you to have your object support other interfaces that you have defined elsewhere to integrate better into your application.

You will also see that the object interfaces have an inner-interface called Init. This is used by the framework to distinguish between objects being set by the framework and being set by application code so that the framework in turn can mark them as dirty and ensure that two way relationships are maintained between objects if required.

Summary

  • Hydrate uses static code generation to represent an object model, which does require initial buy in, but makes for fast efficient code.

  • The starting point in Hydrate is a UML definition of the objects in the model.

  • It is not hard to write Java classes that support the generated interface files.


Visualization tool

Hydrate comes with a UML visualization tool to help understand the structure of your declared object model. Simply start the tool and open the object model you have defined. For example, here is the display for the above object model.

You get this by typing the following at the (linux) command prompt from the HydrateSample directory:

java -cp ../HydrateTools/bin:../Common/lib/commons-logging.jar \
org.hydrateframework.gui.HydrateUI src/test/HelloWorld.hyd

or for Windows:

java -cp ..\HydrateTools\bin;..\Common\lib\commons-logging.jar
org.hydrateframework.gui.HydrateUI src\test\HelloWorld.hyd

The tool also provides support for writing more complex mapping queries. There is more on this in chapter on the Query Map Builder

Summary

  • The Hydrate Visualization Tool is a convenient way of viewing a UML diagram relating to the object model you have specified in your XML declaration file.


Object Context, Request Context: Caches and Units of Work

Before using Hydrate objects, a few preparatory steps are necessary to set up an environment into which bean objects are placed and managed.

An ObjectContext can be thought of as a bucket into which objects managed by Hydrate are placed when created or read from the database. As such the ObjectContext provides methods to find objects by their keys and to iterate through collections of objects. The context is also the place which determines the factory that will be used by default to build each type of object and the queries that are used by default to read and write the objects from the database. Finally it provides a central point in which the modification state of the objects is held and a method 'saveAll' that allows all objects to be updated in the database.

An ObjectContext is often used to do one specific job that relates to a request for data or the amendment of a database. However, by leaving the ObjectContext in scope, you can use it as a cache for data over several operations.

The RequestContext is responsible for managing resources that are used within the scope of a small unit of work such as servicing a request from a client. This object keeps track and caches queries that are used to read and write objects and connections that are needed for access to the databases. Before using Hydrate in any way that needs to implicitly use any of these database resources, a request context must be set up and provided with any connections that it needs to get its job done.

By explicitly making the connections available we maintain maximum flexibility as to the source of the connections while avoiding deadlock conditions that can happen if more than one connection is needed to complete a request.

You must remember to clean up the request context when completing each request. This is typically done in a finally block, so that the cleanup is guaranteed no matter what the execution path of the request.

Typical usage pattern 1: simple standalone application. In a very simple standalone application which is performing a single function, you will probably want to set up the ObjectContext and a RequestContext at the point an application starts, then make them available to the rest of the application, perhaps even in the static scope.

public static ObjectContext ctx;

public void main(String[] args) {
ctx = new ObjectContext(true);
HelloWorld.configureContext(ctx);
RequestContext rc = new RequestContext(ctx);

// Create a connection and add it to the request context
Class.forName("com.mysql.jdbc.Driver");
Properties props = new java.util.Properties();
Connection conn = java.sql.DriverManager.getConnection(
"jdbc:mysql:localhost/school", props);
rc.addConnection("jdbc/HelloWorld", conn);
.
.
rc.releaseConnections();
}

Typical usage pattern 2: responding to server requests as a web application. In response to the request, set up and configure an ObjectContext, then a RequestContext, add all connections that will be used in the request to the RequestContext. Release connections in the RequestContext in a finally block after using Hydrate:

    ObjectContext ctx = new ObjectContext();
MyFamily.configureContext(ctx);

// Get a connection from JNDI - this code is for a tomcat server with a
// configured connection pool jdbc/HelloWorld
Context ctx = new InitialContext();
DataSource ds = (DataSource)ctx.lookup("java:comp/env/jdbc/HelloWorld");
conn = ds.getConnection();

// Make the connection available to Hydrate
rc.addConnection("jdbc/HelloWorld", conn);
RequestContext rc = new RequestContext(ctx);
try {
.
.
} finally {
if (rc != null) {
rc.releaseConnections();
}
}

Typical Usage Pattern 3: The above usage pattern can be amended by saving the reference to the ObjectContext between request calls. In this way, the ObjectContext is effectively being used as a cache since objects referenced by it will stay in memory. If you choose to use the ObjectContext as a cache, then you are responsible for maintaining cache concurrency (i.e. ensuring that your in-memory objects are in sync with the database).

Summary

  • There is a certain amount of context set up that must be done before using Hydrate

  • The Object Context is a bucket for Hydrate Objects

  • The Request Context manages per-request resources.


Populating objects, partial population and 'not set' values

Hydrate works most effectively when used as a window on your database. You describe a model through the UML definition language that is designed to overlay your data model. You then use the Hydrate infrastructure to pull into memory cross- sections of this data that you need to provide an answer to a particular request. Suppose your database holds your entire inventory amounting to several Gb of information. You overlay an object model over this data and then to answer a particular request, you write a query or queries that will pull into in-memory objects, just those parts of the model that you need.

For example, suppose you need to look for all widgets with a fire safety code of 'AAA' or better, that were delivered by ACME on a wet Wednesday last year. You can probably craft a query that can filter most of this information, but the 'wet Wednesday' clause could be tricky in SQL, particularly as weather information is not likely to be included in your stock database. You therefore write a query to pull back on each Wednesday last year. The query would include information about the supplier, the widget as well as other objects, but not all of the data for each object. You then perform the final validation check on the in-memory objects. You don't want to be forced to load all the attributes of the supplier, just to check that the supplier was in fact 'ACME', nor all details of the delivery schedule just to check that a delivery was on a Wednesday.

In order to permit the partial population of data models, Hydrate supports the concept of 'not loaded' objects and attributes. An attribute or object reference that is tagged as 'not loaded' has simply not been read from the database (yet). Hydrate's ability to flag any attribute as 'not loaded' is a key feature of the architecture. It means that you can create fully legal representations of the object model in memory from a subset of the available data.

Hydrate also supports the concept of 'Not Set' values, which and at this point, it is worth highlighting the difference between these and 'not loaded' values. An attribute or reference is marked as 'not set' if it has been read from the database, but semantically has no value in the data model. Your Widget object might have an attribute 'overseasReference', which is sometimes set where the widget has an alternate reference number, but is 'not set' in cases where it has not. This functionality is generally represented as a 'null' in a relational database. However, suppose you have populated your Widget object from the database using a query that does not request the 'overseasReference' field. That attribute would now be set to 'not loaded' - it may actually be 'not set' but we don't know 'cos we haven't read it from the database. Attributes can only be 'not set' if they are marked as 'optional' in the UML definition.

This is a subtle but very important distinction and it is worth spending a couple of seconds considering the difference.

So what happens when you try to access an attribute that is not loaded. The behaviour is actually implementation-specific, that is the implementation of the bean interface for each object can decide for itself how to handle this case. The recommended behaviour, and that taken by the default implementations of these interfaces, is to throw a 'NotLoadedException' as soon as any attempt is made to access a field that has not been loaded from the database.

Accessing a field that is 'not set' elicits different behaviour. No exceptions are thrown, but a 'special' value is returned in each case. This return value can be tested with the equality operator '==' against a static 'not set' value that is generated as part of the each object bean interface. For example, for the overseasReference attribute in the Widget object, you can set the value to or test the returned value against: 'Widget.OVERSEASREFERENCE_NOTSET'.

Summary

  • Any attribute in Hydrate may have the value 'not loaded' indicating it has not been read from the database.

  • Any attribute in Hydrate, that is not mandatory can have the value 'not set' indicating that semantically, it has no value.


Mapping Queries - single object

The simplest case of mapping from a relational database to an object model is where only one object type is being populated from the underlying data. This simple case is actually quite useful in refreshing or fully populating the attributes for a single data object. The example also illustrates the steps that must be followed when doing any data population.

Mapping from a database query to an object is a three stage process:

  • Read the key of the object from the query results set. The fields that make up the chosen key are read from the results set row and a key constructed based on the information read.

  • Search for an existing object. Hydrate always searches the object context for an object with a matching key. It can optionally search and load a matching object from the default database. If no object is found, a new one is created and initialized with information from the key.

  • Read the remainder of the object from the current results set row. This step may be disabled if this is not a new object.

When reading attributes of the key or attributes of the object itself, Hydrate looks for columns in the returned result set that have names that match the corresponding field in the object. You can use the standard SQL syntax to rename columns in the query as a way of ensuring that they match the attributes in your object, or you can use a map to convert the names returned from the query to attribute names of your object.

The above three steps are repeated for each row returned in the query results set and in this way an object is created for each row in the returned query. If any of the keys in the returned results set was equal, the data from the rows where the keys are unique is assumed to refer to the same object and only one object is created by Hydrate.

Let's look at a code snippet that demonstrates the above to populate a Country object from a hypothetical database with a table containing countries. Country has three attributes, code (the ISO code of the country), telCode (the international telephone dialling code for the country), and name (the name of the country). Country has a primary key consisting of the attribute 'code'. Here's the Java code:

    // Prepare a statement to be run
PreparedStatement stmt = conn.prepareStatement(
"select isoCode code, name from Country");

// Create an instance of the class that manages object creation
// and tell it that we plan to create 'Country' objects.
QueryRunner runner = new QueryRunner(ctx);
runner.addAssembler(new CountryAssembler(ctx));

// Run the statement and build objects.
runner.build(stmt, null);

The above code demonstrates the population of a single object 'Country' from a database table containing the ISO code and name of the country. The ISO code attribute in the database table needed to be renamed so that it matches the name of the object attribute 'code'. Note also that this query is not actually requesting the 'telCode' attribute (perhaps it is not available in this database).

This code snippet will result in the creation of one Country object for each row in the returned results set with a unique isoCode value.

So if the following rows are returned:

code

name

GB

United Kingdom

US

United States of America

ES

Spain

FR

France

US

United States of America



Four objects would be created of type Country, one for GB, US, ES and FR. The second US row from the result set would not give rise to the creation of a new object, because the key matches an existing object. The telCode field of each of the created objects would be set to 'NotLoaded'. Any attempt to access this field would result in an exception being thrown, but the object is otherwise fully valid.

We now might run the following code snippet:

stmt = conn2.prepareStatement("select iso code, intTelDialCode telCode from DialingCodes");
runner = new QueryRunner(ctx);
runner.addAssembler(new CountryAssembler(ctx));
runner.build(stmt, null);

This might return the following rows:

code

telCode

US

1

GB

44

IT

39



We would now have a total of five objects in the context, all countries with codes 'GB', 'US', 'ES', 'FR' and 'IT'. The first two have both name and telCode loaded with their correct values. The second two have name loaded, but telCode is not loaded (and would throw an exception if accessed). The last object's name attribute is not loaded, but the telCode attribute is available.

This example shows how you can use Hydrate to pull different pieces of information about the same object from different queries, and different data sources. This example is realistic in this usage of a second query to 'flesh out' additional information about an object already loaded. However, in the case of the original query, it is more normal to want to populate more than one object from each row of a query. Hydrate makes this very easy indeed and the ways to do it are discussed below in 'Mapping Queries - multiple object' and 'Mapping Queries - the query map builder'.

Summary

  • Hydrate builds objects using standard JDBC queries

  • For each row on the query Hydrate builds the key, searches for an existing matching object and then populates the found object or a new object from the data in the result set row.

  • Hydrate knows which columns from the query results to match to which columns in the object by matching the name of the column.

  • You can build one object from multiple data sources, provided they share a common key.


More on Mapping Names

Sometimes, you just can't use SQL to do your name mapping for you. Examples of when this might occur are:

  • You are running stored procedures where you have no control over the names of the returned attributes.

  • You are using a dialect of SQL that does not support name mapping.

  • There are name length or identifier limitations in your SQL dialect that do not permit the correct name to be used.

  • There are name conflicts when creating more than one object type from a result set (see multiple objects, below).

The solution in this case is to use a column map for the query. A column map is simply an object supporting the java.util.Map interface that can map from the name of a column found in the query to the name of an attribute in your object.


How to Access the Objects Returned from the Query

So you've run your query and loaded objects into your ObjectContext bucket. You'll notice however, that the method you've been calling: Assembler.build(...) doesn't return anything that would tell you which objects were created. Surely it should return an array of objects or something?

But what would it be an array of? All objects of all types that were created from that query? All objects from one of the types that were created from that query? All new objects (those that were not already in memory?)

In fact there are so many possibilities that we made the decision for it not to return anything. You have a few options:

Run the build method of the QueryRunner class and, which it's finished, call 'iterateObjects(...) on the ObjectContext to give you a list of all objects of a particular type.

// Prepare a statement to be run
PreparedStatement stmt = conn.prepareStatement(
"select isoCode code, name from Country");

// Create an instance of the class that manages object creation
// and tell it that we plan to create 'Country' objects.
QueryRunner runner = new QueryRunner(ctx);
runner.addAssembler(new CountryAssembler(ctx));

// Run the statement and build objects.
runner.build(stmt, null);

// Now iterate through the Country objects printing them out
for (Iterator iter = ctx.iterateObjects(
Country.class, CountryKey.class); iter.hasNext; ) {

System.out.println("Country is: '" +
iter.next().toString() + "'");
}

If you already have a reference to an object that is related to the objects you are interested in, you can call a method on that object to navigate to the object or objects you want. For example, suppose you have a Country object reference already and have just run a query that will load all Greeting objects for that Country. Assuming Country has been defined with a 'to-many' reference to Greeting, you can now call getGreetings() on Country and it will return a list of all the Greeting objects you have just loaded.

// Prepare a statement to be run
PreparedStatement stmt = conn.prepareStatement(
"select country country_code, text from Greeting " +
"where country=?");

// Create an instance of the class that manages object creation
// and tell it that we plan to create 'Greeting' objects.
QueryRunner runner = new QueryRunner(ctx);
runner.addAssembler(new GreetingAssembler(ctx, null, null));

// Run the statement and build objects - assume we have an
// object 'country' already set up.
stmt.setString(1, country.getCode());
runner.build(stmt, null);

for(Greeting greeting : country.getGreetings()) {
System.out.println(greeting.getText());
}

You can pass in an Observer object when calling build which will be called each time a new object of any class is created or referenced from the query as it is running. This approach is particularly powerful as it permits you to carry out work, including accessing other databases, during the network latency implicit in your call to the database server. There is a useful Observer available as an inner class of the QueryRunner class that looks for and returns one and only one single object from a query. An example is below. See the chapter on 'Doing Work as the Query is Running' for more information,

// Prepare a statement to be run
PreparedStatement stmt = conn.prepareStatement(
"select isoCode code, name from Country where code=?");

// Create an instance of the class that manages object creation
// and tell it that we plan to create 'Country' objects.
QueryRunner runner = new QueryRunner(ctx);
runner.addAssembler(new CountryAssembler(ctx));

// Run the statement and build objects.
stmt.setString(1, "US");
QueryRunner.GetSingleObject getCountry =
new QueryRunner.GetSingleObject(Country.class);
runner.build(stmt, getCountry);

// Now printing out the Country object.
System.out.println(getCountry.getObject().toString());

Summary

  • Created objects are not returned directly from the build method of QueryRunner

  • There are a number of different ways available to you to access the created objects.

  • You can start working on the created objects as soon as each one is read from the query results, by implementing an Observer.


The importance of Keys and object uniqueness. Object equality.

The discussion above touched on the importance of an object key in defining when an object is considered to be unique. Much of this discussion will be familiar to relational database designers, but let's backup a bit and define what we mean by uniqueness. An object is unique if one and only one of this object can ever exist in a self-consistent representation of the model.

For example, in a database holding stock information, an instance of a Widget object should represent a Widgets stock level. It may have attributes such as name, description, quantity, purchase price, retail price, etc. The uniqueness criterion for a Widget in this case is that of a distinct kind of stock item that is held by the company. If we have 10 'Kerning Washers' in stock, there is one Widget object that represents them. The unique key for this object might be the stock code for the item (in fact stock codes are usually invented tags that have this purpose), though you could also use the name if it was always unique for each stock type.

In fact there are often cases where you have a number of alternate keys, either as a single attribute of the object or as multiple attributes. Hydrate fully supports the definition of multiple keys on each object and can read objects from the database using one key for one query and a different key for a different query against a different data source.

Consider now that the company is dealing in Widgets that have different conditions or qualities. In this case, there may be unique things about each individual stock item - its condition, its purchase and sale price may all be different. In this case we want to extend the uniqueness criteria so that one Widget object relates to an individual physical object in the warehouse. The stock code is no longer sufficient as a uniqueness criteria, we need another field such as serial number, or condition, to further refine the definition of uniqueness.

Note in this case, there may not be any field within the object that could be used to define uniqueness. In this case, you can always 'invent' a new field that is assigned an arbitrary value each time a new object is created, and make the key equal to this generated attribute. Hydrate supports objects that have implicit generated keys and can defer an underlying database to provide values for these keys.

Finally consider that you may want to extend your data model further to include a time series data on the retail prices of all your Widgets. Now your uniqueness criteria must also include a date or timestamp on which the sale price became valid. Since it is now valid for you to have two Widget objects in memory at the same time, each representing the same physical object, but at different points in time, you must further extend the key to include the date, or some version number, to take this into account.

Note that some of the attributes of keys may be references to other objects in the model. If you have designed your model such that a Widget object represents a single object in the warehouse, then you might also have a WidgetType object to record generic things about that type of Widget. The WidgetType would could use the stock number code as its key. The Widget object would have a mandatory reference to a WidgetType object and this same reference is actually part of the natural key of the Widget object.

The use of object references as part of the key represents a departure from relational modelling since the relational model cannot contain references per se rather it would represent a reference to another object as the key fields for that object. In mapping to the relational model, Hydrate takes care of this mapping process.

In Hydrate object keys are declared within the UML schema definition file. You may declare as many keys as you like, but each must agree on what the uniqueness criteria for the object is. For example you could have one key that was based on the primary stock reference number, and another based on a supplier coupled with a supplier stock reference number. Yet another key could indicate an old stock reference number which is unique for each unique stock item, but may not always be set. One and only one of the keys you define may be designated the primary key.

Each key that you define for a Hydrate object must have a unique name and the generation process creates a Java class file representing each key you have defined in your model.

Having defined what we mean by uniqueness and looked at how defining the key goes hand in hand with defining the uniqueness criteria we will take a look at why this is important to Hydrate.

When Hydrate loads objects into memory from the database or from XML, it automatically detects duplicate objects and either discards them or refreshes the in-memory version of the object with the new details. It is not possible to have two objects loaded with the same key into the same object context, and therefore it is impossible to have two object instances representing the same unique object in the object model.

This 'canonicalization' of the object model is a key Hydrate feature and has a number of other ramifications:

Queries that return attributes from two different objects having a one-to-many relationship will repeat details of the object on the 'to-one' side of the relationship. Hydrate automatically resolves this and only creates one object.

Objects whose keys are equal will be identical, that is:

object1.getKey(null).equals(object2.getKey(null))

is equivalent to

object1 == object2

Summary

  • Consideration of the uniqueness criteria of objects is an integral part of the object model design process

  • A key may consist of a single attribute, or multiple attributes. Attributes may be references to other objects.

  • An object may have more than one key provided all keys agree on the uniqueness criteria (or are null).

  • Hydrate resolves duplicate objects using the key.

  • You can always test for object equality with the '==' operator.


Object Keys and Assemblers

Object Keys are very important to assemblers. As we know from our discussion above, the object build process contains three steps: read key from result set row, find object or create if not found and populate object with data from result set row. We now know that an object may have more than one key and this means that we can choose to build an object using something other than its primary key to define uniqueness.

This is a very powerful feature of Hydrate since it allows us to link disparate databases where a common key can be identified. Suppose we have two databases with Widget objects in them. One is an old database in which the Widgets have an old-style reference number, the other is a newer database with a new widget numbering scheme, but which also holds a reference to the old widget number for backward compatibility. We define the widget object as having two keys, one with the new reference, one with the old. <TODO> object example


Mapping Queries - multiple object

We've seen how to run a query to return a single type of object, but typically when you write queries to return data from a database, they refer to attributes that appear in many different kinds of object. In the simple example in the HelloWorld application, the query is returning two kinds of object, 'Country' and 'Greeting', but in real-world applications, you very often want to write queries that reference four or five different objects.

The point here is that you do not want to design your queries based on what your O/R mapping tool can handle. You will achieve far greater performance and flexibility if you write queries based on what works well at the database level, secure in the guarantee that you will be able to map the results to your chosen object graph. In fact, you may want to refactor your queries at a later stage and break one big query into a few smaller ones, or vice versa with a view to tuning performance or consolidating data access.

The Hydrate framework strives to meet these requirements by providing a flexible mapping strategy between a JDBC results set and an arbitrary object graph. It provides a set of tools for helping map the columns returned by a query to a set of objects and two complementary approaches to linking up objects built from a query.

Mapping data into more than one object from a single result set is basically very similar to the single object case. There are two aspects that make things a bit more complex.

There is a namespace conflict between attribute names in the several objects you are trying to build. In the case where two objects are expecting an attribute called 'code' how to you determine which column maps to which object's attribute in each case.

The objects that are built need to be linked up to the other objects that they are related to in the query. This needs to happen with as little fuss as possible.

Let's look at the HelloWorld example. Our query is returning a country code, and a greeting text and we'd like these fields to be mapped into the field 'Country.code' and 'Greeting.text'. Furthermore, we'd like to ensure that the created instance of Country is linked to the created instance of Greeting according to their declared relationship.

First, let's look at the problem of mapping the attribute names. In this case, there is (fortunately) no overlap in the names of the attributes, so we can simply use SQL attribute renaming to ensure that the names line up:

// Prepare a statement to be run
PreparedStatement stmt = conn.prepareStatement(
"select country code, greeting text from Greetings");

See the chapter 'More on Mapping Names' for some tips and tools on mapping names from the query to the object attributes.

// Create an instance of the class that manages object creation
// and tell it that we plan to create 'Country' objects.
QueryRunner runner = new QueryRunner(ctx);
Assembler countryAssember = new CountryAssembler(ctx);
Assembler greetingAssembler = new GreetingAssembler(ctx, null, null);
runner.addAssembler(greetingAssembler);

We've added two assemblers to the query runner one for each object type we expect to build. We still have not solved the second problem - the linking of objects. There are two complementary ways of doing this: implicit and explicit. The simplest is explicit. We link the two builders together with the setLink(...) method.

// establish the link between objects
countryAssembler.addLink("greetings", greetingAssembler);

// autoFixup (implicit) link resolution is the default.
countryAssembler.setAutoFixup(false);
greetingAssembler.setAutoFixup(false);

We are then free to run our query as before. The resultant objects will be linked up when the query completes.

// Run the statement and build objects.
runner.build(stmt, null);

The explicit linking is more exact, and perhaps simpler to understand. As we shall see, it is also the only option for many-to-many relationships. However, often implicit linkup may prove more effective. In essence, explicit linkup works because the generated objects know that some of their attributes are actually references to other objects. The Greeting object has been generated to know that it has a link to a country object and so if we ask it to populate the country, it will automatically look for, create if not found, and link a country object to any greeting object that maps this key.

Going back to our original query, we could also map the country code as the foreign key reference to country in the Greeting object. The name of any foreign key reference is the name of the reference ('country' in this case), concatenated with an underscore plus the name of the key field (or fields) of that object. If the key of the referenced object contains more than one field, there will be more than one field in the foreign key. If one or more of the key fields of the referenced object is itself an object, the rules apply recursively.

So in this case, we are looking for a field 'country_code' which is the foreign key of the country object in the greeting object. So we can now rewrite the query:

// Prepare a statement to be run
PreparedStatement stmt = conn.prepareStatement(
"select country country_code, greeting text from Greetings");

Now we only need to add the greeting builder to the QueryRunner:

// Add assembler for greeting object
QueryRunner runner = new QueryRunner(ctx);
runner.addAssembeler(new GreetingAssembler(ctx, null, null));

// Build objects
runner.build(stmt, null);

The greeting assembler sees that the foreign key for country has been included in the query and automatically looks for or creates a country object with just this key field populated. It then automatically links this object to the greeting object it just created.

The implicit object link-up is much more succinct, but needs some knowledge about the object references, the keys used in those references and some care about including the right attribute names. It is very useful for 'quick linkups' as in this example, but it can be difficult to write clear transparent code using this approach. Also, you cannot use implicit linkup with many-to-many references, because the foreign keys are not part of objects that exist in the model.

In fact, the explicit link-up can also become quite opaque, as well as being verbose and for this reason, the Hydrate framework also supports the Query Map builder, which is part of the Hydrate GUI. This highly flexible and easy to use mapping tool is the subject of another chapter.

Summary

  • Mapping multiple object types in a query is an extension of mapping a single object type with the additional complexity of conflicting attribute namespaces and object linkup.

  • Use column maps to resolve conflicting attribute namespaces

  • You have two options for performing object linkup: implicit and explicit.

  • Implicit linkup uses a foreign key in one of the linked objects to find/create and link to the other object.

  • Explicit linkup uses declared relationships between assemblers to perform linkup.


Relationships - unidirectional and bidirectional.

When doing data modelling, the relationships between objects can be an unexpectedly hard thing to get your head around, because we generally think in very different ways about different kinds of relationships, that are represented in a very similar way in the data model.

For example, consider a person and their relationships. A person has a father, who is also a person. So a person also has a list of children. But is this one relationship or two? Certainly we may speak of these as different relationships, but they are not entirely independent in that, when a child establishes their fixed relationship with a parent (in a purely data modelling sense), the parent also acquires a child. One should not be allowed to happen without the other.

In Hydrate, this situation is represented as a relationship with two references (actually as we shall see all relationships have two references). One of the references is called 'father', references a person object and has a multiplicity of 1. The other reference is called children, references a person object and has a multiplicity of 0..*. Both references are navigable. Note it is more usual that a reference would be between two objects of different types, but this example serves to highlight why it is important to name the reference with something other than the name of the object it is referencing.

The effect of the relationship described above is to add additional attributes to the Person object. The person object would acquire a 'father' attribute of type Person (so the interface would have methods:

public Person getFather();
public void setFather(Person x);

The person object would also acquire a 'children' collection yielding the following methods:

public Collection<Person> getChildren();
public void setChildren(Collection<Person>);

In Hydrate, the reference in declared outside of the objects which it connects. This is somewhat non-intuitive at first, and causes objects to acquire attributes that may be part of their key, but do not appear among the declared attributes. The approach was chosen since it reduces redundancy. When viewed in the graphical tool, the references are less confusing.

Now consider the case where a person has one of a dynamic enumerated set of 'status' types. This may take values such as Single, Married, Divorced, Widowed, etc. One way to represent this is by creating an object type that represents the status of a person, say LegalStatus, and associate it with the Person object. There is therefore a relationship between Person and LegalStatus where a person has one and only one LegalStatus, but more than one person could have the same Legal Status. Note that Hydrate also provides the concept of compiled enumerations for enumeration values that are likely to be reasonably static.

Is this one reference or two? In fact although the relationship more strongly belongs in the Person object, it is a two way relationship like the previous example. The Person has a Legal Status, but the Legal Status also has a list of people (who have that status). That said, it would be a rare application that actually needed to be able to enumerate the list of people with a single status in a highly efficient way.

In this case, we would probably not make the Person reference accessible from LegalStatus and this is achieved by not making the reference navigable. In cases like this, the reference to the enumerated value usually takes the name of the object it references, legalStatus in this case. The reverse reference name does not matter much, because it is not navigable, but the name chosen must be unique. You could call it revLegalStatus, if only Person objects have a legal status. The name peopleWithStatus, might be a better alternative to avoid name clashes. Hydrate would generate an attribute legalStatus in the Person object with getter and setter as follows:

public LegalStatus getLegalStatus();
public void setLegalStatus(LegalStatus x);

If you are thinking, what is the harm of making the reference to the Person from Legal Status navigable as well, there is a cost. Apart from adding confusing functionality to the public interface to the bean and requiring the generation of an additional collection class and object linkup code, making a 'to many' relationship navigable requires a set to be kept up to date for each such collection. The overhead of maintaining this, particularly for references where few objects map to many objects, is significant.

Note that although the reference is tied more strongly to Person, it is still declared outside of the Person object and even though the reverse reference is not navigable, it still needs to be declared as a reference.

A third example looks at impersonal collection references. Consider a Person object having a collection of Reminder objects. Each Reminder records something that the person needs to do and a due date. In the application that uses this data, reminders are only ever accessed through a person. In this case we have a one-way relationship to a collection. It is not hard to see that there is actually a relationship in the other direction, but we might not need to make it navigable. If the relationship to the Reminder object was called 'reminders' and had a multiplicity of 0..*, the implicit reverse reference might be known as 'person' and would have a multiplicity of 1. The first reference is navigable and the second is not. Hydrate would generate the following code:

public Collection<Reminder> getReminders();
public void setReminders(Collection<Reminder>);

If you are thinking, what is the harm of making the reference from Reminder to Person navigable as well, there is actually very little cost in doing so. This relationship will generally have to be maintained 'behind the scenes' anyway to ensure that database updates work so the only overhead is potential 'pollution' of the public interface with a method that is never called.

Finally, it is worth saying that many-to-many relationships are also fully supported by the framework. As with all other relationships, they are declared outside of the objects they refer to and result in collection attributes being added to those objects. These relationships still have two references, both of which must be declared and named, but either of which may be non-navigable. As with any 'to many' relationship, there is overhead in making it navigable, so do not do so unless you plan to use the functionality in your application. Pure many-to-many relationships generally need additional thought when mapping to and from a relational model because they are typically represented in the relational world through a 'linking table'. Hydrate has the ability to map data both from and to such linking tables.

Summary

  • Relationships always have two references, although with some relationship types, it is sometimes quite hard to think what the reverse reference means (or think of a name for it).

  • Each reference may be either a mandatory single, optional single or multiple reference.

  • Relationships can usually be thought of as belonging more strongly to one or other of the objects involved in the relationship. This can be represented in the model through ownership.

  • Making a single object navigable, has no overhead in terms of data storage or performance. Making a collection navigable generally carries a significant overhead in that an index has to be maintained for each collection.


Relationships - Reverse Hook-up

In the previous chapter we saw that relationships are always bidirectional and often are navigable in both directions. The implication of this is that there is a connection between the two references of a relationship such that when one changes there is an implied change to the other. Consider our HelloWorld application. The Greeting object has a reference to a country but if the model is to remain self-consistent, this means that the country's list of Greeting objects must contain the Greeting that references it.

Now assume that we have three objects, two countries: france and germany, and a greeting: greeting. Assume that the greeting's country reference is initially empty.

We call: greeting.setCountry(france). In order to maintain consistency in the object model, the implementation of this method should implicitly add greeting to the list of greetings belonging to france. Following this reverse hook-up, the method france.getGreetings().contains(greeting) will return true.

Now, we call greeting.setCountry(germany). Again we need to retain consistency in the object model. This time, we need to un-hook the relationship to france, by removing greeting from france's list of greetings, and then adding greeting to germany's list of greetings. Following this reverse hook-up, the method france.getGreetings().contains(greeting) will return false, and germany.getGreetings().contains(greeting) will return true.

Of course we could have updated the relationship from the other direction. The call: france.getGreetings().add(greeting) will add greeting to france's list of greetings, but implies that greeting.getCountry() should now return 'france'. Likewise if we subsequently call germany.getGreetings().add(greeting), the revese hook-up process will change the country reference on 'greeting' from france to germany.

The implementation of this reverse hook-up functionality is the responsibility of the bean interface implementor, and is included in the <Object>Impl implementation of the bean interfaces that are produced by the code generator. There are also Collection implementations that are generated as inner classes of the <Object>Helper class that are designed to perform the reverse hook-up for the collections that back the collection-based references.

Reverse relationship hook-up is quite a fiddly and error-prone undertaking in hand-coded classes. The exact sequence of actions is different for one-to-one, one-to-many and many-to-many relationships and the Hydrate framework takes away the chore and the risk associated with writing and thoroughly testing this code.

Summary

  • The existence of two-way relationships means that making a change to one side of the relationship may imply a change that has to be made to the other side in order to keep the model self consistent.

  • Hydrate handles this reverse hook-up functionality automatically in its generated implementations of the bean interfaces.


Mapping Queries - the query map builder

In previous chapters we have seen how to map simple queries to your object model. We looked at how we could map single or multiple JDBC queries to single or multiple objects. We saw that the mapping of result set columns to attributes of an object is determined by the name of the column and that the object references were linked up either implicitly or explicitly. The examples seen so far are fine for smaller applications and queries that map a few objects each with a few attributes, but this approach creates problems for a larger application with more objects and the need to better manage the queries used to access the database.

The JDBC approach demonstrated in previous chapters has a number of specific drawbacks:

  • It encourages the use of SQL embedded inline in the java application.

  • The mapping of the result set to the objects is determined in java code which makes assumptions about the queries being run.

  • For complex queries and mappings, it is not very clear from the code what is being mapped and how.

To address these issues, it is strongly recommended that each non-trivial mapping be embedded in its own class that encapsulates the SQL and the mapping code, creating a black box around the SQL with constraints as inputs and constructed objects as outputs. Hydrate provides a flexible tool designed to allow you to do this with queries declared in an XML meta-language.

The Query Map Builder is a tool that greatly simplifies the process of mapping complex queries. The tool permits a developer to write and execute a query, look at the results and through the use of a table made up of the actual results of running the query, and map each column returned to an attribute in an object. It then allows the developer to determine which of the objects created from the query should be linked to which other of those objects.

The screenshot below shows the Query Map Builder screen having mapped our HelloWorld query into the now familiar Country and Greeting objects.

The SQL query has been typed in at the box at the top, after selecting the data source to use and a name that will be used to reference the query:

SELECT * from Greeting

Clicking on the 'View Results' button immediately runs the query and places the first 7 lines of the result in the table at the bottom of the screen.

The developer has then gone to the 'Object' column and selected the Greeting object from the list. The 'Key' column has been left blank indicating that we don't want to use a key for the greeting object and that all lines returned from the query should give us a new greeting object.

Ignoring (for now, but see Advanced usage - discriminators) the 'Condition' and 'Class' columns, the developer now has to map the fields can be seen to have been returned from the live query to attributes belonging to the Greeting object. This is again done using a drop-down list to select one attribute at a time. Don't forget that you don't need to map all columns of the query nor provide mappings for all attribute of the object. Unmapped columns will be ignored, unloaded attributes will be marked unloaded in the model, which is fine as long as you don't try to use them.

The developer has done the same for the 'Country' object, though the country object does have a key which determines its uniqueness: 'CountryKey'. The code field of country is mapped from the query.

Finally, the developer must provide the link between the two objects - the query builder always uses explicit links. At the end of the list of columns retrieved from the query there are three additional columns marked Link1 to Link3. These columns also contain dropdowns that select which object is to be linked to. You do not have to provide links in both directions. The object you chose to link from is up to you, but don't do both! Doing so will cause read performance to suffer as the link has to be resolved twice.

The developer has chosen to link Country to Greeting, by selecting Greeting at line 1 (greetings) from the drop-down. In other words, each Country object will be linked to the Greeting object at line 1 via the reference called 'greetings'.

You can save your query by clicking 'Ok' and then Save All from the File menu. The query is saved down as an XML file that stores the query to be run together with the column map you have just described. If the query was called GreetingQuery, it will be saved in a file called GreetingQuery.xml.

When you are ready to use the query in code, you load the query into your java application through its xml definition file and run it in the normal way:

// Load the query from its xml definition
QueryMap stmt = new QueryMap("GreetingQuery.xml");

// Run the query
QueryRunner runner = new QueryRunner(ctx);
runner.build(stmt, null);

Note that the query maps are picked up automatically from your definition file. There is no need to add any assemblers to the query runner, nor do anything about linking up objects. The code now makes no direct reference to database entities: neither tables or attribute names, nor how to map these into the objects in the model. All of this business logic is encapsulated in the XML definition file.

This approach, apart from permitting a more transparent mapping of columns to objects through the graphical display of the mapping also creates an abstraction layer between database and code and provides a convenient point for regression testing all of the query logic in an application. In fact it is envisaged that the majority of non-trivial applications will make exclusive use of the Query Map tool too map data into objects. We would expect embedded SQL to have applications only in very simple data retrieval situations in small or tactical applications, or in situations where a dynamic query, whose text and column maps are built in code is required to meet performance requirements.

Having said that, a good deal of flexibility that is usually implemented through dynamic queries is available through parameterized where clauses in the Query Map. This is the subject of the next chapter.

Summary

  • Embedded JDBC queries have drawbacks from an application design and code transparency standpoint.

  • The Query Map Builder provides a more transparent and easy to use tool for mapping queries to objects and presents an important abstraction layer from the database.

  • Most non-trivial applications are likely to use Query Maps more heavily than the more direct mapping strategies described in earlier chapters.


Mapping Queries - writing parameterized queries

The example of a Query Map given in the previous chapter was for a query that returned all rows from the query i.e. that had no constraints. It is rarely the case that your queries will be of this form and unless the querying mechanism supports settable parameters, the strategy is likely to result in an explosion of queries performing very similar functions.

With prepared statements in JDBC, you have a limited ability to set parameters for prepared statements, in which '?' placeholders are replaced by values according to the index of the placeholder. This approach also has its drawbacks though, since the motivation for exposing this behaviour is really to give the developer API access through JDBC to the precompiled statements supported by many native database APIs.

The Query Map takes an approach that retains the performance of prepared statements, while providing a name-based interface to the settable constraints and permits any constraint to be excluded from the query where clause altogether, simply by excluding the parameter or parameters that are part of that constraint.

Query Map parameters are supported using a somewhat crude, but highly effective where expression constructor based on Boolean Reverse Polish Notation. Before looking at how parameters are established in the Query Map Builder, let's have a quick review of reverse polish notation, specifically as applied to Boolean expressions. If RPN is entirely new to you, there is a fine article from HP at http://www.hp.com/calculators/articles/rpn.html describing its benefits.

RPN is a way of writing expressions without using parenthesis or needing to know operator precedence rules. RPN expressions consist of values (or atomic expressions) and operators and are read in order (from the top down as shown in the Query Map display). The evaluator moves down the token list until it finds an operator (which in the case of a boolean expression is an AND, OR or NOT keyword). When found, it applies that keyword to the value (in the case of NOT) or two values (for AND and OR) that it previously passed over on the stack. The result of this operation is another value that is pushed back onto the stack. The evaluator continues with the next token until there is only one token left on the stack, which is the result of the calculation.

The principal reason that RPN is used in the Query Map is that you can evaluate the expression ignoring any atomic expressions by excluding them from the calculation. That is whenever an atomic expression is encountered that has not been given a value, instead of performing the above work, the expression is excluded from the calculation. This simple device permits the same query to be used with a wide range of different filter criteria.

At runtime, the where expression is built dynamically based on the set of parameters that have been set on the query. This whole expression is substituted in the SQL expression in place of a placeholder which is always written as 'WHERE1'. This use of a text placeholder is admittedly crude, but works really well especially where the where clause is in a subquery, or the query includes dialect specific grammer that would be difficult to parse with a proper SQL parser.

For example, in our HelloWorld query, you might want to extend our simple query to include the ability to select by country, by country and approximate greeting text (using an SQL pattern) or by exact greeting text only. First you need to update the query to provide the placeholder where the where clause will be written:

SELECT * from Greeting WHERE1

Now you need to write the reverse polish notation expression. This might be written as:

Reverse Polish Expression

country=[country:String]

text like [greetingLike:String]

AND

text=[greeting:String]



You might want to try this in the Hydrate UI, since the SQL that is executed depends on which of the above parameters has been set. The UI allows you to set any combination of the parameters and quickly see the resulting output.

For example, if you set just the country parameter, the SQL generated looks something like:

SELECT * from Greeting WHERE country=?

This is a prepared statement and the country parameter you have supplied will be provided as parameter #1 when the statement is run. Try different values of the parameter, clicking View Results each time, to see the effect.

If now, we also set the greetingLike parameter and click 'View Results', the SQL written transforms itself thus:

SELECT * from Greeting WHERE (country=? AND text like ?)

Note that in providing the parameter with a value, the expression appears, linked to the rest of the where clause with its attached operator. Now, try un-setting the country parameter: select all of the text in the country parameter input box and press delete. Then press 'View Results'. The query is now as follows:

SELECT * from Greeting WHERE text like ?

Any expression can be dropped out of the query and the where clause will be adjusted to only include the expressions that have values. Likewise all expressions can be included and the full where clause will appear, although sometimes this may not make much sense. In the above case, the full where clause would be:

SELECT * from Greeting WHERE text ((country=? AND text like ?) OR text=?

One apparent problem with this where clause substitution is that you can't put a normal where clause in the query that you write. As it turns out, this is not such an issue. There are two main cases where you need to do this: firstly in writing join conditions for a query that accesses multiple tables. The best approach for join clauses is to use the ANSI compliant join syntax, supported by most database engines, that expresses the join condition in the FROM clause.

SELECT c.name, o.orderQty from Customer c LEFT OUTER JOIN Order o 
ON c.id = o.custId WHERE1

The second case is where you truly have constant expressions in your where clause that need to be mixed with the dynamic parameterized expressions. You can always achieve the desired results in this case by mixing constant Boolean expressions with your parameterized expressions in the RPN where clause.

When you are ready to use your queries in your application, setting parameters couldn't be easier. Here is our previous example from the Hello World application with a parameter:

// Load the query from its xml definition and set a parameter
QueryMap stmt = new QueryMap("GreetingQuery.xml");
stmt.setString("country", "US");

// Run the query
QueryRunner runner = new QueryRunner(ctx);
runner.build(stmt, null);

Summary

  • The Query Map builder provides for parameterized where clauses that significantly improve the potential for query reuse.

  • The where clause set in the query map builder uses Boolean Reverse Polish Notation to specify the clause. Unused expressions are dropped from the where clause.

  • Parameters are set in code using an extension of the PreparedStatement interface that permits the setting of named parameters.

  • Constant expressions can be included in the where clause.


Parameters in Query Map Expressions

This looks more closely at a couple of ways that parameters can be used in where clause expressions.

You will have seen that a parameter generally takes the form [name:type], where 'name' is the name of the parameter with which it is known in application code, and type is the type of the parameter. The following types are supported:

Type

Set Param Call

String Format

int

setInt

<integer>

double

setDouble

<floating point>

String

setString

<any text>

Date

setObject

yyyy-MM-dd

Time

setObject

HH:mm:ss.SSS

Timestamp

setObject

yyyy-MM-dd'T'HH:mm:ss.SSS

byte[]

setObject

<hexadecimal>



Arrays are also supported and are indicated by one of the above types suffixed by an open/close parenthesis e.g. for a parameter that is an array of strings [names:String()]. Array parameters are very useful in squeezing better performance out of select queries through the use of 'in' clauses. You could write your where clause expression as follows:

Reverse Polish Expression

id=[id:int]

id in [ids:int()]

AND



This allows you to specify either a single id in a select query by specifying the 'id' parameter, or multiple ids by specifying the 'ids' parameter and passing in an array of integers as the query parameter. It doesn't make any sense to provide values for both 'id' and 'ids'

A single expression in the where clause can actually contain several clauses and several parameters. You could write a where clause that stored information about the received date of Widgets in your warehouse:

Reverse Polish Expression

receivedDate=[receivedOnOrAfter:Date]

receivedDate >= [recevedOnOrAfter:Date] AND receivedDate < [receivedBefore:Date]

OR



In this case you would either provide a value for 'receivedOnOrAfter' to get a list of all Widgets received on a particular date, or you could set both 'receivedOnOrAfter' and 'receivedBefore' to get all widgets received in a range of dates. This example highlights the fact that if an expression contains two parameters, both must have a value for the expression to be included in the query. In the first case with only 'receivedOnOrAfter' set, the query clause would read:

WHERE receivedDate=?

In the second case, with both parameters set, it would read:

WHERE (receivedDate=? OR (receivedDate >= ? AND receivedDate < ?)

Summary

  • Parameters in a query are typed and each parameter can be used in multiple expressions.

  • Each expression can contain multiple parameters. It will only be included in the query where clause if all parameters are specified.


Mapping Queries advanced usage - discriminators, recursive references

We have seen how to load objects from an SQL query so that each row of the query is mapped to an object. However, there are times when you want to control this mapping so that some rows are not mapped to an object or one of a set of possible objects is chosen for instantiation. This behaviour is particularly important when you are mapping different members of a single object hierarchy from a single query and depending on a field or combination of fields in the query one and only one of a set of possible concrete classes should be created.


Extending Bean Interfaces - methods and interface extensions

So far we have looked in some detail into how we can control the attribute accessor methods of the bean interface, but what if you want functionality beyond the basic getters and setters afforded by the standard bean mapping. Hydrate also permits you to include any arbitrary method in the bean interface. These method declarations are defined in the XML definition file of the model, for example:

    <Class name="Country" comment="A national identity">
<Attribute name="code" type="String" size="2"
comment="Short ISO code for country"/>
<Attribute name="telCode" type="PositiveInteger" mandatory="false"
comment="International dialling code for country"/>
<Attribute name="name" type="String" size="30"/>
<Key name="CountryKey" primary="true">
<Member name="code"/>
</Key>
<Method name="getGreetingInLanguage" type="Greeting"
throws="org.hydrateframework.sample.NoSuchLanguageException">
<Parameter name="language" type="String"/>
</Method>
</Class>

This example shows how we can add a declared method to the Country object called getGreetingInLanguage which takes a single String parameter . The implementation might look for a greeting among the list of greetings belonging to this country with a language that matches the given language. We have declared that the method can throw an exception and this is reflected in the declaration of this method.

The code generator can unfortunately not write the implementation of the method. This task is left up to the developer, but in order to ensure that the implementation is called correctly by the code-generated implementation as well as any other implementations, this method is placed in static method with the same parameters plus a parameter giving the object reference. This method can be found in the 'util' class found in the util package with a name corresponding to the class name suffixed with 'Util'. Any code that you write into this method will be preserved between code regeneration steps.

Finally you may need for your generated class to support an interface that is part of your own application domain. The methods of the interface including any getXxx or setXxx methods must match the generated method names but, with a bit of practise, it is always possible to arrange this.

Summary

  • You can add any Java method to your objects and a method will be generated for you in the bean interface.

  • Any generated Java methods will have stub definitions created in a separate static 'util' method. Any code written inside this method will not be overwritten by the code generator.


Types: Native, Built-in, Extended and Enumerations

So far we have looked at some very simple examples of data models using only string types as attributes. For Hydrate to be an effective tool, it must support a wider range of types in a way that is both flexible and extensible. The following table lists the built-in types supported by Hydrate:

Name

Description

Java Type

XML Type

Database Type

String

Textual content

String

xs:string

VARCHAR(size)

PositiveInteger

Integer number (may not be negative)

int

xs:nonNegativeInteger

INTEGER

Integer

Integer (negative or positive)

int

xs:integer

INTEGER

PositiveDouble

Floating point number (may not be negative)

double

xs:decimal

NUMERIC

Real

Floating point number (positive or negative)

double

xs:decimal

NUMERIC

Date

A date value (with no time)

Date

xs:date

DATE

Time

A time value (with no date)

Date

xs:time

TIME

Timestamp

A date/time combined value

Date

xs:dateTime

DATETIME

Boolean

A yes/no value

boolean

xs:boolean

CHAR

Blob

Binary data

byte[]

xs:hexBinary

BLOB

Void

no data (used for method return)

void

n/a

n/a

<enumeration>

Enumerated values (see below)

<type>

<type>

<type>

<class>

User supplied class

<class>

xs:string

VARCHAR(size)

<model object>

Another class in the model (used for method/parameter return values)

<interface bean>

n/a

n/a



To use any of the above classes, simply supply its name in the 'type' attribute of the attribute definition in you UML class definition file. Note that some types, (String and Blob) need an additional size parameter to be specified in the attribute definition to determine how big to make the database field. There may be some database specific limitations on this size. For example:

<Class name="Widget">
<Attribute name="stockId" type="PositiveInteger" comment="unique stock id"/>
<Attribute name="name" type="String" size="255" comment="description"/>
<Attribute name="price" type="PositiveDouble" comment="sell price of item"/>
<Method name="isInStock" type="Boolean"/>
</Class>

The type shown in the table labelled '<enumeration>' represents a Hydrate feature that enables you to declare in your UML definition file an enumerated type. Suppose for example you are compiling a database of people, and you need to store the 'marital status' of the individual. Marital status can have values of 'Single', 'Married', 'Divorced', 'Widowed', and rather than storing the text values you would like to just store the first letter of each state. You could create a separate class in your model called MaritalStatus, thus:

<Class name="MaritalStatus">
<Attribute name="code" type="String" size="1"/>
<Attribute name="displayValue" type="String" size="50"/>
<Key name="MaritalStatusKey" primary="true">
<Member name="code"/>
</Key>
</Class>

Having done this, you can create a reference to the Person object thus:

<Relationship name="PersonStatus">
<Reference name="status" toObject="MaritalStatus"
multiplicity="1" navigable="true"/>
<Reference name="people" toObject="Person"
multiplicity="0..*" navigable="false"/>
</Relationship>

This is a perfectly good way of doing enumerations and the strength of doing things this way is that it permits you to add new enumerated types into your model without the need to do an application deploy. However, the downside is that it complicates the model and adds a bunch of extra classes into the generated class directory. In the case of marital status, we are unlikely to be able to add a new marital status type without wanting to make code changes to the application anyway, so is there a lighter-weight way of doing these enumerations?

Hydrate provides the concept of an enumeration which is defined in the UML definition file, thus:

<Enumeration name="MaritalStatus" type="String" size="1">
<Option code="S" display="Single"/>
<Option code="M" display="Married"/>
<Option code="D" display="Divorced"/>
<Option code="W" display="Widowed"/>
</Enumeration>

We can then reference this enumerated type directly by name in the attribute declaration:

<Attribute name="maritalStatus" type="MaritalStatus">

Behind the scenes, Hydrate generates a class file that relates to the enumerated type and creates static instances of that class that represent each of the possible values, as well as a special 'not set' value. In code, you can refer to these static values as members of the enumeration class whose names match the display name of each option with illegal java characters replaced by '_'.

For example:

person.setMaritalStatus(MaritalStatus.Single);

In the default database schema, hydrate will also write a table that represents this enumeration and populate it with the data you have described in the definition file.

It is also possible you use your own classes as attribute types for Hydrate, however, each class that you use must implement the toString method in an unambiguous way and implement a constructor taking a String parameter that parses the result of the toString method back into the object. When accessed in Java, you will access such objects as instances of your own class. When read from, or written to the database, they will be read and written using the string representation. For example if you have a java class called 'Distance'

package com.mydomain.types;

public class DistanceType {

public double distance;

public DistanceType(String distance) {
try {
if (distance.endsWith("km")) {
this.distance = Double.parseDouble(
distance.substring(0, distance.length()-2));
return;

} else if (distance.endsWith("miles")) {
this.distance = Double.parseDouble(
distance.substring(0, distance.length()-5)) * 1.6093;
return;
}

} catch (Exception e) {
throw new IllegalArgumentException(e);
}
throw new IllegalArgumentException(distance);
}

public String toString() {
return Double.toString(distance) + "km";
}
}

in the XML definition file you could write:

<Attribute name="distance" type="com.mydomain.types.DistanceType">

This feature gives you some ability to easily extend the range of types supported by the bean methods into your own application's domain. However, the database representation of this data is less than ideal, so use this feature sparingly.

You can also provide full support for any user defined type that can be represented in a single database column, by implementing the org.hydrateframework.gen.schema.type interface. This is an advanced feature, and you should consult the javadoc for information on how to override this interface. Once done, you specify the full class name of your 'Type' implementation as the type of the attribute in your class definition.

This feature is probably most useful in overriding some of the behaviour of an existing type for example, changing the decision about the database representation of a type. To do this override the inner class: org.hydrateframework.gen.schema.Types.<type name>

Summary

  • Hydrate provides a selection of built-in types that can cover the requirements of most applications.

  • Enumerated types are useful as an efficient and easy-to-use way of declaring enumerations that are quite static.

  • You can use your own java classes as types in Hydrate (with certain limitations).

  • You can extend the possible built in types by implementing an interface or extending from an existing class.


Object factories

So far in the explanation of Hydrate operation we've glossed over the subject of concrete classes. Every time we've created an object, we've referenced it in terms of its interface bean. From the Hello World application:

Locale loc = Locale.getDefault();
Country country = new CountryKey(loc.getCountry()).findCountry(ctx);

for (Greeting greeting : country.getGreetings()) {
System.out.println(greeting.getText());
}

The above code makes no reference to concrete classes and why should it - since the interface beans provide all the functionality that is needed from the object model. However, you may want more control over the concrete classes that are instantiated and the queries that are used by default to read and write them from the database. This is where the object factories come in.

An object factory encapsulates a number of decisions involved in the object reading/writing and creating process:

  • The concrete implementation of a bean interface to create when reading objects from the database.

  • The name of the default connection as registered with the RequestContext to use when reading and writing to the default database.

  • The object assembler to use by default when copying information into the object from the database and vice versa.

  • The NamedParameterStatement query to use by default when reading and writing from and to the database.

  • A proxy class to use when concrete classes do not implement HydrateObject. (no reason to change from default).

An object factory for each object that you have declared in your model is automatically registered with the object context (when you call MyModel.prepareContext(ctx). You can choose to override the decision made by Hydrate by setting your own factory implementation on the context. Here is an example for the Hello World application. You might provide an implementation for the Country bean interface like this:

package com.mydomain.example;

public class MyCountryImpl implements Country {

// implementations must have a constructor that takes an ObjectContext
public MyCountryImp(ObjectContext ctx) {
}

public String getCode() {
.
.
}

public void setCode(String x) {
.
.
}
.
.
}

You then need to tell the object context to use this class instead of the generated class:

ctx.setFactory(Country.class, new ObjectFactory(ctx,Country.class,
com.mydomain.example.MyCountryImpl.class, CountryAssembler.class,
MyModel.CountryQuery.class, "jdbc/MyModel", CountryHelper.Proxy.class));

This line is in effect telling Hydrate that whenever it needs to create a concrete instance of the Country bean interface, it should use the concrete class com.mydomain.example.MyCountryImpl.

The ObjectFactory also determines a number of things about where and how information is read and written from the default database. Don't forget that here we are talking about the default database that is read from and written to implicitly. You have independent control over which database or databases information is read from to populate your object model since this is controlled on a query by query basis. In other words you write your query to go against a particular connection or data source which does not have to be the same as the default data source. In this way, you also control the how objects are populated by choosing the individual SQL queries that will run and how the results will map back to your object graph.

The default database and query referred to by the object factory are those used when you write information back to the database and when loading an object implicitly from the database using the object factory methods such as loadObject(...).

You might want to change the default database read/write strategy so that you read and write different parts of your model to different databases. Also, if you want to support updates to any database schema other than the one generated for you as part of the code generation process, you'll need to provide custom queries for read/write to the default database. Suppose you want to write changed greetings in the Hello World application back to the same database schema you read them from. You would need to create a new version of the read/write query for this object and establish it as the default query by replacing the default object factory:

ctx.setFactory(Country.class, new ObjectFactory(ctx,Country.class,
CountryImpl.class, CountryAssembler.class,
MyCountryQuery.class, "jdbc/MyModel", CountryHelper.Proxy.class));

The topic of how Hydrate writes to the database and how to customize what it is doing is covered in more detail in the chapter on writing to the database.

Summary

  • The object factory controls default behaviour of the framework including which concrete classes to use and which database to access when reading and writing objects from the database and when creating new objects

  • You can override any of these default behaviours to gain programmatic control over these decisions.


The generated implementations - why you would want to use them and how to extend them.

The generated implementation of the interface bean classes have been mentioned already quite a few times. These provide a minimal, but fully functional set of concrete classes that shadow the interfaces that make up the model. The context is initialized with factories that specify these concrete classes as the classes to use in implementing the model.

So what goes into these implementations? Here is a quick list:

  • Implementation of the main <ClassName> interface: Implementation of getters and setters of <ClassName> that store state in internal variables typed appropriately (using native types where possible) to the declaration of the attribute. This is compulsory for all valid implementations of the <ClassName> bean interface.

  • Key rehashing: any time any attribute that makes up part of any key changes, you need to recalculate the key and then adjust the object's position in that key's index. This is compulsory for all valid implementations that will read from or write updates to the database.

  • Reverse reference hook-up: code to ensure that when references are updated the reverse reference is updated appropriately. This applies both to single object references and collections. This is highly advisable for all implementations that will read from, and compulsory for all that write to a database.

  • Keeping track of object dirty state: any time you change an attribute or a reference, you need to keep track of the object's dirty state. This is compulsory if you are doing database updates.

  • Implementation of the <ClassName>.Init interface: Implementation of these methods to permit the internal state to be updated without keeping track of the dirty state of the object or doing reverse reference hook-up. You really need to implement this interface if you are going to do any database updates.

  • The generated class supports the HydrateObject interface and supports: save, writeAttributesXML, attributeType, isFieldPopulated, getImplementedIfc, forget and getKey. You don't need to implement this interface, but your code will be more efficient for reading from and writing to the database and XML if you do.

As you can see, much pretty much all of the work done by the generated code is quite important for one or other of the features supported by Hydrate. You are free to provide your own implementation that just does the basics, but many of the features of the Hydrate framework may be compromised.

One of the more controversial aspects of the generated objects is that they derive from an object (specifically AbstractHydrateObject) that is part of the HydrateFramework. There is understandable resistance to this from experienced developers because Java permits only one class inheritance hierarchy and it seems a bit greedy that Hydrate should grab that privilege for itself. It also creates a hard dependency between the Java objects that make up the model and the Hydrate framework. Some work has already gone into relaxing this restriction and allowing objects that do not implement the HydrateObject interface to be used and for developers to specify the base class of these implementations, although this is not yet fully part of the product. That said, only hard-core developers need ever be aware about which concrete classes are used to implement the bean interfaces: most developers will only interact with the model through interfaces which do not have any Hydrate dependencies.

Method implementations present another difficulty for the code generation approach. The code generator clearly cannot write the method implementations itself, but any changes the developer makes to the generated files will be overwritten the next time the code generation step is run. This problem has been addressed by creating a 'Util' class (in a util sub-package) for each bean interface that contains static methods that provide implementations for each of the methods that have been declared as part of the public interface of the object. So if you've declare a method 'double getProfitMargin() in object Widget, the ...util.WidgetUtil class will have a static method: double getProfitMargin(Widget o) { ... }.

The implementation of these static methods will initially be empty and it is the responsibility of the developer to fill them out, but having done so, that code is protected from being overwritten on subsequent code generation steps. This is because the code generator for the Util classes has been specially coded so that it does not overwrite the internals of any of these static methods that have been coded.

These Util classes are also different because, unlike any of the other generated classes, you'll probably want to check them into source control, and write tests around them. For this reason, you would normally tell the code generator to write these classes to a different