Call us: +1-415-738-4000

Enterprise Ehcache for Hibernate

Introduction

Enterprise Ehcache for Hibernate provides a flexible and powerful second-level cache solution for boosting the performance of Hibernate applications.

This document has the following sections:

Installing Enterprise Ehcache for Hibernate

Step 1: Requirements

  • JDK 1.6 or greater
  • Hibernate 3.2.5, 3.2.6, 3.2.7, 3.3.1, or 3.3.2 Use the same version of Hibernate throughout the cluster. Sharing of Hibernate regions between different versions of Hibernate versions is not supported.
  • Terracotta 4.2 package

Step 2: Install and Update the JAR files

For guaranteed compatibility, use the JAR files included with the Terracotta kit you are installing. Mixing with older components may cause errors or unexpected behavior.

To install the distributed cache in your application, add the following JAR files to your application's classpath:

  • ${TERRACOTTA_HOME}/ehcache/lib/ehcache-terracotta-ee-<version>.jar

    <version> is the current version of the Ehcache-Terracotta JAR.

  • ${TERRACOTTA_HOME}/ehcache/lib/ehcache-core-ee-<ehcache-version>.jar

    The Ehcache core libraries, where <ehcache-version> is the current version of Ehcache (2.4.3 or higher).

  • ${TERRACOTTA_HOME}/ehcache/lib/slf4j-<slf4j-version>.jar

    The SLF4J logging facade allows Ehcache to bind to any supported logger used by your application. Binding JARs for popular logging options are available from the SLF4J project. For convenience, the binding JAR for java.util.logging is provided in ${TERRACOTTA_HOME}/ehcache (see below).

  • ${TERRACOTTA_HOME}/ehcache/lib/slf4j-jdk14-<slf4j-version>.jar

    An SLF4J binding JAR for use with the standard java.util.logging.

  • ${TERRACOTTA_HOME}/common/terracotta-toolkit-<API>-

    runtime-ee-<version>.jar

    The Terracotta Toolkit JAR contains the Terracotta client libraries. <API> refers to the Terracotta Toolkit API version. <version> is the current version of the Terracotta Toolkit JAR.

If you are using the open-source edition of the Terracotta kit, no JAR files will have "-ee-" as part of their name.

If you are using a WAR file, add these JAR files to the WEB-INF/lib directory.

NOTE: Application Servers
Most application servers (or web containers) should work with this installation of the Terracotta Distributed Cache. However, note the following: - GlassFish – You must add the following to `domains.xml`: `-Dcom.sun.enterprise.server.ss.ASQuickStartup=false` - WebLogic – You must use the supported version of WebLogic. If using version 10.3, you must remove the xml-apis from `WEB-INF/lib` and add the following to `WEB-INF/weblogic.xml`:
    <weblogic-web-app>
     <container-descriptor>
       <prefer-web-inf-classes>true</prefer-web-inf-classes>
     </container-descriptor>
    </weblogic-web-app>
- JBoss 5.x – PermGen memory must be at least 128MB and can be set using the switch `-XX:MaxPermSize=128m`.

Step 3: Prepare Your Application for Caching

Hibernate entities that should be cached must be marked in one of the following ways:

  • Using the @Cache annotation.
  • Using the <cache> element of a class or collection mapping file (hbm.xml file).
  • Using the <class-cache> (or <collection-cache>) element in the Hibernate XML configuration file (hibernate.cfg.xml by default).

For more information on configuring Hibernate, including configuring collections for caching, see the Hibernate documentation.

In addition, you must specify a concurrency strategy for each cached entity. The following cache concurrency strategies are supported:

  • READ_ONLY
  • READ_WRITE
  • NONSTRICT_READ_WRITE
  • TRANSACTIONAL

    Transactional caches are supported with Echache 2.0 or later. See Setting Up Transactional Caches for more information on configuring a transactional cache.

See Cache Concurrency Strategies for more information on selecting a cache concurrency strategy.

Using @Cache

Add the @Cache annotation to all entities in your application code that should be cached:

@Cache(usage=CacheConcurrencyStrategy.READ_WRITE)
public class Foo {...}

@Cache must set the cache concurrency strategy for the entity, which in the example above is READ_WRITE.

Using the <cache> Element

In the Hibernate mapping file (hbm.xml file) for the target entity, set caching for the entity using the <cache> element:

<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping PUBLIC
       "-//Hibernate/Hibernate Mapping DTD 3.0//EN"
       "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">

<hibernate-mapping package="com.my.package">
 <class name="Foo" table="BAR">
   <cache usage="read-write"/>
   <id name="id" column="BAR_ID">
     <generator class="native"/>
   </id>
   <!-- Some properties go here. -->
 </class>
</hibernate-mapping>

Use the usage attribute to specify the concurrency strategy.

Using the <class-cache> Element

In hibernate.cfg.xml, set caching for an entity by using <class-cache>, a subelement of the <session-factory> element:

<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE hibernate-configuration PUBLIC
   "-//Hibernate/Hibernate Configuration DTD//EN"
   "http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd">

<hibernate-configuration>

 <session-factory name="java:some/name">
   <!-- Properties go here. -->

   <!-- mapping files -->
   <mapping resource="com/my/package/Foo.hbm.xml"/>

   <!-- cache settings -->
   <class-cache class="com.my.package.Foo" usage="read-write"/>
 </session-factory>
</hibernate-configuration>

Use the usage attribute to specify the concurrency strategy.

Step 4: Edit Configuration Files

You must edit the Hibernate configuration file to enable and specify the second-level cache provider. You must also edit the Enterprise Ehcache for Hibernate configuration file to configure caching for the Hibernate entities that will be cached and to enable Terracotta clustering.

Hibernate Configuration File

For Hibernate 4.x, add the following to your hibernate.cfg.xml:

<property name="hibernate.cache.use_second_level_cache">true</property>
<property name="hibernate.cache.region.factory_class">
   org.hibernate.cache.ehcache.EhCacheRegionFactory </property>

For Hibernate 3.3, you can improve performance by substituting a factory class for the provider class used in previous versions of Hibernate. Add the following to your hibernate.cfg.xml file:

<property name="hibernate.cache.use_second_level_cache">true</property>
<property name="hibernate.cache.region.factory_class">
   net.sf.ehcache.hibernate.EhCacheRegionFactory</property>

For Hibernate 3.2, which cannot use the factory class, add the following to your hibernate.cfg.xml file:

<property name="hibernate.cache.use_second_level_cache">true</property>
<property name="hibernate.cache.provider_class">
   net.sf.ehcache.hibernate.EhCacheProvider</property>
TIP: Singletons
To use a singleton version of the provider or factory class, substitute net.sf.ehcache.hibernate.SingletonEhCacheProvider or net.sf.ehcache.hibernate.SingletonEhCacheRegionFactory. Singleton CacheManagers are simpler to access and use, and can be helpful in less complex setups where only one configuration is required. Note that a singleton CacheManager should not be used in setups requiring mutliple configuration resources or involving multiple instances of Hibernate.
TIP: Spring Users
If you are configuring Hibernate using a Spring context file, you can enable and set the second-level cache provider using values in the hibernateProperties property in the bean definition for the session factory: hibernate.cache.use_second_level_cache=true hibernate.cache.region.factory_class= net.sf.ehcache.hibernate.EhCacheRegionFactory

For Hibernate 4, use org.hibernate.cache.ehcache.EhCacheRegionFactoryi instead of net.sf.ehcache.hibernate.EhCacheRegionFactory.

Enterprise Ehcache Configuration File

Create a basic Ehcache configuration file, ehcache.xml by default:

<?xml version="1.0" encoding="UTF-8"?>
<ehcache name="Foo"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:noNamespaceSchemaLocation="ehcache.xsd">
     <defaultCache
       maxElementsInMemory="0"
       eternal="false"
       timeToIdleSeconds="1200"
       timeToLiveSeconds="1200">
         <terracotta />
     </defaultCache>
     <terracottaConfig url="localhost:9510" />
</ehcache>

This defaultCache configuration includes Terracotta clustering. The Terracotta client must load the configuration from a file or a Terracotta server. The value of the <terracottaConfig /> element’s url attribute should contain a path to the file or the address and DSO port (9510 by default) of a server. In the example value, "localhost:9510" means that the Terracotta server is on the local host. If the Terracotta configuration source changes at a later time, it must be updated in configuration.

TIP: Terracotta Clients and Servers
In a Terracotta cluster, the application server is also known as the client.

ehcache.xml must be on your application's classpath. If you are using a WAR file, add the Ehcache configuration file to WEB-INF/classes or to a JAR file that is included in WEB-INF/lib.

Specifying Caches for Hibernate Entities

Using an Ehcache configuration file with only a defaultCache configuration means that every cached Hibernate entity is cached with the settings of that defaultCache. You can create specific cache configurations for Hibernate entities using <cache> elements.

For example, add the following <cache> block to ehcache.xml to cache a Hibernate entity that has been configured for caching (see Step 3: Prepare Your Application for Caching):

<cache name="com.my.package.Foo" maxElementsInMemory="1000"
      maxElementsOnDisk="10000" eternal="false" timeToIdleSeconds="3600"
      timeToLiveSeconds="0" memoryStoreEvictionPolicy="LFU">
 <!-- Adding the element <terracotta /> turns on Terracotta clustering for the cache Foo. -->
 <terracotta />
</cache>
Expiration Settings

You can edit the expiration settings in the defaultCache and any other caches that you configure in ehcache.xml to better fit your application's requirements. See Expiration Parameters for more information.

Step 5: Start Your Application with the Cache

You must start both your application and a Terracotta server.

  1. Start the Terracotta server with the following command:

    UNIX/Linux

    [PROMPT] ${TERRACOTTA_HOME}/bin/start-tc-server.sh &
    

    Microsoft Windows

    [PROMPT] ${TERRACOTTA_HOME}\bin\start-tc-server.bat
    
  2. Start your application.

    Your application should now be running with the Terracotta second-level cache.

  3. Start the Terracotta Developer Console. To view the cluster along with the cache, run the following command to start the Terracotta Developer Console:

    UNIX/Linux

    [PROMPT] ${TERRACOTTA_HOME}/bin/dev-console.sh &
    

    Microsoft Windows

    [PROMPT] ${TERRACOTTA_HOME}\bin\dev-console.bat
    
  4. On the console's initial panel, click Connect....
  5. In the cluster navigation tree, navigate to Terracotta cluster > My application > Hibernate.

    Hibernate and second-level cache statistics, as well as other visibility and control panels should be available.

Step 6: Edit the Terracotta Configuration

This step shows you how to run clients and servers on separate machines and add failover (High Availability). You will expand the Terracotta cluster and add High Availability by doing the following:

  • Moving the Terracotta server to its own machine
  • Creating a cluster with multiple Terracotta servers
  • Creating multiple application nodes

These tasks bring your cluster closer to a production architecture.

Procedure:

  1. Shut down the Terracotta cluster.
  2. Create a Terracotta configuration file called tc-config.xml with contents similar to the following:

    <?xml version="1.0" encoding="UTF-8"?>
    <!-- All content copyright Terracotta, Inc., unless otherwise indicated. 
     All rights reserved. -->
    <tc:tc-config xsi:schemaLocation="http://www.terracotta.org/schema/terracotta-4.xsd"
              xmlns:tc="http://www.terracotta.org/config" 
              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    
     <servers>
       <!-- Sets where the Terracotta server can be found. Replace the value of
            host with the server's IP address. -->
       <server host="server.1.ip.address" name="Server1">
         <data>%(user.home)/terracotta/server-data</data>
         <logs>%(user.home)/terracotta/server-logs</logs>
       </server>
       <!-- If using a standby Terracotta server, also referred to as an ACTIVE-
            PASSIVE configuration, add the second server here. -->
       <server host="server.2.ip.address" name="Server2">
         <data>%(user.home)/terracotta/server-data</data>
         <logs>%(user.home)/terracotta/server-logs</logs>
       </server>
       <ha>
          <mode>networked-active-passive</mode>
          <networked-active-passive>
             <election-time>5</election-time>
          </networked-active-passive>
       </ha>
     </servers>
     <!-- Sets where the generated client logs are saved on clients. -->
     <clients>
       <logs>%(user.home)/terracotta/client-logs</logs>
     </clients>
    </tc:tc-config>
    
  3. Install Terracotta 4.2 on a separate machine for each server you configure in tc-config.xml.

  4. Copy the tc-config.xml to a location accessible to the Terracotta servers.
  5. Perform Step 2: Install and Update the JAR files and Step 4: Edit Configuration Files steps on each application node you want to run in the cluster. Be sure to install your application and any application servers on each node.

  6. Edit the <terracottaConfig> element in Terracotta Distributed Ehcache for Hibernate configuration file, ehcache.xml, that you created above:

    <!-- Add the servers that are configured in tc-config.xml. -->
    <terracottaConfig url="server.1.ip.address:9510,server.2.ip.address:9510" />
    

    Later in this procedure, you will see where to get more information on editing the settings in the configuration file.

  7. Copy ehcache.xml to each application node and ensure that it is on your application's classpath (or in WEB-INF/classes for web applications).
  8. Start the Terracotta server in the following way, replacing "Server1" with the name you gave your server in tc-config.xml:

    UNIX/Linux

    [PROMPT] ${TERRACOTTA_HOME}/bin/start-tc-server.sh -f <path/to/tc-config.xml> \
         -n Server1 &
    

    Microsoft Windows

    [PROMPT] ${TERRACOTTA_HOME}\bin\start-tc-server.bat -f <path\to\tc-config.xml> ^
         -n Server1 &
    

    If you configured a second server, start that server in the same way on its machine, entering its name after the -n flag. The second server to start up becomes the "hot" standby, or PASSIVE. Any other servers you configured will also start up as standby servers.

  9. Start all application servers.
  10. Start the Terracotta Developer Console and view the cluster.

Step 7: Learn More

To learn more about working with a Terracotta cluster, see the following documents:

Testing and Tuning Enterprise Ehcache for Hibernate

This document shows you how to test and tune Enterprise Ehcache for Hibernate.

TIP: Top Tuning Tips
- Set Expiration Parameters - Turn Off Query Cache - Reduce Unnecessary Database Connections - Configure Database Connection Pools - Turn off Unnecessary Statistics Gathering

Testing the Cache

The main benefit of a Hibernate second-level cache is raising performance by decreasing the number of times an application accesses the database. To gauge the level of database offloading provided by the Enterprise Ehcache for Hibernate second-level cache, look for these benefits:

  • Server CPU offload – The CPU load on the database server should decrease.
  • Lower latency – The latency for returning data should decrease.
  • Higher Transactions per second (TPS) – The TPS rate should increase.
  • More concurrency – The number of threads that can access data should increase.

The number of threads that can simultaneously access the distributed second-level cache can be scaled up more easily and efficiently than database connections, which generally are limited by the size of the connection pool.

You should record measurements for all of these factors before enabling the Enterprise Ehcache for Hibernate second-level cache to create a benchmark against which you can assess the impact of using the cache. You should also record measurements for all of these factors before tuning the cache to gauge the impact of any tuning changes you make.

Another important test in addition to performance testing is verifying that the expected data is being loaded. For example, loading one entity can result in multiple cache entries. One approach to tracking cache operations is to set Hibernate cache logging to "debug" in log4j.properties:

log4j.logger.org.hibernate.cache=debug

This level of logging should not be used during performance testing.

NOTE: Optimizing Cache Performance
Before doing performance testing, you should read through the rest of this document to learn about optimizing cache performance. Some performance optimization can be done ahead of time, while some may require testing to reveal its applicability.

When using a testing framework, ensure that the framework does not cause a performance bottleneck and skew results.

Optimizing the Cache Size

Caches that get too large may become inefficient and suffer from performance degradation. A growing rate of flushing and faulting is an indication of a cache that's become too large and should be pruned.

Explicit sizing of caches is discussed in the Ehcache documentation.

Expiration Parameters

Expiration is important because it forces unneeded data to be automatically evicted when accessed or when constraints require resources to be freed. The most important parameters for tuning cache size and cache performance in general are the following:

  • timeToIdle (TTI) – This parameter controls how long an entity can remain in the cache without being accessed at least once. TTI is reset each time an entity is accessed. Use TTI to evict little-used entities to shrink the cache or make room for more frequently used entities. Adjust the TTI up if the faulting rate (data faulted in from the database) seems too high, and lower it if flushing (data cleared from the cache) seems too high.
  • timeToLive (TTL) – This parameter controls how long an entity can remain in the cache, regardless of how often it is used (it is never overridden by TTI). Use TTL to prevent the cache from holding stale data. As entities are evicted by TTL, fresh versions are cached the next time they are accessed.

TTI and TTL are set in seconds. Other options for limiting the life of data and the size of caches are discussed in the Ehcache documentation.

How to Set Expiration Parameters

You can set expiration parameters in these ways:

  • In ehcache.xml – Configuration file for Enterprise Ehcache for Hibernate with properties for controlling expiration on a per-cache basis. See the Ehcache documentation for more information.
  • In the Terracotta Developer Console – The GUI for Hibernate second-level cache allows you to apply real-time values to expiration parameters and export a configuration file. For more information, see Enterprise Ehcache for Hibernate Applications.
  • Programmatically – When creating caches programmatically.

After setting expiration parameters, be sure to test the effect on performance (see Testing the Cache).

Reducing the Cache Miss Rate

The cache miss rate is a measure of requests that the cache could not meet. Each miss can lead to a fault which requires a database query. (However, misses and faults are not one-to-one since a query can return results that satisfy more than one miss.) A high or growing cache miss rate indicates the cache should be optimized.

To lower the miss rate, adjust for regions containing entities with high access rates to evict less frequently. This keeps popular entities in the cache for longer periods of time. You should adjust expiration parameter values incrementally and carefully observe the effect on the cache miss rate. For example, TTI and TTL that are set too high can introduce other drawbacks, such as stale data or overly large caches.

Expiration Example

Applications that use Enterprise Ehcache for Hibernate to implement the second-level should have TTI and TTL properly tuned to prevent unnecessarily large data caches and stale values.

The following sections detail how certain cached data is configured for second-level caching in a test-taking web-based application that can handle thousands of concurrent users. Included are snippets from the Enterprise Ehcache for Hibernate configuration file (see Cache Configuration File).

User Roles

The data defining user roles has the following characteristics:

  • Never changes – User roles are fixed (read only).
  • Accessed frequently – Each user session must have a user role.

Therefore, user roles are cached and never evicted (TTI=0, TTL=0). In general, read-only data that is used frequently and never grows stale should be cached continuously.

<cache name="org.terracotta.reference.exam.domain.UserRole"
   maxEntriesLocalHeap="1000"
   eternal="false"
   timeToIdleSeconds="0"
   timeToLiveSeconds="0">
   <persistence strategy="distributed"/>
   <terracotta/>
</cache>
User Data

User data, which includes the user entity and its role, is useful only while the user is active. This data has the following characteristics:

  • Access is unpredictable – User interaction with the application is unpredictable and can be sporadic.
  • Lifetime is unpredictable – The data is useful as long as the user session has activity. Only when the user becomes inactive are the associated entities idle.

Therefore, these entities should have a short idle time of two minutes (TTI=120) to allow data associated with inactive user sessions to be evicted. However, there should never be evicted based on a hard lifetime (TTL=0), thus allowing the associated entities to be cached indefinitely as long as TTI is reset by activity.

<cache name="org.terracotta.reference.exam.domain.User"
   maxEntriesLocalHeap="1000"
   eternal="false"
   timeToIdleSeconds="120"
   timeToLiveSeconds="00">
   <persistence strategy="distributed"/>
   <terracotta/>
</cache>
<cache name="org.terracotta.reference.exam.domain.User.roles"
   maxEntriesLocalHeap="1000"
   eternal="false"
   timeToIdleSeconds="120"
   timeToLiveSeconds="0">
   <persistence strategy="distributed"/>
   <terracotta/>
</cache>
Exam Data

Exam data is includes the actual exams being taken by users. It has the following characteristics:

  • Rarely changes – There is the potential for exam questions to be changed in the database, but this happens infrequently.
  • Data set is large – There can be any number of exams, and not all of them can be cached due to limitations on the size of the cache.

Since there can be many different exams, and the potential exists for a cached exam to become stale, cached exams should be periodically evicted based on lack of access (TTI=3600) and to ensure they are up-to-date (TTL=86400).

<cache name="org.terracotta.reference.exam.domain.Exam"
   maxEntriesLocalHeap="1000"
   eternal="false"
   timeToIdleSeconds="3600"
   timeToLiveSeconds="86400">
   <persistence strategy="distributed"/>
   <terracotta/>
</cache>
<cache name="org.terracotta.reference.exam.domain.Section"
   maxEntriesLocalHeap="1000"
   eternal="false"
   timeToIdleSeconds="3600"
   timeToLiveSeconds="86400">
   <persistence strategy="distributed"/>
   <terracotta/>
      <terracotta/>
</cache>
<cache name="org.terracotta.reference.exam.domain.Section.questions"
   maxEntriesLocalHeap="1000"
   eternal="false"
   timeToIdleSeconds="3600"
   timeToLiveSeconds="86400">
   <persistence strategy="distributed"/>
   <terracotta/>
</cache>
<cache name="org.terracotta.reference.exam.domain.Section.sections"
   maxEntriesLocalHeap="1000"
   eternal="false"
   timeToIdleSeconds="3600"
   timeToLiveSeconds="86400">
   <persistence strategy="distributed"/>
   <terracotta/>
</cache>
<cache name="org.terracotta.reference.exam.domain.Question"
   maxEntriesLocalHeap="1000"
   eternal="false"
   timeToIdleSeconds="3600"
   timeToLiveSeconds="86400">
   <persistence strategy="distributed"/>
   <terracotta/>
</cache>
<cache name="org.terracotta.reference.exam.domain.Question.choices"
   maxEntriesLocalHeap="1000"
   eternal="false"
   timeToIdleSeconds="3600"
   timeToLiveSeconds="86400">
   <persistence strategy="distributed"/>
   <terracotta/>
</cache>
<cache name="org.terracotta.reference.exam.domain.Choice"
   maxEntriesLocalHeap="1000"
   eternal="false"
   timeToIdleSeconds="3600"
   timeToLiveSeconds="86400">
   <persistence strategy="distributed"/>
   <terracotta/>
</cache>

Optimizing for Read-Only Data

If your application caches read-only data, the following may improve performance:

Reducing Unnecessary Database Connections

The JDBC mode Autocommit automatically writes changes to the database, making it unnecessary for an application to do so explicitly. However, unnecessary database connections can result from Autocommit because of the way JDBC drivers are designed. For example, transactional read-only operations in Hibernate, even those that are resolved in the second-level cache, still generate "empty" database connections. This situation, which can be tracked in database logs, can quickly have a detrimental effect on performance.

Turning off Autocommit should prevent empty database connections, but may not work in all cases. Lazily fetching JDBC connections resolves the issue by preventing JDBC calls until a connection to the database actually needed.

NOTE: Autocommit
While Autocommit should be turned off to reduce unnecessary database connections for applications that create their own transaction boundaries, it may be useful for applications with on-demand (lazy) loading of data. You should investigate Autocommit with your application to discover its effect.

Two options are provided for implementing lazy fetching of database connections:

Lazy Fetching with Spring-Managed Transactions

If your application is based on the Spring framework, turning off Autocommit may not be enough to reduce unnecessary database connections for transactional read operations. You can prevent these empty database connections from occurring by using the Spring LazyConnectionDataSourceProxy proxy definition. The proxy holds unnecessary JDBC calls until a connection to the database is actually required, at which time the held calls are applied.

To implement the proxy, create a target DataSource definition (or rename your existing target DataSource) and a LazyConnectionDataSourceProxy proxy definition in the Spring application context file:

<!-- Renamed the existing target DataSource to 'dataSourceTarget' 
     which will be used by the proxy. -->
<bean id="dataSourceTarget"
class="org.apache.commons.dbcp.BasicDataSource"
    destroy-method="close">
  <property name="driverClassName"><value>com.mysql.jdbc.Driver</value></property>
  <property name="url"><value>jdbc:mysql://localhost:3306/imagedb</value></property>
  <property name="username"><value>admin</value></property>
  <property name="password"><value></value></property>
  <!-- other datasource configuration properties -->
</bean>
<!-- This is the lazy DataSource proxy that interacts with the target 
     DataSource once a real statement is sent to the database. 
 Users use this DataSource to set up their Hibernate session factory, 
 which in turn forces the Hibernate second-level cache and also 
 everything that interacts with that Hibernate session factory to use it. -->
<bean id="dataSource"
class="org.springframework.jdbc.datasource.LazyConnectionDataSourceProxy">
  <property name="targetDataSource"><ref local="dataSourceTarget"/></property>
</bean>

Your application's SessionFactory, transaction manager, and all DAOs should access the proxy. Since the proxy implements the DataSource interface too, it can simply be passed in instead of the target DataSource.

See the Spring documentation for more information.

Lazy Fetching for Non Spring Applications

By implementing a custom Hibernate connection provider, you can use the LazyConnectionDataSourceProxy in a non-Spring based application:

public class LazyDBCPConnectionProvider implements ConnectionProvider {
   private DataSource ds;
   private BasicDataSource basicDs;
   public void configure(Properties props) throws HibernateException {
       // DBCP properties used to create the BasicDataSource
       Properties dbcpProperties = new Properties();
       // set some DBCP properties or implement logic to get them from 
      the Hibernate config
       try {
           // Let the factory create the pool
           basicDs = (BasicDataSource)BasicDataSourceFactory.createDataSource(dbcpProperties);
           ds = new LazyConnectionDataSourceProxy(basicDs);
           // The BasicDataSource has lazy initialization
           // borrowing a connection will start the DataSource
           // and make sure it is configured correctly.
           Connection conn = ds.getConnection();
           conn.close();
       } catch (Exception e) {
           String message = "Could not create a DBCP pool";
           if (basicDs != null) {
               try {
                   basicDs.close();
               } catch (Exception e2) {
                   // ignore
               }
               ds = null;
               basicDs = null;
           }
           throw new HibernateException(message, e);
       }
   }
   public Connection getConnection() throws SQLException {
       return ds.getConnection();
   }
   public void closeConnection(Connection conn) throws SQLException {
       conn.close();
   }
   public void close() throws HibernateException {
       try {
           if (basicDs != null) {
               basicDs.close();
               ds = null;
               basicDs = null;
           }
       } catch (Exception e) {
           throw new HibernateException("Could not close DBCP pool", e);
       }
   }
   public boolean supportsAggressiveRelease() {
       return false;
   }
}

To use the custom connection provider, update hibernate.cfg.xml with the following property:

<property name="connection.provider_class">LazyDBCPConnectionProvider</property>

Reducing Memory Usage with Batch Processing

If your application must perform a large number of insertions or updates with Hibernate, a potential antipattern can emerge from the fact that all transactional insertions or updates in a session are stored in the first-level cache until flushed. Therefore, waiting to flush until the transaction is committed can result in an OutOfMemoryException (OOME) during large operations of this type.

You can prevent OOMEs in this case by processing the insertions or updates in batches, flushing after each batch. The Hibernate core documentation gives the following example for inserts:

Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();

for ( int i=0; i<100000; i++ ) {
   Customer customer = new Customer(.....);
   session.save(customer);
   if ( i % 20 == 0 ) { //20, same as the JDBC batch size
       //flush a batch of inserts and release memory:
       session.flush();
       session.clear();
   }
}

tx.commit();
session.close();
TIP: session.clear()
The performance of session.clear() has been improved in Hibernate 3.3.2.

Updates can be batched similarly. The JDBC batch size referred to in the comment above is set in the Hibernate configuration property hibernate.jdbc.batch_size. For more information, see "Batch processing" in the Hibernate core documentation.

Other Important Tuning Factors

The following factors could affect the performance of your second-level cache.

Query Cache

This Hibernate feature creates overhead regardless of how many queries are actually cached. For example, it records timestamps for entities even if not caching the related queries. Query cache is on if the following element is set in hibernate.cfg.xml:

<property name="hibernate.cache.use_query_cache">true</property>

If query cache is turned on, two specially-named cache regions appear in the Terracotta Developer Console cache-regions list. The two regions are the query cache and the timestamp cache.

Unless you are certain that the query cache benefits your application, it is recommended that you turn it off (set hibernate.cache.use_query_cache to "false").

Connection Pools

If your installation of Hibernate uses JDBC directly, you use a connection pool to create and manage the JDBC connections to a database. Hibernate provides a default connection pool and supports a number of different connection pools. The low-performance default connection pool is inadequate for more then just initial development and testing. Use one of the supported connection pools, such as C3P0 or DBCP, and be sure to set the number of connections to an optimal amount for your application.

Local Key Cache

Enterprise Ehcache for Hibernate can cache a "hotset" of keys on clients to add locality-of-reference, a feature suitable for read-only cases. Note that the set of keys must be small enough for available memory.

See Terracotta Clustering Configuration Elements for more information on configuring a local key cache.

Hibernate CacheMode

CacheMode* is the Hibernate class that controls how a session interacts with second-level and query caches.

If your application explicitly warms the cache (reloads entities), CacheMode should be set to REFRESH to prevent unnecessary reads and null checks.

Cache Concurrency Strategy

If your application can tolerate somewhat inconsistent views of data, and the data does not change frequently, consider changing the cache concurrency strategy from READ_WRITE to NONSTRICT_READ_WRITE to boost performance. See Cache Concurrency Strategies for more information on cache concurrency strategies.

Terracotta Server Optimization

You can optimize the Terracotta servers in your cluster to improve cluster performance with a second-level cache. Some server optimization requires editing the Terracotta configuration file. For more information on Terracotta configuration file, see:

Test the following recommendations to gauge their impact on performance.

Less Aggressive Memory Management

By default, Terracotta servers clear a certain amount of heap memory based on the percentage of memory used. You can configure a Terracotta server to be less aggressive in clearing heap memory by raising the threshold that triggers this action. Allowing more data to remain in memory makes larger caches more efficient by reducing the server's swap-to-disk dependence. Be sure to test any changes to the threshold to confirm that the server doesn't suffer an OOME by failing to effectively manage memory at the new threshold level.

The default threshold is 70 (70 percent of heap memory used). Raise the threshold by setting a higher value for the Terracotta propertyl2.cachemanager.threshold in one of the following ways.

Create a Java Property

To set the threshold at 90, add the following option to $JAVA_OPTS before starting the Terracotta server:

-Dcom.tc.l2.cachemanager.threshold=90

Be sure to export JAVA_OPTS. If you adjust the threshold value after the server is running, you must restart the Terracotta server for the new value to take effect.

Add to Terracotta Configuration

Add the following configuration to the top of the Terracotta configuration file (tc-config.xml by default) before starting the Terracotta server:

<tc-properties>
    <property name="l2.cachemanager.threshold" value="90" />
</tc-properties>

You must start the Terracotta server with the configuration file you've updated:

start-tc-server.sh -f <path_to_configuration_file>

Use start-tc-server.bat in Microsoft Windows.

Run in Non-Persistent Mode

If your data is backed by a database, and no critical data exists only in memory, you can run the Terracotta server in non-persistent mode (temporary-swap-only mode). By default, Terracotta servers are set to non-persistent mode. For more information on persistence, see the Terracotta Configuration Guide and Reference.

Reduce the Berkeley DB Memory Footprint

Terracotta allots a certain percentage of memory to Berkeley DB, the database application used to manage the disk store. The default is 25 percent. Under the following circumstances, this percentage can be reduced:

  • Running in temporary-swap-only mode (see Run in Non-Persistent Mode) requires less memory for Berkeley DB since it is managing less data.
  • Running with a large heap size may require a smaller percentage of memory for Berkeley DB.

For example, if Berkeley DB has a fixed requirement of 300– 400MB of memory, and the heap size is set to 6GB, Berkeley DB can be allotted eight percent. You can set the percentage using the Terracotta property l2.berkeleydb.je.maxMemoryPercent in one of the following ways.

Create a Java Property

To set the percentage at 8, add the following option to $JAVA_OPTS (or $JAVA_OPTIONS) before starting the Terracotta server:

-Dcom.tc.l2.berkeleydb.je.maxMemoryPercent=8

Be sure to export JAVA_OPTS (or JAVA_OPTIONS). If you adjust the percentage value after the server is running, you must restart the Terracotta server for the new value to take effect.

Add to Terracotta Configuration

Add the following configuration to the top of the Terracotta configuration file (tc-config.xml by default) before starting the Terracotta server:

<tc-properties>
    <property name="l2.berkeleydb.je.maxMemoryPercent" value="8" />
</tc-properties>

You must start the Terracotta server with the configuration file you've updated:

start-tc-server.sh -f <path_to_configuration_file>

Use start-tc-server.bat in Microsoft Windows.

If you lower the value of l2.berkeleydb.je.maxMemoryPercent, be sure to test the new value's effectiveness by noting the amount of flushing to disk that occurs in the Terracotta server. If flushing rises to a level that impacts performance, increase the value of l2.berkeleydb.je.maxMemoryPercent incrementally until an optimal level is observed.

Statistics Gathering

Each time you connect to the Terracotta cluster with the Developer Console and go to the second-level cache node, Hibernate and cache statistics gathering is automatically started. Since this may have a negative impact on performance, consider disabling statistics gathering during performance tests and in production if you continue to use the Developer Console. To disable statistics gathering, navigate to the Overview panel in the Hibernate view, then click Disable Statistics.

Logging

There is a negative impact on performance if logging is set. Consider disabling statistics logging during performance tests and in production.

To disable statistics gathering in the Terracotta Developer Console, navigate to the Configuration panel in the Hibernate view, then select the target regions in the list and clear Logging enabled if it is set.

To disable debug logging for Enterprise Ehcache, set the logging level for the clustered store to be less granular than FINE.

Java Garbage Collection

Garbage Collection (GC) should be aggressive. Consider using the -server Java option on all application servers to force a "server" GC strategy.

Database Tuning

A well-tuned database reduces latency and improves performance:

  • Indexes should be optimized for your application. Databases should be indexed to load data quickly, based on the types of queries your application performs (type of key used, for example).
  • Database tables should be of a format that is optimized for your application. In MySQL, for example, the InnoDB format provides better performance than the default MyISAM (or the older ISAM) format if your application performs many transactions and uses foreign keys.
  • Ensure that the database is set to accept at least as many connections as the connection pool can open. See Connection Pools for more information.

The following are issues that could affect the functioning of Enterprise Ehcache for Hibernate.

Unwanted Synchronization with Hibernate Direct Field Access

When direct field access is used, Hibernate uses reflection to access fields, triggering unwanted synchronization that can degrade performance across a cluster. See this JIRA issue for more information.

Hibernate Exception Thrown With Cascade Option

Under certain circumstances, using a cascade="all-delete-orphan" can throw a Hibernate exception. This will happen if you load an object with a cascade="all-delete-orphan" collection and then remove the reference to the collection. Don't replace this collection, use clear() so the orphan-deletion algorithm can detect your change. See the Hibernate troubleshooting issues for more information.

Cacheable Entities and Collections Not Cached

Certain data that should be in the second-level cache may not have been configured for caching (or may have not been configured correctly). This oversight may not cause an error, but may impact performance. See Finding Cacheable Entities and Collections for more information.

Enterprise Ehcache for Hibernate Reference

This document contains technical reference information for Enterprise Ehcache for Hibernate.

Cache Configuration File

Note the following about ehcache.xml in a Terracotta cluster:

  • The copy on disk is loaded into memory from the first Terracotta client (also called application server or node) to join the cluster.
  • Once loaded, the configuration is persisted in memory by the Terracotta servers in the cluster and survives client restarts.
  • In-memory configuration can be edited in the Terracotta Developer Console. Changes take effect immediately but are not written to the original on-disk copy of ehcache.xml.
  • The in-memory cache configuration is removed with server restarts if the servers are in non-persistent mode, which is the default. The original (on-disk) ehcache.xml is loaded.
  • The in-memory cache configuration survives server restarts if the servers are in persistent mode (default is non-persistent). If you are using the Terracotta servers with persistence of shared data, and you want the cluster to load the original (on-disk) ehcache.xml, the servers' database must be wiped by removing the data files from the servers' server-data directory. This directory is specified in the Terracotta configuration file in effect (tc-config.xml by default). Wiping the database causes all persisted shared data to be lost.

Setting Cache Eviction

Cache eviction removes elements from the cache based on parameters with configurable values. Having an optimal eviction configuration is critical to maintaining cache performance. For more information on cache eviction, see Setting Cache Eviction.

See How Configuration Affects Element Eviction for more informaton on how configuration can impact eviction. See Terracotta Clustering Configuration Elements for definitions of other available configuration properties.

Cache-Configuration File Properties

See Terracotta Clustering Configuration Elements for more information.

Exporting Configuration from the Developer Console

To create or edit a cache configuration in a live cluster, see Editing Cache Configuration.

To persist custom cache configuration values, create a cache configuration file by exporting customized configuration from the Terracotta Developer Console or create a file that conforms to the required format. This file must take the place of any configuration file used when the cluster was last started.

Migrating From an Existing Second-Level Cache

If you are migrating from another second-level cache provider, recreate the structure and values of your cache configuration in ehcache.xml. Then simply follow the directions for installing and configuring Enterprise Ehcache for Hibernate in Enterprise Ehcache for Hibernate Express Installation.

Cache Concurrency Strategies

A cache concurrency strategy controls how the second-level cache is updated based on how often data is likely to change. Cache concurrency is set using the usage attribute in one of the following ways:

  • With the @Cache annotation:

    @Cache(usage=CacheConcurrencyStrategy.READ_WRITE)

  • In the cache-mapping configuration entry in the Hibernate configuration file hibernate.cfg.xml.

  • In the <cache> property of a class or collection mapping file (hbm file).

Supported cache concurrency strategies are described in the following sections.

READ_ONLY

The READ_ONLY strategy works well for unchanging reference data. It can also work in use cases where the cache is periodically invalidated by an external event. That event can flush the cache, then allow it to repopulate.

WARNING: Using this strategy with transactional caches can cause unpredictable results.

READ_WRITE

The READ_WRITE strategy works well for data that changes and must be committed. READ_WRITE guarantees correct data at all times by using locks to ensure that transactions are not open to more than one thread.

WARNING: To avoid errors and unpredictable behavior, use this strategy only with caches that have strong consistency and do not have nonstop mode enabled.

If a cached element is created or changed in the database, READ_WRITE updates the cache after the transaction completes. A check is done for the element's existence and (if the element exists) for an existing lock. The cached element is guaranteed to be the same version as the one in database.

Note, however, that Hibernate may lock elements before a transaction (update or delete) completes to the database. In this case, other transactions attempting to access those elements will miss and be forced to retrieve the data from the database.

Cache loading is done with checks for existence and version (existing elements that are newer are not replaced).

Enterprise Ehcache for Hibernate is designed to maximize performance with READ_WRITE strategies when the data involved is partitioned by your application (using sticky sessions, for example). However, caching needs are application-dependent and should be investigated on a case-by-case basis.

NONSTRICT_READ_WRITE

The NONSTRICT_READ_WRITE strategy is similar to READ_WRITE, but may provide better performance. NONSTRICT_READ_WRITE works well for data that changes and must be committed, but it does not guarantee exclusivity or consistency (and so avoids the associated performance costs). This strategy allows more than one transaction to simultaneously write to the same entity, and is intended for applications able to tolerate caches that may at times be out of sync with the database.

This strategy works best with caches that have strong consistency. It will show some degradation with caches that have eventual consistency instead. It may be able to continue functioning with nonstop mode, though will do better without it.

WARNING: Using this strategy with transactional caches can cause unpredictable results.

Because it does not guarantee the stability of data as it is changed in the database, NONSTRICT_READ_WRITE does not update the cache when an element is created or changed in the database. However, elements that are updated in the database, whether or not the transaction completes, are removed from the cache.

Cache loading is done with no checks, and get() operations return null for nonexistent elements.

TRANSACTIONAL

The TRANSACTIONAL strategy is intended for use in an environment utilizing the Java Transaction API (JTA) to manage transactions across a number of XA resources. This strategy guarantees that a cache remains in sync with other resources, such as databases and queues. Hibernate does not use locking for any type of access, but relies instead on a properly configured transactional cache to handle transaction isolation and data integrity.

WARNING: Use this strategy with transactional caches only. Using with other types of caches will cause errors.

The TRANSACTIONAL strategy is supported in Ehcache 2.0 and higher. For more information on how to set up a second-level cache with transactional caches, see Setting Up Transactional Caches.

How Entitymanagers Choose the Data Source

Entitymanagers can read data from the cache, or from the database. Which source the entitymanager selects depends on the cache concurrency strategy chosen.

With NONSTRICT_READ_WRITE, it is possible that an entitymanager will query the database more often if frequent updates are invalidating target cache entries.

With READ_WRITE, updates do not invalidate the cache, and so an entitymanager may read from the cache (same as TRANSACTIONAL). However, under READ_WRITE, an entitymanager will have to read from the database if the target cache entry is under a lock at the time the read attempt is made. NONSTRICT_READ_WRITE, on the other hand, may read a stale value from the cache if the read attempt is made before the transaction completes.

READ_WRITE also forces use of the entitymanager's timestamp for comparison purposes when evaluating the freshness of data, which again can lead to more database access operations.

Setting Up Transactional Caches

If your application is using JTA, you can set up transactional caches in a second-level cache with Enterprise Ehcache for Hibernate. To do so, ensure the following:

Ehcache
  • You are using Ehcache 2.1.0 or higher.
  • The attribute transactionalMode is set to "xa" or "xa-strict".
  • The cache is clustered (the <cache> element has the subelement <terracotta clustered="true">). For example, the following cache is configured to be transactional:

    <cache name="com.my.package.Foo"
         ... 
         transactionalMode="xa">
       <terracotta clustered="true"/>
    </cache>
    
  • The cache UpdateTimestampsCache is not configured to be transactional. Hibernate updates org.hibernate.cache.UpdateTimestampsCache that prevents it from being able to participate in XA transactions.

Hibernate
  • You are using Hibernate 3.3.
  • The factory class used for the second-level cache is net.sf.ehcache.hibernate.EhCacheRegionFactory.
  • Query cache is turned off.
  • The value of current_session_context_class is jta.
  • The value of transaction.manager_lookup_class is the name of a TransactionManagerLookup class (see your Transaction Manager).
  • The value of transaction.factory_class is the name of a TransactionFactory class to use with the Hibernate Transaction API.
  • The cache concurrency strategy is set to TRANSACTIONAL. For example, to set the cache concurrency strategy for com.my.package.Foo in hibernate.cfg.xml:

    <class-cache class="com.my.package.Foo" usage="transactional"/>
    

    Or in a Hibernate mapping file (hbm file):

    <cache usage="transactional"/>
    

    Or using annotations:

    @Cache(usage=CacheConcurrencyStrategy.TRANSACTIONAL)
    public class Foo {...}
    

    For more on cache concurrency strategies, see Cache Concurrency Strategies.

Configuring Multiple Hibernate Applications

If you are using more than one Hibernate web application with the Terracotta second-level cache, additional configuration is needed to allow for multiple classloaders. See the section on configuring an application group (app-groups) in the Configuration Guide and Reference for more information on configuring application groups.

Finding Cacheable Entities and Collections

Certain data that should be in the second-level cache may not have been configured for caching. This oversight may not cause an error, but may impact performance.

Using the Terracotta Developer Console, you can compare the set of cached regions with the set of all Hibernate entities and collections. Note any items, such as collections containing fixed or slow-changing data, that appear as Hibernate entities but do not have corresponding cache regions.

Cache Regions in the Object Browser

If the Enterprise Ehcache for Hibernate second-level cache is being clustered correctly, a Terracotta root representing the second-level cache appears in the Terracotta Developer Console's object browser. Under this root, which exists in every client (application server), are the cached regions and their children.

You can use this root to verify that the second-level cache is running and is clustered with Terracotta:

  1. Start the Terracotta server:

    UNIX/Linux

    [PROMPT] ${TERRACOTTA_HOME}/bin/start-tc-server.sh -f <path_to_tc-config.xml> &
    

    Microsoft Windows

    [PROMPT] ${TERRACOTTA_HOME}\bin\start-tc-server.bat -f <path_to_tc-config.xml>
    
  2. Start your application. You can start more than one instance of your application.
  3. Start the Terracotta Developer Console:

    UNIX/Linux

    [PROMPT] ${TERRACOTTA_HOME}/bin/dev-console.sh &
    

    Microsoft Windows

    [PROMPT] ${TERRACOTTA_HOME}\bin\dev-console.bat
    

Using the Terracotta Developer Console, verify that there is a root named default:terracottaHibernateCaches. For each Terracotta client (application server), the caches should appear as MapEntry objects under this root, one per cache region. The data itself is found inside these cache-region entries.

Hibernate Statistics Sampling Rate

The second-level cache runtime statistics are pulled from Hibernate statistics, which have a fixed sampling rate of one second (sample once per second). The Terracotta Developer Console's sampling rate for display purposes, however, is adjustable.

To display all of the Hibernate statistical counts, set the Terracotta Developer Console's sampling rate to one second. To set the sampling rate, choose Options... from the Developer Console's Tools menu, then set Poll period seconds to "1".

For example, if the sampled Hibernate statistics record the Cache Miss Count values "15, 25, 62, 10, 12, 43," and the Terracotta Developer Console's sampling rate is set to one second, then all of these values are graphed. However, if the Terracotta Developer Console's sampling rate is set to three seconds, then only the values "15, 62, 43" are graphed (assuming that the first poll period coincides with the first value recorded).

Is a Cache Appropriate for Your Use Case?

Some use cases may present hurdles to realizing benefits from a second-level Hibernate cache implementation.

Frequent Updates of Database

Volatile data requires frequent cache invalidation, which increases the overhead of maintaining the cache. At some point this overhead impacts performance at a cost too high to make the cache favorable. Identifying "hotsets" of data can mitigate this situation by limiting the amount of data that requires reloading. Another solution is scaling your cluster to keep more data in memory (see Terracotta Server Arrays).

Very Large Data Sets

Huge data sets that are queried randomly (or across the set with no clear pattern or "hotsets") are difficult to cache because of the impact on memory of attempting to load that set or having to evict and load elements at a very high rate. Solutions include scaling the cluster to allow more data into memory (see Terracotta Server Arrays), adding storage to allow Terracotta to spill more data to disk, and using partitioning strategies to prevent any one node from loading too much data.

Frequent Updates of In-Memory Data

As the rate of updating cached data goes up, application performance goes down as Hibernate attempts to manage and persist the changes. An asynchronous approach to writing the data may be a good solution for this issue.

Low Frequency of Cached Data Queries

The benefits of caching are maximized when cached data is queried multiple times before expiring. If cached data is infrequently accessed, or often expires before it is used, the benefits of caching may be lost. Solutions to this situation include invalidating data in the cache more often to force updates. Also, refactoring your application to cache more frequently queried data and avoid caching data that tends to expire unused.

Requirements of Critical Data

Cached data cannot be guaranteed to be consistent at all times with the data in a database. In situations where this must guaranteed, such as when an application requires auditing, access to the data must be through the System of Record (SoR). Financial applications, for example, require auditing, and for this the database must be accessed directly. If critical data is changed in a cache, however, the data obtained from the database could be erroneous.

Database Modified by Other Applications

If data in the database can be modified by applications outside of your application with Hibernate, and that same data is eligible for the second-level cache, unpredictable results could occur. One solution is a redesign to prevent data that can end up in the cache from being modified by applications outside of the scope of your Hibernate application.