Open Source

Storage Tiers Basics

Introduction
Memory Store
Off-Heap Store
- Allocating Direct Memory in the JVM
- Tuning Off-Heap Store Performance
Disk Store
- Serialization
- Storage Options

Introduction

BigMemory Go has three storage tiers, summarized here:

Memory store – Heap memory that holds a copy of the hottest subset of data from the off-heap store. Subject to Java GC.
Off-heap store – Limited in size only by available RAM. Not subject to Java GC. Can store serialized data only. Provides overflow capacity to the memory store.
Disk store – Backs up in-memory data and provides overflow capacity to the other tiers. Can store serialized data only.

This document defines these storage tiers and details the suitable element types for each storage tier.

Before running in production, it is strongly recommended that you test the BigMemory Go tiers with the actual amount of data you expect to use in production. For information about sizing the tiers, refer to Cache Configuration Sizing Attributes.

Memory Store

The memory store is always enabled and exists in heap memory. For the best performance, allot as much heap memory as possible without triggering GC pauses, and use the off-heap store to hold the data that cannot fit in heap (without causing GC pauses).

The memory store has the following characteristics:

Accepts all data, whether serializable or not
Fastest storage option
Thread safe for use by multiple concurrent threads
Backed By JDK LinkedHashMap

Off-Heap Store

The off-heap store extends the in-memory store to memory outside the of the object heap. This store, which is not subject to Java GC, is limited only by the amount of RAM available.

Because off-heap data is stored in bytes, only data that is Serializable is suitable for the off-heap store. Any non serializable data overflowing to the OffHeapMemoryStore is simply removed, and a WARNING level log message emitted.

Since serialization and deserialization take place on putting and getting from the off-heap store, it is theoretically slower than the memory store. This difference, however, is mitigated when GC involved with larger heaps is taken into account.

Allocating Direct Memory in the JVM

The off-heap store uses the direct-memory portion of the JVM. You must allocate sufficient direct memory for the off-heap store by using the JVM property MaxDirectMemorySize.

For example, to allocate 2GB of direct memory in the JVM:

java -XX:MaxDirectMemorySize=2G ...

Since direct memory may be shared with other processes, allocate at least 256MB (or preferably 1GB) more to direct memory than will be allocated to the off-heap store.

Note the following about allocating direct memory:

If you configure off-heap memory but do not allocate direct memory with -XX:MaxDirectMemorySize, the default value for direct memory depends on your version of your JVM. Oracle HotSpot has a default equal to maximum heap size (-Xmx value), although some early versions may default to a particular value.
MaxDirectMemorySize must be added to the local node's startup environment.
Direct memory, which is part of the Java process heap, is separate from the object heap allocated by -Xmx. The value allocated by MaxDirectMemorySize must not exceed physical RAM, and is likely to be less than total available RAM due to other memory requirements.
The amount of direct memory allocated must be within the constraints of available system memory and configured off-heap memory.
The maximum amount of direct memory space you can use depends on the process data model (32-bit or 64-bit) and the associated operating system limitations, the amount of virtual memory available on the system, and the amount of physical memory available on the system.

Using Off-Heap Store with 32-bit JVMs

The amount of heap-offload you can achieve is limited by addressable memory. 64-bit systems can allow as much memory as the hardware and operating system can handle, while 32-bit systems have strict limitations on the amount of memory that can be effectively managed.

For a 32-bit process model, the maximum virtual address size of the process is typically 4GB, though most 32-bit operating systems have a 2GB limit. The maximum heap size available to Java is lower still due to particular operating-system (OS) limitations, other operations that may run on the machine (such as mmap operations used by certain APIs), and various JVM requirements for loading shared libraries and other code. A useful rule to observe is to allocate no more to off-heap memory than what is left over after -Xmx is set. For example, if you set -Xmx3G, then off-heap should be no more than 1GB. Breaking this rule may not cause an OutOfMemoryError on startup, but one is likely to occur at some point during the JVM's life.

If Java GC issues are afflicting a 32-bit JVM, then off-heap store can help. However, note the following:

Everything has to fit in 4GB of addressable space. If 2GB of heap is allocated (with -Xmx2g) then at most are are 2GB left for off-heap data.
The JVM process requires some of the 4GB of addressable space for its code and shared libraries plus any extra Operating System overhead.
Allocating a 3GB heap with -Xmx, as well as 2047MB of off-heap memory, will not cause an error at startup, but when it's time to grow the heap an OutOfMemoryError is likely.
If both -Xms3G and -Xmx3G are used with 2047MB of off-heap memory, the virtual machine will start but then complain as soon as the off-heap store tries to allocate the off-heap buffers.
Some APIs, such as java.util.zip.ZipFile on a 1.5 JVM, may <mmap> files in memory. This will also use up process space and may trigger an OutOfMemoryError.

Tuning Off-Heap Store Performance

Memory-related or performance issues that arise during operations can be related to improper allocation of memory to the off-heap store. If performance or functional issues arise that can be traced back to the off-heap store, see the suggested tuning tips in this section.

General Memory allocation

Committing too much of a system's physical memory is likely to result in paging of virtual memory to disk, quite likely during garbage-collection operations, leading to significant performance issues. On systems with multiple Java processes, or multiple processes in general, the sum of the Java heaps and off-heap stores for those processes should also not exceed the size of the physical RAM in the system. Besides memory allocated to the heap, Java processes require memory for other items, such as code (classes), stacks, and PermGen.

Note that MaxDirectMemorySize sets an upper limit for the JVM to enforce, but does not actually allocate the specified memory. Overallocation of direct memory (or buffer) space is therefore possible, and could lead to paging or even memory-related errors. The limit on direct buffer space set by MaxDirectMemorySize should take into account the total physical memory available, the amount of memory that is allotted to the JVM object heap, and the portion of direct buffer space that other Java processes may consume.

In addition, be sure to allocate at least 15 percent more off-heap memory than the size of your data set. To maximize performance, a portion of off-heap memory is reserved for meta-data and other purposes.

Note also that there could be other users of direct buffers (such as NIO and certain frameworks and containers). Consider allocating additional direct buffer memory to account for that additional usage.

Compressed References

For 64-bit JVMs running Java 6 Update 14 or higher, consider enabling compressed references to improve overall performance. For heaps up to 32GB, this feature causes references to be stored at half the size, as if the JVM is running in 32-bit mode, freeing substantial amounts of heap for memory-intensive applications. The JVM, however, remains in 64-bit mode, retaining the advantages of that mode.

For the Oracle HotSpot, compressed references are enabled using the option -XX:+UseCompressedOops. For IBM JVMs, use -Xcompressedrefs.

Slow Off-Heap Allocation

Based configuration, usage, and memory requirements, BigMemory Go could allocate off-heap memory multiple times. If off-heap memory comes under pressure due to over-allocation, the host OS may begin paging to disk, thus slowing down allocation operations. As the situation worsens, an off-heap buffer too large to fit in memory can quickly deplete critical system resources such as RAM and swap space and crash the host OS.

To stop this situation from degrading, off-heap allocation time is measured to avoid allocating buffers too large to fit in memory. If it takes more than 1.5 seconds to allocate a buffer, a warning is issued. If it takes more than 15 seconds, the JVM is halted with System.exit() (or a different method if the Security Manager prevents this).

To prevent a JVM shutdown after a 15-second delay has occurred, set the net.sf.ehcache.offheap.DoNotHaltOnCriticalAllocationDelay system property to true. In this case, an error is logged instead.

Swapiness and Huge Pages

An OS could swap data from memory to disk even if memory is not running low. For the purpose of optimization, data that appears to be unused may be a target for swapping. Because BigMemory Go can store substantial amounts of data in RAM, its data may be swapped by the OS. But swapping can degrade overall cluster performance by introducing thrashing, the condition where data is frequently moved forth and back between memory and disk.

To make heap memory use more efficient, Linux, Microsoft Windows, and Oracle Solaris users should review their configuration and usage of swappiness as well as the size of the swapped memory pages. In general, BigMemory Go benefits from lowered swappiness and the use of huge pages (also known as big pages, large pages, and superpages).

Settings for these behaviors vary by OS and JVM. For Oracle HotSpot, -XX:+UseLargePages and -XX:LargePageSizeInBytes=<size> (where <size> is a value allowed by the OS for specific CPUs) can be used to control page size. However, note that this setting does not affect how off-heap memory is allocated. Over-allocating huge pages while also configuring substantial off-heap memory can starve off-heap allocation and lead to memory and performance problems.

Maximum Serialized Size of an Element

This section applies when using BigMemory through the Ehcache API.

Unlike the memory and the disk stores, by default the off-heap store has a 4MB limit for classes with high quality hashcodes, and 256KB limit for those with pathologically bad hashcodes. The built-in classes such as String and the java.lang.Number subclasses Long and Integer have high quality hashcodes. This can issues when objects are expected to be larger than the default limits.

To override the default size limits, set the system property net.sf.ehcache.offheap.cache_name.config.idealMaxSegmentSize to the size you require.

For example,

net.sf.ehcache.offheap.com.company.domain.State.config.idealMaxSegmentSize=30M

Reducing Faulting

While the memory store holds a hotset (a subset) of the entire data set, the off-heap store should be large enough to hold the entire data set. The frequency of misses (get operations that fail to find the data in memory) begins to rise when the data is too large to fit into off-heap memory, forcing gets to fetch data from the disk store (called faulting). More misses in turn raise latency and lower performance.

For example, tests with a 4GB data set and a 5GB off-heap store recorded no misses. With the off-heap store reduced to 4GB, 1.7 percent of cache operations resulted in misses. With the off-heap store at 3GB, misses reached 15 percent.

Disk Store

The disk store provides a thread-safe disk-spooling facility that can be used for additional storage and for persisting data through system restarts.

For more information about data persistence on disk, refer to the Fast Restartability page.

Serialization

Only data that is Serializable can be placed in the disk store. Writes to and from the disk use ObjectInputStream and the Java serialization mechanism. Any non-serializable data overflowing to the disk store is removed and a NotSerializableException is thrown.

Serialization speed is affected by the size of the objects being serialized and their type. It has been found that:

The serialization time for a Java object consisting of a large Map of String arrays was 126ms, where the serialized size was 349,225 bytes.
The serialization time for a byte[] was 7ms, where the serialized size was 310,232 bytes.

Byte arrays are 20 times faster to serialize, making them a better choice for increasing disk-store performance.

Storage Options

Two disk-store options are available:

Temporary swap – Allows the memory and off-heap stores to overflow to disk when they become full. This makes the disk a temporary store because overflow data does not survive restarts or failures. When the node is restarted, any existing data on disk is cleared because it is not designed to be reloaded.
Restartable store – Mirrors the data in memory and provides failure recovery of data. When the node is restarted, the data set is reloaded from disk to the in-memory stores.