Motivation
Sometimes it's important to have only one instance for a class. For example, in a system there should be only one window manager (or only a file system or only a print spooler). Usually singletons are used for centralized management of internal or external resources and they provide a global point of access to themselves.
The singleton pattern is one of the simplest design patterns and involves only one class which is responsible to instantiate itself, ti make sure it creates only one instance and in the same time to provide a global point of access to that instance. In that case the instance can be used from everywhere without calling the directly the constructor each time.
Intent
Ensure that only one instance of a class is created.
Provide a global point of access to the object.
Implementation
The implementation involves a static member in the "Singleton" class, a private constructor and a static public method that returns a reference to the static member.
The Singleton Pattern defines a getInstance operation which exposes the unique instance which is accessed by the clients. getInstance() is a class operation and is responsible for creating its own unique instance in case it is not created yet.
class Singleton{ private static Singleton m_instance; private Singleton() { ... } public static synchronized Singleton getInstance() { if (m_instance == null) m_instance = new Singleton(); return m_instance; } ... public void doSomething() { ... }}
In the code above it can be seen that the getInstance method ensure that only one instance of the class is created. The constructor should not be accessible from outside of the class to ensure that the only way of instantiating the class to be through the getInstance method.
The getInstance method is used also to provide a global point of access to the object and it can be used like this: Singleton.getInstance().doSomething();
Applicability & Examples
According to the definition the singleton pattern should be used when there must be exactly one instance of a class, and when it must be accessible to clients from a global access point. Here are some real situations where the singleton is used:
Example 1 - Logger Classes
The Singleton pattern is used in the design of logger classes. This classes are ussualy implemented as a singletons, and provides a global logging access point in all the application components without being necessary to create an object each time a logging operations is performed.
Example 2 - Configuration Classes
The Singleton pattern is used to design the classes that provide the configuration settings for an application. By implementing configuration classes as Singleton not only that we provide a global access point, but we also keep the instance we use as a cache object. When the class is instantiated( or when a value is read ) the singleton will keep the values in its internal structure. If the values are read from the database or from files this avoid reloading the values each time the configuration parameters are used.
Example 3 - Accesing resources in shared mode
It can be used in the design of an application that needs to work with the serial port. Let's say that there are many classs in the application, working in an multithreading environment, that needs to operate actions on the serial port. In this case a singleton with synchronized methods has to be used to manage all the operations on the serial port.
Example 4 - Factories implemented as Singletons
Let's assume that we design an application with a factory to generate new objects(Acount, Customer, Site, Address objects) with their ids, in an multithreading environment. If the factory is instantiated twice in 2 different threads then is possible to have 2 overlapping ids for 2 different objects. If we implement the Factory as a singleton we avoid this problem. Combining Abstarct Factory or Factory Method and Singleton design patterns is a common practice.
Specific problems and implementation
Thread-safe implementation for multithreading use.
Lazy instantiation using double locking mechanism.
The standard implementation shown in the code above is a thread safe implementation, but it's not the best thread-safe implementation beacuse sinchronization is very expensive when we are talking about the performance. We can see that the syncronized method getInstance does not need to be checked for syncronization after the object is initialized. If we see that the singleton object is already created we just have to return it without using any syncronized block. This optimization consist in checking in an unsincronized block if the object is null and if not to check agoin and create it in an syncronized block. This is called double locking mechanism.
In this case case the singleton instance is created when the getInstance() method is called for the first time. This is called lazy instantiation and it ensures that the singleton instance is created only when it is needed.
//Lazy instantiation using double locking mechanism.class Singleton{ private static Singleton m_instance; private Singleton() { System.out.println("Singleton(): Initializing Instance"); } public static Singleton getInstance() { if (m_instance == null) { synchronized(Singleton.class) { if (m_instance == null) { System.out.println("getInstance(): First time getInstance was invoked!"); m_instance = new Singleton(); } } } return m_instance; } public void doSomething() { System.out.println("doSomething(): Singleton does something!"); }}
A detialed discution(double locking mechanism) can be found on http://www-128.ibm.com/developerworks/java/library/j-dcl.html?loc=j
Early instantiation using implementation with static field
In the following implementattion the singleton object is instantiated when the class is loaded and not when it is first used, due to the fact that the m_instance member is declared static. This is why in this implementation we don't need to syncronize any portion of the code. The class is loaded once this guarantee the unicity of the object.
Singleton - A simple example (java) //Early instantiation using implementation with static field.class Singleton{ private static Singleton m_instance = new Singleton(); private Singleton() { System.out.println("Singleton(): Initializing Instance"); } public static Singleton getInstance() { return m_instance; } public void doSomething() { System.out.println("doSomething(): Singleton does something!"); }}
Protected constructor
It is possible to use a protected constructor to in order to permit the subclassing of the singeton. This techique has 2 drawbacks and this is why singleton is not used as a base class:
First of all, if the constructor is protected, it means that the class can be instantiated by calling the constructor from another class in the same package. A possible solution to avoid it is to create a separate package for the singleton.
Second of all, in order to use the derived class all the getInstance calls should be changed in the existing code from Singleton.getInstance() to NewSingleton.getInstance().
Multiple singleton instances if classes loaded by different classloaders access a singleton.
If a class(same name, same package) is loaded by 2 diferent classloaders they represents 2 different clasess in memory.
Serialization
If the Singleton class implements the java.io.Serializable interface, when a singleton is serialized and then deserialized more than once, there will be multiple instances of Singleton created. In order to avoid this the readResolve method should be implemented. See Serializable () and readResolve Method () in javadocs.
public class Singleton implements Serializable { ... // This method is called immediately after an object of this class is deserialized. // This method returns the singleton instance. protected Object readResolve() { return getInstance(); } }
Abstract Factory and Factory Methods implemented as singletons.
There are certain situations when the a factory should be unique. Having 2 factories might have undesired effects when objects are created. To ensure that a factory is unique it should be implemented as a singleton. By doing so we also avoid to instantiate the class before using it.
Hot Spot:
Multithreading - A special care should be taken when singleton has to be used in a multithreading application.
Serialization - When Singletons are implementing Serializable interface they have to implement readResolve method in order to avoid having 2 different objects.
Classloaders - If the Singleton class is loaded by 2 different class loaders we'll have 2 different classes, one for each class loader.
Global Access Point represented by the class name - The singleton instance is obtained using the class name. At the first view this is an easy way to access it, but it is not very flexible. If we need to replace the Sigleton class, all the references in the code should be changed accordinglly.
Monday, March 7, 2011
Wednesday, March 2, 2011
What is a Java Thread and How does it work?
What is a Java Thread and How does it work?
A java thread is an execution context or a lightweight process. It is a single sequential flow of control within a program. Programmer may use java thread mechanism to execute multiple tasks at the same time.
Thread class and run() Method
* Basic support for threads is in the java.lang.Thread class. It provides a thread API and all the generic behavior for threads. These behaviors include starting, sleeping, running, yielding, and having a priority.
* The run() method gives a thread something to do. Its code should implement the thread's running behavior.
There are two ways of creating a customized thread:
o Sub classing java.lang.Thread and Overriding run() method.
o Implementing the java.lang.Runnable Interface.
Thread Scheduling
* When we say that threads are running concurrently, in practice it may not be so. On a computer with single CPU, threads actually run one at a time giving an illusion of concurrency.
* The execution of multiple threads on a single CPU based on some algorithm is called thread scheduling.
* Thread scheduler maintains a pool of all the ready-to-run threads. Based on fixed priority algorithm, it allocates free CPU to one of these threads.
The Life Cycle of a Thread
The following diagram illustrates the various states that a Java thread can be in at any point during its life and which method calls cause a transition to another state.
Thread life cycle
* Ready-to-run
A thread starts its life cycle with a call to start(). For example
MyThread aThread = new MyThread();
aThread.start();
A call to start() will not immediately start thread's execution but rather will move it to pool of threads waiting for their turn to be picked for execution. The thread scheduler picks one of the ready-to-run threads based on thread priorities.
* Running
The thread code is being actively executed by the processor. It runs until it is swapped out, becomes blocked, or voluntarily give up its turn with this static method
Thread.yield();
Please note that yield() is a static method. Even if it is called on any thread object, it causes the currently executing thread to give up the CPU.
* Waiting
A call to java.lang.Object's wait() method causes the current thread object to wait. The thread remains in "Waiting" state until some another thread invokes notify() or the notifyAll() method of this object. The current thread must own this object's monitor for calling the wait().
* Sleeping
Java thread may be forced to sleep (suspended) for some predefined time.
Thread.sleep(milliseconds);
Thread.sleep(milliseconds, nanoseconds);
Please note that static method sleep() only guarantees that the thread will sleep for predefined time and be running some time after the predefined time has been elapsed.
For example, a call to sleep(60) will cause the currently executing thread to sleep for 60 milliseconds. This thread will be in ready-to-run state after that. It will be in "Running" state only when the scheduler will pick it for execution. Thus we can only say that the thread will run some time after 60 milliseconds.
* Blocked on I/O.
A java thread may enter this state while waiting for data from the IO device. The thread will move to Ready-to-Run after I/O condition changes (such as reading a byte of data).
* Blocked on Synchronization.
A java thread may enter this state while waiting for object lock. The thread will move to Ready-to-Run when a lock is acquired.
* Dead
A java thread may enter this state when it is finished working. It may also enter this state if the thread is terminated by an unrecoverable error condition.
Thread Synchronization
Problems may occur when two threads are trying to access/modify the same object. To prevent such problems, Java uses monitors and the synchronized keyword to control access to an object by a thread.
* Monitor
o Monitor is any class with synchronized code in it.
o Monitor controls its client threads using, wait() and notify() ( or notifyAll() ) methods.
o wait() and notify() methods must be called in synchronized code.
o Monitor asks client threads to wait if it is unavailable.
o Normally a call to wait() is placed in while loop. The condition of while loop generally tests the availability of monitor. After waiting, thread resumes execution from the point it left.
* Synchronized code and Locks
o Object lock
Each Object has a lock. This lock can be controlled by at most one thread at time. Lock controls the access to the synchronized code.
o When an executing thread encounters a synchronized statement, it goes in blocked state and waits until it acquires the object lock. After that, it executes the code block and then releases the lock. While the executing thread owns the lock, no other thread can acquire the lock. Thus the locks and synchronization mechanism ensures proper exceution of code in multiple threading.
Thread Priority
A thread's priority is specified with an integer from 1 (the lowest) to 10 (the highest), Constants Thread.MIN_PRIORITY and Thread.MAX_PRIORITY can also be used. By default, the setPriority() method sets the thread priority to 5, which is the Thread.NORM_PRIORITY.
Thread aThread = Thread.currentThread();
int currentPriority;
currentPriority = aThread.getPriority();
aThread.setPriority( currentPriority + 1 );
Setting priorities may not always have the desired effect because prioritization schemes may be implemented differently on different platforms. However, if you cannot resist messing with priorities, use higher priorities for threads that frequently block (sleeping or waiting for I/O). Use medium to low-priority for CPU-intensive threads to avoid hogging the processor down.
Thread Deadlock
In multiple threading, following problems may occur.
* Deadlock or deadly embrace occurs when two or more threads are trying to gain control of the same object, and each one has a lock on another resource that they need in order to proceed.
* For example, When thread A waiting for lock on Object P while holding the lock on Object Q and at the same time, thread B holding a lock on Object P and waiting for lock on Object Q, deadlock occurs.
* Please note that if the thread is holding a lock and went to a sleeping state, it does not loose the lock. However, when thread goes in blocked state, it normally releases the lock. This eliminates the potential of deadlocking threads.
* Java does not provide any mechanisms for detection or control of deadlock situations, so the programmer is responsible for avoiding them.
A java thread is an execution context or a lightweight process. It is a single sequential flow of control within a program. Programmer may use java thread mechanism to execute multiple tasks at the same time.
Thread class and run() Method
* Basic support for threads is in the java.lang.Thread class. It provides a thread API and all the generic behavior for threads. These behaviors include starting, sleeping, running, yielding, and having a priority.
* The run() method gives a thread something to do. Its code should implement the thread's running behavior.
There are two ways of creating a customized thread:
o Sub classing java.lang.Thread and Overriding run() method.
o Implementing the java.lang.Runnable Interface.
Thread Scheduling
* When we say that threads are running concurrently, in practice it may not be so. On a computer with single CPU, threads actually run one at a time giving an illusion of concurrency.
* The execution of multiple threads on a single CPU based on some algorithm is called thread scheduling.
* Thread scheduler maintains a pool of all the ready-to-run threads. Based on fixed priority algorithm, it allocates free CPU to one of these threads.
The Life Cycle of a Thread
The following diagram illustrates the various states that a Java thread can be in at any point during its life and which method calls cause a transition to another state.
Thread life cycle
* Ready-to-run
A thread starts its life cycle with a call to start(). For example
MyThread aThread = new MyThread();
aThread.start();
A call to start() will not immediately start thread's execution but rather will move it to pool of threads waiting for their turn to be picked for execution. The thread scheduler picks one of the ready-to-run threads based on thread priorities.
* Running
The thread code is being actively executed by the processor. It runs until it is swapped out, becomes blocked, or voluntarily give up its turn with this static method
Thread.yield();
Please note that yield() is a static method. Even if it is called on any thread object, it causes the currently executing thread to give up the CPU.
* Waiting
A call to java.lang.Object's wait() method causes the current thread object to wait. The thread remains in "Waiting" state until some another thread invokes notify() or the notifyAll() method of this object. The current thread must own this object's monitor for calling the wait().
* Sleeping
Java thread may be forced to sleep (suspended) for some predefined time.
Thread.sleep(milliseconds);
Thread.sleep(milliseconds, nanoseconds);
Please note that static method sleep() only guarantees that the thread will sleep for predefined time and be running some time after the predefined time has been elapsed.
For example, a call to sleep(60) will cause the currently executing thread to sleep for 60 milliseconds. This thread will be in ready-to-run state after that. It will be in "Running" state only when the scheduler will pick it for execution. Thus we can only say that the thread will run some time after 60 milliseconds.
* Blocked on I/O.
A java thread may enter this state while waiting for data from the IO device. The thread will move to Ready-to-Run after I/O condition changes (such as reading a byte of data).
* Blocked on Synchronization.
A java thread may enter this state while waiting for object lock. The thread will move to Ready-to-Run when a lock is acquired.
* Dead
A java thread may enter this state when it is finished working. It may also enter this state if the thread is terminated by an unrecoverable error condition.
Thread Synchronization
Problems may occur when two threads are trying to access/modify the same object. To prevent such problems, Java uses monitors and the synchronized keyword to control access to an object by a thread.
* Monitor
o Monitor is any class with synchronized code in it.
o Monitor controls its client threads using, wait() and notify() ( or notifyAll() ) methods.
o wait() and notify() methods must be called in synchronized code.
o Monitor asks client threads to wait if it is unavailable.
o Normally a call to wait() is placed in while loop. The condition of while loop generally tests the availability of monitor. After waiting, thread resumes execution from the point it left.
* Synchronized code and Locks
o Object lock
Each Object has a lock. This lock can be controlled by at most one thread at time. Lock controls the access to the synchronized code.
o When an executing thread encounters a synchronized statement, it goes in blocked state and waits until it acquires the object lock. After that, it executes the code block and then releases the lock. While the executing thread owns the lock, no other thread can acquire the lock. Thus the locks and synchronization mechanism ensures proper exceution of code in multiple threading.
Thread Priority
A thread's priority is specified with an integer from 1 (the lowest) to 10 (the highest), Constants Thread.MIN_PRIORITY and Thread.MAX_PRIORITY can also be used. By default, the setPriority() method sets the thread priority to 5, which is the Thread.NORM_PRIORITY.
Thread aThread = Thread.currentThread();
int currentPriority;
currentPriority = aThread.getPriority();
aThread.setPriority( currentPriority + 1 );
Setting priorities may not always have the desired effect because prioritization schemes may be implemented differently on different platforms. However, if you cannot resist messing with priorities, use higher priorities for threads that frequently block (sleeping or waiting for I/O). Use medium to low-priority for CPU-intensive threads to avoid hogging the processor down.
Thread Deadlock
In multiple threading, following problems may occur.
* Deadlock or deadly embrace occurs when two or more threads are trying to gain control of the same object, and each one has a lock on another resource that they need in order to proceed.
* For example, When thread A waiting for lock on Object P while holding the lock on Object Q and at the same time, thread B holding a lock on Object P and waiting for lock on Object Q, deadlock occurs.
* Please note that if the thread is holding a lock and went to a sleeping state, it does not loose the lock. However, when thread goes in blocked state, it normally releases the lock. This eliminates the potential of deadlocking threads.
* Java does not provide any mechanisms for detection or control of deadlock situations, so the programmer is responsible for avoiding them.
Thursday, February 24, 2011
What is the difference between JRE,JVM and JDK?
JDK (Java Development Kit)
Java Developer Kit contains tools needed to develop the Java programs, and JRE to run the programs. The tools include compiler (javac.exe), Java application launcher (java.exe), Appletviewer, etc…
Compiler converts java code into byte code. Java application launcher opens a JRE, loads the class, and invokes its main method.
You need JDK, if at all you want to write your own programs, and to compile the m. For running java programs, JRE is sufficient.
JRE is targeted for execution of Java files
i.e. JRE = JVM + Java Packages Classes(like util, math, lang, awt,swing etc)+runtime libraries.
JDK is mainly targeted for java development. I.e. You can create a Java file (with the help of Java packages), compile a Java file and run a java file
JRE (Java Runtime Environment)
Java Runtime Environment contains JVM, class libraries, and other supporting files. It does not contain any development tools such as compiler, debugger, etc. Actually JVM runs the program, and it uses the class libraries, and other supporting files provided in JRE. If you want to run any java program, you need to have JRE installed in the system
The Java Virtual Machine provides a platform-independent way of executing code; programmers can concentrate on writing software, without having to be concerned with how or where it will run.
If u just want to run applets (ex: Online Yahoo games or puzzles), JRE needs to be installed on the machine.
JVM (Java Virtual Machine)
As we all aware when we compile a Java file, output is not an 'exe' but it's a '.class' file. '.class' file consists of Java byte codes which are understandable by JVM. Java Virtual Machine interprets the byte code into the machine code depending upon the underlying operating system and hardware combination. It is responsible for all the things like garbage collection, array bounds checking, etc… JVM is platform dependent.
The JVM is called "virtual" because it provides a machine interface that does not depend on the underlying operating system and machine hardware architecture. This independence from hardware and operating system is a cornerstone of the write-once run-anywhere value of Java programs.
There are different JVM implementations are there. These may differ in things like performance, reliability, speed, etc. These implementations will differ in those areas where Java specification doesn’t mention how to implement the features, like how the garbage collection process works is JVM dependent, Java spec doesn’t define any specific way to do this.
Java Developer Kit contains tools needed to develop the Java programs, and JRE to run the programs. The tools include compiler (javac.exe), Java application launcher (java.exe), Appletviewer, etc…
Compiler converts java code into byte code. Java application launcher opens a JRE, loads the class, and invokes its main method.
You need JDK, if at all you want to write your own programs, and to compile the m. For running java programs, JRE is sufficient.
JRE is targeted for execution of Java files
i.e. JRE = JVM + Java Packages Classes(like util, math, lang, awt,swing etc)+runtime libraries.
JDK is mainly targeted for java development. I.e. You can create a Java file (with the help of Java packages), compile a Java file and run a java file
JRE (Java Runtime Environment)
Java Runtime Environment contains JVM, class libraries, and other supporting files. It does not contain any development tools such as compiler, debugger, etc. Actually JVM runs the program, and it uses the class libraries, and other supporting files provided in JRE. If you want to run any java program, you need to have JRE installed in the system
The Java Virtual Machine provides a platform-independent way of executing code; programmers can concentrate on writing software, without having to be concerned with how or where it will run.
If u just want to run applets (ex: Online Yahoo games or puzzles), JRE needs to be installed on the machine.
JVM (Java Virtual Machine)
As we all aware when we compile a Java file, output is not an 'exe' but it's a '.class' file. '.class' file consists of Java byte codes which are understandable by JVM. Java Virtual Machine interprets the byte code into the machine code depending upon the underlying operating system and hardware combination. It is responsible for all the things like garbage collection, array bounds checking, etc… JVM is platform dependent.
The JVM is called "virtual" because it provides a machine interface that does not depend on the underlying operating system and machine hardware architecture. This independence from hardware and operating system is a cornerstone of the write-once run-anywhere value of Java programs.
There are different JVM implementations are there. These may differ in things like performance, reliability, speed, etc. These implementations will differ in those areas where Java specification doesn’t mention how to implement the features, like how the garbage collection process works is JVM dependent, Java spec doesn’t define any specific way to do this.
How To Manage Memory With Java
Java provides automatic garbage collection,sometimes you will want to know how large the object heap is and how much of it is left. You can use this result to know about the efficiency of your application or program, that is you may come to know how many more objects you can instantiate. To obtain these values use totalMemory() and freeMemory() methods.
Since Java Garbage Collector runs periodically to check the dangling object references and empty Strings and to recycle these unused objects. However on the other hand we can let the garbage collector run whenever we want. Garbage collector runs on random times, you never know when will be the next appointment of Garbage Collector. So, f you think you are running out of memory you can enforce Garbage Collector to Sweep unused memory. You can run the Garbage Collector on demand by calling gc() method. gc() is called as "System. gc()". A good practice is to first call gc() method then call freeMemory() to get the base memory usage. Next execute your code and now see how much memory code is occupying by again calling freeMemory() method.
NOTE:-The methods gc(), totalMemory(), freeMemory() are part of Runtime class(For more on Runtime refer its Runtime API). The gc() method is also available in System class and is marked static in it.
Lets understand by example:-
public class Memoryusage{
public static void main(String a[]){
Runtime rt = Runtime. getRuntime();
long mem1,mem2;
String toomuch[] = new String[20000];
System. out. println("Total memory is : "+rt. totalMemory());
mem1=rt. freeMemory();
System. out. println("Initial Free Memory : "+mem1);
rt. gc();
mem1=rt. freeMemory();
System. out. println("Memory after Garbage Collection :"+mem1);
for(int i=0;i<20000;i++)
toomuch[i] = new String("String Array");
mem2=rt. freeMemory();
System. out. println("Memory after allocation :"+mem2);
System. out. println("Memory used by alocation : "+(mem1-mem2));
for(int i=0;i<1000;i++)
toomuch[i] = null;
rt. gc();
mem2=rt. freeMemory();
System. out. println("Memory after deacllocating memory : "+mem2);
}
}
Output:-
Total memory is : 5177344
Initial Free Memory : 4898608
Memory after Garbage Collection :4986512
Memory after allocation :4505976
Memory used by allocation : 480536
Memory after deacllocation memory : 4530512
Output is Machine specific and may vary on your machine.
Now you can see that how these methods work gc(),freeMemory() and totalMemory(). Runtime class is an abstract class thus it cannot be used to create instances but despite we can all methods to do the same. Like here we have called method getRuntime().
Since Java Garbage Collector runs periodically to check the dangling object references and empty Strings and to recycle these unused objects. However on the other hand we can let the garbage collector run whenever we want. Garbage collector runs on random times, you never know when will be the next appointment of Garbage Collector. So, f you think you are running out of memory you can enforce Garbage Collector to Sweep unused memory. You can run the Garbage Collector on demand by calling gc() method. gc() is called as "System. gc()". A good practice is to first call gc() method then call freeMemory() to get the base memory usage. Next execute your code and now see how much memory code is occupying by again calling freeMemory() method.
NOTE:-The methods gc(), totalMemory(), freeMemory() are part of Runtime class(For more on Runtime refer its Runtime API). The gc() method is also available in System class and is marked static in it.
Lets understand by example:-
public class Memoryusage{
public static void main(String a[]){
Runtime rt = Runtime. getRuntime();
long mem1,mem2;
String toomuch[] = new String[20000];
System. out. println("Total memory is : "+rt. totalMemory());
mem1=rt. freeMemory();
System. out. println("Initial Free Memory : "+mem1);
rt. gc();
mem1=rt. freeMemory();
System. out. println("Memory after Garbage Collection :"+mem1);
for(int i=0;i<20000;i++)
toomuch[i] = new String("String Array");
mem2=rt. freeMemory();
System. out. println("Memory after allocation :"+mem2);
System. out. println("Memory used by alocation : "+(mem1-mem2));
for(int i=0;i<1000;i++)
toomuch[i] = null;
rt. gc();
mem2=rt. freeMemory();
System. out. println("Memory after deacllocating memory : "+mem2);
}
}
Output:-
Total memory is : 5177344
Initial Free Memory : 4898608
Memory after Garbage Collection :4986512
Memory after allocation :4505976
Memory used by allocation : 480536
Memory after deacllocation memory : 4530512
Output is Machine specific and may vary on your machine.
Now you can see that how these methods work gc(),freeMemory() and totalMemory(). Runtime class is an abstract class thus it cannot be used to create instances but despite we can all methods to do the same. Like here we have called method getRuntime().
Wednesday, February 23, 2011
http://www.javadev.org/files/Hibernate%20Performance%20Tuning.pdf
Hibernate Performance Tuning
Fetching strategies
A fetching strategy is the strategy Hibernate will use for retrieving associated objects if the application needs to navigate the association. Fetch strategies may be declared in the O/R mapping metadata, or over‐ridden by a particular HQL or Criteria query.
Hibernate3 defines the following fetching strategies:
• Join fetching ‐ Hibernate retrieves the associated instance or collection in the same SELECT, using an OUTER JOIN.
• Select fetching ‐ a second SELECT is used to retrieve the associated entity or collection. Unless you explicitly disable lazy fetching by specifying lazy="false", this second select will only be executed when you actually access the association.
• Subselect fetching ‐ a second SELECT is used to retrieve the associated collections for all entities retrieved in a previous query or fetch. Unless you explicitly disable lazy fetching by specifying lazy="false", this second select will only be executed when you actually access the association.
• Batch fetching ‐ an optimization strategy for select fetching ‐ Hibernate retrieves a batch of entity instances or collections in a single SELECT, by specifying a list of primary keys or foreign keys.
Hibernate also distinguishes between:
• Immediate fetching ‐ an association, collection or attribute is fetched immediately, when the owner is
loaded.
• Lazy collection fetching ‐ a collection is fetched when the application invokes an operation upon that collection. (This is the default for collections.)
• "Extra‐lazy" collection fetching ‐ individual elements of the collection are accessed from the database as needed. Hibernate tries not to fetch the whole collection into memory unless absolutely needed (suitable for very large collections)
• Proxy fetching ‐ a single‐valued association is fetched when a method other than the identifier getter is invoked upon the associated object.
• "No‐proxy" fetching ‐ a single‐valued association is fetched when the instance variable is accessed. Compared to proxy fetching, this approach is less lazy (the association is fetched even when only the identifier is accessed) but more transparent, since no proxy is visible to the application. This approach requires buildtime bytecode instrumentation and is rarely necessary.
• Lazy attribute fetching ‐ an attribute or single valued association is fetched when the instance variable is accessed. This approach requires buildtime bytecode instrumentation and is rarely necessary.
Solving the n+1 selects problem
The biggest performance killer in applications that persist objects to SQL databases is the n+1 selects problem. When you tune the performance of a Hibernate application, this problem is the first thing you’ll usually need to address. Its normal (and recommended) to map almost all associations for lazy initialization.
This means you generally set all collections to lazy="true" and even change some of the one‐to‐one and many‐to‐one associations to not use outer joins by default. This is the only way to avoid retrieving all objects in the database in every transaction. Unfortunately, this decision exposes you to the n+1 selects problem.
It’s easy to understand this problem by considering a simple query that retrieves all Items for a particular user:
Iterator items = session.createCriteria(Item.class)
.add( Expression.eq("item.seller", user) )
.list()
.iterator();
This query returns a list of items, where each collection of bids is an uninitialized collection wrapper. Suppose that we now wish to find the maximum bid for each item. The following code would be one way to do this:
List maxAmounts = new ArrayList();
while (items.hasNext()) {
Item item = (Item) items.next();
BigDecimal maxAmount = new BigDecimal("0");
for ( Iterator b = item.getBids().iterator(); b.hasNext(); ) {
Bid bid = (Bid) b.next();
if ( bid.getAmount().compareTo(maxAmount) == 1 )
maxAmount = bid.getAmount();
}
maxAmounts.add( new MaxAmount( item.getId(), maxAmount ) );
}
But there is a huge problem with this solution (aside from the fact that this would be much better executed in the database using aggregation functions): Each time we access the collection of bids, Hibernate must fetch this lazy collection from the database for each item. If the initial query returns 20 items, the entire transaction requires 1 initial select that retrieves the items plus 20 additional selects to load the bids collections of each item. This might easily result in unacceptable latency in a system that accesses the database across a network. Usually you don't explicitly create such operations, because you should quickly see doing so is suboptimal.
However, the n+1 selects problem is often hidden in more complex application logic, and you may not recognize it by looking at a single routine.
Batch fetching
The first attempt to solve this problem might be to enable batch fetching. We change our mapping for the bids collection to look like this:
With batch fetching enabled, Hibernate pre‐fetches the next 10 collections when the first collection is accessed. This reduces the problem from n+1 selects to n/10 + 1 selects. For many applications, this may be sufficient to achieve acceptable latency. On the other hand, it also means that in some other transactions, collections are fetched unnecessarily. It isn’t the best we can do in terms of reducing the number of round trips to the database.
HQL aggregation A much, much better solution is to take advantage of HQL aggregation and perform the work of calculating the maximum bid on the database. Thus we avoid the problem:
String query = "select MaxAmount( item.id, max(bid.amount) )" + " from Item item join item.bids bid" + " where item.seller = :user group by item.id"; List maxAmounts = session.createQuery(query).setEntity("user", user).list();
Unfortunately, this isn’t a complete solution to the generic issue. In general, we may need to do more complex processing on the bids than merely calculating the maximum amount. We’d prefer to do this processing in the Java application.
Eager fetching
We can try enabling eager fetching at the level of the mapping document:
The outer‐join attribute is available for collections and other associations. It forces Hibernate to load the association eagerly, using an SQL outer join. Note that, as previously mentioned, HQL queries ignore the outer‐join attribute; but we might be using a criteria query. This mapping avoids the problem as far as this transaction is concerned; we’re now able to load all bids in the initial select. Unfortunately, any other transaction that retrieves items using get(), load(), or a criteria query will also retrieve all the bids at once. Retrieving unnecessary data imposes extra load on both the database server and the application server and may also reduce the concurrency of the system, creating too many unnecessary read locks at the database level. Hence we consider eager fetching at the level of the mapping file to be almost always a bad approach. The outer‐join attribute of collection mappings is arguably a misfeature of Hibernate (fortunately, it’s disabled by default). Occasionally it makes sense to enable outer‐join for a or association (the default is auto), but we’d never do this in the case of a collection.
Runtime (codelevel) declarations
The recommended solution for this problem is to take advantage of Hibernate’s support for runtime (code‐level) declarations of association fetching strategies. The example can be implemented like this:
List results = session.createCriteria(Item.class)
.add( Expression.eq("item.seller", user) )
.setFetchMode("bids", FetchMode.EAGER)
.list();
// Make results distinct
Iterator items = new HashSet(results).iterator();
List maxAmounts = new ArrayList();
for ( ; items.hasNext(); ) {
Item item = (Item) items.next();
BigDecimal maxAmount = new BigDecimal("0");
for ( Iterator b = item.getBids().iterator(); b.hasNext(); ) {
Bid bid = (Bid) b.next();
if ( bid.getAmount().compareTo(maxAmount) == 1 )
maxAmount = bid.getAmount();
}
maxAmounts.add( new MaxAmount( item.getId(), maxAmount ) );
}
We disabled batch fetching and eager fetching at the mapping level; the collection is lazy by default. Instead, we enable eager fetching for this query alone by calling setFetchMode(). As discussed earlier in this chapter, this is equivalent to a fetch join in the from clause of an HQL query. The previous code example has one extra complication: The result list returned by the Hibernate criteria query isn’t guaranteed to be distinct. In the case of a query that fetches a collection by outer join, it will contain duplicate items. It’s the application’s responsibility to make the results distinct if that is required. We implement this by adding the results to a HashSet and then iterating the set.
So, we have established a general solution to the n+1 selects problem. Rather than retrieving just the top‐level objects in the initial query and then fetching needed associations as the application navigates the object graph, we follow a two step process:
1 Fetch all needed data in the initial query by specifying exactly which associations will be accessed in the following unit of work.
2 Navigate the object graph, which will consist entirely of objects that have already been fetched from the database.
This is the only true solution to the mismatch between the object‐oriented world, where data is accessed by navigation, and the relational world, where data is accessed by joining.
Lazy fetching
By default, Hibernate3 uses lazy select fetching for collections and lazy proxy fetching for single‐valued associations. These defaults make sense for almost all associations in almost all applications. However, lazy fetching poses one problem that you must be aware of. Access to a lazy association outside of the context of an open Hibernate session will result in an exception. Since the lazy collection was not initialized when the Session was closed, the collection will not be able to load its state. Hibernate does not support lazy initialization for detached objects. The fix is to move the code that reads from the collection to just before the transaction is committed. Alternatively, we
could use a non‐lazy collection or association, by specifying lazy="false" for the association mapping. However, it is intended that lazy initialization be used for almost all collections and associations. If you define too many non‐lazy associations in your object model, Hibernate will end up needing to fetch the entire database into memory in every transaction!
Initializing collections and proxies
A LazyInitializationException will be thrown by Hibernate if an uninitialized collection or proxy is accessed outside of the scope of the Session, ie. when the entity owning the collection or having the reference to the proxy is in the detached state.
Sometimes we need to ensure that a proxy or collection is initialized before closing the Session. Of course, we can always force initialization by calling item.getBids() or item.getBids().size(), for example. But that is confusing to readers of the code and is not convenient for generic code. The static methods Hibernate.initialize() and Hibernate.isInitialized() provide the application with a convenient way of working with lazily initialized collections or proxies. Hibernate.initialize(item) will force the initialization of a proxy, item, as long as its Session is still open. Hibernate.initialize(item.getBids() ) has a similar effect for the collection of bids. Another option is to keep the Session open until all needed collections and proxies have been loaded. In some application architectures, particularly where the code that accesses data using Hibernate, and the code that uses it are in different application layers or different physical processes, it can be a problem to ensure that the Session is open when a collection is initialized. There are two basic ways to deal with this issue:
•
In a web‐based application, a servlet filter can be used to close the Session only at the very end of a user request, once the rendering of the view is complete (the Open Session in View pattern). Of course, this places heavy demands on the correctness of the exception handling of your application infrastructure. It is vitally important that the Session is closed and the transaction ended before returning to the user, even when an exception occurs during rendering of the view.
•
In an application with a separate business tier, the business logic must "prepare" all collections that will be needed by the web tier before returning. This means that the business tier should load all the data and return all the data already initialized to the presentation/web tier that is required for a particular use case. Usually, the application calls Hibernate.initialize() for each collection that will be needed in the web tier (this call must occur before the session is closed) or retrieves the collection eagerly using a Hibernate query with a FETCH clause or a FetchMode.JOIN in Criteria. This is usually easier if you adopt the Command pattern instead of a Session Facade.
•
You may also attach a previously loaded object to a new Session with merge() or lock() before accessing uninitialized collections (or other proxies). No, Hibernate does not, and certainly should not do this automatically, since it would introduce ad hoc transaction semantics!
The Second Level Cache
A Hibernate Session is a transaction‐level cache of persistent data. It is possible to configure a cluster or JVMlevel (SessionFactory‐level) cache on a class‐by‐class and collection‐by‐collection basis. You may even plugin a clustered cache. Be careful. Caches are never aware of changes made to the persistent store by another application (though they may be configured to regularly expire cached data).
You have the option to tell Hibernate which caching implementation to use by specifying the name of a class that implements org.hibernate.cache.CacheProvider using the property hibernate.cache.provider_class. Hibernate comes bundled with a number of built‐in integrations with open‐source cache providers (listed below); additionally, you could implement your own and plug it in as outlined above.
Cache Providers
Cache
Provider class
Type
Cluster Safe
Query Cache
Supported
Hashtable
(not intended
for production
use)
org.hibernate.cache.HashtableCacheProv
ider
memory
yes
EHCache
org.hibernate.cache.EhCacheProvider
memory, disk
yes
OSCache
org.hibernate.cache.OSCacheProvider
memory, disk
yes
SwarmCache
org.hibernate.cache.SwarmCacheProvider
clustered (ip
multicast)
yes (clustered
invalidation)
JBoss
TreeCache
org.hibernate.cache.TreeCacheProvider
clustered (ip
multicast),
transactional
yes
(replication)
yes (clock
sync req.)
Cache mappings
The element of a class or collection mapping has the following form:
usage="transactional|read‐write|nonstrict‐read‐write|read‐only" (1)
region="RegionName" (2)
include="all|non‐lazy" (3)
/>
(1) usage (required) specifies the caching strategy: transactional, read‐write, nonstrict‐read‐write or read‐only (2) region (optional, defaults to the class or collection role name) specifies the name of the second level cache region
(3) include (optional, defaults to all) non‐lazy specifies that properties of the entity mapped with lazy="true" may not be cached when attribute‐level lazy fetching is enabled
Alternatively (preferably?), you may specify and elements in hibernate.cfg.xml.
Strategy: read only
If your application needs to read but never modify instances of a persistent class, a read‐only cache may be used. This is the simplest and best performing strategy. It's even perfectly safe for use in a cluster.
Strategy: read/write
If the application needs to update data, a read‐write cache might be appropriate. This cache strategy should never be used if serializable transaction isolation level is required. If the cache is used in a JTA environment, you must specify the property hibernate.transaction.manager_lookup_class, naming a strategy for obtaining the JTA TransactionManager. In other environments, you should ensure that the transaction is completed when Session.close() or Session.disconnect() is called. If you wish to use this strategy in a cluster, you should ensure that the underlying cache implementation supports locking. The built‐in cache providers do not.
Strategy: nonstrict read/write
If the application only occasionally needs to update data (ie. if it is extremely unlikely that two transactions would try to update the same item simultaneously) and strict transaction isolation is not required, a nonstrictread‐write cache might be appropriate. If the cache is used in a JTA environment, you must specify hibernate.transaction.manager_lookup_class. In other environments, you should ensure that the transaction is completed when Session.close() or Session.disconnect() is called.
Strategy: transactional
The transactional cache strategy provides support for fully transactional cache providers such as JBoss TreeCache. Such a cache may only be used in a JTA environment and you must specify hibernate.transaction.manager_lookup_class.
Cache Concurrency Strategy Support
Cache
read‐only
read‐only nonstrictread‐
write
read‐write
transactional
Hashtable (not intended
for production
use)
yes
yes
yes
EHCache
yes
yes
yes
OSCache
yes
yes
yes
SwarmCache
yes
yes
JBoss TreeCache
yes
yes
Managing the caches
Whenever you pass an object to save(), update() or saveOrUpdate() and whenever you retrieve an object using load(), get(), list(), iterate() or scroll(), that object is added to the internal cache of the Session. When flush() is subsequently called, the state of that object will be synchronized with the database. If you do not want this synchronization to occur or if you are processing a huge number of objects and need to manage memory efficiently, the evict() method may be used to remove the object and its collections from the firstlevelcache.
ScrollableResult items = sess.createQuery("from Item as item").scroll(); //a huge result set
while ( items.next() ) {
Item item = (Item) items.get(0);
doSomethingWithAItem(item);
sess.evict(item);
}
The Session also provides a contains() method to determine if an instance belongs to the session cache. To completely evict all objects from the session cache, call Session.clear() For the second‐level cache, there are methods defined on SessionFactory for evicting the cached state of an instance, entire class, collection instance or entire collection role.
sessionFactory.evict(Item.class, itemId); //evict a particular Item
sessionFactory.evict(Item.class); //evict all Items sessionFactory.evictCollection("Item.bids", itemId); //evict a particular collection of bids
sessionFactory.evictCollection("Item.bids"); //evict all bid collections
The CacheMode controls how a particular session interacts with the second‐level cache.
• CacheMode.NORMAL ‐ read items from and write items to the second‐level cache • CacheMode.GET ‐ read items from the second‐level cache, but don't write to the second‐level cache except when updating data • CacheMode.PUT ‐ write items to the second‐level cache, but don't read from the second‐level cache • CacheMode.REFRESH ‐ write items to the second‐level cache, but don't read from the second‐level cache, bypass the effect of hibernate.cache.use_minimal_puts, forcing a refresh of the second‐level cache for all items read from the database
To browse the contents of a second‐level or query cache region, use the Statistics API:
Map cacheEntries = sessionFactory.getStatistics()
.getSecondLevelCacheStatistics(regionName)
.getEntries();
You'll need to enable statistics, and, optionally, force Hibernate to keep the cache entries in a more human understandable format:
hibernate.generate_statistics true
hibernate.cache.use_structured_entries true
The Query Cache
Query result sets may also be cached. This is only useful for queries that are run frequently with the same parameters. To use the query cache you must first enable it:
hibernate.cache.use_query_cache true
This setting causes the creation of two new cache regions ‐ one holding cached query result sets
(org.hibernate.cache.StandardQueryCache), the other holding timestamps of the most recent updates to queryable tables (org.hibernate.cache.UpdateTimestampsCache). Note that the query cache does not cache the state of the actual entities in the result set; it caches only identifier values and results of value type. So the query cache should always be used in conjunction with the second‐level cache.
Most queries do not benefit from caching, so by default queries are not cached. To enable caching, call
Query.setCacheable(true). This call allows the query to look for existing cache results or add its results to the cache when it is executed.
By Nima Goudarzi ‐ July, 2007
References:
‐ Hibernate3 reference
‐ Hibernate in Action (Manning 2005)
Fetching strategies
A fetching strategy is the strategy Hibernate will use for retrieving associated objects if the application needs to navigate the association. Fetch strategies may be declared in the O/R mapping metadata, or over‐ridden by a particular HQL or Criteria query.
Hibernate3 defines the following fetching strategies:
• Join fetching ‐ Hibernate retrieves the associated instance or collection in the same SELECT, using an OUTER JOIN.
• Select fetching ‐ a second SELECT is used to retrieve the associated entity or collection. Unless you explicitly disable lazy fetching by specifying lazy="false", this second select will only be executed when you actually access the association.
• Subselect fetching ‐ a second SELECT is used to retrieve the associated collections for all entities retrieved in a previous query or fetch. Unless you explicitly disable lazy fetching by specifying lazy="false", this second select will only be executed when you actually access the association.
• Batch fetching ‐ an optimization strategy for select fetching ‐ Hibernate retrieves a batch of entity instances or collections in a single SELECT, by specifying a list of primary keys or foreign keys.
Hibernate also distinguishes between:
• Immediate fetching ‐ an association, collection or attribute is fetched immediately, when the owner is
loaded.
• Lazy collection fetching ‐ a collection is fetched when the application invokes an operation upon that collection. (This is the default for collections.)
• "Extra‐lazy" collection fetching ‐ individual elements of the collection are accessed from the database as needed. Hibernate tries not to fetch the whole collection into memory unless absolutely needed (suitable for very large collections)
• Proxy fetching ‐ a single‐valued association is fetched when a method other than the identifier getter is invoked upon the associated object.
• "No‐proxy" fetching ‐ a single‐valued association is fetched when the instance variable is accessed. Compared to proxy fetching, this approach is less lazy (the association is fetched even when only the identifier is accessed) but more transparent, since no proxy is visible to the application. This approach requires buildtime bytecode instrumentation and is rarely necessary.
• Lazy attribute fetching ‐ an attribute or single valued association is fetched when the instance variable is accessed. This approach requires buildtime bytecode instrumentation and is rarely necessary.
Solving the n+1 selects problem
The biggest performance killer in applications that persist objects to SQL databases is the n+1 selects problem. When you tune the performance of a Hibernate application, this problem is the first thing you’ll usually need to address. Its normal (and recommended) to map almost all associations for lazy initialization.
This means you generally set all collections to lazy="true" and even change some of the one‐to‐one and many‐to‐one associations to not use outer joins by default. This is the only way to avoid retrieving all objects in the database in every transaction. Unfortunately, this decision exposes you to the n+1 selects problem.
It’s easy to understand this problem by considering a simple query that retrieves all Items for a particular user:
Iterator items = session.createCriteria(Item.class)
.add( Expression.eq("item.seller", user) )
.list()
.iterator();
This query returns a list of items, where each collection of bids is an uninitialized collection wrapper. Suppose that we now wish to find the maximum bid for each item. The following code would be one way to do this:
List maxAmounts = new ArrayList();
while (items.hasNext()) {
Item item = (Item) items.next();
BigDecimal maxAmount = new BigDecimal("0");
for ( Iterator b = item.getBids().iterator(); b.hasNext(); ) {
Bid bid = (Bid) b.next();
if ( bid.getAmount().compareTo(maxAmount) == 1 )
maxAmount = bid.getAmount();
}
maxAmounts.add( new MaxAmount( item.getId(), maxAmount ) );
}
But there is a huge problem with this solution (aside from the fact that this would be much better executed in the database using aggregation functions): Each time we access the collection of bids, Hibernate must fetch this lazy collection from the database for each item. If the initial query returns 20 items, the entire transaction requires 1 initial select that retrieves the items plus 20 additional selects to load the bids collections of each item. This might easily result in unacceptable latency in a system that accesses the database across a network. Usually you don't explicitly create such operations, because you should quickly see doing so is suboptimal.
However, the n+1 selects problem is often hidden in more complex application logic, and you may not recognize it by looking at a single routine.
Batch fetching
The first attempt to solve this problem might be to enable batch fetching. We change our mapping for the bids collection to look like this:
With batch fetching enabled, Hibernate pre‐fetches the next 10 collections when the first collection is accessed. This reduces the problem from n+1 selects to n/10 + 1 selects. For many applications, this may be sufficient to achieve acceptable latency. On the other hand, it also means that in some other transactions, collections are fetched unnecessarily. It isn’t the best we can do in terms of reducing the number of round trips to the database.
HQL aggregation A much, much better solution is to take advantage of HQL aggregation and perform the work of calculating the maximum bid on the database. Thus we avoid the problem:
String query = "select MaxAmount( item.id, max(bid.amount) )" + " from Item item join item.bids bid" + " where item.seller = :user group by item.id"; List maxAmounts = session.createQuery(query).setEntity("user", user).list();
Unfortunately, this isn’t a complete solution to the generic issue. In general, we may need to do more complex processing on the bids than merely calculating the maximum amount. We’d prefer to do this processing in the Java application.
Eager fetching
We can try enabling eager fetching at the level of the mapping document:
The outer‐join attribute is available for collections and other associations. It forces Hibernate to load the association eagerly, using an SQL outer join. Note that, as previously mentioned, HQL queries ignore the outer‐join attribute; but we might be using a criteria query. This mapping avoids the problem as far as this transaction is concerned; we’re now able to load all bids in the initial select. Unfortunately, any other transaction that retrieves items using get(), load(), or a criteria query will also retrieve all the bids at once. Retrieving unnecessary data imposes extra load on both the database server and the application server and may also reduce the concurrency of the system, creating too many unnecessary read locks at the database level. Hence we consider eager fetching at the level of the mapping file to be almost always a bad approach. The outer‐join attribute of collection mappings is arguably a misfeature of Hibernate (fortunately, it’s disabled by default). Occasionally it makes sense to enable outer‐join for a
Runtime (codelevel) declarations
The recommended solution for this problem is to take advantage of Hibernate’s support for runtime (code‐level) declarations of association fetching strategies. The example can be implemented like this:
List results = session.createCriteria(Item.class)
.add( Expression.eq("item.seller", user) )
.setFetchMode("bids", FetchMode.EAGER)
.list();
// Make results distinct
Iterator items = new HashSet(results).iterator();
List maxAmounts = new ArrayList();
for ( ; items.hasNext(); ) {
Item item = (Item) items.next();
BigDecimal maxAmount = new BigDecimal("0");
for ( Iterator b = item.getBids().iterator(); b.hasNext(); ) {
Bid bid = (Bid) b.next();
if ( bid.getAmount().compareTo(maxAmount) == 1 )
maxAmount = bid.getAmount();
}
maxAmounts.add( new MaxAmount( item.getId(), maxAmount ) );
}
We disabled batch fetching and eager fetching at the mapping level; the collection is lazy by default. Instead, we enable eager fetching for this query alone by calling setFetchMode(). As discussed earlier in this chapter, this is equivalent to a fetch join in the from clause of an HQL query. The previous code example has one extra complication: The result list returned by the Hibernate criteria query isn’t guaranteed to be distinct. In the case of a query that fetches a collection by outer join, it will contain duplicate items. It’s the application’s responsibility to make the results distinct if that is required. We implement this by adding the results to a HashSet and then iterating the set.
So, we have established a general solution to the n+1 selects problem. Rather than retrieving just the top‐level objects in the initial query and then fetching needed associations as the application navigates the object graph, we follow a two step process:
1 Fetch all needed data in the initial query by specifying exactly which associations will be accessed in the following unit of work.
2 Navigate the object graph, which will consist entirely of objects that have already been fetched from the database.
This is the only true solution to the mismatch between the object‐oriented world, where data is accessed by navigation, and the relational world, where data is accessed by joining.
Lazy fetching
By default, Hibernate3 uses lazy select fetching for collections and lazy proxy fetching for single‐valued associations. These defaults make sense for almost all associations in almost all applications. However, lazy fetching poses one problem that you must be aware of. Access to a lazy association outside of the context of an open Hibernate session will result in an exception. Since the lazy collection was not initialized when the Session was closed, the collection will not be able to load its state. Hibernate does not support lazy initialization for detached objects. The fix is to move the code that reads from the collection to just before the transaction is committed. Alternatively, we
could use a non‐lazy collection or association, by specifying lazy="false" for the association mapping. However, it is intended that lazy initialization be used for almost all collections and associations. If you define too many non‐lazy associations in your object model, Hibernate will end up needing to fetch the entire database into memory in every transaction!
Initializing collections and proxies
A LazyInitializationException will be thrown by Hibernate if an uninitialized collection or proxy is accessed outside of the scope of the Session, ie. when the entity owning the collection or having the reference to the proxy is in the detached state.
Sometimes we need to ensure that a proxy or collection is initialized before closing the Session. Of course, we can always force initialization by calling item.getBids() or item.getBids().size(), for example. But that is confusing to readers of the code and is not convenient for generic code. The static methods Hibernate.initialize() and Hibernate.isInitialized() provide the application with a convenient way of working with lazily initialized collections or proxies. Hibernate.initialize(item) will force the initialization of a proxy, item, as long as its Session is still open. Hibernate.initialize(item.getBids() ) has a similar effect for the collection of bids. Another option is to keep the Session open until all needed collections and proxies have been loaded. In some application architectures, particularly where the code that accesses data using Hibernate, and the code that uses it are in different application layers or different physical processes, it can be a problem to ensure that the Session is open when a collection is initialized. There are two basic ways to deal with this issue:
•
In a web‐based application, a servlet filter can be used to close the Session only at the very end of a user request, once the rendering of the view is complete (the Open Session in View pattern). Of course, this places heavy demands on the correctness of the exception handling of your application infrastructure. It is vitally important that the Session is closed and the transaction ended before returning to the user, even when an exception occurs during rendering of the view.
•
In an application with a separate business tier, the business logic must "prepare" all collections that will be needed by the web tier before returning. This means that the business tier should load all the data and return all the data already initialized to the presentation/web tier that is required for a particular use case. Usually, the application calls Hibernate.initialize() for each collection that will be needed in the web tier (this call must occur before the session is closed) or retrieves the collection eagerly using a Hibernate query with a FETCH clause or a FetchMode.JOIN in Criteria. This is usually easier if you adopt the Command pattern instead of a Session Facade.
•
You may also attach a previously loaded object to a new Session with merge() or lock() before accessing uninitialized collections (or other proxies). No, Hibernate does not, and certainly should not do this automatically, since it would introduce ad hoc transaction semantics!
The Second Level Cache
A Hibernate Session is a transaction‐level cache of persistent data. It is possible to configure a cluster or JVMlevel (SessionFactory‐level) cache on a class‐by‐class and collection‐by‐collection basis. You may even plugin a clustered cache. Be careful. Caches are never aware of changes made to the persistent store by another application (though they may be configured to regularly expire cached data).
You have the option to tell Hibernate which caching implementation to use by specifying the name of a class that implements org.hibernate.cache.CacheProvider using the property hibernate.cache.provider_class. Hibernate comes bundled with a number of built‐in integrations with open‐source cache providers (listed below); additionally, you could implement your own and plug it in as outlined above.
Cache Providers
Cache
Provider class
Type
Cluster Safe
Query Cache
Supported
Hashtable
(not intended
for production
use)
org.hibernate.cache.HashtableCacheProv
ider
memory
yes
EHCache
org.hibernate.cache.EhCacheProvider
memory, disk
yes
OSCache
org.hibernate.cache.OSCacheProvider
memory, disk
yes
SwarmCache
org.hibernate.cache.SwarmCacheProvider
clustered (ip
multicast)
yes (clustered
invalidation)
JBoss
TreeCache
org.hibernate.cache.TreeCacheProvider
clustered (ip
multicast),
transactional
yes
(replication)
yes (clock
sync req.)
Cache mappings
The
region="RegionName" (2)
include="all|non‐lazy" (3)
/>
(1) usage (required) specifies the caching strategy: transactional, read‐write, nonstrict‐read‐write or read‐only (2) region (optional, defaults to the class or collection role name) specifies the name of the second level cache region
(3) include (optional, defaults to all) non‐lazy specifies that properties of the entity mapped with lazy="true" may not be cached when attribute‐level lazy fetching is enabled
Alternatively (preferably?), you may specify
Strategy: read only
If your application needs to read but never modify instances of a persistent class, a read‐only cache may be used. This is the simplest and best performing strategy. It's even perfectly safe for use in a cluster.
Strategy: read/write
If the application needs to update data, a read‐write cache might be appropriate. This cache strategy should never be used if serializable transaction isolation level is required. If the cache is used in a JTA environment, you must specify the property hibernate.transaction.manager_lookup_class, naming a strategy for obtaining the JTA TransactionManager. In other environments, you should ensure that the transaction is completed when Session.close() or Session.disconnect() is called. If you wish to use this strategy in a cluster, you should ensure that the underlying cache implementation supports locking. The built‐in cache providers do not.
Strategy: nonstrict read/write
If the application only occasionally needs to update data (ie. if it is extremely unlikely that two transactions would try to update the same item simultaneously) and strict transaction isolation is not required, a nonstrictread‐write cache might be appropriate. If the cache is used in a JTA environment, you must specify hibernate.transaction.manager_lookup_class. In other environments, you should ensure that the transaction is completed when Session.close() or Session.disconnect() is called.
Strategy: transactional
The transactional cache strategy provides support for fully transactional cache providers such as JBoss TreeCache. Such a cache may only be used in a JTA environment and you must specify hibernate.transaction.manager_lookup_class.
Cache Concurrency Strategy Support
Cache
read‐only
read‐only nonstrictread‐
write
read‐write
transactional
Hashtable (not intended
for production
use)
yes
yes
yes
EHCache
yes
yes
yes
OSCache
yes
yes
yes
SwarmCache
yes
yes
JBoss TreeCache
yes
yes
Managing the caches
Whenever you pass an object to save(), update() or saveOrUpdate() and whenever you retrieve an object using load(), get(), list(), iterate() or scroll(), that object is added to the internal cache of the Session. When flush() is subsequently called, the state of that object will be synchronized with the database. If you do not want this synchronization to occur or if you are processing a huge number of objects and need to manage memory efficiently, the evict() method may be used to remove the object and its collections from the firstlevelcache.
ScrollableResult items = sess.createQuery("from Item as item").scroll(); //a huge result set
while ( items.next() ) {
Item item = (Item) items.get(0);
doSomethingWithAItem(item);
sess.evict(item);
}
The Session also provides a contains() method to determine if an instance belongs to the session cache. To completely evict all objects from the session cache, call Session.clear() For the second‐level cache, there are methods defined on SessionFactory for evicting the cached state of an instance, entire class, collection instance or entire collection role.
sessionFactory.evict(Item.class, itemId); //evict a particular Item
sessionFactory.evict(Item.class); //evict all Items sessionFactory.evictCollection("Item.bids", itemId); //evict a particular collection of bids
sessionFactory.evictCollection("Item.bids"); //evict all bid collections
The CacheMode controls how a particular session interacts with the second‐level cache.
• CacheMode.NORMAL ‐ read items from and write items to the second‐level cache • CacheMode.GET ‐ read items from the second‐level cache, but don't write to the second‐level cache except when updating data • CacheMode.PUT ‐ write items to the second‐level cache, but don't read from the second‐level cache • CacheMode.REFRESH ‐ write items to the second‐level cache, but don't read from the second‐level cache, bypass the effect of hibernate.cache.use_minimal_puts, forcing a refresh of the second‐level cache for all items read from the database
To browse the contents of a second‐level or query cache region, use the Statistics API:
Map cacheEntries = sessionFactory.getStatistics()
.getSecondLevelCacheStatistics(regionName)
.getEntries();
You'll need to enable statistics, and, optionally, force Hibernate to keep the cache entries in a more human understandable format:
hibernate.generate_statistics true
hibernate.cache.use_structured_entries true
The Query Cache
Query result sets may also be cached. This is only useful for queries that are run frequently with the same parameters. To use the query cache you must first enable it:
hibernate.cache.use_query_cache true
This setting causes the creation of two new cache regions ‐ one holding cached query result sets
(org.hibernate.cache.StandardQueryCache), the other holding timestamps of the most recent updates to queryable tables (org.hibernate.cache.UpdateTimestampsCache). Note that the query cache does not cache the state of the actual entities in the result set; it caches only identifier values and results of value type. So the query cache should always be used in conjunction with the second‐level cache.
Most queries do not benefit from caching, so by default queries are not cached. To enable caching, call
Query.setCacheable(true). This call allows the query to look for existing cache results or add its results to the cache when it is executed.
By Nima Goudarzi ‐ July, 2007
References:
‐ Hibernate3 reference
‐ Hibernate in Action (Manning 2005)
Subscribe to:
Posts (Atom)