Installed couple
of 11g SOA Suite instances and delivered it to our development team to work on
it. For a week the server seemed to be working fine but then started my
nightmare. The servers started becoming unstable.
Following were
the issues:
1. The servers became too slow, especially the EM
console hence making it difficult for developers to continue working.
2. EM Login page got struck at the user
authentication page.
3. Data sources went into suspended state
hence bringing EM application to halt.
4. Out of Memory issues
This is where I
did have a look into the JVM settings for the SOA application. Till now I was
going with default JVM settings as provided by Oracle. I generated some gc logs
and used a GC analyzer to view the GC info. That is where I could see frequent
GCs with unarguably high pause time. I needed to tune my JVM settings for sure.
With release of
the Oracle Fusion Middleware 11g products like SOA suite, BAM, OER, OSB etc a
lot has changed the way these products are built and work. Let’s focus on the
SOA Suite. The 11g SOA Suite unlike the 10g now runs on Oracle Weblogic server.
The SOA suite application now grows bigger with addition of applications like
B2B, BAM etc. In the past releases (10g) B2B and BAM used to be separate
installations .On top of this there are two management consoles, the Weblogic
Admin Console and the Enterprise Manager FMW console which the product needs to
function. Hence as you see the new 11g SOA suite is not only new but also a
big. The Application Sever (Weblogic) has to be tuned appropriately in order to
ensure a healthy SOA instance.
Below is my
environment info:
Application and
Database Server hardware Info
The SOA
Application and the database servers both were installed on separate physical
boxes. The specifications of the boxes are mentioned below.
Server Hardware: SUN T 5240
Operating System: Solaris 10
Architecture:
Sun Sparc 64 Bit
Number of CPU: 10
Available
Memory: 13.6 GB
Application
Installed and Version:
Application
Server: Oracle Weblogic Server (Version 10.3.4 )
FMW Product:
Oracle SOA Suite (Version 11.1.1.4)
JVM Used: Sun
JDK 1.6 Update 23
(Latest Sun JDK
,this version boasts of performance boost on Solaris servers)
Application
Install Architecture: Stand Alone Install.
Below is my JVM
setting recommendation. Please note, below tuning might be a good one to start
with. As number of concurrent users,
deployed applications, load increase the tuning parameter below might
change.
SOA Suite 11g
ideally uses two JVMs to function.
1. Admin Server JVM: This is the
weblogic server (JVM) on which the
Weblogic Admin Console and the EM Fusion Middleware Control are deployed. The
Weblogic Admin Console is used to manage and control the weblogic resources.
The EM Fusion Middleware control mainly is used to work on the SOA suite. It
enables application deployment, application monitoring etc.
2. SOA Managed Server: This is the
weblogic server (JVM) on which the
entire SOA Suite and B2B product stack is deployed. Hence you can expect
it to be a bit heavier than the Admin
Server.
Below is the JVM
settings I recommend:
JVM Heap
Recommendations for Development Managed Servers
-server –d64
–Xss256k –Xms4g –Xmx4g –XX:NewRatio=2 -XX:+AggressiveOpts -XX:PermSize=1g
-XX:MaxPermSize=1g -XX:+UseParallelGC -XX:+UseParallelOldGC
-XX:ParallelGCThreads=8 -XX:InitialSurvivorRatio=10 -XX:SurvivorRatio=10
-XX:LargePageSizeInBytes=4m -Dweblogic.management.discover=false
-Dweblogic.StuckThreadMaxTime=900 -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/tmp/java.hprof -verbose:gc -Xloggc:/tmp/gc.log -Xnoclassgc
-XX:TargetSurvivorRatio=90 -XX:ReservedCodeCacheSize=64m -XX:CICompilerCount=8
-XX:+AlwaysPreTouch -XX:+PrintReferenceGC -XX:+ParallelRefProcEnabled
-XX:-UseAdaptiveSizePolicy -XX:+PrintAdaptiveSizePolicy -XX:+DisableExplicitGC
JVM Heap
Recommendations for Production Managed Servers
-server –d64
–Xss256k –Xms6g –Xmx8g –XX:NewRatio=2 -XX:+AggressiveOpts -XX:PermSize=2g
-XX:MaxPermSize=2g -XX:+UseParallelGC -XX:+UseParallelOldGC
-XX:ParallelGCThreads=16 -XX:LargePageSizeInBytes=4m
-XX:InitialSurvivorRatio=10 -XX:SurvivorRatio=10 –XX:-UseTLAB
-Dweblogic.management.discover=false -Dweblogic.StuckThreadMaxTime=900
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/java.hprof -verbose:gc
-Xloggc:/tmp/gc.log -Xnoclassgc -XX:TargetSurvivorRatio=90
-XX:ReservedCodeCacheSize=64m -XX:CICompilerCount=8 -XX:+AlwaysPreTouch
-XX:+PrintReferenceGC -XX:+ParallelRefProcEnabled -XX:-UseAdaptiveSizePolicy
-XX:+PrintAdaptiveSizePolicy -XX:+DisableExplicitGC
JVM Heap
Recommendations for AdminServer
Modify the
AdminServer JVM since it's running both the WebLogic Console administration
application and the Enterprise Manager Fusion Application Control:
-server –Xms2g
–Xmx2g –XX:NewRatio=3 -XX:+AggressiveOpts -XX:PermSize=512m
-XX:MaxPermSize=512m -XX:+UseParallelGC -XX:+UseParallelOldGC
-XX:ParallelGCThreads=16 -XX:InitialSurvivorRatio=10 -XX:SurvivorRatio=10
-Dweblogic.StuckThreadMaxTime=900 -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/tmp/java.hprof -verbose:gc -Xloggc:/tmp/gc.log -Xnoclassgc
-XX:TargetSurvivorRatio=90 -XX:ReservedCodeCacheSize=64m -XX:CICompilerCount=8
-XX:+AlwaysPreTouch -XX:+PrintReferenceGC -XX:+ParallelRefProcEnabled
-XX:-UseAdaptiveSizePolicy -XX:+PrintAdaptiveSizePolicy -XX:+DisableExplicitGC
Tuning
Explanations:
The production
recommendation differs from Development by the size of the heap. Since the Production environment will be
hosting significantly more traffic it will need the additional heap space to
grow and handle those requests. Tuning will
be required in that there’s a possibility the Production SOA applications may
need more or less heap space as required by the following factors:
-
Size of SOA interfaces deployed
- Frequency of SOA interface usage or # of
instances per minute
- Length of time through
which each instance executes
Tuning the
environment should be not speculative but a measured one. Using Memory and GC
analyzing tools like Oracle Enterprise Manager Grid Control in conjunction with
Performance Load Testing activities, you will be able to tune your production
environment adequately to prevent any load related outages. Further tuning may
need to be monitored on the JVM for Garbage Collection times. If the time it takes to do partial or Full GC
increases significantly then increase the number of ParallelGCThreads. By default the ParallelGCThreads is set to
what’s available at the system level.
Example on a 2xUltraSPARC T2+ [T5240] = 128 which is too high and can
cause heap fragmentation.
Since garbage
collection in the Old Space or Tenured Space can be costly requiring more pause
time and cpu time to complete a full gc, you may need to size up the New,
Nursery, or Eden Space. This is
controlled with the NewRatio=n directive.
This sets the Eden Space to 1 / n + 1 size of the Max heap space. If you find that majority of objects are
short lived meaning the heap grows to a high end with heavy load but then
returns to a lower level, then you may benefit from a larger Eden space. This may require using a different directive
than NewRatio. You may need to size your
Eden space to 50 – 60% of the total heap size.
Try –XX:NewSize=5g –XX:MaxNewSize=5g where –Xmx8g.
The default
64bit thread stack size is 1024m under SPARCv9.
When defining a 64bit model [-d64] be sure to size down the thread stack
size which by default is too large; the 32bit model defaults to 512k on SPARCv9;
the Linux x86-64 the Java 32bit model is 256k.
Some performance benchmarks on spec.org for WebLogic set the thread
stack size to 128k on the Sun T series servers.
Having a high thread stack size can waste a significant amount of stack
space [heap space]. Consider setting it
to –Xss128k or –Xss256k to free up heap space and thereby reducing the overall
max heap the application may need under load.
Thread local
portions of the heap in the young generation is free space on the thread
stack. This can be used as a cache and
can offer “excellent speedups on smaller numbers of threads (100s)”. However, this can become a burden to the JVM
costing more gc time when the number of threads are in the thousands. On the Solaris SPARC platform the directive
–XX:+UseTLAB is on by default. When
testing an application under heavy load using thousands of threads and
experiencing excessive gc, consider turning off TLABs: -XX:UseTLAB
When sizing the
JVM heap or internal heaps ensure you set both the min and max to the same
size. This reduces latency while the JVM
is trying to size up or down the heap spaces.
Example NewSize, MaxNewSize or PermSize, MaxPermSize
-XX:+HeapDumpOnOutOfMemoryError
This directive creates a Heapdump in case of a out of memory error in the JMV.
This would allow you to diagnose the root cause of thr memory leak.
-XX:HeapDumpPath=/tmp/java.hprof
: This creates the Heapdump in the specified location.
-verbose:gc
-Xloggc:/tmp/gc.log : This option allows you to specify the gc log file
location.
Above tuning
recommendations could be applied to other FMW products as well. Again the above
recommendations should give you a descent start. With growing load and usage
you may reconsider the tuning.
Excellent article Venkat...
ReplyDeleteThank you....
ReplyDelete