Auto detect java.lang.OutOfMemoryError
While using weblogic containers for managed servers we encountered OOM errors and sometimes the process use to go down in the middle of night where no one was monitoring.Due to this our productivity severely impacted. So, finally to eradicate this kind of issues we created below script which runs as a cron job each 10mins and detects if any out of memory situations comes and bounces the managed server automatically after backing up the error logs for future analysis of the error.
You can use the below autocheck.sh script as a cronjob like below to run each 10 mins to check for OOM errors or process failures.
-bash-3.2$ crontab -e */10 * * * * /opt/apps/mgr/autocheck.sh :wq -bash-3.2$ crontab -l */10 * * * * /opt/apps/mgr/autocheck.sh
Below is the script for autocheck.sh
Change the “ENVNAME, WORKDIR, msName, Log_LOC, Tmp_LOC, Cac_LOC” according to your env. details including the email addresses.
#!/bin/bash ########## ENVNAME="Production Env:" COUNTER="0" WORKDIR="/opt/apps/mgr" msName="myprodserver1" Log_LOC="$WORKDIR/WLS/user_projects/domains/wls_mydomain/servers/AdminServer/logs" Tmp_LOC="$WORKDIR/WLS/user_projects/domains/wls_mydomain/servers/AdminServer/tmp" Cac_LOC="$WORKDIR/WLS/user_projects/domains/wls_mydomain/servers/AdminServer/cache" StartScritDir="$WORKDIR/WLS/user_projects/domains/wls_mydomain/bin" ############################################## ## You can add any number of log files to monitor ## for the specified out of memory errors ############################################## LogFiles=( AdminServer.log AdminServer-diagnostic.log wls_app1266.log ) ########## ###NOTIFICATIONS### #To email address where all the notifications will be sent via mail EMAIL="notification_list(at)techpaste.com" # CC list in the notification mail CCList="mymail(at)techpaste.com" # From email address in the notification Email FromAdd="mymail(at)techpaste.com" ########## ######End to be modified###### ######Do not make modifications below###### ########################### #Functions ########################## OOMCheck() { for logfile in ${LogFiles[@]} ;do OOMCount=`grep "java.lang.OutOfMemoryError" $Log_LOC/$logfile | wc -l` COUNTER=$[$COUNTER + $OOMCount] export COUNTER done } Clean() { echo "`date` :Clearing Temp and Cache Folders..."; if [ -d "$Tmp_LOC" ]; then rm -rf $Tmp_LOC/* fi if [ -d "$Cac_LOC" ]; then rm -rf $Cac_LOC/* fi } CleanLogs() { for logfile in ${LogFiles[@]} ;do if [ -f $Log_LOC/$logfile.tar.gz ]; then rm -f $Log_LOC/$logfile.tar.gz fi if [ -f $Log_LOC/$logfile ]; then tar -czf $Log_LOC/$logfile.tar.gz $Log_LOC/$logfile rm -f $Log_LOC/$logfile fi done } BackupLogs() { for logfile in ${LogFiles[@]} ;do if [ -f $Log_LOC/$logfile ]; then tar -czf $Log_LOC/$logfile.tar.gz $Log_LOC/$logfile fi done } ProcCheck() { PID=`ps -eaf | grep -v grep | grep java |grep $msName | grep -v "<defunct>" | awk '{ print $2 }'` export PID } KillProc() { PID=`ps -eaf | grep -v grep | grep java |grep $msName | grep -v "<defunct>" | awk '{ print $2 }'` kill -9 $PID } StartServ() { cd $StartScritDir ./startWeblogic.sh $msName } Main() { OOMCheck ProcCheck if [[ "$COUNTER" != "0" || "$PID" == "" ]] ; then echo "`date` :Out Of Memory Condition Detected or Server Process is Down..." echo "`date` :Bouncing the Env.."; echo "`date` :Backing up logs for future reference.." BackupLogs echo "$ENVNAME Seems To Be Down.Auto Restarting the server..." | mail -s "$(echo -e "Auto-Msg: $ENVNAME :Java Process Is Down Or OutOfMemory Error\nContent-Type: text/html")" $EMAIL -c $CCList -- -f $FromAdd KillProc Clean CleanLogs StartServ echo "$ENVNAME : Server restarted and env back online." | mail -s "$(echo -e "Auto-Msg:[RESTART-COMPLETED] $ENVNAME :Java Process Is Down Or OutOfMemory Error\nContent-Type: text/html")" $EMAIL -c $CCList -- -f $FromAdd else echo "`date` :Exiting, No OOM or Server Process downtime found..."; exit 0 fi } Main
Sample Output: -bash-3.2$ ./autocheck.sh Wed Aug 28 08:44:10 PDT 2013 :Exiting, No OOM or Server Process downtime found...
Please do let me know in comments incase you have some doubts on how to execute it. I will try to reply all your queries as soon as I get to them.
In case of any ©Copyright or missing credits issue please check CopyRights page for faster resolutions.
Hello,
Stumbled across your site while searching for something similar.
May I use your codes and make changes as necessary?
Regards
Ahmad
Yes, please. You are free to use/modify anything from this site till the point you give the credits to the original poster. 🙂
Good day Ramakanta,
Thank you for allowing me to use and modify.
Credit and link will set to you.
Thank you again from a sysadmin in Malaysia.
Most welcome.. 🙂