Tomcat logging

This is a problem considering the cloud has only 20G of space and log files fill up quite quickly.

 

go to pwd/var/www/liferay-portal-6.1.1-ce-ga2/tomcat-7.0.27/bin and edit the catalina.sh

put CATALINA_OUT to /mnt

that will solve the problem

Demo

It was a disaster :(. Lack of preparations. Need to be more prepared next time. 

OpenSSL self sign certificate

openssl genrsa -des3 -out server.key 2048
openssl rsa -in server.key -out server.key.insecure
mv server.key server.key.secure
mv server.key.insecure server.key
openssl req -new -key server.key -out server.csr
openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt

Got a brick on my head…

Actually 2. Ouch.

About retrying in Nimrod/K

Looks like there is no mechanism for re-trying in Nimrod/K.

 

Nimrod has options to let users handle when a job failed with onerror command

Options available:

  • fail
  • ignore
  • restart
  • repeat

We can put the command: “onerror repeat/restart” before the actual command.

 

LBM error when running mpi in East with Nimrod

The experiment stops when ~ 90 jobs are done, 3 optimal points found.

From the error, it seems that it is communication error from the cluster.

Error log from Kepler

ptolemy.kernel.util.IllegalActionException: Nimrod job failed. Experiment name: mpi Jobname 1011
Error: Failed plan file line: – 5 – exec /bin/sh -c /bin/mkdir /home/hoang/mdo-flow-compliance/step1/9.4269441175058_29.0_139.8662186080265; /bin/pwd > /home/hoang/mdo-flow-compliance/step1/9.4269441175058_29.0_139.8662186080265/nimroddir ;cd /home/hoang/md… Traceback (most recent call last):
File “/home/ngdev/src/nimrod-trunk/level2/agent/Job.py”, line 1977, in Run
File “/home/ngdev/src/nimrod-trunk/level2/agent/Job.py”, line 1394, in ExecCmd
RuntimeError: ‘mkdir’ terminated with code 1
in .LBP_Topology_Optimization.LBM_C12_TCA.LBM_C12
with tag colour {sequenceID=0, metadata={Optimizer=object(org.monash.nimrod.optim.SimplexAlgorithm@6765f707), pointIndex=3, creator=object(org.monash.nimrod.optim.SimplexOptimActor {.LBP_Topology_Optimization.Simplex Optim Actor})}, parameters={r=9.4269441175058, re=139.8662186080265, s=29.0}, hashcode=-2058481515}
in .LBP_Topology_Optimization.LBM_C12_TCA.LBM_C12
at org.monash.nimrod.NimrodActor.NimrodGCommonFunctions.startAndWait(NimrodGCommonFunctions.java:208)
at org.monash.nimrod.GridJob.fire(GridJob.java:354)
at org.monash.nimrod.NimrodDirector.NimrodProcessThread.run(NimrodProcessThread.java:448)

Error log from Nimrod error log

[EAST-05:07832] opal_os_dirpath_create: Error: Unable to create the sub-directory (/scratch/1088260.1.all.q) of (/scratch/1088260.1.all.q/openmpi-sessions-hoang@EAST-05_0/11376/0/0), mkdir failed [1]
[EAST-05:07832] [[11376,0],0] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 106
[EAST-05:07832] [[11376,0],0] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 399
[EAST-05:07832] [[11376,0],0] ORTE_ERROR_LOG: Error in file ess_hnp_module.c at line 320
————————————————————————–
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

orte_session_dir failed
–> Returned value Error (-1) instead of ORTE_SUCCESS
————————————————————————–
[EAST-05:07832] [[11376,0],0] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 128
————————————————————————–
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here’s some additional information (which may only be relevant to an
Open MPI developer):

orte_ess_set_name failed
–> Returned value Error (-1) instead of ORTE_SUCCESS
————————————————————————–
[EAST-05:07832] [[11376,0],0] ORTE_ERROR_LOG: Error in file orterun.c at line 694
rm: cannot remove `Vortex00400000*’: No such file or directory
rm: cannot remove `3D_Trns00400000*’: No such file or directory
cp: cannot stat `lbm-data.tec’: No such file or directory
cp: cannot stat `lbm-output’: No such file or directory

Log in to node 05, looks like there is IO error, cannot ls scratch

Configure apache for cross site xmlhttprequests for ParaviewWeb

ParaviewWeb has apache installed in /opt/apache-version

To allow cross site xmlhttprequests, add

Header set Access-Control-Allow-Origin *

Within <VirtualHost *:80> and before <Directory> … </Directory> in conf/extra/httpd-vhosts.conf

The other way to create a proxy server.

Follow

Get every new post delivered to your Inbox.