Tuesday, October 5, 2010

Failed to initialize the application bea_wls_internal – Weblogic Cluster


Background:
2 node cluster setup with 2 managed servers (Admin + Managed;  Managed).
Weblogic Server 10.3.2
Weblogic Portal 10.3.2
Admin Server and Managed server directories are different (High Availability setup, Admin Server is on shared location, managed servers on local disk)

On starting the managed servers, the server fails to start. The output log file shows the below errors:

Error Security BEA-000000 [Security:090836]The Keystore provider configured for PKICredential Mapper does not exist at location wsrpKeystore.jks.
Error Deployer BEA-149205 Failed to initialize the application 'bea_wls_internal' due to error weblogic.application.ModuleException: Failed to load webapp: 'bea_wls_internal.war'.
weblogic.application.ModuleException: Failed to load webapp: 'bea_wls_internal.war'
        at weblogic.servlet.internal.WebAppModule.prepare(WebAppModule.java:387)
        at weblogic.application.internal.flow.ScopedModuleDriver.prepare(ScopedModuleDriver.java:180)
        at weblogic.application.internal.flow.ModuleListenerInvoker.prepare(ModuleListenerInvoker.java:93)
        at weblogic.application.internal.flow.DeploymentCallbackFlow$1.next(DeploymentCallbackFlow.java:388)
        at weblogic.application.utils.StateMachineDriver.nextState(StateMachineDriver.java:37)
        Truncated. see log file for complete stacktrace
javax.servlet.ServletException: [Security:090820]The internal variable ServletInfoSpi is null and it should not be.
        at weblogic.security.providers.saml.SAMLServletAuthenticationFilter.init(SAMLServletAuthenticationFilter.java:52)
        at weblogic.security.service.internal.ServletAuthenticationFilterServiceImpl$ServiceImpl.getServletAuthenticationFilters(Unknown Source)
        at weblogic.security.service.PrincipalAuthenticator.getServletAuthenticationFilters(Unknown Source)
        at weblogic.servlet.security.internal.WebAppSecurity.init(WebAppSecurity.java:80)
        at weblogic.servlet.security.internal.WebAppSecurityWLS.init(WebAppSecurityWLS.java:66)
        Truncated. see log file for complete stacktrace

Cause.
The Managed server fails to load security profiles. This is because the security files are not present in the local directory structure. This is because we changed the managed server location from the shared storage to local directory using pack/unpack utility as per standard guidelines to create high availability setup. This utility did not copy the security files.

Fix.
Copy security files from shared location to local disks.
bash-3.00$ cd /admin_server/admin/domains/portal_domain/
bash-3.00$ cp wsrpKeystore.jks DemoTrust.jks DemoIdentity.jks /opt/oracle/product/middleware/user_projects/domains/portal_domain

Invalid xsi:type qname: wsrp:wsrp-identity-asserterType – Weblogic Portal


While starting Admin server using WLST in a domain which supports Weblogic Portal, the admin server does not starts and errors out with the following errors in the WLST console:

Error Starting server AdminServer: weblogic.nodemanager.NMException: Exception while starting server 'AdminServer': java.io.IOException: Server failed to start up. See server output log for more details.

The output log file says this:

BEA-141244 Schema validation errors while parsing /admin_server/admin/domains/portal_domain/config/config.xml - Invalid xsi:type qname: 'wsrp:wsrp-identity-asserterType' in element realm@http://www.bea.com/ns/weblogic/920/domain
Error Management BEA-141244 Schema validation errors while parsing /admin_server/admin/domains/portal_domain/config/config.xml - /admin_server/admin/domains/portal_domain/unknown:13:9: error: failed to load java type corresponding to t=wsrp-identity-asserterType@http://www.bea.com/ns/wlp/90/security/wsrp
Critical WebLogicServer BEA-000362 Server failed. Reason: [Management:141245]Schema Validation Error in /admin_server/admin/domains/portal_domain/config/config.xml see log for details. Schema validation can be disabled by starting the server with the command line option: -Dweblogic.configuration.schemaValidationEnabled=false

Cause.
The domain which we created requires portal libraries to be loaded when any of the servers starts. This can be verified from the config.xml file in which all libraries are targeted to the Admin server and the cluster. When any server is started using WLST the node manager does not reads the setDomain.env file associated with that domain and hence the server is unaware of the path of the library files.

Fix.
Start the Admin server using the startWeblogic.sh script. Navigate to “Servers è Admin Server è Server Start” tab and add the following parameters for “Arguments” section.
-d64 -Dweblogic.ext.dirs=/opt/oracle/product/middleware/patch_wlw1030/profiles/default/sysext_manifest_classpath:/opt/oracle/product/middleware/patch_wls1030/profiles/default/sysext_manifest_classpath:/opt/oracle/product/middleware/patch_wlp1030/profiles/default/sysext_manifest_classpath:/opt/oracle/product/middleware/patch_cie670/profiles/default/sysext_manifest_classpath:/opt/oracle/product/middleware/wlportal_10.3/p13n/lib/system:/opt/oracle/product/middleware/wlportal_10.3/light-portal/lib/system:/opt/oracle/product/middleware/wlportal_10.3/portal/lib/system:/opt/oracle/product/middleware/wlportal_10.3/info-mgmt/lib/system:/opt/oracle/product/middleware/wlportal_10.3/analytics/lib/system:/opt/oracle/product/middleware/wlportal_10.3/apps/lib/system:/opt/oracle/product/middleware/wlportal_10.3/info-mgmt/deprecated/lib/system:/opt/oracle/product/middleware/wlportal_10.3/content-mgmt/lib/system -Dweblogic.alternateTypesDirectory=/opt/oracle/product/middleware/wlportal_10.3/portal/lib/security

*considering that the WL_HOME is /opt/oracle/product/middleware

Native version is enabled but node manager native library could not be loaded


The biggest challenge when working with 64 bit installations is to ensure that all components pick up the correct libraries. While doing a 64 bit installation of Weblogic Portal recently, my nodemanager refused to start.

While starting the node manager, following error message is thrown:
weblogic.nodemanager.common.ConfigException: Native version is enabled but node manager native library could not be loaded
        at weblogic.nodemanager.server.NMServerConfig.initProcessControl(NMServerConfig.java:239)
        at weblogic.nodemanager.server.NMServerConfig.(NMServerConfig.java:179)
        at weblogic.nodemanager.server.NMServer.init(NMServer.java:176)
        at weblogic.nodemanager.server.NMServer.(NMServer.java:141)
        at weblogic.nodemanager.server.NMServer.main(NMServer.java:337)
        at weblogic.NodeManager.main(NodeManager.java:31)
Caused by: java.lang.UnsatisfiedLinkError: /opt/oracle/product/middleware/wlserver_10.3/server/native/solaris/sparc64/libnodemanager.so: ld.so.1: java: fatal: /opt/oracle/product/middleware/wlserver_10.3/server/native/solaris/sparc64/libnodemanager.so: wrong ELF class: ELFCLASS64 (Possible cause: architecture word width mismatch)
        at java.lang.ClassLoader$NativeLibrary.load(Native Method)
        at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1803)
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1728)
        at java.lang.Runtime.loadLibrary0(Runtime.java:823)
        at java.lang.System.loadLibrary(System.java:1028)
        at weblogic.nodemanager.util.UnixProcessControl.(UnixProcessControl.java:24)
        at weblogic.nodemanager.util.Platform.getProcessControl(Platform.java:114)
        at weblogic.nodemanager.server.NMServerConfig.initProcessControl(NMServerConfig.java:237)

Cause.
This error occurs because of mismatch of libraries being loaded at runtime. The OS is 64 bit; therefore the client (node manager) should be started as a 64 bit client. For some reason, the startNodeManager.sh script did not had the “-d64” flag in the java options, hence this error.
               
Fix.
Edit the startNodeManager.sh script and add the “-d64” option to JAVA_VM variable.
JAVA_VM="-d64 ${JAVA_VM}
Restart the Node manager and it should start normally.

Wednesday, July 21, 2010

Application Server 10g Log Locations

These are the common log locations for various components under Application Server 10g.

Sr.

Component

Log Location

Comments

1

HTTP Server

$OH/Apache/Apache/logs

Contains Access log and Error logs. Logging can be controlled using LogLevel parameter in httpd.conf file under $OH/Apache/Apache/conf

2

OPMN

$OH/opmn/logs

Has start/stop logs for all components installed. E.g. HTTP, OID, SOA etc

3

SSO Server

$OH/sso/log/ssoServer.log

Has detailed logs of SSO server. Will also give information for user authentication failure etc. Can configure the log level in policy.properties under $OH/sso/conf

4

LDAP Server (OID)

$OH/ldap/log

Contains logging information about the ldap processes, syncrhronization processes (odisrv) and monitor process (oidmon). The logs for LDAP and ODISRV will have their PIDs suffixed to their name (oidldapd01s966746.log)

5.1

OC4J

$OH/j2ee/<instance_name>/log/oc4j/log.xml

This file contains all generic information about the specific J2EE instance. This is common for all J2EE versions (10.1.2.x, 10.1.4.x and 10.1.3.x). This is particularly useful in debugging ESB applications.

5.2

OC4J

$OH/j2ee/<instance_name>/log

This location has various different logs (jms, rmi, default-web-access, global-application etc). Dig under this folder to find out logging information related to jms/rmi issues. For example, if there is an issue in two systems interacting to each other using rmi protocol (which is mostly the case), you can check the rmi.log for information related to that.This is also common for all versions of Application Server.

5.3

OC4J

$OH/j2ee/<instance_name>/application-deployments/<application><instance_name_default_island_num>

An instance will have many applications deployed to it viz. sso, oiddas, esb-dt, bpel, ccore (wsm). The application specific logs are generated in this folder. For versions 10.1.2.x and 10.1.4.x an example of the location can be "$OH/j2ee/OC4J_SECURITY/application-deployments/oiddas/OC4J_SECURITY_default_island_1" (for OIDDAS application) and for version 10.1.3.x it would be "$OH/j2ee/oc4j_esbdt/application-deployments/esb-dt/oc4j_esbdt_SOA_ESBDT_1" (for ESB DT application).

5.3

OC4J

$OH/j2ee/<instance_name>/application-deployments/<application><instance_name_group_name>

  

6

DCM Server

$OH/dcm/logs

Contains logs for dcmctl commands.

7

IAS Console (10.1.2.x & 10.1.4.x)

$OH/sysman/log

Contains logging information about the iascontrol application used in versions 10.1.2.x and 10.1.4.x. For version 10.1.3.x there is no iascontrol instead there is em console (the application name is ascontrol). The log for that can be accessed as mentioned in point 5.3.

8

Webcache Server

$OH/webcache/logs

Has access and error logs for webcache.

9

Report Server

$OH/reports/logs/<report_server_name>/rwserver.log

All startup/shutdown information of reports server as well the engines is logged in this along with any other error messages.

10

Forms Server

$OH/forms/trace

If the trace is enabled for forms then this directory will contain all the trace dumps. You can get the name of the trace file from the iasconsole when you enable the trace for a form session. For information on formapps application check location at 5.3 (substitue formapps for application)

11

Portal Server

$OH/portal/logs

This directory contains only logging information from ptlconfig file. For a detailed portal application log check point 5.3.

12.1

BPEL Server

$OH/bpel/system/logs

This is basic directory having all BPEL specific logs. This will have system wide logs.

12.2

BPEL Server

$OH/bpel/domains/<domain_name>/logs

This directory contains only domain specific logs.

Tuesday, July 20, 2010

SSO server version v3.0 is not supported

While enabling SSO for SOA 10g I tumbled upon this error while starting my HTTP Server process.
The HTTP Server logs under opmn showed the following:
$ cat $OH/opmn/logs/HTTP_Server *

--------
10/06/25 07:17:30 Start process
--------
/u01/app/oracle/product/10.1.3/OracleAS_1/Apache/Apache/bin/apachectl startssl: execing httpd
Syntax error on line 6 of /u01/app/oracle/product/10.1.3/OracleAS_1/Apache/Apache/conf/mod_osso.conf:
SSO server version v3.0 is not supported.

Now this happened because of a version mismatch in the SSO conf file I generated from my SSO box. The SSO version was 10.1.4.3 and I was using the generated conf file for SOA 10.1.3.4. The syntax I used for generating the file was this:

$OH/sso/bin/ssoreg.sh -oracle_home_path /u03/app/OAS/product/10.1.4/Infra/sso \
 -site_name xyz.com -config_mod_osso TRUE \
 -mod_osso_url http://xyz.com:7777 -remote_midtier \
 -config_file /u03/app/OAS/product/10.1.4/Infra/sso/Apache/Apache/conf/osso/osso_soa.conf

This by default creates a SSO configuration file which would be of a version 3, which is incompatible with SOA 10.1.3.x installations. So, we need to add a parameter to the above command to tell SSO that we need a file with older version. Run the same command with the following syntax:

$OH/sso/bin/ssoreg.sh -oracle_home_path /u03/app/OAS/product/10.1.4/Infra/sso \
-site_name xyz.com -config_mod_osso TRUE \
-mod_osso_url http://xyz.com:7777 -remote_midtier \
-config_file /u03/app/OAS/product/10.1.4/Infra/sso/Apache/Apache/conf/osso/osso_soa.conf
-sso_partner_version v1.4

Transfer the SSO configuration file generated to the middle tier and run the osso1013 command again. Try starting the HTTP Server, it should work now.

References: Metalink note 809743.1

Thursday, July 1, 2010

Can't locate File/Compare.pm in @INC on AIX

While integrating SOA Suite 10g with Oracle Identity Management 10g, I came across some strange PERL errors, probably which pointed to the fact that some libraries were missing.

While running the osso1013 script, I tumbled upon these errors:

Can't locate File/Compare.pm in @INC (@INC contains: /project/as10g/src/shvaramb/perl58/perl58/bin/AIX/Opt/lib/5.8.3/aix-thread-multi /project/as10g/src/shvaramb/perl58/perl58/bin/AIX/Opt/lib/5.8.3 /project/as10g/src/shvaramb/perl58/perl58/bin/AIX/Opt/lib/site_perl/5.8.3/aix-thread-multi /project/as10g/src/shvaramb/perl58/perl58/bin/AIX/Opt/lib/site_perl/5.8.3 /project/as10g/src/shvaramb/perl58/perl58/bin/AIX/Opt/lib/site_perl . /u01/app/oracle/product/10.1.3/OracleAS_1/perl/site/5.8.3/lib /u01/app/oracle/product/10.1.3/OracleAS_1/perl/5.8.3/lib) at ./osso1013 line 67.
BEGIN failed--compilation aborted at ./osso1013 line 67.

The file (Compare.pm) in this case could be any other file. A quick cross check on the perl version tells me I'm using the latest one.

$ perl -version
This is perl, v5.8.2 built for aix-thread-multi
$ which perl
/usr/bin/perl

But, wait a second. Looks like I'm using the default perl supplied by the OS. Now, this is not supported by Oracle. According to the doc, it says that for every perl script you run within the Application Server Stack, you need to use the perl shipped with Application Server i.e. I should be using $OH/perl/bin/perl and not /usr/bin/perl.

Time for corrections then,

$ export PATH=$ORACLE_HOME/bin:$ORACLE_HOME/opmn/bin:$ORACLE_HOME/perl/bin:$PATH
$ export LD_LIBRARY_PATH=$ORACLE_HOME/lib
$ which perl
/u01/app/oracle/product/10.1.3/OracleAS_1/perl/bin/perl

I retried, but it failed again with the same errors! I could see the Compare.pm file somewhere under $OH/perl/lib but somehow the perl interpreter doesn't seem to pick it up. You can check the paths perl is picking up with "perl -V" command. (Remember the capital "V")

After a bit of googling n metalinking, I found the solution. This machine is an AIX box and for AIX the environment variable PERL5LIB has to be set.

Setting the variable PERL5LIB:

$ export PERL5LIB=$ORACLE_HOME/perl/lib/site_perl/5.8.3:$ORACLE_HOME/perl/lib/5.8.3

Note that the "5.8.3" could be any version (5.6.1 or something else).

That did the trick and I could register SSO successfully!

Saturday, June 5, 2010

Changing hostname/IP for Weblogic 11g


Recently I had to migrate two VMs hosting SOA/WebCenter 11g respectively from our US centers to India centers. This involved change in the hostname/IP of the two servers. Now the process is already documented @
http://download.oracle.com/docs/cd/E12839_01/core.1111/e10105/host.htm#CHDGEDCF.
But as is with all documents, not everything is documented!

The startManagedServer.sh script has the URL of the AdminServer hard-coded. So you have to change that too. Having done this, start the nodemanager.
When you start your nodemanager, you might encounter the below error (in the AdminServer logfile):

BEA-090504 - Certificate chain received from localhost - 127.0.0.1 failed hostname verification check. Certificate contained xyz.abc.com but check expected localhost
OR
BEA-090482 - BAD_CERTIFICATE alert was received from localhost.localdomain - 127.0.0.1. Check the peer to determine why it rejected the certificate chain (trusted CA configuration, hostname verification). SSL debug tracing may be required to determine the exact reason the certificate was rejected.

There are two ways to solve it.

1. Disable Flags - Jugaad way ;)
Put the following flags at the right places.
Node Manager: -Dweblogic.nodemanager.sslHostNameVerificationEnabled=false
Admin Server: -Dweblogic.security.SSL.ignoreHostnameVerification=true

2. Recreate the Certificates - The recommended way.
Node manager by default uses the WebLogic demo identity keystore. The keystore is generated at install time using the CertGen utility. The generated private key uses the common name (cn) resolved by Java.

2.1 Set the PATH
. $WL_HOME/server/bin/setWLSEnv.sh

2.2 Backup DemoIdentity.jks under $WL_HOME/server/lib


2.3 Generate the private key.

java utils.CertGen -cn -keyfilepass DemoIdentityPassPhrase -certfile newcert -keyfile newkey

2.4 Import the key generated above to the keystore.

java utils.ImportPrivateKey -keystore DemoIdentity.jks -storepass DemoIdentityKeyStorePassPhrase -keyfile newkey.pem -keyfilepass DemoIdentityPassPhrase -certfile newcert.pem -alias demoidentity

2.5 Copy DemoIdentity.jks to $WL_HOME/server/lib


2.6 Restart your nodemanager.

That's it !