I do need to write a long-form article about this, but I've been on a voyage of discovery configuring AND testing WAS transaction recovery, by placing the transaction/compensation/partner logs in an Oracle 12c database.
This is in the context of an IBM Business Process Manager Advanced environment.
During the process, I saw this in the SupCluster logs ( specifically the second cluster member ) : -
SupClusterMember2/SystemOut.log:[16/07/17 11:53:47:332 BST] 00000001 WASSessionCor I SessionProperties shouldSetAndDoLogging SESN0169I: Session Manager found the custom property InvalidateOnUnauthorizedSessionRequestException with value true.
SupClusterMember2/SystemOut.log:[16/07/17 11:59:40:316 BST] 0000004d SQLMultiScope I CWRLS0009E: Details of recovery log failure: Another server has locked the HA lock row, com.ibm.ws.recoverylog.spi.InternalLogException: Another server has locked the HA lock row
SupClusterMember2/SystemOut.log:[16/07/17 11:59:40:319 BST] 0000004d SQLMultiScope E CWRLS0024E: Exception caught during recovery! Another server has locked the HA lock row, com.ibm.ws.recoverylog.spi.InternalLogException: Another server has locked the HA lock row
SupClusterMember2/SystemOut.log:[16/07/17 11:59:40:324 BST] 0000004d SQLMultiScope A WTRN0107W: Caught non-SQLException Throwable when forcing SQL RecoveryLog tranlog for server PSCell1\Node2\SupClusterMember2 Throwable: Another server has locked the HA lock row, com.ibm.ws.recoverylog.spi.InternalLogException: Another server has locked the HA lock row
SupClusterMember2/SystemOut.log:[16/07/17 11:59:40:333 BST] 0000004d SQLMultiScope A WTRN0100E: Cannot recover from SQLException when forcing SQL RecoveryLog tranlog for server PSCell1\Node2\SupClusterMember2 Exception: Another server has locked the HA lock row, com.ibm.ws.recoverylog.spi.InternalLogException: Another server has locked the HA lock row
SupClusterMember2/SystemOut.log:[16/07/17 11:59:40:316 BST] 0000004d SQLMultiScope I CWRLS0009E: Details of recovery log failure: Another server has locked the HA lock row, com.ibm.ws.recoverylog.spi.InternalLogException: Another server has locked the HA lock row
SupClusterMember2/SystemOut.log:[16/07/17 11:59:40:319 BST] 0000004d SQLMultiScope E CWRLS0024E: Exception caught during recovery! Another server has locked the HA lock row, com.ibm.ws.recoverylog.spi.InternalLogException: Another server has locked the HA lock row
SupClusterMember2/SystemOut.log:[16/07/17 11:59:40:324 BST] 0000004d SQLMultiScope A WTRN0107W: Caught non-SQLException Throwable when forcing SQL RecoveryLog tranlog for server PSCell1\Node2\SupClusterMember2 Throwable: Another server has locked the HA lock row, com.ibm.ws.recoverylog.spi.InternalLogException: Another server has locked the HA lock row
SupClusterMember2/SystemOut.log:[16/07/17 11:59:40:333 BST] 0000004d SQLMultiScope A WTRN0100E: Cannot recover from SQLException when forcing SQL RecoveryLog tranlog for server PSCell1\Node2\SupClusterMember2 Exception: Another server has locked the HA lock row, com.ibm.ws.recoverylog.spi.InternalLogException: Another server has locked the HA lock row
The problem was a PEBCAK, in that I'd obviously misconfigured things.
I validated my WAS configuration: -
cat /opt/ibm/WebSphereProfiles/Dmgr01/config/cells/PSCell1/nodes/Node1/serverindex.xml | grep -i recoveryLog
<recoveryLog xmi:id="RecoveryLog_1500062274540" transactionLogDirectory="custom://com.ibm.rls.jdbc.SQLRecoveryLog?datasource=jdbc/AppCluster_Tranlogs,tablesuffix=App1" compensationLogDirectory="custom://com.ibm.rls.jdbc.SQLRecoveryLog?datasource=jdbc/AppCluster_Tranlogs,tablesuffix=App1" compensationLogFileSize="5"/>
<recoveryLog xmi:id="RecoveryLog_1500062279192" transactionLogDirectory="custom://com.ibm.rls.jdbc.SQLRecoveryLog?datasource=jdbc/SupCluster_Tranlogs,tablesuffix=Sup1"/>
<recoveryLog xmi:id="RecoveryLog_1500062279192" transactionLogDirectory="custom://com.ibm.rls.jdbc.SQLRecoveryLog?datasource=jdbc/SupCluster_Tranlogs,tablesuffix=Sup1"/>
cat /opt/ibm/WebSphereProfiles/Dmgr01/config/cells/PSCell1/nodes/Node2/serverindex.xml | grep -i recoveryLog
<recoveryLog xmi:id="RecoveryLog_1500062274432" transactionLogDirectory="custom://com.ibm.rls.jdbc.SQLRecoveryLog?datasource=jdbc/AppCluster_Tranlogs,tablesuffix=App2" compensationLogDirectory="custom://com.ibm.rls.jdbc.SQLRecoveryLog?datasource=jdbc/AppCluster_Tranlogs,tablesuffix=App2" compensationLogFileSize="5"/>
<recoveryLog xmi:id="RecoveryLog_1500062279152" transactionLogDirectory="custom://com.ibm.rls.jdbc.SQLRecoveryLog?datasource=jdbc/SupCluster_Tranlogs,tablesuffix=Sup2"/>
to ensure that I : -
(a) was using the right datasources for the right clusters ( AppCluster has both Transaction and Compensation logs, whereas SupCluster only has Transaction logs )
(b) had suitably incremented the suffix - App1 or Sup1 for member 1, App2 or Sup2 for member 2
Finally, as this was a TEST environment, I dropped the tables: -
DROP TABLE CMNUSER.WAS_TRAN_LOGAPP1;
DROP TABLE CMNUSER.WAS_PARTNER_LOGAPP1;
DROP TABLE CMNUSER.WAS_COMP_LOGAPP1;
DROP TABLE CMNUSER.WAS_TRAN_LOGAPP2;
DROP TABLE CMNUSER.WAS_PARTNER_LOGAPP2;
DROP TABLE CMNUSER.WAS_COMP_LOGAPP2;
DROP TABLE CMNUSER.WAS_TRAN_LOGSUP1;
DROP TABLE CMNUSER.WAS_PARTNER_LOGSUP1;
DROP TABLE CMNUSER.WAS_COMP_LOGSUP1;
DROP TABLE CMNUSER.WAS_TRAN_LOGSUP2;
DROP TABLE CMNUSER.WAS_PARTNER_LOGSUP2;
DROP TABLE CMNUSER.WAS_COMP_LOGSUP2;
DROP TABLE CMNUSER.WAS_PARTNER_LOGAPP1;
DROP TABLE CMNUSER.WAS_COMP_LOGAPP1;
DROP TABLE CMNUSER.WAS_TRAN_LOGAPP2;
DROP TABLE CMNUSER.WAS_PARTNER_LOGAPP2;
DROP TABLE CMNUSER.WAS_COMP_LOGAPP2;
DROP TABLE CMNUSER.WAS_TRAN_LOGSUP1;
DROP TABLE CMNUSER.WAS_PARTNER_LOGSUP1;
DROP TABLE CMNUSER.WAS_COMP_LOGSUP1;
DROP TABLE CMNUSER.WAS_TRAN_LOGSUP2;
DROP TABLE CMNUSER.WAS_PARTNER_LOGSUP2;
DROP TABLE CMNUSER.WAS_COMP_LOGSUP2;
and recreated them: -
CREATE TABLE CMNUSER.WAS_TRAN_LOGAPP1(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_PARTNER_LOGAPP1(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_COMP_LOGAPP1(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_TRAN_LOGAPP2(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_PARTNER_LOGAPP2(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_COMP_LOGAPP2(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_TRAN_LOGSUP1(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_PARTNER_LOGSUP1(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_COMP_LOGSUP1(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_TRAN_LOGSUP2(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_PARTNER_LOGSUP2(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_COMP_LOGSUP2(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_PARTNER_LOGAPP1(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_COMP_LOGAPP1(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_TRAN_LOGAPP2(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_PARTNER_LOGAPP2(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_COMP_LOGAPP2(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_TRAN_LOGSUP1(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_PARTNER_LOGSUP1(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_COMP_LOGSUP1(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_TRAN_LOGSUP2(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_PARTNER_LOGSUP2(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
CREATE TABLE CMNUSER.WAS_COMP_LOGSUP2(
SERVER_NAME VARCHAR(128),
SERVICE_ID SMALLINT,
RU_ID NUMBER(19),
RUSECTION_ID NUMBER(19),
RUSECTION_DATA_INDEX SMALLINT,
DATA BLOB);
And all is well.
From a testing perspective, I've created a SCA module which uses a JDBC Resource Adapter to create/update/read data from an Oracle database table.
Again, that's for a future long-form article ….