Tuesday, March 2, 2010

Troubleshooting AR UPDATE Process

Instructions for the Reader
A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting.



Overview
The recovery from an ARUPDATE abend depends on WHERE the abend occurred in the lifecycle of the ARUPDATE multiprocess job. Hence, it is important to appreciate the 'structure' of the ARUPDATE job and the processes, executed steps and their sequence.

The Receivable Update multi-process job (ARUPDATE) consists of up to eight steps, but these 4 are always executed:

1. AR_UPDATE AE process partitions the data and, if needed, calls AE AR_POSTING to create Payment, Maintenance, Transfer, Direct Debit and/or Draft Worksheets GROUP_CONTROL and PENDING_ITEM records.

2. AR_PGG job contains AE process AR_PGG1 that calls AE process AR_PGG_SERV to generate PENDING_DST accounting entries and, if needed, VAT and IU entries for any groups that are set to post.

3. AR_POST job contains AE process AR_POST1 that calls AE process AR_POSTING to post the transactions in each selected group.

4. AR_UPDATE2 AE process performs cleanup tasks.

The AR_UPDATE, AR_PGG1, AR_POST1 and AR_UPDATE2 will generate Message Log and Application Engine (AE) trace files. Hence, when ARUPDATE is run, you can examine the following traces (produced in HTM but you can also save as TXT files) but please note the sequence of allocating Process Instances for each AE is not sequential. :

AE_AR_UPDATE__AET, AE_AR_PGG1__AET, AE_AR_POST1__AET & AE_AR_UPDATE2__AET.

ARUPDATE Main Processes
1. AR_UPDATE
1. Builds groups of pending items (GROUP_CONTROL, PENDING_ITEM) from Payment, Maintenance, Transfer, Direct Debit and Drafts worksheets and
2. Executes the parallel preprocessor to add group records to RP_PARALLEL and partition all groups set to post among the specified parallel processes.

2. AR_PGG
The AR_PGG parallel job calls the AR_PGG1 AE Process (As delivered in DEMO. At customer sites you may also have AR_PGG2, AR_PGG3, etc.). The AR_PGG1 AE Process sets the partition to #1 and then calls AR_PGG_SERV to actually process the data; it is only 1 Step. The AR_PGG_SERV main operations are:

1. Copies group records from RP_PARALLEL to a Temp table
2. Locks the PENDING_ITEM by setting the records PROCESS_INSTANCE number and updates DST_BAL_ITEM =’O’
3. Locks GROUP_CONTROL records for each group and updates DST_BAL_STATUS=’O’
4. Creates and inserts the new User then System accounting entry rows in a Temp table PS_PG_PENDST_TAOn
5. Retrieves Customer records for each pending item
6. Starts the IU processor, if needed, to add Inter/Intra rows to PENDING_DST
7. Inserts into PENDING_DST from PS_PG_PENDST_TAOn
8. Updates GROUP_CONTROL DST_BAL_STATUS=’I’ and unlocks group records
9. Updates PENDING_ITEM DST_BAL_ITEM=’I’ and unlocks pending item records


3. AR_POST
The AR_POST parallel job calls AR_POST1. The AR_POST1 AE Process contains a MAIN Section with only 1 POSTING Step and 2 Actions that sets the partition and RP_ACTION to ‘P’ and then calls the AR_POSTING AE Process. The AR_POSTING AE is called twice:

1. The 1st time from AR_UPDATE to generate any Payment, Maintenance, Transfer, Draft or Direct Debit Groups

2. The 2nd time from AR_POST1 to post the generated group pending items

The AR_POSTING logic executed depends on RP_ACTION when the program is called. If the RP_ACTION = ‘W’ then AR_POSTING will generate Worksheet groups and pending items. This was covered in the AR_UPDATE analysis section. If the RP_ACTION =’P’ then AR_POSTING will post the generated groups pending items.


4. AR_UPDATE2
The AR_UPDATE2 AE consists of a MAIN calling the UPDATE2 Section. The UPDATE2 Section calls AR_POSTING, AR_REV_EST and AR_JGEN as needed. The HISTORY Step calls AR_POSTING.UPDATE2 Section if the Run Control Parameters include any History flags. The ARREVEST Step calls AR_REV_EST.MAIN Section if Commitment Control is switched ON for AR. The FS_JGEN Step calls AR_JGEN.MAIN Section if the Run Control Process to GL flag is ‘Y’.


Troubleshooting Steps
Step 1: Verify If Online Recovery is Feasible
Navigate to: Accounts Receivable > Receivables Update > Reset Process page.

If the process does not appear on the 'Process Reset' page, click 'Refresh' or Search (magnifying glass) to ensure the page has the latest updates. If the process appears under the 'Not Successful Process Requests' list, then either or both of the 'Reset' and 'Restart' (yellow) buttons will be highlighted. This verifies whether the process can be reset or restarted online (hence avoiding external SQL updates and ensuring data integrity). If there is a need to log a GCS SR later on, ensure this page is attached to SR.

Step 2: Verify Where the Abend Occured
1.1 Which process abended: AR_UPDATE, AR_PGG [AR_PGG_SERV], AR_POST [AR_POSTING] or AR_UPDATE2?

1.1.1 If the Abend occured in AR_UPDATE (1st process) then the recovery is relatively simple as no accounting entries were generated (AR_PGG) and no posting/table updating occured (AR_POST). Once the reason for the abend is resolved (which naturally varies according to the Abend message), the job can be Restarted or Reset online (if feasible).

1.1.2 If the Abend occcured in AR_PGG/AR_PGG_SERV, then application data records have been 'locked' by the abending process (all relevant table records PROCESS_INSTANCE field updated with the running/abended Process Instance number) and will need to be unlocked (otherwise, they cannot be used by any other process). This 'unlocking' is part of the 'Reset' online process (if feasible) and effectively does 2 main updates:

1. Updates PROCESS_INSTANCE field in all relevant data records to Zero (the value '0', to allow data records to be selected again)

2. Deletes all records from RP_PARALLEL (to allow groups to be re-selected again)

The exact SQLs used by the online 'Reset' process can be found by executing this SQL:

select * from PS_AR_RESETSQL_TBL where PRCSNAME='ARUPDATE'

1.1.3 If the Abend occcured in AR_POST/AR_POSTING, then application data records could have been updated (e.g. new ITEMs inserted, new ITEM_DST accounting records added, CUST_DATA customer balances updated, etc.). If the database records updates were not Rolled Back (reveresed or not Committed) then this would constitute a major effort as it could mean data corruption (e.g. Related parent-child tables are out-of-synch, totals incorrect, etc.). The determination of the extent of the data corruption, if any, would be the first task in this (thankfully rare) situation.

1.1.4 If the Abend occured in AR_UPDATE2, then it would mean that database application records have bee updated but Customer History may have not. Depending on the Abend, this could be recovered in the next run of ARUDPATE (as Customer History updates are Run Time Options driven) but it would need in-depth investigation.

Step 3: Verify Data Status after ARUPDATE Abend
1. Verify any existing records locked by Process Instance = Abending Process Instance PI# .

(some of the tables below, e.g VAT, SUBCUST, DRAFT may not be relevant to your site):

SELECT * FROM PS_GROUP_CONTROL WHERE PROCESS_INSTANCE =’PI#’
SELECT * FROM PS_PENDING_ITEM WHERE PROCESS_INSTANCE =’PI#’
SELECT * FROM PS_PENDING_DST WHERE PROCESS_INSTANCE =’PI#’
SELECT * FROM PS_PENDING_VAT WHERE PROCESS_INSTANCE =’PI#’

SELECT * FROM PS_ITEM WHERE PROCESS_INSTANCE =’PI#’
SELECT * FROM PS_ITEM_ACTIVITY WHERE PROCESS_INSTANCE =’PI#’
SELECT * FROM PS_ITEM_ACT_VAT WHERE PROCESS_INSTANCE =’PI#’

SELECT * FROM PS_CUST_DATA WHERE PROCESS_INSTANCE =’PI#’
SELECT * FROM PS_CUST_HISTORY WHERE PROCESS_INSTANCE =’PI#’
SELECT * FROM PS_SUBCUST_DATA WHERE PROCESS_INSTANCE =’PI#’
SELECT * FROM PS_SUBCUST_HISTORY WHERE PROCESS_INSTANCE =’PI#’

SELECT * FROM PS_PAYMENT WHERE PROCESS_INSTANCE =’PI#’
SELECT * FROM PS_DRAFT_CONTROL WHERE PROCESS_INSTANCE =’PI#’

2. Verify the Abended Process selected Groups:

SELECT * FROM PS_RP_PARALLEL WHERE RUN_CNTL_ID =’%’ and OPRID = ‘%’

3. Verify the Abended PI# Run Control ID and TAO status and data:

(where '%' is your actual Run Control ID or Operator ID)

SELECT * FROM PS_AERUNCONTROL WHERE PROCESS_INSTANCE=’PI#’ AND RUN_CNTL_ID =’%’ and OPRID = ‘%’
SELECT * FROM PS_AETEMPTBLMGR WHERE PROCESS_INSTANCE=’PI#’ AND RUN_CNTL_ID =’%’ and OPRID = ‘%’

SELECT * FROM PS_PGG_SERVICE_AET WHERE RUN_CNTL_ID =’%’ and OPRID = ‘%’
SELECT * FROM PS_RP_POSTING_AET WHERE RUN_CNTL_ID =’%’ and OPRID = ‘%’'
SELECT * FROM PS_AR_UPDATE_AET WHERE RUN_CNTL_ID =’%’ and OPRID = ‘%’
SELECT * FROM PS_AE_REQUEST WHERE RUN_CNTL_ID =’%’ and OPRID = ‘%’'
SELECT * FROM PS_AE_REQUEST_OPT WHERE RUN_CNTL_ID =’%’ and OPRID = ‘%’
Step 4: Evaluate Recovery Options
Recovery Option 1 [Online Data Correction/Isolation]

If the 'offending' source of data (e.g. Online Group, External Billing Group, Maintenance Worksheet, etc) has been identfied, then this specific data source could either be corrected or 'isolated' using online pages. This allows further processing of ARUDPATE (Restart or Reset, below), especially if this Abend occurs in a Production environment with tight nightly schedules.

Example 1 : Online Group 123 is identified as containing data corruption, hence, if feasbile using online navigation path, it's Posting Status would be set to Do Not Post so it will be bypassed by the next run of ARUPDATE or it could just be completely deleted online and re-added again.

Example 2: If a missing setup is causing the abend (e.g. missing currency rate, missing Entry Type, missing Payment Terms code to calculate Due Date, etc.) then the missing setup data could be added/updated online before the Reset or Restart of the abended process.

In summary: If the identified data causing the abend can be corrected online then this would always be the preferred approach, otherwise external SQL updates may be necessary (see below).

Recovery Option 2 [Online Restart]

If the online page Accounts Receivable > Receivables Update > Reset Process shows the abended process and allows (highlighted yellow button) online Restart, and the reason for the abend is known and is transient (not permanent) and fixed then the ARUPDATE job can be Restarted (resumed) online. This will execute ARUPDATE again starting from the abended step, but this time, as the cause of the abend has been resolved externally, the step will hence execute correctly. Naturally, this assumes that the Abend cause has been identified and resolved (setup changed, currency rate added, etc.). If the Abend cause has not been identified, then Restart is not an effective action as the job will simply abend again on the exact same step.

Recovery Option 3 [Online Reset]

If the online page Accounts Receivable > Receivables Update > Reset Process shows the abended process and allows (highlighted yellow button) online Reset, the Abend cause needs more in-depth further investigation or does not allow Restart (missing setup that will adversely impact futher processing if job allowed to resume), and the database records need to be 'unlocked' then the online Reset will execute the SQL Updates of PS_AR_RESETSQL_TBL and 'release' all relevant locked records. This implies the next run of ARUPDATE will start from a 'clean slate' and all relevant records (online groups, worksheets, etc.), as per Run Time Options, will be re-selected for processing.

References to the above Restart/Reset processes are as follows: (different release levels may have different chapter numbers):

PeopleSoft Enterprise Receivables 9.0 PeopleBook > Chapter 29 Posting and Unposting Groups > Section Understanding How to Restart or Reset Receivable Update. Here is the text:
"Accounts Receivable, Receivables Update, Reset Process, Process Reset
Access the Process Reset page. Enter your search criteria for the process instance and click Refresh.
In the Not Successful Process Requests grid, select the process instance that you want"


Recovery Option 4 [Offline SQL to Unlock Records]

Usually, this Option should not be attempted if any of the Online Options are feasible. However, in some situations this could be the only feasible option. In most instances, but NOT all, an AR_PGG_SERV abend could be recovered by unlocking the records locked by the abended process (or, in other scenarios, another process impacting the abended process).

Locking simply means the records PROCESS_INSTANCE field has been set to the abended Process Instance Number. Step 3: Verify Data Status after ARUPDATE Abend above would have identified the locked records and hence the SQL updates to unlock them would take the form of (using Group_Control as an example, but you will need to repeat for all tables):

UPDATE PS_GROUP_CONTROL SET PROCESS_INSTANCE = '0' WHERE GROUP_BU ='Group Business U:nit' AND GROUP_ID ='Group ID' AND PROCESS_INSTANCE =’PI#’

After all relevant tables (see Step 3 Verify above) has been unlocked, the RP_PARALLEL table (where selected Groups are stored during each process run) will need to be cleared, otherwise any group whose record still exists on this table will NOT be selected in the next run of ARUPDATE. The SQL code will take this form:

DELETE FROM PS_RP_PARALLEL WHERE RUN_CNTL_ID =’%’ and OPRID = ‘%’
Recovery Option 5 [Offline SQL to Correct Data]

There could be instances where SQL updates to data tables (whether temporary _TAO or application) could resolve the abend and allow a Restart (resume processing) from the Abended step. This scenario typically occurs in duplicate INSERT abends (where a record with an identical key to the Source table record already exists on the Target table). The approach could be to remove the existing record from the Target table or the new record from the Source table, depending which record is erroneous.

Naturally, the offline SQLs would depend on which tables are impacted and after careful examination of subsequent impact on processing. Hence, any such SQLs must be tested on a COPY of Production database first (this, as previously stated, is advised for all external SQLs).

Example 1: An abend at AR_POSTING.DISTRIB.INSERT where INSERT INTO PS_ITEM_DST abends due to duplicate records on the PS_RP_PNDDST_TAO4 source temporary table.

1. Examine if the PS_RP_PNDDST_TAO4 source table contained multiple duplicate records. A sample SQL would be to select rows with same KEY field values:

SELECT DISTINCT BUSINESS_UNIT, CUST_ID, ITEM, ITEM_LINE,
ITEM_SEQ_NUM, LEDGER_GROUP, LEDGER, DST_SEQ_NUM
FROM PS_RP_PNDDST_TAO4 WHERE PROCESS_INSTANCE = 'PI#'

2. Examine if ITEM_DST already has records that are similar (same Key fields) as the incoming records from PS_RP_PNDDST_TAO4. A sample SQL would be to select from both tables the records that share the same KEY field values:

SELECT A.BUSINESS_UNIT, A.CUST_ID, A.GROUP_ID, A.GROUP_BU,
A.ITEM, A.ITEM_LINE, A.ITEM_SEQ_NUM, A.LEDGER_GROUP, A.LEDGER, A.DST_SEQ_NUM
FROM PS_RP_PNDDST_TAO4 A, PS_ITEM_DST B
WHERE A.PROCESS_INSTANCE = 'PI#'
AND A.BUSINESS_UNIT=B.BUSINESS_UNIT
AND A.CUST_ID=B.CUST_ID
AND A.ITEM=B.ITEM
AND A.ITEM_LINE=B.ITEM_LINE
AND A.ITEM_SEQ_NUM=B.ITEM_SEQ_NUM
AND A.LEDGER=B.LEDGER
AND A.LEDGER_GROUP=B.LEDGER_GROUP
AND A.DST_SEQ_NUM=B.DST_SEQ_NUM

Based on the results above, and approach could be to Update the PROCESS_INSTANCE = '0' on PS_RP_PNDDST_TAO4 or Delete the (duplicate) rows from PS_RP_PNDDST_TAO4 table and then Restart the process.

Example 2: An abend at AR_PGG_SERV.MNT_SUSP.INSCFBAL with error message "ORA-00947: not enough values" is traced to the online ChartField Inheritance page for SetID+Receivables (SHARE + AR_BI) not displaying the correct number of ChartFields (say displaying 4 instead of the defined 5). This is maniftested itself in the INSERT and SELECT not having corresponding number of fields ("not enough values"). The resolution would be to SQL Delete the 'SHARE'+'AR_BI' records in PS_CF_INHERIT_TBL (the table 'behind' the online page) and then re-creating the records (all 5 of them, in this example) again using the online ChartField Inheritance page.

Example 3: Abends can sometimes occur due to incorrect data values in pre-defined fields, such as special characters in Customer Names or lower-case letters in Item IDs. This is usually shown by such error messages as "ORA-01756: quoted string not properly terminated" (or similar). The recovery is to update the data with the correct values that correspond to the Field Type (e.g UPPERCASE).

Recovery Option 6 [External Table, Program or Run Control Fixes]

In some cases, the resolution could be by external actions on impacted objects (tables, programs, etc.) that does NOT involved SQL updates to data. This would include re-building tables, re-compiling code, etc.

Example 1: A rare abend at AR_POSTING.ITEM_UYT.PNDRULES with error message "ORA-00600: internal error code" (which does not have any prior references on the GCS database) was due to corrupt temporary table PS_RP_TERMS_TAO. Rebuilding the table resolved the issue, without any further SQL data manipulation.

Example 2: Abends referring to COBOL calls/references could be resolved by ensuring the COBOL directory is correct and/or re-compiling COBOL source objects.

Example 3: A one-time event (e.g. 1st run after an upgrade) caused an abend due to execution of a process step that should not be executed for upgraded data missing information needed by new setup. A possible course of action would be to inactivate the Step, upgrade, then re-activate the Step for new data.

Example 4: ARUPDATE abended due to Deadlock, hence the process can be Restarted (once the Deadlock is resolved). However, this would need manipulation of the Run Control status of the Parent and Child processes. For a detailed example refer to KM Doc ID: EAR8.8SP1+: Abended ARUPDATE Job Stays in QUEUED Status Even After Pressing Restart Button. (Doc ID 865335.1) Note 865335.1

Recovery Option 7 [Add/Fix Run Control]

Sometimes running ARUPDATE with a New Run Control record (different ID/Name to the abended process Run Control) will resolve the issue if the existing Run Control records have been corrupted in any way. However, if the existing abended job Run Control ID needs to be used again it would mean deleting its records from the control tables. This is the list of tbles where the Run Control may exist and needs to be deleted (verify first then delete if/as needed):


DELETE FROM PS_AERUNCONTROL WHERE OPRID = '<>' AND RUN_CNTL_ID = '<>'
DELETE FROM PS_AETEMPTBLMGR WHERE OPRID = '<>' AND RUN_CNTL_ID = '<>'

DELETE FROM PS_AE_REQUEST WHERE OPRID = '<>' AND RUN_CNTL_ID = '<>'
DELETE FROM PS_AE_REQUEST_OPT WHERE OPRID = '<>' AND RUN_CNTL_ID = '<>'
DELETE FROM PS_PGG_SERVICE_AET WHERE OPRID = '<>' AND RUN_CNTL_ID = '<>'
DELETE FROM PS_RP_POSTING_AET WHERE OPRID = '<>' AND RUN_CNTL_ID = '<>'

(+)DELETE FROM PS_PRCSRUNCNTL WHERE OPRID = '<>' AND RUN_CNTL_ID = '<>'
(+)DELETE FROM PS_POSTING_REQUEST WHERE OPRID = '<>' AND RUN_CNTL_ID = '<>'
('+' exclude if you wish to retain Run Control, but reset AR Update)
(* where '<>' is your actual OPRID and actual Run Control ID)

Recovery Option 8 [Cleanup of Temporary Tables]

[Optional] Clean up the abended Process Instance TAO tables. The complete list exists on PS_AETEMPTBLMGR keyed by the PROCESS_INSTANCE and RUN_CNTL_ID. Delete from all _TAO tables EXCEPT for PS_PGG_SERVICE_TAO, which should NEVER be deleted. Refer to above Step 3 '3. Verify the Abended PI# Run Control ID and TAO'

SELECT * FROM PS_AETEMPTBLMGR

Recovery Option 9 [Apply Known Resolution]

The abend could be the result of a code error, hence no other fix would ultimately work in preventing it in the future. In this situation the abended process will be recovered by the relevant Option above, any corresponding changes to setup or data if/as needed, the Resolution (whether standalone fix or delivered in a Bundle) applied and then the process re-run.

7 comments:

  1. Very informative. Thank you for posting.

    ReplyDelete
  2. This was awesome..thanks so much

    ReplyDelete
  3. Oh wow....this info is fantastic. You have not only given a greater understanding of the process but you also offered several valuable solutions. Thanks a million and this is greatly appreciated.

    ReplyDelete
  4. Valueble Information on AR. Keep posting some more processes. Thanks a lot.

    ReplyDelete
  5. Awsm information ...thanks a ton for this . ..By Sachin

    ReplyDelete
  6. Tremendous effort and thanks a lot for sharing, SK

    ReplyDelete
  7. Very good article..Phew..so much effort has been put into this.

    ReplyDelete