Projects/UUIDs Usage/Technical Documentation

From InfiniteERP Wiki
Jump to: navigation, search

UUID Identifiers - Technical Documentation

General Considerations

Legacy data

Although the new format for primary must be able to store UUIDs, the changes will not be made in the legacy data primary keys. This will cause the coexistence of the two models in the tables, there will be some records with the UUID primary and other ones with the old system adapted to the new type.


DBSourceManager export

When DBSoruceManager exports database into files, records are ordered them by their primary key. This caused to have in XML files records ordered by development environment and the new records after the older ones. As consequence of this DBSourceManager was able to distinguish between core and custom code and so was indicated in the file.

UUIDs usage will cause new records to be unpredictably located in the file. This will make DBSourceManager to lose the ability of distinguishing core and custom ranges.

Note that this is not a problem in case the Modularity project is implemented because customizations will be managed as modules but it is if not.


Database datatype

UUIDs can be represented as a varchar2(32) as an hexadecimal value without scores (-) for group separator or as varchar2(36) with separators.

It also can be represented as a numeric value. The main advantage of this approach is that the current type for primary keys is a number(10) so the modification in the model would be smaller and in manual code almost none, on the other hand, varchar(32) is a more standard way to store UUID values than number, so although the higher implementation effort this should be the way to implement.

PostgreSQL 8.3 includes a native UUID type but this type will not be used because of compatibility reasons with Oracle.


UUID generation

Currently primary key values are obtained using the ad_sequence_next db store procedure. This is so because PKs were strongly related with the database information (which was the latest assigned PK value).

Now this relation is weak because PK values are generated regardless the already existent ones. This allows to generate them in the Java code before insertion (this is specially applicable to WAD generated windows). Anyway, it is still required to have a PK value generator within the database to be used by PL/SQL code.


Java

Java util package includes the UUID class which allows to generate UUIDs in a easy way. Example:

package testing;

import java.math.BigInteger;
import java.util.UUID;

public class test {
  public static void main(String[] args) {
    UUID u = null;
    String s = null;
    BigInteger b = null;
    for (int i=0; i<50; i++) {
      u = UUID.randomUUID();
      s = u.toString().replace("-", "");
      System.out.println(u+ "    " + s );
    }
  }
}


Oracle

Oracle implements sys_guid() function which returns an UUID as raw type, this can be transformed in string.

Example:

select rawtohex(sys_guid()), to_number(rawtohex(sys_guid()),'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX') from dual;


PostgreSQL

Although PostgreSQL 8.3 provides storage and comparison functions for UUIDs it does not include any built-in function for generation. The contrib module contrib/uuid-ossp [1] provides them.

Anyway this module seems not to be 8.2 compatible.


PostgreSQL 8.3

Install database

apt-get install postgresql-8.3

Install contrib package

apt-get install postgresql-contrib-8.3

Important Note for PostgreSQL on Windows Users

As reported there is an issue with PostgreSQL 8.3 on Windows so UUIDs are not unique. You can check reported issue at our bugtracker for details.

There is a workaround to this issue, that is replacing the current uuid-ossp.dll with this one created by Hiroshi Saito (PostgreSQL Developer).


AD_Language Table

AD_Language table's PK does not follow the standard approach as the rest of tables do, its PK consists in a varchar2(6) instead of number(10). A new numeric column is required and its population with values taking into account all the tables referring to existent values in this one (specially *_TRL tables).


UUID identifiers implementation

These are the steps to be performed for the UUID implementation:

  • Change all the primary key column from number(10) to varchar2(32) to be able to store UUIDs.
    • Apendix A
  • Do the same for all the foreing key columns.
    • Apendix B
  • Look for non-db-FK columns that actually store references to other PKs (for example ad_tree_node.node_id, fact_acct.record_id, fact_acct.record_id2, etc...) and change its data type.
    • The list of these columns in in Appendix C.
  • Create UUID generator in Java and DB.
  • Adapt PL/SQL code to the new data types.
    • Types for PK and FK types must be transformed to varchar2(32)
  • Adapt WAD code to the new data types.
  • Adapt manual code to the new data types.
    • xsql file needn't explicitly cast char as they are doing with numbers so castings for these fields must be removed.


Double PK migration

Current model contains a number of tables with two columns as primary key, the tables with this approach are the translation ones (*_TRL) and the accounting onews (*_ACCT). These tables should be transformed into single UUID primary key. The steps to do this are:

  • Look for all the double PK tables
    • Apendix D
  • Create a new UUID type column for all of them
  • Populate with UUID values this new columns
  • Check for FKs to these tables
  • Set these columns as PK in database
  • Add the new column to the application dictionary
  • Change current references to fit the new model


Existent installations upgrade

When existent installations are going to be updated all these modifications will have to be applied to all their customizations.

Here are listed the main considerations to take into account while doing this migration.

These modifications will have to be done in all the custom code.

In the list below some modifications are proposed to be done using regular expressions, when they are for database model (PL/SQL, triggers...) they can be directly applied in the XML files and after this update the database structure from XML. This expressions should be applied in insensitive case.


UUID usage

Structure type modification

All the columns that are primary keys or foreign keys must moved from number(10) to varchar2(32).

The easiest way to do this is to edit the xml files that define the database structure and to update database structure afterwards (ant update.database.structure).

Scripts in apendixes A and B might be useful to detect the column that don't fit the new types.


PL/SQL code

The problem with PL/SQL code is that although in Oracle it is compiled and in case it has errors they are raised in compilation time, Oracle tries to do automatic casting, so for this modification it will not crash until execution time. For PostgreSQL >8.3 this casting is not performed but compilation is not done so it also crashes at execution. So even though all the procedures might be in a correct status they could crash when they are executed that's why this review should be done carefully.

  • Functions, procedures parameters and return type. For all the custom PL/SQL code it is necessary to review the types for the parameters they receive and for the return in the functions. When the parameter or the return type is a reference (FK) they must be changed to VARCHAR2.
  • Procedures with AD_Process_Para. In Openbravo the standard way for PL procedures to receive parameters is using the AD_PInstancePara table which is automatically populated by the application depending on the defined parameters for the procedure. Now the references to ID columns are stored as strings, while before this project were numbers, so they must be modified. The piece of code that looks for parameters looks like:
   FOR Cur_Parameter IN
     (SELECT i.Record_ID, p.ParameterName, p.P_String, p.P_Number, p.P_Date
     FROM AD_PInstance i
     LEFT JOIN AD_PInstance_Para p
       ON i.AD_PInstance_ID=p.AD_PInstance_ID
     WHERE i.AD_PInstance_ID=p_PInstance_ID
     ORDER BY p.SeqNo
     )
   LOOP
     v_Record_ID:=Cur_Parameter.Record_ID;
     IF(Cur_Parameter.ParameterName='Parameter_Name') THEN
       v_Parameter_Variable:=Cur_Parameter.P_Number;
       DBMS_OUTPUT.PUT_LINE('  AD_Tab_ID=' || v_AD_Tab_ID) ;
     ELSE
       DBMS_OUTPUT.PUT_LINE('*** Unknown Parameter=' || Cur_Parameter.ParameterName) ;
     END IF;
   END LOOP; -- Get Parameter

Here the part that must be modified, in case the parameter is a reference to a PK is:

     IF(Cur_Parameter.ParameterName='Parameter_Name') THEN
       v_Parameter_Variable:=Cur_Parameter.P_Number;
       DBMS_OUTPUT.PUT_LINE('  AD_Tab_ID=' || v_AD_Tab_ID) ;
     ELSE

Which should be:

     IF(Cur_Parameter.ParameterName='Parameter_Name') THEN
       v_Parameter_Variable:=Cur_Parameter.P_String;
       DBMS_OUTPUT.PUT_LINE('  AD_Tab_ID=' || v_AD_Tab_ID) ;
     ELSE

Note that to make this change possible it's also needed to modify the type for v_Parameter_Variable variable from NUMBER to VARCHAR(32).

  • Variables types All variables used to store PK or FK values must be set to VARACHAR2(32). This work cannot be completely automatized, so all the variables defined as NUMBER should be checked to ensure they are correct. Anyway, some massive replacements can be performed:
    • In most cases variables used to store PK and FK are named like VaribleName_ID so this replacement can be done: (_ID/s*)NUMBER/s*\(10\) by \1VARCHAR2(32) and afterwards (_ID/s*)NUMBER by \1VARCHAR2(32)
    • Another very tipycally used variable name is v_NextNo which stores the next PK obtained for an insertion so replacement would be: (v_NextNo\s*)NUMBER by \1VARCHAR2(32)


XSQL files

Currently xsql files have explicit casting (TO_NUMBER(?)) for PostgreSQL 8.3 compatibility in all the PK and FK parameters. These castings must be removed.

  • In select and update statements the most common syntax is
select columns
  from tables
 where column_fk_ID = to_number(?)

For these clauses the replacement to be done is: (.*_ID\s*=\s*)TO_NUMBER\(\?\) by \1'\?'. This will replace the to_number(?) by ? for all the PK and FK columns named following the standard name convention (tablename+_ID)

  • Insert statements The previous rule must be applied also to insert statements, unfortunately in this case there isn't a simple expression to do the substitutions, so they will have to be manually done. An insert statement has the following look:
 insert into table
   (column_char1, column_number2, column_date3...)
 values
   (?, to_number(?), to_date(?)...)

Depending on the column type it will be necessary to replace to_number(?) by ?


Both: XSQL and PL/SQL
  • ID comparison with constant, in some queries ID columns are compared with constant values instead of with other columns, variables or parameters. The replacement would be: (_id\s*=\s*)([0-9]+) by \1'\2'
  • Coalesce with constant, the previous rule is also applicable to coalesce: (coalesce\(.*_id\s*,)([0-9]+)\) by \1'\2'\)
Jasper reports
  • ID fields to String: Fields used to store IDs must be String (luckily these ones are not usually displayed to user), the regular expression is: sed -i 's/(<field name=".*_ID" class=")java.math.BigDecimal/\\1java.lang.String/' `find -type f -name *jrxml`
  • ID parameters to String: The same for parameters: sed 's/\(<parameter name=".*_ID".*class="\)java.math.BigDecimal/\1java.lang.String' `find -type f -name *jrxml
  • After these modifications it is necessary to recompile the modified jrxml files used as subreports. They are the ones that have a .jasper file.
Java reports

Little changes are required in java, just in case any ID value was converted into a numeric data type now it must be in String. Additionally a new static method (String Utility.stringList(String)) has been added. This method received an string that can contain a single value or a list of comma separated values and returns the same list with values between quotes ('), the aim of this method is to be used to pass values directly to xsql IN clauses.

Double PK substitution

Although to make the application work it's not strictly necessary to change multiple PK tables to single PK, it's recommended in order to follow the standards.

These are the guidelines to make it:

  • Modify table structure. This step will convert the tables with more than one column in the PK to a single PK. To do this a new column must be created for the PK and the existent ones removed from PK. The easiest way to do so is to edit the XML structure files and update database structure afterwards.
    • Locate custom tables with multiple PK executing the script in Apendix D.
    • In the XML file create a new column named TableName_ID and set it as PK:
 <?xml version="1.0"?>
  <database name="TABLE TABLE_NAME">
   <table name="TABLE_NAME" primaryKey="TABLE_NAME_KEY">
    <column name="TABLE_NAME_ID" primaryKey="true" required="true" type="VARCHAR" size="32" autoIncrement="false">
      <default/>
       <onCreateDefault><![CDATA[get_uuid()]]></onCreateDefault>
    </column>
    • Remove all the columns that were part of the PK primaryKey="true" to primaryKey="false".
    • Create uniqueness for the columns that previously were part of the primary key:
      <unique name="TABLE_COLUMN1_COLUMN2_UN">
        <unique-column name="COLUMN1"/>
        <unique-column name="COLUMN2"/>
      </unique>
  • Create in the application dictionary the columns and fields for the new PK columns. To do this execute the script in Apendix D.


Apendixes

A. PK script

This Oracle script looks for PK with NUMBER(10) type.

select t.tablename, c.columnname
  from ad_column c, 
       ad_table t, 
       user_tab_columns u
where c.ad_table_id = t.ad_table_id
  and upper(t.tablename) = u.table_name
  and upper(c.columnname) = u.column_name
  and ad_reference_id in (select ad_reference_id from ad_reference where name in ('ID'))
  and u.data_precision = 10
  and u.data_type='NUMBER'
  and not exists (select 1 from user_views where view_name = u.table_name);


B. FK script

This Oracle script obtains all the FK columns that are NUMBER(10):

select t.tablename, c.columnname
  from ad_column c, 
       ad_table t, 
      user_tab_columns u
where c.ad_table_id = t.ad_table_id
  and upper(t.tablename) = u.table_name
  and upper(c.columnname) = u.column_name
  and data_type ='NUMBER'
  and data_precision = 10
  and ad_reference_id in (select ad_reference_id from ad_reference where name in ('Table', 'TableDir', 'Search', 'Location', 'PAttribute', 'Image'))
  and not exists (select 1 from user_views where view_name = u.table_name);


C. other FK columns

This table lists the columns that are not set as FK in database but actually contain references to PKs. They do not have referential integrity because the information they contain can be references to several tables.

Table Column Comments
AD_TREENODE

AD_TREENODEBP

AD_TREENODEMM

AD_TREENODEPR

NODE_ID
AD_ATTACHMENT

AD_CHANGELOG

AD_NOTE

AD_PINSTANCE

AD_RECORD_ACCESS

FACT_ACCT

RECORD_ID
FACT_ACCT RECORD_ID2
A_AMORTIZATION

C_BUDGETLINE

C_CASH

C_DP_MANAGEMENT

C_INVOICE

C_INVOICELINE_ACCTDIMENSION

C_ORDER

C_SETTLEMENT

C_VALIDCOMBINATION

FACT_ACCT

I_GLJOURNAL

M_INOUT

M_INVENTORY

M_MOVEMENT

M_PRODUCTION

USER1

USER2

GL_JOURNALLINE C_DEBT_PAYMENT_ID These columns are in database catalog but not in application dictionary.
M_PRODUCT MRP_PLANNINGMETHOD_ID
M_TRANSACTION M_INTERNAL_CONSUMPTIONLINE_ID
C_PERIOD DIVIDEUP_FACT_ACCT_GROUP_ID
C_PERIOD CLOSE_FACT_ACCT_GROUP_ID
FACT_ACCT FACT_ACCT_GROUP_ID
C_PERIOD REG_FACT_ACCT_GROUP_ID
M_PRODUCT MRP_PLANNER_ID
S_TIMEEXPENSE M_WAREHOUSE_ID
S_TIMEEXPENSE M_PRICELIST_ID


D. Tables with multiple PK

This Oracle script obtains all the tables with more than one column as PK

 select c.table_name, count(*)
   from user_constraints c, user_ind_columns i
 where c.constraint_type='P'
  and c.index_name = i.index_name
 group by c.table_name
 having count(*)>1
 order by 2 desc,1;

There are 73 tables with more than one column as PK, note that 6 of them have 3 PK columns.

Table PK cols
AD_RECORD_ACCESS 3
AD_TREEBAR 3
C_BP_WITHHOLDING 3
C_INTERORG_ACCT 3
T_INVENTORYVALUE 3
T_REPLENISH 3
AD_ALERTRULE_TRL 2
AD_COLUMN_ACCESS 2
AD_ELEMENT_TRL 2
AD_FIELDGROUP_TRL 2
AD_FIELD_TRL 2
AD_FORM_ACCESS 2
AD_FORM_TRL 2
AD_MENU_TRL 2
AD_MESSAGE_TRL 2
AD_PINSTANCE_PARA 2
AD_PROCESS_ACCESS 2
AD_PROCESS_PARA_TRL 2
AD_PROCESS_TRL 2
AD_REFERENCE_TRL 2
AD_REF_LIST_TRL 2
AD_ROLE_ORGACCESS 2
AD_TABLE_ACCESS 2
AD_TAB_TRL 2
AD_TASK_ACCESS 2
AD_TASK_TRL 2
AD_TEXTINTERFACES_TRL 2
AD_TREENODE 2
AD_TREENODEBP 2
AD_TREENODEMM 2
AD_TREENODEPR 2
AD_USER_ROLES 2
AD_WF_NODENEXT 2
AD_WF_NODE_TRL 2
AD_WINDOW_ACCESS 2
AD_WINDOW_TRL 2
AD_WORKFLOW_ACCESS 2
AD_WORKFLOW_TRL 2
AT_COMMAND_TRL 2
A_ASSET_ACCT 2
A_ASSET_CHANGE_AMT 2
A_ASSET_GROUP_ACCT 2
C_BANKACCOUNT_ACCT 2
C_BP_EMPLOYEE_ACCT 2
C_CASHBOOK_ACCT 2
C_CHARGE_ACCT 2
C_COUNTRY_TRL 2
C_CURRENCY_TRL 2
C_DOCTYPE_TRL 2
C_ELEMENTVALUE_TRL 2
C_GLITEM_ACCT 2
C_GREETING_TRL 2
C_ORDERTAX 2
C_PAYMENTTERM_TRL 2
C_PROJECT_ACCT 2
C_TAXCATEGORY_TRL 2
C_TAX_ACCT 2
C_TAX_TRL 2
C_UOM_TRL 2
C_WITHHOLDING_ACCT 2
M_ATTRIBUTEINSTANCE 2
M_ATTRIBUTEUSE 2
M_PRODUCTPRICE 2
M_PRODUCT_ACCT 2
M_PRODUCT_CATEGORY_ACCT 2
M_PRODUCT_PO 2
M_PRODUCT_TRL 2
M_REPLENISH 2
M_SUBSTITUTE 2
M_WAREHOUSE_ACCT 2
T_REPORTSTATEMENT 2
T_SELECTION2 2
T_SPOOL 2


E. Multiple PK: columns and fields insertions

This Oracle script inserts the new PK columns and fields. This script must be run in the db user after applying the structure modifications. Another user previous to these modifications in required in the same DB in order to make the comparisons, 'OLD_USER' in the second line must be set with this user name.

<source lang="sql">

declare
  v_OldUser varchar2(30) := 'OLD_USER';
  colid varchar2(32);
  total number;
begin
  for i in (
     select t.tablename, t.ad_table_id
       from dba_constraints c, dba_ind_columns i, ad_table t
     where c.constraint_type='P'
      and c.index_name = i.index_name
      and c.owner=v_OldUser
      and i.index_owner=v_OldUser
      and upper(t.tablename) = c.table_name
     group by t.tablename,  t.ad_table_id
     having count(*)>1) loop
   
    --for existing columns remove issecondarykey values
    update ad_column
      set isSecondaryKey='N'
    where ad_table_id = i.ad_table_id
      and isSecondaryKey='Y';
    
    
    select count(*)
      into total
      from ad_column
     where ad_table_id = i.ad_table_id
       and columnname =  i.tablename||'_ID';
       
    if total = 0 then    
      --ad new column
      colid := get_uuid();
      insert into ad_column
        (ad_column_id, ad_client_id, ad_org_id, isactive,
         created, createdby, updated, updatedby,
         name, columnname, ad_table_id, ad_reference_id,
         fieldlength, iskey, ismandatory)
      values
        (colid, '0', '0', 'Y',
         now(), '0', now(), '0',
         i.tablename||'_ID', i.tablename||'_ID', i.ad_table_id, '13',
         32, 'Y', 'Y');
      
      --add fields in the tabs
      for j in (select *
                  from ad_tab
                 where ad_table_id = i.ad_table_id) loop
        insert into ad_field
          (ad_field_id, ad_client_id, ad_org_id, isactive,
           created, createdby, updated, updatedby,
           name, ad_tab_id, ad_column_id, isdisplayed, displaylength,
           seqno)
        values
          (get_uuid(), '0', '0', 'Y',
           now(), '0', now(), '0',
            i.tablename||'_ID', j.ad_tab_id, colid, 'N', 32,
            0);          
      end loop;
    end if;
  end loop;
end;

</source>