The Hash Lookup Function obtains the value for a destination column from a lookup table, according to a hashed value derived from a source column. The Hash Lookup Function allows you to consistently mask data when you use the same source and lookup tables in any environment.
The source column that is hashed does not need to be a column that will be replaced by lookup table values.
The Hash Lookup Function is case-sensitive. For example, the source values John and JOHN will be hashed to different values. You can use the TRIM parameter to convert the source value to uppercase before it is hashed.
There are two forms of the Hash Lookup Function, single column and multiple column. The single column form inserts a value into a single destination column. The multiple column form inserts values from multiple lookup table columns into corresponding destination columns, based on a single hash value from a source column.
You can enter the multiple column Hash Lookup Function for any source column that will be replaced by lookup table values, but you must edit the column map to remove the names of remaining source columns that will also be replaced.
The lookup table must include a key column that contains sequential number values without any gaps, and the remaining columns contain replacement values. The key column must be a numeric data type. The lookup table is typically indexed. The function hashes a source column to derive sequential numbers from 1 to the maximum value in the key column of the lookup table. The hashed value from the source table is matched with the sequential numbers in the lookup table, and values from the corresponding lookup table row are inserted at the destination.
If the source column used to derive the hashed value contains certain values (NULL, spaces (for CHAR columns), zero-length VARCHAR), the value is not hashed and the following reserved values are used as keys to the lookup table:
Source Value | Lookup Table Key |
---|---|
NULL | -1 |
spaces (CHAR or VARCHAR) | -2 |
zero-length VARCHAR | -3 |
The lookup table should include a row for each of these numbers, allowing you to insert a lookup value for each of these source values. If one of these source values is found and a corresponding number is not in the lookup table, a conversion error is reported.
The ignore parameter allows you to ignore the lookup table and use a source value when a row in a specified source column contains a specified value (NULL, SPACES (for CHAR columns), or zero-length VARCHAR).
You can use the preserve parameter to ignore the lookup table and use a source value when a row in a specified source column contains a specified value (NULL, SPACES (for CHAR columns), or zero-length VARCHAR). preserve can also be used to ignore the lookup table if a source column does not contain a value.
The trim parameter allows you to specify characters that will be trimmed from the source value before it is hashed. For example, if you choose to trim commas from a source value, the values Smith, John, and Smith John will each be hashed to the same value. You can also use this parameter to convert the source value to uppercase before it is hashed.
If the source value is converted to uppercase, the trim characters are also converted to uppercase.
You can use the seed parameter to vary the calculation performed by the hashing algorithm. The hashed value from the source column and the seed value are matched with a sequential number from the lookup table to obtain the replacement value for the destination column.
The syntax is:
HASH_LOOKUP( [sourcecol,] [trim=([char1char2 ] [\u]),]
dest=(col1, coln), lktablename (search,
{ value | values=(col1, coln) } ) [ ,cache | ,nocache ]
[,ignore=(col (spaces, null, zero_len ), )
| PRESERVE=( [ NOT_FOUND, ] colname (spaces, null, zero_len), ) ] )[,seed=n])
To specify a backslash “\” or a right parentheses “)”, you must precede the character with a backslash escape character. For example, to specify a right parentheses, enter: trim=(\)).
You can only use the escape character with a backslash, a right parentheses, or as part of the uppercase indicator.
For single column lookup, enter one column name only.
For multiple column lookup, the order of the column names must correspond to the destination table columns in the dest= parameter. The number of columns must equal the columns in the dest= parameter, and at least one column must include values. To not specify values for a column, do not enter a value. For example, coln().
preserve= and ignore= are mutually exclusive. ignore= will be deprecated in a future release.
Thecol, null, spaces, and zero_len operands have the same effect when used with eitherpreserve= or ignore=.
Use the Hash Lookup Function to insert values from a column in a lookup table into a destination table column, based on a value hashed from a source column.
For example, assume the source column, FIRST_NAME, contains first names and the destination column will include replacement first names from the lookup table. A lookup table, NAME_LOOKUP, contains a column (FIRST) with first names and a column (SEQ) containing sequential values.
To obtain values for the destination column using the NAME_LOOKUP table, specify:
HASH_LOOKUP(FIRST_NAME,NAME_LOOKUP(SEQ, FIRST))
The Hash Lookup Function matches the hash values from the source column with values in the SEQ column of the NAME_LOOKUP table. When a match is found, the function inserts the corresponding value from the FIRST column into the destination column.
Use the Hash Lookup Function to insert values from columns in a lookup table row into columns in a destination table row, based on a value hashed from a source column.
For example, based on values hashed from a source column (FIRST_NAME) that contains first names, you can replace values in destination columns (FIRST and LAST) with first and last names from a lookup table. A lookup table named NAME_LOOKUP contains a column (SEQ) with sequential values as well as columns (FIRST_MASK and LAST_MASK) to mask values in the destination.
To replace names in the destination table based on values hashed from a source column, specify:
HASH_LOOKUP(FIRST_NAME,DEST=(FIRST,LAST), NAME_LOOKUP(SEQ,VALUES=(FIRST_MASK, LAST_MASK)))
The Hash Lookup Function matches the hash values from the source FIRST_NAME column with values in the SEQ column of the NAME_LOOKUP table. When a match is found, the function inserts the corresponding values from the lookup table FIRST_MASK and LAST_MASK columns into the corresponding destination columns.
Use the following statement to extend the single column example, where you want to use the source NULL and SPACES values instead of lookup table values:
HASH_LOOKUP(FIRST_NAME,NAME_LOOKUP(SEQ, FIRST),IGNORE=(FIRST_NAME(NULL,SPACES)))
Use the following statement to extend the single column example, where you do not want to maintain a table of found lookup values in memory:
HASH_LOOKUP(FIRST_NAME,NAME_LOOKUP(SEQ, FIRST),NOCACHE)
Use the following statement to extend the single column example, where you want to trim spaces and commas from the source value as well as convert the source value to uppercase before it is hashed:
HASH_LOOKUP(FIRST_NAME, TRIM=( ,\u),NAME_LOOKUP(SEQ,FIRST))