Define the data to archive

After specifying the archive process parameters, you must provide an access definition.

The access definition references the tables that contain the data, provides selection criteria for the data, prescribes the way that relationships between tables are traversed, and provides other specifications such as index parameters and actions (user-defined SQL statements, including calls to stored procedures) to be executed during processing. In other words, an access definition defines the data you want to archive and provides table-specific parameters for processing.

For this scenario, you will create an access definition that is local, or exclusive to the current archive request by:

Open the Access Definition Editor

The first step in the scenario is to select the tables that contain the data that you want to archive. You can select these tables by using the Access Definition Editor. To open the Access Definition Editor, click Tools > Edit Access Definition in the Archive Request Editor. The Access Definition Editor is displayed with Tables as the active tab.

Access Definition Editor with Tables as the active tab

You use the Access Definition Editor to create and edit access definitions. In the Description box, you can enter the purpose or function of the access definition. For this scenario, type Archive orders older than two years.

Select the tables

After you open the Access Definition Editor, the next step in the scenario is to select the tables that contain data that you want to archive. Use the Tables tab to list the names of tables that contain the desired data and to define specifications for the data in the listed tables. In this scenario, you are archiving information for orders more than two years old. The ORDERS, CUSTOMERS, DETAILS, and ITEMS tables contain the desired data. Use the options on the Tables tab to select the tables.

Default qualifier

The default qualifier saves time when you enter table names in the grid, or table list. A fully qualified table name has three parts: Dbalias.Creatorid.Tablename. Dbalias is a user-defined name that provides parameters needed to connect to the database. Creatorid qualifies a table name and often reflects the user account used to create it. Tablename is the base table name.

You can use the default qualifier to provide one or both parts for you — generally, use a default qualifier that applies to most or all tables in the table list. To prefix the table names in the access definition in this scenario, enter the Dbalias and Creatorid that correspond to the sample database tables.

Start table

Each archive process begins with a start table — the table from which data is first archived — and proceeds to obtain related data from the remaining tables in the table list. Because you want to archive a specific set of data related to orders more than two years old, as determined by dates in a column in the ORDERS table, you must specify ORDERS as the start table, automatically placing ‘ORDERS' in the first line of the table list.

If you do not specify a start table, the first name entered in the table list is the assumed start table. You can change the start table by typing a new name, selecting from the Start Table list, or right-clicking a name on the tablelist and selecting Set as Start.

Table list

The table list references tables from which data is archived. When you add a table name to the table list, the type of object referenced by the table name and the name of the DBMS for the table are automatically displayed.

You can use two methods to add table names to the list. One method is to type names in the Table/View column. An easier method, however, may be to find the tables that contain related data automatically. Using the Select Table(s) dialog, you can list tables related to the start table. You can then review the list and select tables you want to include in the process.

Select Tables dialog

To display the Select Table(s) dialog, right-click the ORDERS table name in the table list and select Add Tables from the shortcut menu. The dialog provides a list of tables, based on the default qualifier.

To limit the list to tables related to the ORDERS table, use the options in the Find Tables Related to Table area. Select Find Tables Related to Table, ensure default settings Both and All Levels are selected, and click Display. The default settings ensure that all related tables (children, parents, grandparents, grandchildren, and so on) are retrieved for the entire data model.

Select Table(s) dialog

Locate the following tables in the list: CUSTOMERS, DETAILS, and ITEMS. Select each table while pressing the Ctrl key and click Select to add the tables to the Table List in the Access Definition.

Choose the data to delete

In the archive process, you copy selected data to an archive file. You can delete archived data from all or selected database tables during the archive process or you can defer the delete. After you select the tables that you want to archive, the next step in the scenario is to specify whether you want to delete rows that are archived.

Deferring the deletion of archived data gives you the opportunity to verify the data after the process is complete. For example, you can browse the archive file to ensure the correct data is archived before using the delete process to delete the archived data from the database. Regardless of whether you delete during the archive process or defer the delete, you must indicate the data intended for deletion. That is, the database tables from which archived rows are to be deleted. You can identify the tables in the access definition or in the archive delete list on the Delete After Archive Specifications dialog.

For this scenario, you are streamlining your database by deleting archived rows from the ORDERS and DETAILS tables. You intend to retain the archived rows in the CUSTOMERS and ITEMS tables, because this data (names, addresses, and information about each product in inventory) is pertinent to recent and future orders. To delete the selected rows from each table, select the Delete Rows After Archive check boxes for ORDERS and DETAILS.

Database Table grid on Access Definition Editor. The Delete Rows After Archive check boxes are selected for the ORDERS and DETAILS tables.

When defining the process parameters earlier in this scenario, you chose to delete rows as part of the process. After you run the archive data, archived rows are removed from the ORDERS and DETAILS tables.

Understand table specifications

Table specifications provide display and criteria parameters for data. You can provide specifications for any table in the table list by right‑clicking the table name and selecting a Table Specifications option from the shortcut menu.

A brief description of each option follows:

Columns
Select columns to be included and excluded from a point and shoot display. You can use the point and shoot facility to visually select specific rows from the start table.
Selection Criteria
Specify criteria to select rows according to values in one or more columns in a table.
SQL
Create an SQL Where clause for complex selection criteria, for example, criteria that requires a combination of OR and AND logical operators.
Sort
Arrange start table rows in a point and shoot display according to values in one or more columns.
Actions
Define user-written SQL statements, including calls to stored procedures, to be executed at predefined points in an archive or restore process.
Archive Index
Establish index parameters for a column (or set of columns).
File Attachments
Provide parameters needed to archive a file referenced within or associated with a row of extracted data and include it in an archive file.

In this scenario, you use table specifications to provide selection criteria for the data you want to archive and to establish an archive index. The following sections demonstrate how to accomplish these tasks.

Specify selection criteria for the data

The next step in the scenario is to select the data that you want to archive using selection criteria.

Selection criteria allow you to pinpoint the data you want to archive. You can select data according to values in one or more columns. Selection criteria must conform to SQL syntax and include relational or logical operators.

For example, you can select data for archiving by:

  • Age — determined by values in one or more DATE columns.
  • Values in one or more columns — such as an ‘inactive' indicator.
  • Manually selecting rows in the Start Table using Point and Shoot.

In the following examples, the operators and syntax might not be valid for all database management systems. However, the functions demonstrated are universal.

Example 1

To archive data for all customers with names beginning with the letter A, from Pennsylvania, you can specify:

CUSTNAME
LIKE ‘A%'
STATE
= ‘PA'

Example 2

To archive data for an item that is no longer sold to customers (for example, The Man Who Would be King), you can specify:

ITEM_ID
= ‘AD013'

Select rows on the basis of date

In this scenario, you are archiving information for orders more than two years old. You can define selection criteria for the data based on values in a DATE column — particularly, the value in the ORDER_DATE column in the ORDERS table. To accomplish the task, you can use a unique operator.

The syntax for the operator is: BEFORE (nD, nW, nM, nY)

The D, W, M, and Y arguments, in any combination, indicate the number of days, weeks, months, or years subtracted from the current date at runtime. Rows with a date older than the calculated date are archived.

Use the BEFORE operator to specify generic criteria for an archive request, which can be scheduled to run repeatedly. For example, if you wish to retain data for no more than 5 years, you can specify:

BEFORE(5Y)

as selection criteria for an archive request and schedule the process to run automatically every quarter, without having to change the selection criteria for each run.

In the table list, right-click the ORDERS table name and click Table Specifications > Selection Criteria. The Table Specifications dialog is displayed with Selection Criteria as the active tab.Table Specifications dialog with Selection Criteria as the active tab

In the Selection Criteria column, type BEFORE(2Y) as selection criteria for ORDER_DATE. To return to the table list, click File > Close or click the Close button in the title bar. Note the presence of the selection criteria icon in the table list.

Define an archive index

After you select the data that you want to archive using selection criteria, the next step in the scenario is to establish indexes for your archived tables.

You can establish one or more indexes for each archived table. An archive index consists of values in one or more columns. The index information is stored in an archive index file, which can be searched more quickly than an entire archive file can be searched. When you browse or restore data, an index can expedite the search.

Using the Archive Index tab in the Table Specifications dialog, you can create as many as 64 indexes for each table in the archive file.
Note: A large number of indexes may increase processing time.
In this scenario, you are creating one index of values for CUST_ID in the CUSTOMERS table, because you anticipate searching archive files based on values for customer ID.

In the table list, right-click the CUSTOMERS table name. On the shortcut menu, click Table Specifications > Archive Index to open the Table Specifications dialog with Archive Index as the active tab. Initially, the name of the table for which you opened the shortcut menu is displayed. To edit specifications for another table in the table list, you can select the name from the Table drop-down list.

Table Specifications dialog with Archive Index as the active tab

By default, index names are generated in the form Indexn, where n is a sequential number that provides a unique name for each index. For this scenario, use the default index name.

Available Columns lists columns in the selected table that you can use as index columns. Double-click CUST_ID to add the column name to the Index Columns list. The column is defined as an Index Column. You can also drag a column name from Available Columns to Index Columns. If the index consists of more than one column, you can change the order of the columns in the index by dragging a column name to the desired position. The order determines the sequence in which the column values are searched.

When finished, click File > Close to save the index specifications and return to the Access Definition Editor. Note the presence of the archive index icon in the table list. The index created in the archive process is used later in this scenario.

Review table specifications

After you establish indexes for your archived tables, the next step in the scenario is to review the table specifications to ensure that they are correct.

Icons on the table list indicate the table specifications used in this scenario.

Access Definition Editor with Tables as the active tab. Icons for the ORDERS and CUSTOMERS tables are displayed in the grid under Table Specifications.

Selection Criteria icon The selection criteria icon indicates that selection criteria is specified for the ORDERS table.

Archive Index icon The archive index icon indicates that an index is defined for the CUSTOMERS table.

You can click an icon to quickly display the table specifications on the corresponding tab in the Table Specifications dialog.

Define the traversal path

After you review the table specifications, the next step in the scenario is to define the traversal path. That is, you select the relationships to be used and the direction in which the relationships are traversed — from parent to child, from child to parent, or in both directions — during the archive process.

The traversal path determines the sequence in which data is selected from tables referenced in the access definition. Consider the database structure of the tables you selected for archiving.

Diagram of traversal path among the CUSTOMERS, ORDERS, DETAILS, and ITEMS tables. The relationships are described as follows.

In the diagram and in the following discussion, the relationships between tables are indicated by a three-letter code: the letter "R", the first letter in the name of the parent table, and the first letter in the name of the child table.

ORDERS is the start table, because you want to archive a specific set of data related to ORDERS more than two years old. Based on the relationships between each pair of tables:

  • CUSTOMERS is a parent of ORDERS (relationship RCO).
  • ORDERS and ITEMS are parents of DETAILS (relationships ROD and RID).

To obtain the desired set of data for this scenario, you must include relationships RCO, ROD, and RID in the archive process and specify the direction in which they are traversed in processing.

Use the options on the Relationships tab to define the traversal path for selecting data from the tables referenced in the Access Definition.

Access Definition Editor with Relationships as the active tab

All relationships between pairs of tables in the table list are displayed. The parent table and the corresponding child table in each relationship is provided.

Review Status

Each listed relationship has a status, which indicates:

New
Relationship is listed for the first time. Once you have viewed the relationship, the status is no longer new, which is indicated by a blank status the next time you display the Relationships tab.The status of a relationship is New the first time the list is displayed after:
  • A new relationship between tables on the list is defined.
  • A table is added to the list.

When you run the process, a warning message advises you of any new relationships. Although you can proceed despite the warning, it might be prudent to review the relationship usage list, which is the point of the warning.

blank
Relationship is defined and is not new to the list.
Ref
Relationship is ignored because the parent or child table is a reference or lookup table, as specified on the Tables tab.
Unknown
Relationship does not exist. This condition can occur when a change in default qualifier causes a new set of tables and relationships to be referenced in the access definition. Relationships with unknown status are ignored during a process. To remove a relationship in unknown status, right-click to open the shortcut menu and select Remove or Remove All Unknown.

The Use new relationships check box is selected by default to include all relationships with a new status in the process. If you clear this check box, you must manually select new relationships to include them in the process. Clearing the check box ensures that a completed access definition is not affected by the addition of a new relationship.

Select Relationships

The Select column determines whether a relationship is used in the process. For this scenario, verify that check boxes for relationships ROD, RCO, and RID are selected.

Specify Options 1 and 2

By default, the process begins at the start table and proceeds through the data model, traversing selected relationships from parent to child — ORDERS to DETAILS. Specifically, the process obtains rows from ORDERS that match your selection criteria and then traverses the relationship ROD to obtain related DETAILS rows, the children of ORDERS.

However, to select related data from ITEMS and CUSTOMERS, the remaining relationships from child to parent must be traversed. For example, to obtain CUSTOMERS rows related to ORDERS and maintain referential integrity, the relationship RCO must be traversed from child to parent. Similarly, to obtain ITEMS rows related to DETAILS, the relationship RID must be traversed from child to parent.

You can use the Options check boxes to extend the traversal path:

  • Option (1) determines whether a relationship from child to parent is traversed to archive a parent row for each selected child row. By default, Option (1) is selected. As a result, this setting maintains the relational integrity of the data.
  • Option (2) determines whether additional child rows are archived when a parent row is archived because of Option (1). In other words, if a relationship has been traversed from child to parent to archive a parent row, all children of that parent are archived when Option (2) is selected. By default, Option (2) is cleared.

Examples

The Option (1) and Option (2) specifications are relevant when:

  • The start table is a child table. For example, if the ORDERS table is the start table, the Option (1) setting determines whether the related CUSTOMERS rows are selected and, if so, the Option (2) setting determines if additional ORDERS rows, related to the CUSTOMERS rows, are selected.
  • A table has more than one parent table. For example, if the ITEMS table is the Start Table, related rows are extracted from the DETAILS table because it is a child of the ITEMS table. However, the ORDERS table is also a parent of the DETAILS table. The Option (1) setting determines whether a traversal is made from child to parent to select the ORDERS rows related to the DETAILS rows and, if so, the Option (2) setting determines if additional DETAILS rows, related to the ORDERS rows, are selected.

In this scenario, the default settings for each relationship direct processing to:

  • Select data from rows in the ORDERS table that meet the selection criteria BEFORE(2Y) for ORDER_DATE.
  • Select parent rows from the CUSTOMERS table because Option (1) is selected for the relationship RCO between CUSTOMERS and ORDERS.
  • Select child rows from the DETAILS table to follow the basic traversal path from parent to child for the relationship ROD between ORDERS and DETAILS.
  • Select parent rows from the ITEMS table because Option (1) is selected for the relationship RID between DETAILS and ITEMS.

Review the traversal path

After defining the traversal path but prior to running the process, verify that the appropriate data will be archived.

Use the Show Steps option to display the steps to be performed in the process. Click Tools > Show Steps to display a narrative that describes the traversal path.

Show Steps for Extract dialog. This dialog lists the steps in the traversal path.

To return to the Relationships tab, click File > Close. Then click File > Update and Return to update the access definition and return to the Archive Request Editor.



Feedback