iCan Blog Archive

Now that the 7.1 XML enhancements for Db2 for i have been announced, it’s a great time to mention the XML search support that is available in OmniFind V1R2.  

When the text search index is built over an XML column, OmniFind will, by default, include the XML markup when indexing and searching documents. XML search allows a search to be scoped to a specific element or attribute, rather than the entire document. 

In addition, the search syntax allows comparing an element or attribute value to a numeric, ISO date or ISO dateTime value during the search.

For example, let’s say I insert the document below into an XML column PRODUCT_ORDER of a table ORDERS:

> INSERT INTO ORDERS (PRODUCT_ORDER) VALUES(

'George Jetson
Referential Universal Digital Indexer (RUDI)
999.99 
2010-04-20T13:00:00 
2010-04-22 ');

Now after my XML data has been indexed using OmniFind, I can perform searches that include the XML markup in the search.

Maybe I want to search for orders where the customer’s title is a variation of “digital indexing operator”.  (Notice that “indexing” is a different linguistic variation of the word “index” that exists in the /ORDER/CUSTOMER/TITLE element of my document that I hope to find.)

I can find my document with this query:

> SELECT PRODUCT_ORDER FROM ORDERS WHERE CONTAINS(PRODUCT_ORDER, '@xmlxp:''/ORDER/CUSTOMER/TITLE[. contains("digital indexing operator")] '' ' ) = 1;

Those familiar with the popular W3 XPath syntax will notice the XML search syntax used by OmniFind is similar to that standard. 

The capability to include numeric, xs:date and xs:dateTime values in the search adds more potential.

Let’s say I only want rows for orders from George that were ordered this year.

  > SELECT PRODUCT_ORDER FROM ORDERS WHERE CONTAINS(PRODUCT_ORDER, '@xmlxp:''/ORDER[CUSTOMER/NAME contains("George Jetson") and TIME_ORDERED >= xs:dateTime("2010-01-01T00:00:00")]'' ') = 1;

The same concept applies to the xs:date data type. I can search for orders delivered to George on or before 2010-12-31.

>   SELECT PRODUCT_ORDER FROM ORDERS WHERE CONTAINS(PRODUCT_ORDER, '@xmlxp:''/ORDER[CUSTOMER/NAME contains("George Jetson") and DATE_DELIVERED <= xs:date("2010-12-31")]'' ') = 1;

OmniFind is optimized for searches that involve text, but there’s no requirement that the search include a full text search ‘contains’ in the xmlxp expression.

>  SELECT PRODUCT_ORDER FROM ORDERS WHERE CONTAINS(product_order, '@xmlxp:''/ORDER/COST[.  > 999] '' ') = 1;

Returns all rows for orders where cost is greater than 999.

Searches can include attributes in addition to elements:

>  SELECT PRODUCT_ORDER FROM ORDERS WHERE CONTAINS(PRODUCT_ORDER, '@xmlxp:''/ORDER/COST[.  > 999 and @unit = "space tokens"] '' ') = 1;

Returns all rows for orders where the cost is greater than 999 and the “unit” attribute of COST is “space tokens”.

For the advanced XML user, the XML search syntax also supports XML namespaces on element and tag names.

In addition to XML support, there are a number of other improvements that were added in the new release. 

  •  A text Search index’s structure can be backed up and restored with traditional IBM i Save and Restore interfaces.
  •  Move and Rename of a text search index and it’s based on table are now supported.
  • The Generate SQL Data Definition Language (QSQGNDDL) API is supported for text search indexes
  • IBM Navigator for i supports many common administrative tasks for Text Search indexes.
  • The text search server can have a unique alias name assigned to it, in addition to the numeric identifier assigned to it by the system.

These are some exciting capabilities for search and XML, in this new release…worth keeping an eye on.

Nick Lawrence wrote this week’s blog as a followup to his recent “Find a Needle in a Haystack using OmniFind Text Search Server for DB2 for i.” Nick works for IBM in Rochester, Minn., on Db2 for IBM i. His responsibilities include catalog support, Db2 object management and full-text search. Thanks, Nick!

This blog post was originally published on IBMSystemsMag.com and is reproduced here by permission of IBM Systems Media.