System iNetwork


This page is a local copy of http://systeminetwork.com/article/build-new-interface-legacy-applications-sql-udtfs

Build a New Interface to Legacy Applications with SQL UDTFs

by Jean-Paul Lamontre
System iNetwork Programming Tips Contributor


February 05, 2007 —  

With a good method and a good tool, writing a modern application is easy. But what can you do when you need to work with a piece of "untouchable" legacy code? After all, many companies stick to the old mantra of "if it ain't broke, don't fix it!"

For example, you want to show some data, but the data doesn't come from a true database table. Instead it's calculated by a piece of code. And that code runs well, but it's undocumented and nearly unmaintainable — a complex module that shows items' availability depending on your complex internal business rules. In such a case, using the same code for current green-screen users and for new web users is important. It's not always feasible to rewrite the code in Java or to maintain two different programs that do the same thing! Or perhaps you have data too volatile to store in a table, so it has to be calculated each time you need it.

This article demonstrates how an SQL User-Defined Table Function (UDTF) can help you keep your existing business code, with minimal changes, while still letting you access the data from modern tools.

For example, let's say that you are debugging a web application and you need to know the attributes of the server job. Because you need many attributes, building a normal SQL user-defined function (UDF) is insufficient. A normal UDF can return only one parameter, so returning a list of fields is not pretty.

Here's some sample code that demonstrates what I mean:

  RTVJOBA    JOB(&JOB) USER(&USER) NBR(&NBR) +
               INQMSGRPY(&INQMSGRPY) USRLIBL(&USRLIBL) +
               CURLIB(&CURLIB) SYSLIBL(&SYSLIBL) +
               CURUSER(&CURUSER) LANGID(&LANGID) +
               CNTRYID(&CNTRYID) CCSID(&CCSID) +
               DFTCCSID(&DFTCCSID) DECFMT(&DECFMT)

We'd need a miracle to encapsulate all these fields in a UDF. What sort of miracle? An SQL UDTF.

A UDTF is a user-defined function that returns a virtual table instead of a single value. It can return many rows, and each row can have many columns. Consider the following SQL statement:

SELECT * FROM table(jpltools1.udtffunc4()) as t

When I run this statement, it returns the following:

In the preceding example, UDTFFUNC4 is the name of a function that I wrote to retrieve the job attributes. Because the call to that program is wrapped by the TABLE() keyword, SQL views the results as a table, and I can issue a SELECT statement on it, just as I would any other table.

The link between this SELECT and my ILE CL program is called an "external table function." External means that it's not written in SQL. To install this function on my system, I run the following SQL statement to tell SQL about my function and list the columns of the virtual table that it returns:

CREATE FUNCTION  JPLTOOLS1/UDTFFUNC4 ()
RETURNS TABLE (
  JOB       CHAR   (10)
, USER      CHAR   (10)
, NBR       CHAR    (6)
, INQMSGRPY CHAR   (10)
, USRLIBL   CHAR (2750)
, CURLIB    CHAR   (10)
, SYSLIBL   CHAR  (165)
, CURUSER   CHAR   (10)
, LANGID    CHAR    (3)
, CNTRYID   CHAR    (2)
, CCSID     DECIMAL (5, 0)
, DFTCCSID  DECIMAL (5, 0)
, DECFMT    CHAR    (1)
)
LANGUAGE CLLE
PARAMETER STYLE DB2SQL
NOT DETERMINISTIC
NO SQL
CALLED ON NULL INPUT
NO DBINFO
NO EXTERNAL ACTION
NOT FENCED
NO FINAL CALL
DISALLOW PARALLEL
SCRATCHPAD
EXTERNAL NAME JPLTOOLS1/UDTFC4
CARDINALITY 1

In contrast to a typical stored procedure that returns a result set, a UDTF has its columns defined ahead of time by a CREATE FUNCTION statement, as I demonstrated in the preceding example.

The EXTERNAL NAME keyword specifies the ILE CL program that the UDTFFUNC4 function calls. In this example, the name of that program is UDTFC4, which happens to be the ILE CL program that returns the job information.

You can control how parameters are exchanged between SQL and ILE CL (or whichever language you've chosen) by specifying the following options when you run the CREATE FUNCTION statement:

PARAMETER STYLE DB2SQL mandatory — no choice, no comment
NOT DETERMINISTIC two successive calls won't always return the same values: SQL cannot cache the returned values when optimizing
NO SQL there is no SQL package associated; save & restore operation on *PGM cannot restore attached SQL function
CALLED ON NULL INPUT SQL cannot return NULLs directly without calling underlying code
NO DBINFO the function does not need the database information passed as an additional parameter; details in rbafzmst.pdf (DB2 Universal Database for iSeries SQL Reference)
NO EXTERNAL ACTION no call outside SQL scope, some optimizations are allowed
NOT FENCED code doesn't need to run in a dedicated thread, some optimizations are allowed
NO FINAL CALL function doesn't need additional calls to reserve resources or free them up; it needs to be called only for open, fetch, and close steps
DISALLOW PARALLEL function is not thread safe
SCRATCHPAD function needs a working variable space attached to the thread, totally under control of function code; it's an integer field that specifies the length, followed by the data; default length is 100 bytes, so an ILE CL program would receive a 104 byte parameter
CARDINALITY 1 average number of rows that the function returns

The source code for the UDTFC4 program is included in the code bundle for this article. Use the link at the end of the article to download it.

The UDTFC4 program demonstrates how SQL works with an external program for a table function. Here's what it does:

  • The first time SQL needs the function, it calls the underlying code with the CallType parameter set to -2 (which means "FIRST CALL:). (This step is skipped if you specify NO FINAL CALL on the CREATE FUNCTION statement.)

  • Each time a SELECT statement calls the function, SQL prepares a new SCRATCHPAD, then calls the program once more, this time specifying -1 for the CallType parameter to indicate that it's an "open call" (this is where you would open files if you needed to use them in the function).

  • SQL then calls the function repeatedly (in a loop) with the CallType parameter set to 0 (which means "FETCH CALL:). It continues to do this until you send back SQLSTATE='02000' (= normal end of data) in the SQLSTATE parameter.

  • SQL then calls it once with CallType=1 (=CLOSECALL).

  • The last time SQL needs the function (e.g., when executing a COMMIT), it calls with CallType = 2 (=FINALCALL). This is done only if "NO FINAL CALL" is not specified on the CREATE FUNCTION statement. You should avoid performing a COMMIT during the FINAL CALL, because it's too late for SQL at that point.

  • During any of these calls, you can set SQLSTATE to an error code other than '00000' and '02000' to indicate an error. This might change the sequence in which SQL calls the function.

How can you see the code running? Start a service job on the job that runs the call to the UDTF, then start debug on the program. I suggest using iSeries Navigator's Run Sql Script option to test your UDTF, because it shows itself the job name that you need for the service job.

After these introductory games, I've provided another sample function coded in RPG. Instead of demonstrating some complex legacy code (which wouldn't be very demonstrative, because everyone's legacy code is different), I think it's better to show a function that returns data from the QIWS/QCUSTCDT file, which comes with every i5/OS installation.

My sample program is called UDTFR3, and it's the external program for the UDTFFUNC3 function. It's included with the code download for this article, so you can see how it's written.

Note: I've tried very hard to protect the code against an abnormal end by coding many MONITOR and (E) extenders, because SQL does not understand classic "abnormal ends" of programs. Instead, an error should be signaled to SQL using the SQLSTATE parameter.

To test it normally, have the QCUSTCDT file online (either add QIWS to your library list or duplicate the file into your test library). To test an error case, remove the file from your library list. In either case, run the following SQL statement:

SELECT * FROM table(jpltools1.udtffunc3()) as t

It should return the following:

The SQL table function calls the RPG program as many times as necessary to build the virtual table, then processes it with the global SELECT statement to compute the results. It can be a completely different table — it doesn't matter.

Conclusion

You can see that building a UDTF is a little complex. If you have ever coded a user-defined result set procedure, you'll notice that an RS Stored Proc is a little simpler, because you don't have to define returned data: The returned data is dynamic, which is the opposite of "strongly typed."

It's the price you pay for the integrity of your data and is the same idea as the format level identifier of a file. With a strongly typed SQL statement (i.e., a table function), the client, which might be developed by another team (e.g., the web team), knows exactly which data will be returned and can receive only that data. It's the same as the way a program cannot run with a file when the format level identifier has changed.

Also, because the column layout is known in advance, it's possible to call a UDTF from a language that uses an SQL precompiler, such as SQLRPGLE. This is impossible with stored procedures, because the fields in the result set are unknown when the precompiler is run.

A smart development tool can see a change in the function definition. This is impossible with a result set procedure, because the result set structure is known by SQL only after executing the procedure.

A table function provides protection, but it's weaker than a format level identifier (FLI). With a changed FLI, a program cannot run. A modified table function continues to be callable, and code continues to work, not being sensitive to change. The development team must to take care of the changes. A result set, by its construction, is loaded to the client, then used. It can be a heavy network load!

A table function, because of its destination (to be incorporated in a SELECT), participates in more global work: Build an answer to the client, including the WHERE clause on the temporary table, and all the work is done on the server side.

Code Download

You can download the sample UDTF programs from the following link:
http://www.pentontech.com/IBMContent/Documents/article/54008_162_UDTF.zip








Penton Technology Media
Connected Home | Group Computing | IT Buyer's Network | SQL Server Magazine | WebSphere Professional | Windows 2000 Magazine
Report Bugs | Contact Us | Comments/Suggestions | Site Use Agreement | Privacy Policy
Copyright © Penton Technology Media
Penton Media
System i and iSeries are trademarks of International Business Machines Corporation and are used by Penton Media, Inc., under license. systeminetwork.com is published independently of International Business Machines Corporation, which is not responsible in any way for the content. Penton Media, Inc., is solely responsible for the editorial content and control of the iSeries Network.