Build a New Interface to Legacy Applications with SQL UDTFs
by Jean-Paul Lamontre
February 05, 2007 — With a good method and a good tool, writing a modern application is easy. But what can you do when you need to work with a piece of "untouchable" legacy code? After all, many companies stick to the old mantra of "if it ain't broke, don't fix it!" For example, you want to show some data, but the data doesn't come from a true database table. Instead it's calculated by a piece of code. And that code runs well, but it's undocumented and nearly unmaintainable — a complex module that shows items' availability depending on your complex internal business rules. In such a case, using the same code for current green-screen users and for new web users is important. It's not always feasible to rewrite the code in Java or to maintain two different programs that do the same thing! Or perhaps you have data too volatile to store in a table, so it has to be calculated each time you need it. This article demonstrates how an SQL User-Defined Table Function (UDTF) can help you keep your existing business code, with minimal changes, while still letting you access the data from modern tools. For example, let's say that you are debugging a web application and you need to know the attributes of the server job. Because you need many attributes, building a normal SQL user-defined function (UDF) is insufficient. A normal UDF can return only one parameter, so returning a list of fields is not pretty. Here's some sample code that demonstrates what I mean: RTVJOBA JOB(&JOB) USER(&USER) NBR(&NBR) + INQMSGRPY(&INQMSGRPY) USRLIBL(&USRLIBL) + CURLIB(&CURLIB) SYSLIBL(&SYSLIBL) + CURUSER(&CURUSER) LANGID(&LANGID) + CNTRYID(&CNTRYID) CCSID(&CCSID) + DFTCCSID(&DFTCCSID) DECFMT(&DECFMT) We'd need a miracle to encapsulate all these fields in a UDF. What sort of miracle? An SQL UDTF. A UDTF is a user-defined function that returns a virtual table instead of a single value. It can return many rows, and each row can have many columns. Consider the following SQL statement: SELECT * FROM table(jpltools1.udtffunc4()) as t When I run this statement, it returns the following: In the preceding example, UDTFFUNC4 is the name of a function that I wrote to retrieve the job attributes. Because the call to that program is wrapped by the TABLE() keyword, SQL views the results as a table, and I can issue a SELECT statement on it, just as I would any other table. The link between this SELECT and my ILE CL program is called an "external table function." External means that it's not written in SQL. To install this function on my system, I run the following SQL statement to tell SQL about my function and list the columns of the virtual table that it returns: CREATE FUNCTION JPLTOOLS1/UDTFFUNC4 () RETURNS TABLE ( JOB CHAR (10) , USER CHAR (10) , NBR CHAR (6) , INQMSGRPY CHAR (10) , USRLIBL CHAR (2750) , CURLIB CHAR (10) , SYSLIBL CHAR (165) , CURUSER CHAR (10) , LANGID CHAR (3) , CNTRYID CHAR (2) , CCSID DECIMAL (5, 0) , DFTCCSID DECIMAL (5, 0) , DECFMT CHAR (1) ) LANGUAGE CLLE PARAMETER STYLE DB2SQL NOT DETERMINISTIC NO SQL CALLED ON NULL INPUT NO DBINFO NO EXTERNAL ACTION NOT FENCED NO FINAL CALL DISALLOW PARALLEL SCRATCHPAD EXTERNAL NAME JPLTOOLS1/UDTFC4 CARDINALITY 1 In contrast to a typical stored procedure that returns a result set, a UDTF has its columns defined ahead of time by a CREATE FUNCTION statement, as I demonstrated in the preceding example. The EXTERNAL NAME keyword specifies the ILE CL program that the UDTFFUNC4 function calls. In this example, the name of that program is UDTFC4, which happens to be the ILE CL program that returns the job information. You can control how parameters are exchanged between SQL and ILE CL (or whichever language you've chosen) by specifying the following options when you run the CREATE FUNCTION statement:
The source code for the UDTFC4 program is included in the code bundle for this article. Use the link at the end of the article to download it. The UDTFC4 program demonstrates how SQL works with an external program for a table function. Here's what it does:
How can you see the code running? Start a service job on the job that runs the call to the UDTF, then start debug on the program. I suggest using iSeries Navigator's Run Sql Script option to test your UDTF, because it shows itself the job name that you need for the service job. After these introductory games, I've provided another sample function coded in RPG. Instead of demonstrating some complex legacy code (which wouldn't be very demonstrative, because everyone's legacy code is different), I think it's better to show a function that returns data from the QIWS/QCUSTCDT file, which comes with every i5/OS installation. My sample program is called UDTFR3, and it's the external program for the UDTFFUNC3 function. It's included with the code download for this article, so you can see how it's written. Note: I've tried very hard to protect the code against an abnormal end by coding many MONITOR and (E) extenders, because SQL does not understand classic "abnormal ends" of programs. Instead, an error should be signaled to SQL using the SQLSTATE parameter. To test it normally, have the QCUSTCDT file online (either add QIWS to your library list or duplicate the file into your test library). To test an error case, remove the file from your library list. In either case, run the following SQL statement: SELECT * FROM table(jpltools1.udtffunc3()) as t It should return the following: The SQL table function calls the RPG program as many times as necessary to build the virtual table, then processes it with the global SELECT statement to compute the results. It can be a completely different table — it doesn't matter. ConclusionYou can see that building a UDTF is a little complex. If you have ever coded a user-defined result set procedure, you'll notice that an RS Stored Proc is a little simpler, because you don't have to define returned data: The returned data is dynamic, which is the opposite of "strongly typed." It's the price you pay for the integrity of your data and is the same idea as the format level identifier of a file. With a strongly typed SQL statement (i.e., a table function), the client, which might be developed by another team (e.g., the web team), knows exactly which data will be returned and can receive only that data. It's the same as the way a program cannot run with a file when the format level identifier has changed. Also, because the column layout is known in advance, it's possible to call a UDTF from a language that uses an SQL precompiler, such as SQLRPGLE. This is impossible with stored procedures, because the fields in the result set are unknown when the precompiler is run. A smart development tool can see a change in the function definition. This is impossible with a result set procedure, because the result set structure is known by SQL only after executing the procedure. A table function provides protection, but it's weaker than a format level identifier (FLI). With a changed FLI, a program cannot run. A modified table function continues to be callable, and code continues to work, not being sensitive to change. The development team must to take care of the changes. A result set, by its construction, is loaded to the client, then used. It can be a heavy network load! A table function, because of its destination (to be incorporated in a SELECT), participates in more global work: Build an answer to the client, including the WHERE clause on the temporary table, and all the work is done on the server side. Code DownloadYou can download the sample UDTF programs from the following link: |
|