Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Optimized Conversion of SQL Identifiers between EBCDIC 500 and UTF-8 in a Distributed Environment

IP.com Disclosure Number: IPCOM000021750D
Original Publication Date: 2004-Feb-05
Included in the Prior Art Database: 2004-Feb-05
Document File: 3 page(s) / 38K

Publishing Venue

IBM

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 37% of the total text.

Page 1 of 3

Optimized Conversion of SQL Identifiers between EBCDIC 500 and UTF-8 in a Distributed Environment

Optimizing a conversion algorithm is helpful in improving the performance in all computing environments. The following description details optimizing conversion between EBCDIC 500 code page and UTF-8.

Currently, in a distributed environment, a general purpose conversion routine is used to do the conversion for most conversion needs from EBCDIC 500 to UTF-8 and vice versa. The problem with using the general purpose conversion routine indiscriminately is that the cost can be high : the machine instructions needed can range from 700 to 3000 per call. The problem is especially bad when it involves columns, because a table can have many hundreds of columns; multiply that with any number between 700 and 3000, and the problem is obvious.

In a distributed environment, the situation worsens because the conversion is done once at the server reply object generation and again at the requester reply object parsing . This problem can be alleviated by doing some optimization before calling the general purpose conversion routine . Its application is not limited to only the conversion between EBCDIC 500 and UTF-8. Any conversion can benefit from some optimization before calling the general purpose conversion routine.

Using optimization, the number of instructions required to the conversion can be drastically reduced. With optimization, it will take about 10 machine instructions to complete the conversion versus the possible 700-3000 instructions. Actually The algorithm can be applied to non-distributed environment as well.

The optimization algorithm is as follows,

1.Rule out characters that cause expansion or contraction, such as the "not" sign, etc,

2.Rule out delimited SQL identifiers, indicated by either a single quote or double quote around the data string, 3.Rule out SQL identifiers consisting of mixed data string containing the shift-out ('0E'X) and the shift-in('0F'X) characters.

The above 3 items can be done by using the function VERIFY to ensure each character in the SQL identifier data string is of the char A-Z, 0-9, $,# and @. It also ensures that there are no translation errors.

4. If the VERIFY function indicates that each character in the SQL identifier data string is A -Z, 0-9, $,# and @, use a TRanslate instruction with a translate table to do the conversion . If the VERIFY function indicates that any character of the SQL identifier is not A-Z, 0-9,$,# or @, use the general purpose conversion routine.

The following is the pseudo code for converting from EBCDIC 500 to UTF-8:

Declare I fixed bin(15);

Declare allowed_chars CHAR(39) CONTANST('ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789$#@'); I = 0; ! initialize

I = Verify(input_sql_identifier,allowed_chars);

IF I > 0 Then

Call the general purpose conversion routine

1

Page 2 of 3

ELSE

use TR(input_sql_identifier,DSNZE2U);

The Translate table for EBCDIC 500 to UTF-8 is as follows,

DCL

1 DSNZE2U CHA...