Bridging raw TCP binary data to HTTP
Publication Date: 2010-Jul-23
The IP.com Prior Art Database
DataPower Appliances are able to deal with XML as well as binary data received through many protocols (HTTP, MQ, FTP, ...). They are also able to deal with XML received over raw TCP connection without any protocol. There exist many specific binary data formats which are transmitted over raw TCP. Examples are SML (Smart Metering Language, Energy and Utilities), ISO 8583, ... . Even there are combined formats like Microsoft Exchange which transfer XML data padded with binary data. The binary data disallows "normal" XML processing. While it is easy to process this binary data on Datapower if received over a protocol (HTTP, MQ, ...) it is not possible to process because it is sent over raw TCP. A TCP proxy receives raw TCP data on the frontside and just forwards it to a backend. Responses received from the backend are forwarded to the frontend.
Bridging raw TCP binary data to HTTP
Detailed description of the modified TCP proxy steps:
This is the dummy header sent to the backend on connection creation: "POST /
HTTP/1.1\r\nHost: 184.108.40.206\r\nUser-Agent: none
chunked\r\n\r\n". The POST statement defines HTTP/1.1 protocol which provides chunked transcoding. The HOST entry is necessary but not used, therefore the ip address is set to 220.127.116.11. Alternatively the more correct IP address of the backend could be supplied. The User-Agent entry is necessary but unimportant.
It could be any string but "none" describes the situation at best since we have no user agent on the frontside. Last but not least the Transfer-Encoding specifies chunked transfer which is necessary for this solution.
Receiving any data packet from the frontend it just gets wrapped by some bytes
making it a chunked data packet for submission to the backend. The TCP proxy being the basis of this solution typically has two buffers where the frontside and backside data received are buffered before sending to the backside/frontside. This buffer has a specific length, take 32KB as an example. This size is an upper bound of the maximal packet sent to the backend, and in this case 16bit or 4 hex digits are enough to signal the data length. Chunked packets are of the form "hex-length(DATA)\r\nDATA\r
"0\r\n\r\n" will be sent to the backend to signal end of transmission (0 size packet).
This solution just receives the data from the backend and knows to receive the
response HTTP/1.1 chunk encoded because of the request. It just determines the HTTP response header and discards it before sending further packets to the frontside. The end of the HTTP response header is defined by the sequence "\r\n\r\n" which is easy to detect.
For each chunk-encoded data packet received by the backend the preceding
length and "\r\n" and the "\r\n" at the end of the packet are removed before just passing the remaining data to the frontside. This is just the reverse of operation under 2).
The last packet received from the backend is always "0
\r\n" data in the backside buffer. Another specific process is
responsible to transmit that buffer data to the frontside, the trick is that not the data received from the backside will end in that buffer, but the data already modified by steps 4-6.
An important aspect of this system-solution (micro-controller solution with 2 ethernet ports) is the ability to add missing HTTP header data to the (artificial) HTTP header in step 1. Additionally simple transformations might be specified here. A simple example would be to send the received binary data from the frontside hex-encoded
to the backend for a 291 byte DATA packet.
As described above, if client signals shutdown of the socket connection,
n", therefore packet "0123
DATA\r\n" gets send
n" which is just
While the previous description states that the received packets are directly sent, th...