Browse Prior Art Database

FTP data compression (RFC0468)

IP.com Disclosure Number: IPCOM000003614D
Original Publication Date: 1973-Mar-08
Included in the Prior Art Database: 2000-Sep-13
Document File: 5 page(s) / 14K

Publishing Venue

Internet Society Requests For Comment (RFCs)

Related People

R.T. Braden: AUTHOR

Abstract

APOLOGIA

This text was extracted from a ASCII Text document.
This is the abbreviated version, containing approximately 21% of the total text.

Network Working Group R. Braden

Request for Comment: 468 UCLA/CCN

NIC: 14742 March 8, 1973

FTP DATA COMPRESSION

I. INTRODUCTION

APOLOGIA

Major design objectives of the proposed File Transfer Protocol (FTP)

are reliability and efficiency for transmission of large files.

Efficiency has two faces: efficiency of the host CPU's, and efficient

use of the Network bandwidth. Block mode is intended to minimize CPU

overhead for bandwidth efficiency, there is a mode called "HASP" in

RFC 454. The "HASP" mode of FTP is really transmission with data

compression, i.e., an encoding scheme to reduce the information

redundancy in the messages.

RFC 454 contains no explicit definition of the "HASP" or compressed

mode, but instead notes that a future RFC by yours truly will define

the mode. Students of FTP may find this scarcely credible, but you

are now reading the promised RFC. It turned out to be much farther

in the future than any of us expected. Mea Culpa.

GENERAL CONSIDERATIONS

In the early years of the Network, its major uses have been remote

terminal interactions and the small-to-medium-sized file transmission

typical of remote job entry. As facilities such as the Illiac IV and

the Data Machine become operational on the Network, and the Network

community begins to include users with heavy data transmission

requirements, large file transmission will become a major mode of

Network use. For example, one user of CCN expects to send 2 x 10**8

bits of data _each_ _day_ over the Network.

Local byte compression of the type proposed here is particular

effective for reducing the size of "printer" files such as those

transmitted under the Network RJE protocol. Experience with CCN's

RJS service has shown a typical compression of print files by a

factor of between two and three. Since FTP was intended to contain

the data transfer part of Network RJE protocol as a subset, it is

appropriate to include a print file compression mechanism in FTP.

These considerations led the FTP committee to include a compressed

mode within FTP.

The two main arguments for data compression are economics and

convenience (usability). Consider first economics, which is

essentially a trade-off between CPU time and transmission costs. Of

course, as long as Network use is a free commodity, the economics of

data compression are all bad. That happy state won't last forever.

What does data compression cost?

Let us consider only simple linear compression schemes, such as the

one proposed here. By linear, I mean that the CPU time to examine a

source record is proportional to number of bytes in the record. A

simple linear scheme could detect repeated single characters, for

example. One could imagine quadratic schemes, which detected

repeated substrings; but excep...