Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Serving the Web as a file system

IP.com Disclosure Number: IPCOM000009152D
Publication Date: 2002-Aug-09
Document File: 4 page(s) / 13K

Publishing Venue

The IP.com Prior Art Database

Related People

Wayne Gramlich: INVENTOR [+3]

Related Documents

http://www.faqs.org/rfcs/rfc2616.html: URL [+3]

Abstract

[ IPCOM000000010S originally published 2001-07-20 00:08 UTC ] Traditionally, the World Wide Web is accessed through web browsers. However, it may be useful to provide a file metaphor for accessing web pages: in other words, to make web pages accessible through a file system, by treating URLs as file names. This will allow programs that work with files to have direct access to the data on the Web. As explained in the companion disclosure "File System Filters", it's relatively easy to simulate a file system using the WebNFS and NFSv3 protocols. This disclosure explains the details of simulating a file system so as to provide access to Web data. The NFS protocols rely on a number of procedure calls made to a server by a client. If all of these calls are handled properly by the server, then the client will "see" a file system. This file system may be used locally and/or served to other machines, including non-Unix machines. The presence of the file system will allow users and client-side programs to use URLs as file names. This disclosure is written for NFS and Unix (e.g. Solaris and Linux); however, it is applicable to other file systems and operating systems (e.g. SMB and Windows) as well. This is merely an overview of the complete disclosure, which is provided in the accompanying text file. ** NFS Calls ** NFS calls intended to modify the file system usually fail with NFS3ERR_ROFS. GETATTR, FSSTAT, FSINFO, ACCESS, and PATHCONF calls will return fake data. The READLINK call should simply fail. Path lookups should be remembered but not accessed until the client tries to read the file; then the path is reassembled. READ returns data from the desired page. READDIR and READDIRPLUS return (representations of) URLs as explained later. File attributes will mostly be set to some default value. File size probably needs to be correct and may be obtained in several ways. ** Files, Directories, and URLs ** It is desirable to treat a URL as both a file and a directory, so that a program or user can both view the content and follow the links. The full disclosure explains in detail how to do this, both for clients that are willing to return raw data from a directory and for clients that are not. The latter case requires returning two file names for each URL. Each file should be listed as a symbolic link to prevent improper concatenation of the file to the end of the client's current "directory" (URL). Embedded slashes in hyperlinks usually produce correct behavior. Hyperlinks containing complete hyperlinks may cause ambiguity; special characters that are not allowed in URLs may be added to names returned by the server to resolve this. Various options may be combined to produce the best behavior for the purposes of the user (e.g. human readability, convenience of typing, minimal ambiguity for automated programs). The fileid can be generated by hashing the URL. In order to determine the true current working directory path after a change of directory, the client goes through a complicated procedure to follow the path back to the root. The full disclosure explains several ways to support this behavior. ** Misc. ** Since NFS servers are expected to be "dumb", the client should handle missing functionality correctly. The server may be configured with command-line options, environment variables, or special protocol hacks such as magic file names. Writing to a page may invoke a reload (clear caches), post form or authentication data, create a local version of the page, or of course modify the page if the web server permits. Long URLs may be split, with a continuation code. Cookies may be supported in several ways, or refused. HTML frames may be treated (for READDIR purposes) as a page containing links to the pages in the frames. A Web file system may be used to pass a file system through a firewall that allows HTTP but not NFS data. Details in the full disclosure. Network Lock Manager (NLM) protocol can't be directly supported over HTTP, but may be simulated. Since the file system is simulated, mounting is quite simple and does not need explanation here. Like other file systems, the Web file server can work with an automounter. As is explained in the File System Filters disclosure, an overlay file system may be used in conjunction with this one to let users write new files into their local version of a Web site. Secure HTTP may be handled transparently. Extensions of this disclosure will be obvious to anyone skilled in NFS, file systems, or HTTP. It will be obvious how to use the Samba program to serve the file system provided by this (or other fsfilters) to Microsoft Windows systems. [ 000000010S 10S ]

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 23% of the total text.

Title: Serving the Web as a file system

Traditionally, the World Wide Web is accessed through web browsers.� However, it may be useful to provide a file metaphor for accessing web pages: in other words, to make web pages accessible through a file system, by treating URLs as file names.� This will allow programs that work with files to have direct access to the data on the Web.

As explained in the companion disclosure "File System Filters", it's relatively easy to simulate a file system using the WebNFS and NFSv3 protocols.� This disclosure explains the details of simulating a file system so as to provide access to Web data.� The NFS protocols rely on a number of procedure calls made to a server by a client.� If all of these calls are handled properly by the server, then the client will "see" a file system.� This file system may be used locally and/or served to other machines, including non-Unix machines.� The presence of the file system will allow users and client-side programs to use URLs as file names.

This disclosure is written for NFS and Unix (e.g. Solaris and Linux); however, it is applicable to other file systems and operating systems (e.g. SMB and Windows) as well.

� NFS Calls

Many of the NFS calls are intended to modify the file system.� Since the Web is generally read-only, these calls can simply fail with the error appropriate to a read-only file system, NFS3ERR_ROFS.� These calls include SETATTR, WRITE, CREATE, MKDIR, SYMLINK, MKNOD, REMOVE, RMDIR, RENAME, LINK, and COMMIT.

GETATTR, FSSTAT, FSINFO, ACCESS, and PATHCONF calls will return fake data intended to portray the files/pages or "file system" in question as read-only and minimally featured.� Max file name length should of course be set as long as possible.�

The READLINK call should simply fail.

NFS does not exactly use open/use/close semantics: since it is "stateless", a client first looks up the file name (obtaining a file handle) with LOOKUP, and then accesses it at any future time.� LOOKUP is also used for traversing directories; the call will be made with a handle referring to a directory and the name of a subdirectory, and return a handle corresponding to the subdirectory.� In Unix, the path separator "/" is the same as the separator in URLs, so clients will generally interpret a URL as a file under several subdirectories.� This will usually result in several LOOKUP requests with substrings of the URL.� The server should indicate success on any such request, without actually going out to the Web (since unlike a file system, a prefix of a URL may not be valid), and cache the part of the URL that was provided in that request.� When an attempt to use a LOOKUP'd name is made (with READ, READDIR, or READDIRPLUS), the entire name should be reassembled by the server, and the corresponding page fetched from the Web.� WebNFS allows a LOOKUP to traverse several directories at once; this is even easier to deal with.

A READ call will simply return data from a web page.� The server...