Browse Prior Art Database

Creating A Unique Time-Based ID In A Multi-Process Database Application

IP.com Disclosure Number: IPCOM000232186D
Publication Date: 2013-Oct-24
Document File: 3 page(s) / 23K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a novel algorithm that generates a timestamp guaranteed to be unique among the processes without having to insert into a table. Depending on usage of the timestamp-based ID, the parameters of the algorithm can be changed to guarantee uniqueness for one million IDs per second, 60 million per hour, or 3,600 million per day while the formats of the timestamps are preserved.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 45% of the total text.

Page 01 of 3

Creating A Unique Time-Based ID In A Multi-Process Database Application

The timestamp-based ID can be used to partition the rows into different buckets of rows per hour, day, month, and year. The most useful application of this is to use the database partition feature to actually store data based on this timestamp-based ID primary key to different tables per month or year. These monthly/yearly tables can then be archived or backed up, and taken off line or on line separately from the rest of data. Hence, keeping the ID in the timestamp-based format is a crucial application.

In a multi-process database application, creating a unique ID of any kind requires some serialization and communication between the processes that are creating these unique IDs. Once a process generates an ID, it needs to communicate to other processes and reserve the ID it generated. If the ID is used, then the process has to regenerate and try again.

Note that creating ID is only one of the many steps in a transaction that creates an item.

An item is a set of rows in a database or multiple databases, in a single machine or multiple machines, which constitutes a single entity in an application data model. For example, in a typical content management system (CMS), an item may consist of rows in a database to store a metadata and another database in a different machine to store the documents this item has. In a CMS, in the same transaction, the ID might need to be sent to other machines for processing before the transaction can be committed.

A common practice in database applications is for all processes to generate a timestamp-based ID and insert it into a table as a unique key. If the insert returns the duplicate key error, then the process must try to generate a new unique ID. Note that this checks against both current and older IDs.

This method has performance problems. For a fast system that has multiple users continually creating new items (e.g., email server) or that has automatic jobs that constantly creates new items, the chance of having more than one process creating the same timestamp in the same millisecond is inevitable.

When two processes are generating the same ID at the same millisecond, the INSERTs are serialized by the database manager and all processes have to wait for the one who inserts first to COMMIT. Depending on applications, the transaction time may be long, hence affecting performance of the whole system.

In a complex system, such as the Content Manager, a transaction that consists of the generation of the ID is usually very complex and hence takes a long time. For example, after generated and INSERTed to ensure uniqueness, the ID needs to be sent to another machine to be stored in a different repository. These complex transactions involving more than one machine must be done in a single transaction so that it can be rolled back if something fails. This makes the transaction very log thus magnifying the INSERT locking problem.

1


Page...