Browse Prior Art Database

Method for logging writes on Erasure coded storage system to optimize the network usage across nodes for data updates.

IP.com Disclosure Number: IPCOM000249629D
Publication Date: 2017-Mar-08
Document File: 4 page(s) / 61K

Publishing Venue

The IP.com Prior Art Database

Related People

Deodatta Barhate: INVENTOR [+2]

Abstract

Generally, mirrored log is used for an Erasure Coded (EC) system. This disclosure describes a method (to log writes) by which network traffic can be reduced significantly and the log has very minimal storage overhead.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 46% of the total text.

Deodatta Barhate

Ajay Kumar

1

© 2017 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.

Method for logging writes on Erasure

coded storage system to optimize the network usage

across nodes for data updates.

Abstract

Generally, mirrored log is used for an Erasure

Coded (EC) system. This disclosure describes

a method (to log writes) by which network

traffic can be reduced significantly and the log

has very minimal storage overhead.

Problem Statement

Erasure coding is a technology that provides

configurable redundancy/fault-tolerance with

reduced extra storage. With Reed-Solomon

erasure coding, k data chunks are encoded

into m parity chunks and stored across

distributed storage sub-systems. If any

subsystem or whole system crashes abruptly

while this storage system is in use, it can lead

to data and parities inconsistent for the

involved stripe. In such state, it may not be

possible to recover data completely (even if

number of failed subsystems is less than fault

tolerance), making the whole storage system

inaccessible. To allow recovery in such

scenario, logging is implemented over erasure

coded systems.

The simplest and most often implemented

solution is to employ a mirrored/replicated log.

Such log maintains multiple (mostly as many

as required redundancy) full copies of the log.

Replicated/mirrored logs for EC are similar to

what redo-logs are for database management

systems. Essentially, any updates are stored in

the redo-log and the update is acknowledged.

Based on some heuristics, the logs are then

flushed into primary database periodically. The

logging requirements for Erasure coded

systems are similar. As the log is now an

inherent part of the EC storage system, it

should be as much redundant as the EC sub-

system and as the primary requirement for

EC system is to have a storage efficiency, the

logs should extend it such that it still remains

storage efficient. In the log, new data and new

2

© 2017 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.

parities need to be stored before flush to EC

system.

The disadvantages of the mirrored logs are:

1. It could reduce the storage efficiency that

one supposed to get with EC storage system

depending on how proportionally large the log

is.

2. EC storage system configured over

distributed systems vastly use the network for

IO. The performance of such system often is

decided based on how efficiently the network

bandwidth is used. With replicated logs, each

write leads to generation of (N + 1) times data

(and parities) being pushed on the network,

where N is the fault tolerance (or secondary

mirrors)...