Hsin-Po's Website

Logo

UIUC Math PhD .. UCSD ECE postdoc

Distributed Storage Papers

The following are my works on distributed storage systems. Both concern regenerating code that have applications in distributed storage systems.

Abbreviation Authors Title
MoulinAlg20 D W Multilinear Algebra for Distributed Storage
Atrahasis20 D L W Multilinear Algebra for Minimum Storage Regenerating Codes

D = Iwan Duursma (Advisor of the time);
L = Xiao Li (academic sister).

A regenerating code consists of

The configuration of the nodes satisfies the following conditions:

The code is named regenerating mainly due to the last bullet point—the nodes regenerate themselves.

The theory of regenerating codes concerns the relation among $n, k, d, \alpha, \beta, M$. For example, since any $k$ nodes contain $k\alpha$ symbols and can recover the file, the file size $M$ is at most $k\alpha$. Similarly, since $d\beta$ symbols repair a failing node, the node size $\alpha$ is at most $d\beta$. (Exercise) One can also show that $k - 1$ nodes ($\alpha$) plus $d - k + 1$ help messages ($\beta$) is at least $M$. There is a family of bounds of this type. They are called cut-set bounds and restrict where those parameters can live.

The opposite approach is to construct regenerating codes that aim to achieve low $\alpha$, low $\beta$, and high $M$. MoulinAlg20 utilizes multilinear algebra to do this. We construct a series of regenerating codes which we call Moulin codes. They achieve the best known $\alpha/M$-versus-$\beta/M$ trade-off to date. And it is conjectured that this trade-off is optimal.

See Figure 1 on page 3 in MoulinAlg20 for the $\alpha/M$-versus-$\beta/M$ trade-off for the $(n, 3, 3)$ case. The trade-off of (n, 3, 4) regenerating codes Here is another $\alpha/M$-versus-$\beta/M$ trade-off for the $(n, 3, 4)$ case. (In a newer version of MoulinAlg20 that I am still working on.) The trade-off of (n, 3, 4) regenerating codes For more general parameters, check out this D3.js plot.

See also Table 2 on page 29 for the relations among some competitive constructions. Comparison among several ERRC codes that aim for interior points

Atrahasis20 exploits multilinear algebra to construct MSR codes, which we called Atrahasis codes. Formally, an MSR code is a regenerating code with $M = k\alpha$ and $\beta = \alpha/(d - k + 1)$. From the constraint on $M$ one sees that there is no wastes of storage (hence the name minimum storage regeneration = MSR). Some researchers see MSR codes as the intersection of regenerating codes and MDS codes.

MSR alone attracts significant attentions because people want to minimize node size ($\alpha \geq M/k$), and only then they minimize help messages ($\beta \geq \alpha/(d - k + 1)$ given that $\alpha \geq M/k$). See Table 1 on page 5 in Atrahasis20 for a comparison of some existing contraptions. The alpha--F_q trade-off of some well-known MSR codes