Browse Prior Art Database

Method for Factorizing Large-scale Collective Matrix

IP.com Disclosure Number: IPCOM000240705D
Publication Date: 2015-Feb-19
Document File: 11 page(s) / 4M

Publishing Venue

The IP.com Prior Art Database

Related People

Makoto Yamada: INVENTOR [+2]

Abstract

A method is disclosed for factorizing large-scale collective matrix using Hazan's algorithm. The method provides a convex estimator for jointly estimating the collective matrices to provide theoretical guarantees for a large subset of collective matrix structures. A highly rigorous algebra for the collective-matrix structure is developed and a convex estimate for collective matrix completion is used.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 17% of the total text.

Method for Factorizing Large-scale Collective Matrix

Abstract

A method is disclosed for factorizing large-scale collective matrix using Hazan's algorithm.  The method provides a convex estimator for jointly estimating the collective matrices to provide theoretical guarantees for a large subset of collective matrix structures.  A highly rigorous algebra for the collective-matrix structure is developed and a convex estimate for collective matrix completion is used.

Description

Matrix factorization has a wide range of applications such as recommendation systems, recovering gene-protein interactions and modeling text document collections.  Currently, a Convex Collective Matrix Factorization (CCMF) is used which includes a number of compelling properties and the CCMF can deal with data sparsity and cold start problems.  In practice, the data sparsity and cold start problems are critical in a recommendation system such as news recommendation.  However, the CCMF may not be able to handle a very large scale data set.

Disclosed is a method for factorizing large-scale collective matrix using Hazan's algorithm.  The method provides a new CCMF framework that scales to very large-scale matrices.  More specifically, the method addresses the collective matrix completion problem of jointly recovering a collection of matrices with shared structure from partial (and potentially noisy) observations.  The collections of matrices form an entity-relationship component, where each matrix is a relation between a pair of entities.  A joint low rank structure is imposed, wherein each component matrix is low rank.  The latent space of low rank factors is shared across the entire collection.  First, a highly rigorous algebra is developed for the collective-matrix structure and a convex estimate for solving the collective matrix completion problem is used.  Thereafter, a scalable approximate algorithm is utilized to solve the proposed convex program.  The convex program is solved using Hazan's algorithm.  Finally, the results through experiments on simulated and real life datasets are corroborated.

In an implementation of the method, a fundamental notation is used for representing the structure of collective matrices.  a basic algebra for the collective-matrix structure is developed.  The collective-matrix structure is built upon the framework of the CCMF.  Thus, to enhance readability, notation of the CCMF is adopted wherever applicable.

For basic notation, matrices are denoted by uppercase letters, , , etc.   refers to the transpose of .  The matrix inner product is given by .  The set of symmetric matrices of dimension  is denoted as .  The set of integers  is denoted as .

For a matrix  of rank , with singular values , commonly used matrix norms include the nuclear norm , the spectral norm , and the Frobenius norm .

Definition 1

Dual Norm: Given any norm  on a metric space , the dual norm,  is given by .

Definition 2

Operator Norm: Let  and...