Computing Differential Aberration between different sample sets
Publication Date: 2010-Sep-30
The IP.com Prior Art Database
The following method may be employed in the analysis of data obtained from array-based comparative genome hybridization experiments in order to obtain more information about the difference between two groups of samples based on the aberration calls of the samples. The method may be used independently or readily adapted as add-on in a variety of genome analysis software programs, including oneClickCGH®, CGH Fusion® and cnTrack® by InfoQuant (London, UK), ImaGene CGH (El Segundo, CA), NimbleScan and SignalMap (NimbleGen, Iceland), Isis-CGH Software (Boston, MA), CGH Explorer (Borresen-Dale laboratory) and Agilent Genomic Workbench (Agilent, CA), among many others.
In describing the method set forth below, it is understood that the exact way in which the method may be performed may deviate from that set forth below. Specifically, many of the steps of the method may be performed using equivalent steps that provide a similar result. For example, many other statistical tests may be used instead of the test provided in the description below. Likewise, there are many ways in which an interval can be defined in addition to the way described below.
In general terms, the method provides for automated detection of aberrations (gains or losses) that are present more frequently in one group of samples compare to another group of samples within a multi-sample comparative genomic hybridization ("CGH") or an array-based CGH ("aCGH") data set. Aberrations include sequence amplifications (which may also be referred to as “gains”) as well as deletions (which may be referred to as “losses”). Any of various aberration-calling techniques are used to identify aberrant intervals within each of the samples of the multi-sample data set. A set of candidate intervals is constructed to include aberrant intervals identified by the aberration-calling technique, as well as intersections of the identified aberrant intervals. Four scores are calculated for each candidate aberrant interval for the two groups of samples: enrichment of losses in group 1 compare to group 2, enrichment of gains in group 1 compare to group 2, enrichment of losses in group 2 compare to group 1 and enrichment of gains in group 2 compare to group 1. All results with statistical significance above a user specified threshold are reported for further analysis.
Specifically, the method identifies genomic regions with significant enrichment of gains or losses in one of the group of samples in comparison to the total sample set. The method then allows one to narrow down the list of differentially aberrant regions by using certain user defined filtering criteria, like p-value or number of probes. The user can save this subset of regions for further investigation.
Array CGH users often look for differences in aberration pattern in the dataset based on some prior knowledge. The two groups to be compared are often formed based o...