Artículo

Kelmansky, D.M.; Martínez, E.J.; Leiva, V. "A new variance stabilizing transformation for gene expression data analysis" (2013) Statistical Applications in Genetics and Molecular Biology. 12(6):653-666
La versión final de este artículo es de uso interno de la institución.
Consulte el artículo en la página del editor
Consulte la política de Acceso Abierto del editor

Abstract:

In this paper, we introduce a new family of power transformations, which has the generalized logarithm as one of its members, in the same manner as the usual logarithm belongs to the family of Box-Cox power transformations. Although the new family has been developed for analyzing gene expression data, it allows a wider scope of mean-variance related data to be reached. We study the analytical properties of the new family of transformations, as well as the mean-variance relationships that are stabilized by using its members. We propose a methodology based on this new family, which includes a simple strategy for selecting the family member adequate for a data set. We evaluate the finite sample behavior of different classical and robust estimators based on this strategy by Monte Carlo simulations. We analyze real genomic data by using the proposed transformation to empirically show how the new methodology allows the variance of these data to be stabilized.

Registro:

Documento: Artículo
Título:A new variance stabilizing transformation for gene expression data analysis
Autor:Kelmansky, D.M.; Martínez, E.J.; Leiva, V.
Filiación:Departamento de Estadística, Universidad de Valparaíso, Avda. Gran Bretaña 1111, Playa Ancha, Valparaíso, Chile
Instituto de Cálculo, FCEN, Universidad de Buenos Aires, Argentina
Palabras clave:Classical and robust estimators; Linear models; Microarrays; Monte Carlo method; Power transformations; R software; Regression methods; article; contamination; data analysis; family; gene expression; genetic transformation; genomics; human; human genome; methodology; microarray analysis; Monte Carlo method; statistical model; variance; Algorithms; Computer Simulation; Data Interpretation, Statistical; Gene Expression Profiling; Humans; Linear Models; Models, Genetic; Monte Carlo Method; Oligonucleotide Array Sequence Analysis; Software
Año:2013
Volumen:12
Número:6
Página de inicio:653
Página de fin:666
DOI: http://dx.doi.org/10.1515/sagmb-2012-0030
Título revista:Statistical Applications in Genetics and Molecular Biology
Título revista abreviado:Stat. Appl. Genet. Mol. Biol.
ISSN:15446115
Registro:https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_15446115_v12_n6_p653_Kelmansky

Referencias:

  • Barros, M., Paula, G.A., Leiva, V., An R implementation for generalized Birnbaum-Saunders distributions (2009) Comp. Stat. Data Anal, 53, pp. 1511-1528
  • Bengtsson, H., Hössjer, O., Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method (2006) BMC Bioinformatics, 7, p. 100
  • Box, G.E.P., Cox, D.R., An analysis of transformations (1964) J. Roy. Stat. Soc. B, 26, pp. 211-251
  • Cui, X., Kerr, M.K., Churchill, G.A., Transformations for cDNA microarray data (2003) Stat. Appl. Genet. Mol. Biol, 2 (1). , Article 4
  • Durbin, B.P., Hardin, J.S., Hawkins, D.M., Rocke, D.M., A variance-stabilizing transformation for gene-expression microarray data (2002) Bioinformatics, 18, pp. S105-S110
  • Emmerson, J.D., Stoto, M.A., Transforming data (1987) Understanding Robust and Exploratory Data Analysis, pp. 65-104. , Hoaglin, D. C., Mosteller, F., Tukey, J. W. (Eds.), Wiley, New York
  • Galton, F., The geometric mean, in vital and social statistics (1879) Proc. Royal Soc, 29, pp. 365-367
  • Gibrat, R., (1930) Les Inegalités Économiques, , Sirey, Paris
  • Hawkins, D.M., Diagnostics for conformity of paired quantitative measurements (2002) Stat. Med, 21, pp. 1913-1935
  • Huang, S., Qu, Y., The loss in power when the test of differential expression is performed under a wrong scale (2006) J. Comp. Biol, 13, pp. 786-797
  • Huber, P.J., (1987) Robust Statistics, , Wiley, New York
  • Huber, W., Heydebreck, A., Sültmann, H., Poustka, A., Vingron, M., Variance stabilization applied to microarray data calibration and to the quantification of differential expression (2002) Bioinformatics, 18 (SUPPL. 1), pp. S96-S104
  • Huber, W., Heydebreck, A., Sueltmann, H., Poustka, A., Vingron, M., Parameter estimation for the calibration and variance stabilization of microarray data (2003) Stat. Appl. Gen. Mol. Biol, 2 (1). , Article 3
  • Johnson, N.L., Systems of frequency curves generated by methods of translation (1949) Biometrika, 36, pp. 149-176
  • Johnson, N.L., Kotz, S., Balakrishnan, N., (1994) Continuous Univariate Distributions, , Wiley, New York
  • Kapteyn, J., Van Uven, M.J., (1916) Skew Frequency Curves in Biology and Statistics, , Hoitsema Brothers, Groningen
  • Kotz, S., Leiva, V., Sanhueza, A., Two new mixture models related to the inverse Gaussian distribution (2010) Meth. Comp. App. Prob, 12, pp. 199-212
  • Leiva, V., Hernández, H., Riquelme, A., A new package for the Birnbaum-Saunders distribution (2006) R J, 6, pp. 35-40
  • Leiva, V., Hernández, H., Sanhueza, A., An R package for a general class of inverse Gaussian distributions (2008) J. Stat. Soft, 26, pp. 1-21
  • Leiva, V., Sanhueza, A., Kelmansky, D.M., Martínez, E.J., On the glog-normal distribution and its association with the gene expression problem (2009) Comp. Stat. Data Anal, 53, pp. 1613-1621
  • McAlister, D., The law of the geometric mean (1879) Proc. Royal Soc, 29, pp. 367-376
  • Purdom, E., Holmes, S.P., Error distribution for gene expression data (2005) Stat. Appl. Genet. Mol. Biol, 4 (1). , Article 16
  • (2013) R: A Language and Environment for Statistical Computing, , www.R-project.org, R Development Core Team, R Foundation for Statistical Computing. Vienna, Austria
  • Rocke, D.M., Durbin, B., A model for measurement error for gene expression arrays (2001) J. Comp. Biol, 8, pp. 557-569
  • Rocke, D.M., Lorenzato, S., A two-component model for measurement error in analytical chemistry (1995) Technometrics, 37, pp. 176-184
  • Rousseeuw, P.J., Leroy, A.M., (1987) Robust Regression and Outlier Detection, , Wiley, New York
  • Smyth, G.K., Linear models and empirical Bayes methods for assessing differential expression in microarray experiments (2004) Stat. Appl. Genet. Mol. Biol, 3 (1). , Article 3
  • Smyth, G.K., Yang, Y.H., Speed, T., (2003) Statistical Issues in CDNA Microarray Data Analysis, , Humana Press, Totowa, NJ
  • Speed, T., (2003) Statistical Analysis of Gene Expression Data, , Chapman & Hall, New York
  • Van Den Berg, R.A., Hoefsloot, H.C., Westerhuis, J.A., Smilde, A.K., Werf Der Van, M.J., Centering, scaling, and transformations: Improving the biological information content of metabolomics data (2006) BMC Genomics, 7, pp. 142-147
  • Wicksell, S.D., On the genetic theory of frequency. Arkiv för Matematik (1917) Astronomi Och Fysik, 12, pp. 1-56

Citas:

---------- APA ----------
Kelmansky, D.M., Martínez, E.J. & Leiva, V. (2013) . A new variance stabilizing transformation for gene expression data analysis. Statistical Applications in Genetics and Molecular Biology, 12(6), 653-666.
http://dx.doi.org/10.1515/sagmb-2012-0030
---------- CHICAGO ----------
Kelmansky, D.M., Martínez, E.J., Leiva, V. "A new variance stabilizing transformation for gene expression data analysis" . Statistical Applications in Genetics and Molecular Biology 12, no. 6 (2013) : 653-666.
http://dx.doi.org/10.1515/sagmb-2012-0030
---------- MLA ----------
Kelmansky, D.M., Martínez, E.J., Leiva, V. "A new variance stabilizing transformation for gene expression data analysis" . Statistical Applications in Genetics and Molecular Biology, vol. 12, no. 6, 2013, pp. 653-666.
http://dx.doi.org/10.1515/sagmb-2012-0030
---------- VANCOUVER ----------
Kelmansky, D.M., Martínez, E.J., Leiva, V. A new variance stabilizing transformation for gene expression data analysis. Stat. Appl. Genet. Mol. Biol. 2013;12(6):653-666.
http://dx.doi.org/10.1515/sagmb-2012-0030