8
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Eficiencia relativa de 15 pruebas de discordancia con 33 variantes aplicadas al procesamiento de datos geoquímicos Translated title: Relative efficiency of 15 discordancy tests with 33 variants for processing geochemical data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Las pruebas de discordancia son una herramienta estadística útil en los diferentes campos de las ciencias e ingenierías, incluyendo Ciencias de la Tierra. El procedimiento consiste en una metodología rigurosa para la detección de datos desviados en una muestra estadística "contaminada" y posteriormente su depuración, logrando que los datos restantes tengan una distribución normal sin contaminación estadística, con los cuales puedan ser determinadas correctamente las medidas de tendencia central (media) y de dispersión (desviación estándar). En la evaluación empírica de las 15 pruebas de discordancia con 33 variantes, se utilizó una base de datos geoquímicos grande con información sobre 35 materiales de referencia geoquímica (MRG) procedentes de cuatros países (Canadá, E.U.A., Japón y Sudáfrica) que representa 2220 casos con 41,821 datos individuales geoquímicos. Fueron evaluadas nueve pruebas sencillas con 13 variantes y siete pruebas múltiples con 20 variantes (la prueba N4 pertenece a ambos tipos) utilizando valores críticos nuevos de gran precisión y exactitud en la obtención de los resultados. Para la eficiencia de las pruebas de discordancia se emplearon dos términos estadísticos: (1) Criterio de eficiencia relativa ("relative efficiency criterion", REC) previamente conocido; y (2) criterio de valores desviados relativo ("relative outlier criterion" ROC) propuesto en el presente trabajo. Adicionalmente, se utilizó una metodología combinada de regresión lineal y pruebas de significancia de F de Fisher y t de Student. En pruebas de discordancia sencillas, la eficiencia mayor fue para el coeficiente de exceso o curtosis (N15) seguida por las pruebas tipo Grubbs (N1 y N4) y de coeficiente de asimetría (N14), mientras que en pruebas de discordancia múltiples, la prueba N4 en sus tres variantes se caracterizó por eficiencias mayores. Las pruebas tipo Dixon, mucho más populares que las de Grubbs, por lo general presentaron valores menores de la eficiencia. Una implicación importante de estos resultados sería otorgar preferencias a las pruebas N15, N1, N4 y N14 para la aplicación de la metodología de valores desviados en el manejo de datos geoquímicos. Las interpretaciones cuantitativas de regresiones lineales combinadas con pruebas de significancia confirman los resultados de los parámetros REC y ROC. Finalmente, se afirma que independientemente del método analítico usado para determinar la composición geoquímica de materiales de referencia, los valores desviados altos son mucho más comunes que los bajos y las muestras con contaminación estadística simétrica, a ambos lados de la muestra, son relativamente escasas. Los parámetros robustos, como la mediana o la media de Gastwirth, serán muy probablemente sesgadas para este tipo de datos geoquímicos. Así mismo, la aplicación rigurosa de las pruebas de discordancia antes de estimar los valores de la media y desviación estándar parece ser un requerimiento básico.

          Translated abstract

          Discordancy tests provide us with a statistical tool that is useful in different areas of science and engineering, including Earth Sciences. Their application represents a rigorous methodology for the detection and elimination of discordant outliers in statistically contaminated normal samples and provides us remaining data without any statistical contamination, which can then be used to estimate the central tendency (mean) and dispersion (standard deviation) parameters. For the empirical evaluation of 15 discordancy tests with 33 variants, an extensive database of 35 reference materials (RM) from four countries (Canada, U.S.A., Japan, and South Africa) having 2220 applicable cases with 41,821 individual geochemical data, was established. Nine single-outlier tests with 13 variants and seven multiple-outlier tests with 20 variants (test N4 belongs to both types) along with the new, most precise and accurate critical values, were employed for this evaluation. Two statistical parameters quantified the efficiency of discordancy tests: (1) Relative efficiency criterion (REC) known from previous work; and (2) relative outlier criterion (ROC) proposed in this work. Additionally, a methodology was used that combines linear regression analysis with Fisher F and Student t significance tests. Among the single-outlier discordancy tests, the greatest efficiency was shown by kurtosis test (N15), followed by Grubbs type tests (N1 and N4) and skewness test (N14), whereas, among multiple-outlier tests, the Grubbs test N4 in its three variants seemed to be characterized by the greatest efficiency values. The Dixon tests, being much more popular than the Grubbs tests, in general presented the smallest efficiencies. One important implication of these results would be to prefer N15, N1, N4, and N14 tests for the application of this outlier-based methodology for geochemical data handling. The quantitative interpretation using the combined methodology of linear regressions and significance tests confirms the results of REC and ROC parameters. Finally, it is inferred that independently of the analytical methods used for the determination of geochemical composition of reference materials, upper discordant outliers are much more common than the lower ones, and samples with a symmetrical statistical contamination on both sides of the sample are relatively scarce. Robust estimates, such as the median or Gastwirth mean, are likely to be biased for such geochemical data. The application of discordancy tests before estimating the mean and standard deviation values is a basic requirement.

          Related collections

          Most cited references 118

          • Record: found
          • Abstract: not found
          • Article: not found

          Statistics and chemometrics for analytical chemistry

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Automated statistical analysis of protein abundance ratios from data generated by stable-isotope dilution and tandem mass spectrometry.

            We describe an algorithm for the automated statistical analysis of protein abundance ratios (ASAPRatio) of proteins contained in two samples. Proteins are labeled with distinct stable-isotope tags and fragmented, and the tagged peptide fragments are separated by liquid chromatography (LC) and analyzed by electrospray ionization (ESI) tandem mass spectrometry (MS/MS). The algorithm utilizes the signals recorded for the different isotopic forms of peptides of identical sequence and numerical and statistical methods, such as Savitzky-Golay smoothing filters, statistics for weighted samples, and Dixon's test for outliers, to evaluate protein abundance ratios and their associated errors. The algorithm also provides a statistical assessment to distinguish proteins of significant abundance changes from a population of proteins of unchanged abundance. To evaluate its performance, two sets of LC-ESI-MS/MS data were analyzed by the ASAPRatio algorithm without human intervention, and the data were related to the expected and manually validated values. The utility of the ASAPRatio program was clearly demonstrated by its speed and the accuracy of the generated protein abundance ratios and by its capability to identify specific core components of the RNA polymerase II transcription complex within a high background of copurifying proteins.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Sample Criteria for Testing Outlying Observations

               Frank Grubbs (1950)
                Bookmark

                Author and article information

                Journal
                rmcg
                Revista mexicana de ciencias geológicas
                Rev. mex. cienc. geol
                Instituto de Geología, UNAM (México, DF, Mexico )
                1026-8774
                2007-2902
                August 2009
                : 26
                : 2
                : 501-515
                Affiliations
                [01] Temixco Morelos orgnameUniversidad Nacional Autónoma de México orgdiv1Posgrado en Ingeniería (Energía) México
                [02] Temixco Morelos orgnameUniversidad Nacional Autónoma de México orgdiv1Centro de Investigación en Energía México spv@ 123456cie.unam.mx
                Article
                S1026-87742009000200017 S1026-8774(09)02600200017
                586835be-cc60-47ff-8b8f-54bea16e10a6

                This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

                Page count
                Figures: 0, Tables: 0, Equations: 0, References: 67, Pages: 15
                Product
                Product Information: website

                Comments

                Comment on this article

                Similar content 60

                Cited by 6

                Most referenced authors 412