Comparison of different methods for imputing genome-wide marker genotypes in Swedish and Finnish Red Cattle

Publikation: Bidrag til tidsskrift/Konferencebidrag i tidsskrift /Bidrag til avisTidsskriftartikelForskningpeer review

  • Peipei Ma, Danmark
  • Rasmus Froberg Brøndum, Danmark
  • Zahng Qin, Department of Animal Genetics and Breeding, China Agricultural University, Kina
  • Mogens Sandø Lund
  • Guosheng Su
This study investigated the imputation accuracy of different methods, considering both the minor allele frequency and relatedness between individuals in the reference and test data sets. Two data sets from the combined population of Swedish and Finnish Red Cattle were used to test the influence of these factors on the accuracy of imputation. Data set 1 consisted of 2,931 reference bulls and 971 test bulls, and was used for validation of imputation from 3,000 markers (3K) to 54,000 markers (54K). Data set 2 contained 341 bulls in the reference set and 117 in the test set, and was used for validation of imputation from 54K to high density [777,000 markers (777K)]. Both test sets were divided into 4 groups according to their relationship to the reference population. Five imputation methods (Beagle, IMPUTE2, findhap, AlphaImpute, and FImpute) were used in this study. Imputation accuracy was measured as the allele correct rate and correlation between imputed and true genotypes. Results demonstrated that the accuracy was lower when imputing from 3K to 54K than from 54K to 777K. Using various imputation methods, the allele correct rates varied from 93.5 to 97.1% when imputing from 3K to 54K, and from 97.1 to 99.3% when imputing from 54K to 777K; IMPUTE2 and Beagle resulted in higher accuracies and were more robust under various conditions than the other 3 methods when imputing from 3K to 54K. The accuracy of imputation using FImpute was similar to those results from Beagle and IMPUTE2 when imputing from 54K to high density, and higher than the remaining 2 methods. The results also showed that a closer relationship between test set and reference set led to a higher accuracy for all the methods. In addition, the correct rate was higher when the minor allele frequency was lower, whereas the correlation coefficient was lower when the minor allele frequency was lower. The results indicate that Beagle and IMPUTE2 provide the most robust and accurate imputation accuracies, but considering computing time and memory usage, FImpute is another alternative method.
OriginalsprogEngelsk
TidsskriftJournal of Dairy Science
Vol/bind96
Nummer7
Sider (fra-til)4666-4677
Antal sider12
ISSN0022-0302
DOI
StatusUdgivet - jun. 2013

Se relationer på Aarhus Universitet Citationsformater

ID: 52084297