Orthology Guided Assembly in highly heterozygous crops: creating a reference transcriptome to uncover genetic diversity in Lolium perenne

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review

Standard

Orthology Guided Assembly in highly heterozygous crops : creating a reference transcriptome to uncover genetic diversity in Lolium perenne. / Ruttink, Tom; Sterck, Lieven; Rohde, Antje; Bendixen, Christian; Rouzé, Pierre; Asp, Torben; Van de Peer, Yves; Roldán-Ruiz, Isabel.

In: Plant Biotechnology Journal, Vol. 11, No. 5, 06.2013, p. 605-607.

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review

Harvard

Ruttink, T, Sterck, L, Rohde, A, Bendixen, C, Rouzé, P, Asp, T, Van de Peer, Y & Roldán-Ruiz, I 2013, 'Orthology Guided Assembly in highly heterozygous crops: creating a reference transcriptome to uncover genetic diversity in Lolium perenne', Plant Biotechnology Journal, vol. 11, no. 5, pp. 605-607. https://doi.org/10.1111/pbi.12051

APA

CBE

MLA

Vancouver

Author

Ruttink, Tom ; Sterck, Lieven ; Rohde, Antje ; Bendixen, Christian ; Rouzé, Pierre ; Asp, Torben ; Van de Peer, Yves ; Roldán-Ruiz, Isabel. / Orthology Guided Assembly in highly heterozygous crops : creating a reference transcriptome to uncover genetic diversity in Lolium perenne. In: Plant Biotechnology Journal. 2013 ; Vol. 11, No. 5. pp. 605-607.

Bibtex

@article{78d25f43ea4241ab9ab5e775442635f9,
title = "Orthology Guided Assembly in highly heterozygous crops: creating a reference transcriptome to uncover genetic diversity in Lolium perenne",
abstract = "Despite current advances in next-generation sequencing data analysis procedures, de novo assembly of a reference sequence required for SNP discovery and expression analysis is still a major challenge in genetically uncharacterized, highly heterozygous species. High levels of polymorphism inherent to outbreeding crop species hamper De Bruijn Graph-based de novo assembly algorithms, causing transcript fragmentation and the redundant assembly of allelic contigs. If multiple genotypes are sequenced to study genetic diversity, primary de novo assembly is best performed per genotype to limit the level of polymorphism and avoid transcript fragmentation. Here, we propose an Orthology Guided Assembly procedure that first uses sequence similarity (tBLASTn) to proteins of a model species to select allelic and fragmented contigs from all genotypes and then performs CAP3 clustering on a gene-by-gene basis. Thus, we simultaneously annotate putative orthologues for each protein of the model species, resolve allelic redundancy and fragmentation and create a de novo transcript sequence representing the consensus of all alleles present in the sequenced genotypes. We demonstrate the procedure using RNA-seq data from 14 genotypes of Lolium perenne to generate a reference transcriptome for gene discovery and translational research, to reveal the transcriptome-wide distribution and density of SNPs in an outbreeding crop and to illustrate the effect of polymorphisms on the assembly procedure. The results presented here illustrate that constructing a non-redundant reference sequence is essential for comparative genomics, orthology-based annotation and candidate gene selection but also for read mapping and subsequent polymorphism discovery and/or read count-based gene expression analysis",
keywords = "de novo transcriptome assembly, genetic diversity, Lolium perenne, heterozygosity, crops, SNP",
author = "Tom Ruttink and Lieven Sterck and Antje Rohde and Christian Bendixen and Pierre Rouz{\'e} and Torben Asp and {Van de Peer}, Yves and Isabel Rold{\'a}n-Ruiz",
year = "2013",
month = "6",
doi = "10.1111/pbi.12051",
language = "English",
volume = "11",
pages = "605--607",
journal = "Plant Biotechnology Journal",
issn = "1467-7644",
publisher = "Wiley-Blackwell Publishing Ltd",
number = "5",

}

RIS

TY - JOUR

T1 - Orthology Guided Assembly in highly heterozygous crops

T2 - creating a reference transcriptome to uncover genetic diversity in Lolium perenne

AU - Ruttink, Tom

AU - Sterck, Lieven

AU - Rohde, Antje

AU - Bendixen, Christian

AU - Rouzé, Pierre

AU - Asp, Torben

AU - Van de Peer, Yves

AU - Roldán-Ruiz, Isabel

PY - 2013/6

Y1 - 2013/6

N2 - Despite current advances in next-generation sequencing data analysis procedures, de novo assembly of a reference sequence required for SNP discovery and expression analysis is still a major challenge in genetically uncharacterized, highly heterozygous species. High levels of polymorphism inherent to outbreeding crop species hamper De Bruijn Graph-based de novo assembly algorithms, causing transcript fragmentation and the redundant assembly of allelic contigs. If multiple genotypes are sequenced to study genetic diversity, primary de novo assembly is best performed per genotype to limit the level of polymorphism and avoid transcript fragmentation. Here, we propose an Orthology Guided Assembly procedure that first uses sequence similarity (tBLASTn) to proteins of a model species to select allelic and fragmented contigs from all genotypes and then performs CAP3 clustering on a gene-by-gene basis. Thus, we simultaneously annotate putative orthologues for each protein of the model species, resolve allelic redundancy and fragmentation and create a de novo transcript sequence representing the consensus of all alleles present in the sequenced genotypes. We demonstrate the procedure using RNA-seq data from 14 genotypes of Lolium perenne to generate a reference transcriptome for gene discovery and translational research, to reveal the transcriptome-wide distribution and density of SNPs in an outbreeding crop and to illustrate the effect of polymorphisms on the assembly procedure. The results presented here illustrate that constructing a non-redundant reference sequence is essential for comparative genomics, orthology-based annotation and candidate gene selection but also for read mapping and subsequent polymorphism discovery and/or read count-based gene expression analysis

AB - Despite current advances in next-generation sequencing data analysis procedures, de novo assembly of a reference sequence required for SNP discovery and expression analysis is still a major challenge in genetically uncharacterized, highly heterozygous species. High levels of polymorphism inherent to outbreeding crop species hamper De Bruijn Graph-based de novo assembly algorithms, causing transcript fragmentation and the redundant assembly of allelic contigs. If multiple genotypes are sequenced to study genetic diversity, primary de novo assembly is best performed per genotype to limit the level of polymorphism and avoid transcript fragmentation. Here, we propose an Orthology Guided Assembly procedure that first uses sequence similarity (tBLASTn) to proteins of a model species to select allelic and fragmented contigs from all genotypes and then performs CAP3 clustering on a gene-by-gene basis. Thus, we simultaneously annotate putative orthologues for each protein of the model species, resolve allelic redundancy and fragmentation and create a de novo transcript sequence representing the consensus of all alleles present in the sequenced genotypes. We demonstrate the procedure using RNA-seq data from 14 genotypes of Lolium perenne to generate a reference transcriptome for gene discovery and translational research, to reveal the transcriptome-wide distribution and density of SNPs in an outbreeding crop and to illustrate the effect of polymorphisms on the assembly procedure. The results presented here illustrate that constructing a non-redundant reference sequence is essential for comparative genomics, orthology-based annotation and candidate gene selection but also for read mapping and subsequent polymorphism discovery and/or read count-based gene expression analysis

KW - de novo transcriptome assembly

KW - genetic diversity

KW - Lolium perenne

KW - heterozygosity

KW - crops

KW - SNP

U2 - 10.1111/pbi.12051

DO - 10.1111/pbi.12051

M3 - Journal article

C2 - 23433242

VL - 11

SP - 605

EP - 607

JO - Plant Biotechnology Journal

JF - Plant Biotechnology Journal

SN - 1467-7644

IS - 5

ER -