Annotation data from: Utilizing a comparative approach to assess genome evolution during diploidization in Artemisia tridentata (Asteraceae), a keystone species of western North America
Supporting dataset for genome assembly data found within NCBI BioProjects PRJNA1032953 (UTT2), PRJNA722258 (IDT2) and PRJNA795150 (IDT3-Reference Genome). Fasta assembly data used in the EDTA analysis to generate subsequent output files are available from the NCBI Genome database and raw sequence data are available from the NCBI SRA database. Each input fasta contains the nine pseudo-chromosomes described in Melton et al. 2022. Reads from each sample were mapped to the nine pseudo-chromosomes and used to call a consensus sequence. The EDTA analysis provides several outputs, listed below. The primary file of interest is the "SAMPLE_consensus.fasta.mod.EDTA.TEanno.gff3" file. This file was used as inputs for comparisons of TE content across the three genomes.
Please visit https://github.com/oushujun/EDTA for more information about the EDTA pipeline.
FILES:
SAMPLE_consensus.fasta.mod.EDTA.TEanno.gff3 == Whole-genome TE annotation
SAMPLE_consensus.fasta.mod.EDTA.TEanno.sum == Summary of whole-genome TE annotation
SAMPLE_consensus.fasta.mod.EDTA.TElib.fa == A non-redundant TE library
SAMPLE_consensus.fasta.mod.MAKER.masked == Low-threshold TE masking
- annotations
- GFF
- sequence assemblies
Data Authors/Creators
Contact Information
- English