Raw sequence reads are available on NCBI under BioProject PRJNA763859, accession numbers SAMN21454079, SAMN21454080, SAMN21454081, SAMN21454082. Additional sequence data used in this study were previously published under BioProject PRJNA579979. The allele frequency matrix used in GEA analyses, and environmental data including a correlation matrix are available on Dryad: https://doi.org/10.5061/dryad.prr4xgxmn. R code and scripts are available on the Github repository: https://github.com/YaraAlshw/LG_Chinook.
Abstract
Many species that undergo long breeding migrations, such as anadromous fishes, face highly heterogeneous environments along their migration corridors and at their spawning sites. These environmental challenges encountered at different life stages may act as strong selective pressures and drive local adaptation. However, the relative influence of environmental conditions along the migration corridor compared to the conditions at spawning sites on driving selection is still unknown. In this study, we performed genome-environment associations (GEA) to understand the relationship between landscape and environmental conditions driving selection in seven populations of the anadromous Chinook salmon (Oncorhynchus tshawytscha)–a species of important economic, social, cultural and ecological value–in the Columbia River basin. We extracted environmental variables for the shared migration corridors and at distinct spawning sites for each population, and used a Pool-seq approach to perform whole genome resequencing. Bayesian and univariate genome-environment association tests with migration-specific and spawning site-specific environmental variables indicated many more candidate SNPs associated with environmental conditions of the migration corridor compared to spawning sites. Specifically, variables associated with temperature, precipitation, terrain roughness, and elevation variables of the migration corridor were the most significant drivers of environmental selection. Additional analyses of neutral loci revealed two distinct clusters representing populations from different geographic regions of the drainage that also exhibit differences in adult migration timing (summer vs. fall). Tests for genomic regions under selection revealed a strong peak on chromosome 28, corresponding to the GREB1L/ROCK1 region that has been identified previously in salmonids as a region associated with adult migration timing. Our results show that environmental variation experienced throughout migration corridors imposed a greater selective pressure on Chinook salmon than environmental conditions at spawning sites.
Usage Notes
Environmental_data_corr.xlsx
This Exel file contains three sheets. The sheet "140 variables" contains all variable measurments for the seven populations of Chinook salmon. The sheet "Correlation Matrix" shows the correlation values between each variable. Correlations with R value ≥ 0.80 are highlighted in red; correlations with R values ≤ -0.80 are highlighted in yellow. Variables that showed no variation are highlighted in grey. The sheet "Abbreviations" shows the variable name, resolution and geographic resolution, source, and abbreviation for the mean, maximum, minimum, range and point (site).
env_site.csv
This csv file contains all the environmental variables used in the "Site" dataset
env_migration.csv
This csv file contains all the environmental variables used in the "Migration" dataset
env_combinations.csv
This csv file contains all the environmental variables used in the "Combination" dataset
env_full.csv
This csv file contains all the environmental variables used in the "Full" dataset
allele_freqs_Chinook.RData
This .RData file contains the allele frequency matrix used in GEA analyses
- Chinook salmon (Oncorhynchus tshawytscha)
- SNP