September 12, 2022 |

Genomic Prediction of Cassava Mosaic Disease

Executive Summary

Cassava (Manihot esculenta) is one of the most nutritionally important starchy root crops in the world, providing sustenance for around 800 million people globally. It has many desirable characteristics such as a high starch content, drought resistance, and a high yield per area which make it ideally suited to the challenging environments. In the production of cassava, cassava mosaic disease (CMD) is a major threat, causing losses between 12% and 82%, equating to about 30 million tons annually. Therefore, new insights into the drivers responsible for the disease development is an especially important aspect in cassava breeding. Synomics novel technology platform, DISCOVER™, models the complexity of

biology through explicit statistical testing of multi-dimensional feature interactions using proprietary, computationally-efficient algorithms and GPU implementations. Here we identified SNPs and SNP-networks that are involved in both alcohol and cadmium-ion related pathways which have previously been reported to be involved in the resistance of cassava towards CMD. Not only did these discoveries map to potentially causative biology, but when SNP networks were incorporated into breeding prediction models, they materially outperform the current approach, suggesting selection of cassava clones based on predictions using Synomics’ technology leads to higher CMD-resistance.

Background

Cassava

Cassava (Manihot esculenta) is the most important starchy root crop in the tropics. Its drought resistance, wide harvest windows and high yield per area make it an ideal crop in the developing world. As such, it is the primary food staple for millions of people in Africa, Asia, and Latin America (Parmar et al., 2017), making cassava an economically important crop. Cassava mosaic disease (CMD) is a viral infection causing losses ranging from 12% to 82% depending on variety and infection type, leading to annual economic losses of between $1.9 and $2.7 billion USD in East and Central Africa alone (Patil and Fauquet, 2009). Due to cassava’s global importance and susceptibility to biological risks, breeding for CMD resistance has been of key importance. Breeding resistant plants is also a more sustainable and deployable solution than agrochemical approaches.

Cassava breeding

While breeding is an excellent strategy for combating CMD, traditional cassava breeding comes with some challenges like asynchronous flowering, a small number of seeds per cross, and a long cropping cycle (Rabbi et al., 2022). These, coupled with low multiplication of planting material for multi-environment screening and high variation in performance of cuttings used for propagation has slowed Cassava’s breeding progress. The most important tool for circumventing such limitations is the application of genomic breeding. Through genomic breeding, we can drastically reduce the length of breeding cycles, as decisions on which plants to advance can be made before the collection of phenotypic data. Genomic Selection strategies to improve key traits will be very valuable and will be fundamental to food security.

Study Design

Synomics DISCOVERTM

Biological systems are complex, and non-additive effects constitute a large part of the underlying processes. While this is widely understood and acknowledged, due to difficult computational problems and statistical inference in such a large search-space, models incorporating these effects are largely underutilized.

The Synomics DISCOVER™ platform models traits and disease not only by the individual SNPs, but also by complex multiway genomic interactions that relates to a disease or a trait. Synomics’ novel methodology can lead to discoveries of functionally co-dependent genetic interactions. Our discoveries can be used to understand and predict phenotypic outcomes using the same genetic information other industry standard tools would use, but with a much more complex computational model.

To demonstrate the general applicability of our discoveries, we apply a standard machine learning technique of leaving a portion of the data out of our discovery process and using it as a blinded test-set.

Dataset

The dataset used in this study was taken from Rabbi et al. (2022). We used the mean of CMD at 1, 3 and 6 months after planting as the new CMD trait whereas, the original data for CMD were discrete observations ranging from 1 to 5, taken over 4 years, with 3 measurements per year (1-month, 3-month, and 6-month intervals). Table 1 describes the observation values. Furthermore, the mean CMD trait entering the Synomics DISCOVER™ platform were adjusted for effects of years, locations, and for the study design before running the platform. As the prediction of the next generation is a major goal in breeding, the clones of the year 2015 served as a test-set while all years before the last one served for model training. Prediction accuracy was evaluated using the Pearson correlation coefficient between phenotypes of 2015 and their predictions

Observation
Description
1
Clean, no infection
2
Up to 25% leaf area chlorotic, mild leaf distortion, no stunting
3
25-50% leaf area chlorotic, moderate leaf distortion, no stunting
4
50-75% leaf area chlorotic, severe leaf distortion, moderate stunting
5
75- 100% leaf area chlorotic, severe leaf distortion, small leaflets(almost no lamina), severe stunting

Breeding Value Estimation

The networks identified by DISCOVER™ were taken further into the framework of breeding value estimation, predicting the best individuals to select for further breeding.
In this framework, the industry standard GBLUP approach served as a reference benchmark. The incorporation of networks identified by DISCOVER™ into BLUP-models is a range of unique, proprietary approaches developed by Synomics. These approaches are further defined in Table 2.

Model
Description
Model 1
Models main effects for the SNPs of the top-networks
Model 2
Models effects for the top networks.
Model 3
Models both main effects for all SNPs, together with network effects
Model 4
Models effects for all networks.

Results

Prediction Accuracies

Prediction accuracies were calculated on predictions made on plants excluded from model training. The prediction accuracies for CMD in Table 3 showed higher predictability by all the Synomics models over that from the GBLUP model. While the highest accuracy of 69% was observed when all networks were modelled (model 4), models 2 and 3 showed accuracies of 67% and 68% with much fewer SNPs/networks than model 4.

Gene Enrichment

Model
Prediction Accuracy
#SNPs modelled individually
#SNPs modelled individually
#SNPs modelled individually
GBLUB
52%
40,397
Model 1
60%
101
Model 2
67%
58
101
Model 3
68%
40,397
58
101
Model 4
69%
225,366
11,953

Beyond pure predictive breeding applications, we seamlessly extract biological knowledge from the SNPs and SNP-networks found in DISCOVER™. This automatically extends into using orthologue species where genome annotation is poor, such as in cassava. Based on our automatic pipelines we were able to identify multiple functional biological processes which were enriched in genes associated with the most predictive SNPs based on our network analysis. We identified enrichment of genes in alcohol pathways. Cellular response to alcohol has been shown to be upregulated in CMD-tolerant cassava landraces (Allie et al. 2015). Interestingly, we also found enrichment in cadmium ion related pathways, and cadmium has been shown to provide viral resistance in several studies (Ghoshroy et al., 1998; DalCorso et al., 2010).

Conclusion and Discussions

Here we have shown that Synomics’ unique proprietary technology – DISCOVER™ – is able to identify SNPs and SNP-networks that both:
1. Provide insight into the genes involved in the development of CMD; and
2. Predict the phenotype with a high accuracy

Providing insight into genes involved in CMD

Models 1 and 2 show that a small number of SNPs (101) predict the phenotypes more precisely than using all SNPs (40,397) in a GBLUP model. This indicates that there are a small number of SNPs that strongly affect CMD. The presence of quantitative trait loci that regulate CMD susceptibility of cassava was reported previously (Lokko et al. 2005; Okogbenin et al., 2012). Similarly, transgenic cassava plants were found to be more resistant towards CMD (Zhang et al., 2005). Therefore, the small number of SNPs found to affect CMD here provide further potential targets to improve CMD-resistance in cassava. Further, we were able to identify enriched pathways within those 101 SNPs that have high biological relevance to disease resistance.

Predicting the phenotype with high accuracy

The prediction accuracies of Table 3 show that all Synomics’ novel BLUP models outperform the standard genetic evaluation GBLUP model. Synomics unique proprietary approach includes identifying networks (combinations of SNP genotypes across the genome) that affect a disease or trait. Prediction accuracies for our network-based BLUP models (models 2 to 4) were found to outperform the purely SNP-based models (GBLUP and model 1). As network-based models allow us to model high-order epistatic effects, these findings indicate that CMD is not only controlled by a small number of SNPs that have simple additive effects, but also by interactions between those SNPs. As the prediction accuracy using the SNPs of DISCOVER™ was higher than that of GBLUP, it can be concluded that a selection based on Synomics’ technology can be expected to lead to a higher genetic gain than a selection based on GBLUP . Finally, in a multi-environment trial study, interactions between SNPs and environments can serve to breed varieties that are adapted to specific environments, which is essential for a successful crop breeding program. Such SNP environment interaction was for example found for cassava bacterial blight (Sedano et al., 2017) and therefore it can be expected that there will also be SNP environment interactions in case of CMD.

References

P. Zhang, et al. (2005). Resistance to cassava mosaic
disease (…). Plant Biotechnology Journal. Vol. 3, pp. 385–
397. doi: 10.1111/j.1467-7652.2005.00132.x

E. Okogbenin, et al. (2012). Molecular Marker Analysis and Validation of Resistance to Cassava Mosaic Disease in Elite Cassava Genotypes in Nigeria. Crop Breeding &
Genetics, Vol. 52 (6), pp. 2576-2586. https://doi.org/10.2135/cropsci2011.11.0586

Y. Lokko, et al. (2005). Molecular markers associated with a new source of resistance to the cassava mosaic disease. African Journal of Biotechnology, Vol. 4 (9), pp.
873-881

F. Allie, et al. (2014). Transcriptional analysis of South
African cassava mosaic virus-infected (…) BMC Genomics, Vol.15, https://doi.org/10.1186/1471-2164-15-1006

S. Ghoshroy, et al. (1998). Inhibition of plant viral systemic infection by non-toxic concentrations of cadmium. The Plant Journal, Vol.13 (5), pp. 591–602.
https://doi.org/10.1046/j.1365-313X.1998.00061.x

G. DalCorso, et al. (2010). Regulatory networks of cadmium stress in plants. Plant Signaling & Behavior, 5:6,
663-667, DOI: 10.4161/psb.5.6.11425

J.C.S. Sedano, et al. (2017). Major Novel QTL for Resistance to Cassava Bacterial Blight Identified through a MultiEnvironmental Analysis. Frontiers in Plant Science, Vol. 8. https://doi.org/10.3389/fpls.2017.01169

I.Y. Rabbi, et al. (2022). Genome-wide association
analysis reveals new insights into the genetic architecture of defensive(…). Plant Molecular Biology, Vol.109, pp. 195–213. https://doi.org/10.1007/s11103-020-01038-3

B. Owor et al. (2004). The effect of cassava mosaic
geminiviruses on symptom severity, growth and root
yield of a cassava (…). Annals of Applied Biology, Vol.145,
pp.331 – 337. DOI:10.1111/j.1744-7348.2004.tb00390.x
A. Parmar, et al. (2017). Crops that feed the world:
Production and improvement of cassava for food, feed,
and industrial uses. Food Security, Vol.9, pp.907–927

Patil, B. L., & Fauquet, C. M. (2009). Cassava mosaic
geminiviruses: actual knowledge and perspectives.
Molecular plant pathology, 10(5), 685-701.

Find out more

To request more information, please provide the following details:

Thank you for getting in touch, we will keep you updated and share information on this topic in the future.

Related Reports

Golden retriever in the meadow
September 15, 2022
News

Synomics Announces Partnership with Mars Petcare

March 21, 2022
News

Frontier In Genetics

March 21, 2022
News

Synomics to attend World Agri-Tech