Aarhus University Seal / Aarhus Universitets segl

Optimal sample size for predicting viability of cabbage and radish seeds based on near infrared spectra of single seeds

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review

The effects of the number of seeds in a training sample set on the ability to predict the viability of cabbage or radish seeds are presented and discussed. The supervised classification method extended canonical variates analysis (ECVA) was used to develop a classification model. Calibration sub-sets of different sizes were chosen randomly with several iterations and using the spectral-based sample selection algorithms DUPLEX and CADEX. An independent test set was used to validate the developed classification models. The results showed that 200 seeds were optimal in a calibration set for both cabbage and radish data. The misclassification rates at optimal sample size were 8%, 6% and 7% for cabbage and 3%, 3% and 2% for radish respectively for random method (averaged for 10 iterations), DUPLEX and CADEX algorithms. This was similar to the misclassification rate of 6% and 2% for cabbage and radish obtained using all 600 seeds in the calibration set. Thus, the number of seeds in the calibration set can be reduced by up to 67% without significant loss of classification accuracy, which will effectively enhance the cost-effectiveness of NIR spectral analysis. Wavelength regions important for the discrimination between viable and non-viable seeds were identified using interval ECVA (iECVA) models, ECVA weight plots and the mean difference spectrum for viable and no- viable seeds.

Original languageEnglish
JournalJournal of Near Infrared Spectroscopy
Pages (from-to)451-461
Number of pages11
Publication statusPublished - 2011

    Research areas

  • seeds, NIR, classification, ECVA, iECVA, PCA, DUPLEX, CADEX, misclassification rate

See relations at Aarhus University Citationformats

ID: 41695102