In accordance with BBSRC policy on data sharing, many of our datasets are freely available for further research or for analysis as model data in comparison of classification/regression methods. These are shared under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution and reproduction in any medium, provided the original work is properly cited, per the accompanying references which also give full experimental details relating to the original data acquisition.
Please visit our GitHub page QIBChemometrics for secure download of all the latest data repositories, including:
- Benchtop NMR spectra of ‘Arabica’ ground roast coffees: a retail survey (includes several suspected adulterated samples)
- FTIR Spectral Library – illustrative examples of a variety of foods and ingredients
- FTIR Spectra of Red Wines – 37 red wines of two varieties, sub-sampled in triplicate
The following datasets can also be downloaded below as ‘zipped’ folders.
- 983 Mid-infrared (MIR) spectra of fresh fruit purees: strawberry (authentic samples) and non-strawberry (adulterated strawberries and other fruits). Raw data matrix size [983 x 235]. Obtained using Fourier transform infrared (FTIR) spectroscopy with attenuated total reflectance (ATR) sampling. As described in “Use of Fourier transform infrared spectroscopy and partial least squares regression for the detection of adulteration of strawberry purees“ Holland JK, Kemsley EK, Wilson RH. (1998). Journal of the Science of Food and Agriculture, 76, 263-269. More on our food authentication work.
- 120 Mid-infrared (MIR) spectra of fresh minced meats – chicken, pork and turkey. Duplicate acquisitions from 60 independent samples. Raw data matrix size [448 x 120]. Obtained using Fourier transform infrared (FTIR) spectroscopy with attenuated total reflectance (ATR) sampling. As described in “Mid-infrared spectroscopy and authenticity problems in selected meats: a feasibility study“ Al-Jowder O, Kemsley E K, Wilson R. H.(1997) Food Chemistry 59 195-20. More on our food authentication work.
- 96 Lanes of DGGE data of faecal samples from 13 patients with Ulcerative colitis, 11 with Irritable Bowel Syndrome, and 22 healthy controls. Includes technical replicate(s) of most samples. Data matrix size [96 x 731]. The data are described in the Open Access paper Noor SO, Ridgway K, Scovell L, Kemsley EK, Lund EK, Jamieson C, Johnson IT, Narbad A (2010) “Ulcerative colitis and irritable bowel patients exhibit distinct abnormalities of the gut microbiota” BMC Gastroenterology 10 134. Data comprise grayscale intensities for the pre-processed (aligned, autoscaled) lanes as shown in figure 4b from the citation.
- 120 Mid-infrared (MIR) spectra. Duplicate acquisitions from 60 authenticated extra virgin olive oils from 4 different countries of origin; raw data matrix size [570 x 120]. Obtained using Fourier transform infrared (FTIR) spectroscopy with attenuated total reflectance (ATR) sampling. As described in “FTIR spectroscopy and multivariate analysis can distinguish the geographic origin of extra virgin olive oils” Tapp H.S. et al, J. Agric. Food Chem. 51 (21) 6110-5 (2003) More on our EMG work.
- 214 Lanes of SDS-PAGE data, from 19 Spanish and 18 French dry-cured hams; data matrix size [214 x 431]. Triplicate lanes of duplicate extractions (giving ~6 data vectors per sample), pre-processed and aligned as described in “Sodium dodecyl sulphate-polyacrylamide gel electrophoresis of proteins in dry-cured hams: Data registration and multivariate analysis across multiple gels”, Olias R. et. al., Electrophoresis 27 (7) 1288-1299.
- 144 Facial Muscle Electromyograms (time-varying bioelectrical signals measuring muscle activity; 30seconds sampled at 1kHz; raw data matrix size [30000 x 144]). Obtained using surface electromyography (sEMG) from the Right-Masseter muscle group in 6 volunteers, performing duplicate readings of 4 prescribed chewing movements, on 3 separate occasions. As described in “Electromyographic responses to prescribed mastication” Kemsley E.K. et al, J. Electromyog. & Kinesiol. 13 (2) 197-207 (2003). More on our EMG work.
- 56 Mid-infrared (MIR) spectra of authenticated freeze-dried coffee samples (arabica and robusta species, respectively 29 and 27 of each). Raw data matrix size is [286 x 56]. Obtained by Fourier transform infrared (FTIR) spectroscopy with diffuse reflectance (DRIFT) sampling. The data are described in full in the following journal papers: “Discrimination of Arabica and Robusta in Instant Coffee by Fourier Transform Infrared Spectroscopy and Chemometrics” Briandet R et al, J. Agric. Food Chem., 44 (1),170–174 (1996) and “Near- and Mid-Infrared Spectroscopies in Food Authentication: Coffee Varietal Identification” Downey G. et al, J. Agric. Food Chem. 45 (11) 4357-4361 (1997). More on our food authentication work.
- 1D Proton NMR spectra (600 MHz) of polar and non-polar extracts from 40 coffee samples of assured origin. For the non-polar extracts there are also corresponding benchtop (60 MHz) spectra. These are a subset of the samples described in “16-O-methylcafestol is present in ground roast Arabica coffees: Implications for authenticity testing”.