Welcome on the ECCE-6 CDROM.

Conference logo

European Congress of Chemical Engineering - 6
Copenhagen 16-21 September 2007

Abstract 4115 - Preprocessing of Chromatographic Data

Preprocessing of Chromatographic Data

Special Symposium - Innovations in Food Technology (LMC Congress)

Innovations in Food Technology - Poster Session (Food - P2)

Dr Thomas Skov
University of Copenhagen
Department of Food Science (IFV), Quality and Technology

Denmark

Keywords: Chromatographic Data

Preprocessing of data from chromatographic systems (GC, HPLC etc) is often needed to correct for artifacts introduced during the chromatographic run. This is especially so if the data are to be used for multivariate data analysis either in terms of peak areas or raw chromatograms. Some artifacts can be taken care of using traditional chromatographic procedures such as correcting the signal by internals standards or normalization. Other artifacts such as peak shifting and baseline occurrence needs more advanced preprocessing techniques to remove their contribution in the subsequent data analysis.

Several preprocessing methods have been put forward to correct for shifted peaks in chromatographic data, to make the data suitable for subsequent multilinear models like Principal Components Analysis (PCA) or PARAllel FACtor analysis (PARAFAC). The correlation optimized warping algorithm (COW) has shown great potential for chromatographic data. It is a piecewise or segmented preprocessing technique that aligns a sample chromatogram (data vector) towards a reference chromatogram by allowing changes in segment lengths on the sample chromatogram [1,2]. With some modifications these alignment algorithms can also take care of shifted behaviors taking place in two dimensions (data matrix) – e.g. for GC-MS or GC-GC measurements. In some cases (e.g. GC-GC) these techniques can be essential (with shifts in both dimensions), whereas in other situations the second dimension (e.g. the mass spectrum) can help to align data more properly (e.g. having shifted overlapping peaks).

The contribution to the chromatographic signal (e.g. a FID or MS response) is often a combination of analyte signal and baseline. If the baseline contribution becomes too significant this artifact can interfere with the data modeling and thus, result in a misleading interpretation of the final model. Baseline correction can be quite straightforward if it is a simple additive contribution, but more complex contributions such as linear or even higher order polynomial contributions can occur.

This study illustrates the effect of a proper preprocessing of both real and simulated chromatographic data with respect to the subsequent data modeling. The focus will be on the analysis of food matrices using multidimensional chromatographic techniques.

Conference logo