668a Cutting the Gordian Chemometrics Knot – Advantages of the New Method of Sciencebased

Ralf Marbach, VTT Electronics, Kaitoväylä 1,, Oulu,, Finland

A new method for multivariate calibration has recently become available that combines the best features of "classical" (also called "physical") calibration and "inverse" (or "statistical") calibration, where the latter one is used by PLS and PCR. By estimating the spectral signal in the physical way and the spectral noise in the statistical way, so to speak, the prediction accuracy of the inverse model can be combined with the low cost and ease of interpretability of the classical model, including "built-in" proof of specificity of response. The cost of calibration is significantly reduced compared to today's standard practice of statistical calibration (PLS, PCR) because the need for lab-reference values is virtually eliminated. Also eliminated is the need to artificially "upset" an industrial process in order to collect on-line calibration standards that vary over a certain range; because a smoothly running process with a minimum of analyte variation is sufficient. R&D time and expense required for developing new, application-specific PAT instruments can also be significantly reduced because spectrometer hardware performance can be directly translated into user-relevant output accuracy, which greatly simplifies the setting of hardware specifications. Another benefit is that the correct definitions for the two "limits of multivariate detection" become clear. The sensitivity is shown to be limited by so-called "spectral noise," and the specificity is shown to be limited by potentially existing "unspecific correlations." Both limits are testable from first principles, i.e., from measurable pieces of data and without the need to perform any calibration. Applying the limits to statistical calibration reveals why current PLS and PCR results are often affected by unspecific correlations (which, unlike spurious correlations, do not disappear when applying PLS or PCR to larger and larger data sets). The limits of sensitivity and specificity are exactly defined, and the importance of applying spectroscopic expertise and application knowledge to the calibration process is stressed. The method is demonstrated on a well-known data set of near-infrared spectra from pharmaceutical tablets.