Peak Fitting (aka Curve Fitting) in Peak® Spectroscopy Software

Overview

Peak Fitting uses the Levenberg-Marquardt (LMFit) algorithm, which is widely used for non-linear curve-fitting problems. LMFit is well documented in the literature. From starting estimates it varies peak parameters, calculates a spectrum from those peaks, and evaluates the goodness of fit to the sample spectrum. The metric used to calculate goodness of fit is X2 (Chi-squared), which is the sum of the squared residual spectrum. The residual is the sample spectrum minus the calculated spectrum. LMFit is iterative and requires initial user input. The resulting fit is only as good as the initial peak estimates that are provided. LMFit will always find a solution, but that doesn't mean the solution is optimal or even valid. There can be a number of answers to a non-linear problem. Some solutions may be local minima but are not actually the best fit. This is especially likely if the initial peak estimates are far from the real answer. Another problem is over-fitting. Any spectrum can be fitted provided enough starting peaks, but those peaks may not have a corresponding real peak in the sample. The user is responsible for providing good input and for interpreting the results.

General Guidelines

  • Peak Fitting works best on small spectral regions containing the overlapping peaks.
  • Before launching the Peak Fitting application, zoom in on the spectral region you want to fit. The displayed region will be used for the peak fitting.
  • If the data is noisy, results may benefit from smoothing before peak fitting.
  • If the data has a sloping baseline, the results may benefit from baseline correction before peak fitting.
  • It is important to account for all peaks, including 'buried' peaks.

Tutorial

  • Load the 'synthetic_protein.spc' datafile into a workspace. It is installed in the 'peakFit' sub-folder under the PeakSpectroscpy Documents folder.
  • In the 'Analysis' toolbox, choose 'Peak Fitting'.
  • Click the 'Load Peak Fitting Application' button.
  • Click the 'Load Estimates Table' button and select the file 'synthetic protein.pfit'

The 'synthetic_protein.spc' was calculated using the table below. The peak shapes are all Lorentzian. These peaks are a simulation of a Mid-Infrared Protein Absorbance. Also in the peakFit sub-folder is a saved Peak Estimates table, named 'synthetic_protein.pfit'.

CenterAmplitudeFWHH
1627.91.009.5
1636.70.109.75
1645.70.2510.2
1656.30.6510.25
1666.70.369.35
1678.20.1310.05
1690.80.059.75

The Peak Estimates Table

The Estimate Table looks like this. The designations such as (UU) after the Center, Height and FWHH encode constraints on those settings. A 'B' after a value in the table denotes that the Bounds Handling for that value is Bounded. If the mode were Fixed it would have 'F' and if it were Unbounded it will be 'U'. For instance, '(B,U') after the Height values means that the Low Bound is Bounded and the High Bound is Unbounded. Constraints can be applied using the 'Edit Peak' button.

Initial Peak Estimates for Peak Fitting
synthetic_protein.pfit

After loading the spectrum and the estimates table, the click the 'Fit Peaks' button. The fitting will be performed and the results displayed:

Peak Fit Results
Peak Fit Results.

Manual Peak Selection

The first step in Peak Fitting is to tell the program the nominal positions of the peaks that comprise the spectrum. This can be done manually using the mouse, or by clicking the 'Add Peak' button, or by clicking the 'Find Peaks' button or by loading a saved Peak Estimates table.

Peak Selection with the Mouse

It helps to overlay the 2nd derivative by checking the '2nd derivative' box. The 2nd derivative is useful for finding buried peaks. A minimum in the 2nd derivative is the location of a peak.

second derivative overly
Spectrum with overlaid 2nd Derivative

To manually create a peak, right-click the mouse at a peak location. A peak marker is created is created at the mouse location. The peak marker consists of three elements: the Center marker and two width markers. The width initially is set to the default width provided in the Peak Options table.

creating peak markers for peak fitting
A Peak Marker.

The position, height, and width of the peak can be changed by dragging a marker with the left mouse button. When positioned over the Center Marker, the mouse cursor changes to indicate that the peak can be moved up and down and left and right. Moving the Center Marker moves the Width Markers along with it.

The peak center marker
The Peak Center Marker.

When positioned over a vertical width marker, the mouse cursor changes to indicate that the width marker can be moved. Moving a width marker automatically moves the other width marker as well but leaves the Center Marker where it is.

The peak width markers
The Peak Width Markers.

When a peak marker is created with the mouse, an entry in the 'Peak Estimates' table is made corresponding to that peak. To remove a peak marker using the mouse, position the mouse over either the center marker or one of the width markers, and right-click. Also, the row corresponding to the peak in the Peak Estimates table can be selected and then the 'Remove Peak' button clicked. The left mouse button can still be used to expand (zoom in) on the spectral display, as long as it is not on a marker when the left mouse button is pressed.

Adding Peaks with the Keyboard

Peaks can also be added manually by clicking the 'Add Peak' button. This dialog box appears:
Editing Peak Properties
Editing a Peak.

The peak Center, Height, and FWHH can be entered manually. In addition, a peak can be given a name. A name can be useful in the analysis of the peaks. For instance, in this synthetic protein spectrum, peaks can be assigned to Beta Sheets, Alpha Sheets, Amide bands, and so forth. The 'Bound Handling' entries allow for restricting the peak search. The choices are:

UnboundedAllow the 'Value' to vary without any restrictions.
BoundedRestrict the 'Value' to be in the range of 'Low Bound' and/or 'High Bound'.
FixedDo not allow the 'Value' to change during the peak fitting optimization.

The 'Low Bound' value is only applied when 'Low Bound Handling' is set to 'Bounded' and the 'High Bound' value is only applied when 'High Bound Handling' is set to 'Bounded'. Note: there is nothing to restrict a peak height from becoming negative during LMFit. So, by default the Height Low Bound is set to 0, and the Height Low Bound Handling is set to Bounded.

Automatic Peak Selection

Clicking the 'Find Peaks' button will perform a 2nd derivative analysis of the spectrum and select peaks on that basis. 'Find Peaks' can be useful, but manually selecting peaks usually yields better results because the eye of an analyst is better at discerning fine structure than a computer algorithm. In this graphic, the 2nd derivative overlay was used to select the positions of the peaks:

Automatic Initial Peak Finding for Peak Fit
Automatic Peak Finding.

And the table of Peak Estimates looks like this:

The table of peak estimates from automatic peak finding
Automatic Peak Finding.