It is the ability of the method to obtain test results proportional to the concentration of analyte.
What are the components of linearity?
Figure 1
How to evaluate linearity?
Figure 2
As per ICH guidelines Q2B, R1, the linearity should be evaluated initially by visual inspection of the response versus concentration plot. Practically, upon the establishment of a linear relationship a statistical calculation can then be implemented to generate linearity data from the regression line analysis by the least square method.
As Socrates used to say, “The beginning of wisdom starts by the definition of terms”, let’s define the output of the regression analysis by reciting the following example:After running a six-point calibration curve for sodium (Na) on ICP-OES, we received the following responses as shown in the table below.
Figure 3
Figure 4
The quality of linearity data can be primarily judged by examining the correlation coefficient and y-intercept of the linear regression line for the response versus concentration plot. Correlation coefficients of > 0.990 for drug product, or > 0.998 for drug substance are well regarded for the fit of the data to the regression line. However, evaluating linearity data solely on those parameters can’t reflect a true measure of linearity as different datasets can provide identical regression statistics (Check source). Therefore, visual evaluation remains prominent while estimating linearity along with examining the residuals from the linear regression.
The residuals are the difference between the experimental signal and the calculated signal. It provides information on how the line fits through the data points.
Figure 5
Why do you go into the trouble of creating the residual plot?
It provides information on how the line fits through the data points. And whether the line is good in explaining the relationship between the concentration and response. Generally, the closer the points to the horizontal line the lesser the error in the predictive value. If you see the points randomly scattered above and below the horizontal line and you don’t discern any trend, then the line is probably a good model for the data. If the residual has an upward trend or if they were curving up and then curving down, or they had a down-ward trend then this line is not a good fit.
Figure 6
Figure 7
Figure 8
Practically, software like JMP can perform all the statistical data. But if you’re limited, data treatment can be done using the excel spreadsheet, as shown below:
The output from regression analysis is generated using the LINEST function in excel, as shown in the table below:
Figure 9
Data | Description | Value calculated by LINEST |
*Equation of the line, y = mx + c | The relationship between the independent variable (x), and the dependent variable (y), is expressed by the equation of the line. | Y = 9312.1634x – 759.6204 |
*Intercept (c) | The value of y when x equals zero. | 759.6204 |
*SE_Intercept | The standard error of the intercept | 113.7429 |
CI at 95% of intercept | (-1075.42) – (-443.82) | |
*Slope (m) | Gradient of the response curve. It indicates information about instrument response in respect to concentration | 9312.1634 |
SE_Slope | The standard error of the slope | 36.8192 |
CI at 95% of Slope | (9209.94) – (9414.39) | |
*Coefficient of determination, R2 | The square of the correlation coefficient, represents the variance in the outcome that the model is capable of predicting. | 0.9999374 |
*Correlation coefficient, R (Multiple R) | The correlation between the predicted and observed value. The closer the value to 1, the better the correlation. Can be calculated using the data analysis under the data tab in excel then the regression function. | 0.9999687 |
*SS_Regression | The regression sum of squares is the amount of variability in the response that is accounted for by the regression line. | 2292203183 |
*SS_Residual / Error sum of squares | The residual sum of squares is the variability about the regression line. | 143337.769 |
*SS_Total | The total sum of squares is the total amount of variability in the response. | 2292346521 |
*You can calculate the above regression statistics data using excel spreadsheet:
How to carry out linearity?
Beginning with the end in mind, let’s work our way backward to examine what parameters we need to fulfill in the linearity study. Typically, during the method development phase we’ve gained an idea on the working range after the LOQ has been established. So, during the validation there’re some guidelines regarding the ranges that should be considered as per ICH Q2B:
Figure 10
Figure 11
Let’s dive into the following example for better understanding of the process, our team received a newly developed method of Sodium (Na) to be validated. Based on the preliminary runs the consensus was to use matrix free standard to determine the content of sodium in the sample. After establishing the LOQ using (10 * SS / Slope), a six-point calibration curve is prepared from 80 – 120% of the test concentration.
Choosing the linearity range is vital as it ensures we’re not overlooking any change in the dependent variables. In this example, estimating the linearity and determining the linear dynamic working range has to meet certain criteria that we will be discussing in this example.
The first step starts by preparing the calibration solution levels. While, the range’s minimum requirements for the assay is 80 – 120%, there’s no cookie cutter approach. Practically, you start with an educated guess on the proper range and eventually the range is evaluated by investigating the validation characteristics of linearity, accuracy, and precision. In this example wider range from 50 – 130% of the target sample concentration was chosen. The way you prepare the standards is first you need to know the target concentration of the compound under study (Analyte A), which is 1.0 mg/mL.
As indicated in the table below, the target concentration is 1.0 mg/mL. The 50% of the 1.0 mg/mL is equal to 0.5 mg/mL, how?
Multiply, 1.0 mg/mL * (50/100) = 0.5 mg/mL.
Now we know that the working standards concentration is from (0.5, 0.7, 0.85, 1, 1.15, 1.3) mg/mL. What we need to do know is to figure out the stock standard concentration in order to prepare the working standards by the serial dilutions. In this example, the stock solution has a concentration of 2.67 mg/mL.
Figure 12
Figure 13
After injecting the working standards through the instrument. The calibration curve was constructed for the visual examination to see how the data lie along the least squares line.
Figure 14
numbers are arbitrary in vacuum, the instrument gives out responses in form of numbers. It’s our responsibility to know how they’re generated and how they can benefit us. By the same line of thought, we want to examine the quality of the linear response over the specified concentration range.
How to examine the quality of the linear response?
Examining the quality of the linear response is partly determined by inspecting the slope, ideally, the closer to zero the higher the quality of the linear response. In order to do that the response factor (RF) is first determined.
Response factor is the calculated by dividing the area of response by concentration of analyte.
RF = Area / Conc.Generally, response factor is determined at each measured concentration and plotting this response factor against analyte concentration, where then the slope can be determined using the Linest function in excel. As you can in the example mentioned the response is independent of the concentration that alludes to a true linearity over the range of concentrations. Additionally, the slope in the example is inclined towards zero which improves the confidence in the quality of the linear response.
Figure 15
Having fulfilled the visual aspect of linearity, along with response factor. There still another layer of conformity that need to be performed to show a deeper understanding of the behavior of the model that is relying more on the statistical calculations to show if there are any deviations from the assumed linearity – the residual analysis. Essentially, residuals measure the difference between the measured value and the calculated value using the slope and the intercept determined by a fit of all data to predict the calculated value. Typically, residuals should be randomly distributed around the true mean of zero.
Figure 16
Figure 17
So now we’ve fulfilled the first parameter of the linearity study which is the correlation coefficient. Then the remaining are the y-intercept, residual standard deviation, and range. The validated working range is determined by the investigation of the accuracy and precision parameters of the method.
The y-intercept and residual standard deviation calculation:The calculation is based on the regression line equation, Y = mx + b at 100% level. Where, Intercept = b, X variable 1 / Slope = m, x = 1068 ug/mL. The following table shows the calculation for the y-intercept and residual standard deviation.
Figure 18
Figure 19
Note, Correlation coefficient is not a measure of linearity but rather a measure of how well the data fits the model. It only reflects how much of the change in response is due to the change in concentration.