The Sound Velocity Experiment as an example.

When students use Excel to draw a trendline to their data, they often are confused by how one can evaluate the quality of that fit and how one can introduce a different function for the fit. This page is to help you overcome the shortcomings. I will describe the procedure using data from one of the experiments: the sound velocity measurements.

**Step 1**: __Enter the data.__ Spread
sheet is shown below.

It is always helpful to label the data, so
that you can keep track of your manipulations with minimal
confusion. In the case of our example, there is a set of data for
the **time** delay in the sound front arrival as a
function of the **distance** between the microphone
and the speaker:

**Step 2**: __Defining the
model/formula for fitting__. Since we are fitting to the
linear equation:

distance = a + velocity*time or y = a + b*x (1)

our formula would have only two parameters, *a*
and *b*, that will be used to "fit" the data
using the Solver in Excel. We create a line where these
parameters are listed. I used cells "D2" and
"E2" to put the values for *a* and *b*,
respectively.

**Step 3**: __Use your model with the
first guess for parameter values__ (1 and 1 on the spreadsheet
below) to compute a distance value at the time values given in
the initial data. Create a chart that displays both the
experimental and the **model** values of distance.
Then you can start developing a good initial guess for parameter
values used in the Solver.

Look at the screen snapshot below for an example. Notice that no number values were used in developing this formula. Instead, the cell address was used. Cell “A4” holds the time value used in this model to compute the model value for distance at time x = 0.00334 (s). The other cell addresses contain the parameter values that help define the model. The “$” sign in front of the column and row identifiers “fix” the cell reference to that specific location during copying and pasting the formula in a different location on the spread sheet. (For example, if it were copied into the cell directly below, where the formula is typed in, the fixed cell values (with "$") will copy without change. The “A4” cell, not fixed, will copy and change to reference cell “A5”, one row below. Spatial relationships are maintained from one row to the next.

Once the model is defined and computes the distance value for initial time value, copy and paste the formula into the remaining cells of the column. It should calculate appropriate 'model' values correctly.

**Step 4**: __Compute the error__
between the model values and the data (true) values

The formula in the cell "D4" for **error**
is entered as: " = C4 – B4". Copy and paste it to
remaining cells

**Step 5**: __Compute the square of the
error value__

The formula for **error squared**
in cell "E4" was typed in as: "= D4^2"
This formula was copied into the remaining cells of that column.

**Step 6**: __Create a graph__
showing both data and model values.

You can use this to adjust model parameters to develop a good initial guess for the solver.

**Step 7**: __Compute the sum of the squared error terms. __

In the example, I placed in the "E21" cell as
"=SUM(E4:E17)" and titled "**sum Err sq**"
in "E20".

**Step 8**: __Use the ____Solver____
to minimize__ the sum of the squared error terms by varying the
values of the model parameters. The **Solver**
command is listed under **Tools** on the menu bar.
If the Solver is not already included in your Tools list, then
you would have to Add it in.

When you open the Solver Tool, you get a box
titled “Solver Parameters”:

- The first field must reference the value that you are trying to
optimize – in this case the sum of the squared error terms,
cell "$E$21" (if you set up your sheet like mine).

- The second field is a bubble entry for what you want to do –
in this case "Min" - for MINIMIZE

- The third field identifies the cells that you want to adjust to
minimize your target cell – these are the model parameters or
cell $D$2 and $E$2.

Then, hit the **Solve** button
- optimized fitting will appear.

You can use similar approach to introduce any other fitting function with more than two parameters that were used here.

As we saw above, upon minimization of the sum of errors
squared, its value did not get to be zero. If you use the built
in trend line option in Excel graphics, it characterizes the
quality of the trendline by R^{2} value. That value does
not tell you how accurate the slope or the intercept are. It is
more appropriate to introduce the standard deviation for each of
them, similar to the analysis of standard deviation in
measurements for a single value. For details of general treatment
I refer you to GNS and will give here
only the treatment for the example above of a linear function.

In this case, the standard deviation of the slope, s_{b},
and the standard deviation of the intercept, s_{a}, can
be calculated using **standard deviation of the fit**,
*s*, and D, as shown in the
equations below:

To implement these features, following additional steps are introduced.

**Step 9**: __Compute the square of the x value, Sum_x2,
and the sum of x, Sumx:__

The formula in cell F4 was typed in as:
"= A4^2" and copied into the remaining cells. Then in
F21 cell **Sum_x2** was calculated as
"=SUM(F4:F17)". Similarly, in A21 cell **Sumx**
was calculated as "=SUM(A4:A17)"

**Step 10**: __Compute ____D____,
and the standard deviations s, s___{a}__, and s___{b}__:__

In cell C23 **delta** (D) was calculated as: =
$B$21*$F$21-$A$21^2. Standard deviations were calculated using
formulas shown in red next to them.

The error for each parameter is obtained by multiplying its
standard deviation by the student coefficient for this number of
degrees of freedom (14-2=12), *t* ~ 2.2 in $C$27.

Thus, for the example given, the sound velocity is measured : **v = 320.1 ****±**** 1.0 m/s**

On the screenshot I also included the trendline with its equation that Excel generates.