Model Fitting and Regression in MATLAB - YouTube

Channel: LearnChemE

[4]
I'll be showing you some simple model fitting and regression techniques in MATLAB in this
[9]
screencast. What I am first going to do is show you how you can import some data from
[13]
Excel into MATLAB, and we're going to use that data in model fitting and regression
[18]
techniques in MATAB. I've got some data in a file called bacteria, and this is data
[24]
showing the optical density of a culture of bacteria as a function of time. You'll notice
[29]
for each of these times I have the intriplicate. Back in MATLAB I can go up here to the work
[35]
space and I can click "Import Data", and I will navigate to where my file is, I am going
[41]
to open that, and then you can see it's highlighted the data, it's not highlighting the title,
[47]
which is what I want, and so I can check this to make sure that that is what I want to import,
[52]
and then go up here to "Import" and you'll see that it imported this untitled, and I
[57]
can double click there to rename it, so I renamed this data. If you want you can actually
[62]
save this, so I am going to go up here to save, I'll click on that, and I can save this
[68]
to the folder that I want it to be in. At a later time point, if I want to reload that
[73]
data, I can just do load data, and assuming that I am in the right folder then it will
[79]
actually load it and it will but that array over here in the work space. Now what I am
[84]
going to do I am just going to plot this, so I am plotting data, all rows of column
[90]
1, and the response is all rows of column 2, and I am plotting these points as black
[95]
squares, and putting the grid on. So this is our plot, I've got time on the x-axis, optical
[101]
density on the left, and you can see that for each of these measurement times I've got
[106]
three different measurements of the bacteria culture. So what I am ultimately trying to
[112]
do is develop a model equation that models the growth of these bacterial cells. One way
[119]
to do this is to fit all those different data point to different polynomials, first defining
[126]
x as our time, y as the response, and here I've got p(1), it's a vector, which is the polyfit
[134]
of our data, a first-order polynomial, so just a line, I am creating an actual function
[140]
that's taking those coefficients of our fit and putting it into polyval, so this would
[146]
actually be a line so that we can plot that. I am doing second-order polynomial, third-order
[152]
polynomial, and then this fourth one is actually a logarithmic fit, so to do this using polyfit
[159]
you actually just take the polyfit of the log of y, and that will give us the equation
[165]
of a logarithmic fit. I am then computing the residuals, I have residuals one, which
[170]
is the residuals between our first-order polynomial, or our line and the actual values, I have
[177]
the second-order, third, and then I we have our logarithmic residuals. I am next plotting
[182]
a figure. I have subplots of all the different fits, so the first, second, and third-order
[187]
and then the logarithmic, but now I am going to run this file, so when I type bacteria,
[194]
that's the name of that m script that I showed all the plotting commands in, when I run this
[199]
it spits out two figures. The first one is a figure of all the model fits, so I have
[206]
the first-order, that's the line, second-order, third-order, and then our logarithmic fit.
[212]
The second plot, or the residuals for each of the fits, again the residuals are the difference
[217]
between the model and our actual experimental data points, you can see it's a quadratic
[222]
function which means that that's not a very good fit. We have our second-order residuals,
[227]
those look pretty good, they are somewhat random, I'd say our second and third are pretty
[231]
similar, and then we have our logarithmic fit. Bacteria actually grow exponentially,
[236]
so logarithmic should be the best fit, but these cells are actually in something known
[240]
as the lag-phase, so this is one way that we can plot our different fits and then analyze
[246]
to see how good of a fit each of those models is. So now I am going to show you how to do
[253]
these model fits and regression in MATLAB. If you go up here to tools in the figure,
[259]
and go down here to basic fitting you can choose all sorts of different fits, so I can
[264]
choose a linear fit, and you see that that is placed onto the plot. I can put a quadratic,
[270]
a cubic in there. If you want you can plot the residuals, it's actually going to tell
[276]
me that I can't use the bar plot residuals if I have replicates so I am going to do scatter
[281]
plot, we look down here this is the residual plot for all our different fits, I am actually
[285]
going to remove my linear, so I have two fits here, the quadratic and the cubic. If I want
[292]
I can expand this by clicking on this right arrow, and I can get the coefficients for
[297]
these fits, so here I have the cubic fit, if I want to look at a quadratic fit I can
[302]
do that, and these are the equations, and on the top here it has the form of that equation.
[307]
If I want to put in here a cubic fit, and then I can click on the right arrow here,
[312]
it will actually allow you to interpolate between points, so if I wanted to calculate,
[317]
let's say a time of 150, based upon the cubic model, then I can evaluate. I can also plot
[324]
that if I want, and it goes back here onto the figure, and then if you want to you can
[331]
save all of those to the work space. With the data fitting tool that I just showed you,
[335]
you are somewhat limited to the types of models that you can use, you have to basically go
[340]
with some of the models that are already built it. Fortunately there is a curve fitting tool
[346]
that is built into MATLAB, and you can actually create custom models. So let's do this in
[351]
MATLAB, I've imported a viscosity Excel file, so this is the viscosity of a fluid as a function
[357]
of the temperature, so I am going to just import this, rename it viscosity, and then I've extracted
[364]
the first column and that's my x-values, the second column is my y-values, and now if I
[369]
open up this cftool, which is curve fitting tool, it brings up this window, and I can
[376]
do a lot of things here, I can bring in my x-data, so my y-data, and using this I can
[382]
go over here and I can change the different type of model, here I am going to do a polynomial,
[387]
so this is a first-degree, which is just a line. I can go over here, I can see the goodness
[390]
of fit, so a lot of statistical parameters here. We want to maximize adjusted R-squared,
[396]
so I can go here and look at two, and you'll also notice down here I have the different
[401]
parameters, I can go to three for example, and I can basically go through a bunch of
[405]
different models to see which one gives me the best adjusted R-squared. Now I've put
[411]
in a custom equation, and we see that created a pretty good model here just with these three
[416]
terms, the adjusted R-squared is about 0.9972. So you can play around with a bunch of these
[421]
to determine the best fit for your data. This curve fitting tool can also work with multilinear
[429]
models, so that means you have two regressor variables, such as x(1) and x(2), you just
[433]
have two independent regressors and one output, either y or z, can have a second-order polynomial.
[439]
Unfortunately the curve fitting tool just by itself isn't capable of these interaction
[444]
terms, which would be like an x(1) times an x(2), or an x times y, but you can easily
[450]
put in a custom tool into the curve fitting tool. So here I've got some multilinear data.
[456]
In column A I have the response, which is, we're going to say is z, and then I've got two regressor
[462]
variables, x and y, and I am going to import this, and I am going to rename, and then I
[469]
am extracting my z-values, that's the first column of our data, the x values is the second
[475]
column, and the y-values is the third column. Now if I type in cftool that brings up the
[481]
curve fitting tool, and I brought in my x, my y, and my z, and this time you can actually
[488]
play around with this three-dimensional plot to see how well that curve, and right now
[492]
I have a polynomial with a degree one for each of those, so the equation is over here
[496]
on the left. If I want I can name this fit and it will actually store it down here as
[502]
first -order, I can create a new bit, and I've adjusted the order of two for both x
[508]
and y, and I can save that and I can compare the two at the bottom. Finally if I want to
[513]
do the interaction, which was the third one I showed on that slide, I can go here to custom.
[518]
So I've created a new fit here, custom brought xyz, now I have axy squared, plus by squared,
[525]
plus cxy, so this cxy is the interaction multiplying those two that you can't do with a typical
[530]
fit, and I look down here, I look, I can rotate this and see that's not actually as good of
[537]
a fit, and looking down here at the adjusted R-squared, that's actually not as good of
[542]
a fit as the second-order. So the second-order, the one shown here, actually fits the data
[548]
very nicely.