Mathematical modeling for verification and validation - find formulas that predict outputs and behaviors independently of running your program. When you run your program, does it behave in a way that is consistent with the predictions of the mathematical formulas? Investigate why your program may act differently from what is predicted by the theoretical math formula(s)
Process modeling is the concise description of the total variation in one quantity, , by partitioning it into
- a deterministic component given by a mathematical function of one or more other quantities, , plus
- a random component (What is process modeling?)
"There are three main parts to every process model. These are
- the response variable, usually denoted by ,
- the mathematical function, usually denoted as , and
- the random errors, usually denoted by . (terminology)
The response variable, , is a quantity that varies in a way that we hope to be able to summarize and exploit via the modeling process. Generally it is known that the variation of the response variable is systematically related to the values of one or more other variables before the modeling process is begun, although testing the existence and nature of this dependence is part of the modeling process itself.
The mathematical function consists of two parts. These parts are the predictor variables, , and the parameters, . The predictor variables are observed along with the response variable. They are the quantities described on the previous page as inputs to the mathematical function, . The collection of all of the predictor variables is denoted by for short.
The parameters are the quantities that will be estimated during the modeling process. Their true values are unknown and unknowable, except in simulation experiments. As for the predictor variables, the collection of all of the parameters is denoted by for short.The parameters and predictor variables are combined in different forms to give the function used to describe the deterministic variation in the response variable. For a straight line with an unknown intercept and slope, for example, there are two parameters and one predictor variable .
For a straight line with a known slope of one, but an unknown intercept, there would only be one parameter.
For a quadratic surface with two predictor variables, there are six parameters for the full model..
(Terminology for process models)
Like the parameters in the mathematical function, the random errors are unknown. They are simply the difference between the data and the mathematical function. They are assumed to follow a particular probability distribution, however, which is used to describe their aggregate behavior. The probability distribution that describes the errors has a mean of zero and an unknown standard deviation, denoted by , that is another parameter in the model, like the 's. (Terminology for process models)
Process models are used for four main purposes: 1. estimation 2. prediction 3. calibration 4. optimization (Process models)
The goal of estimation is to determine the value of the regression function (i.e., the average value of the response variable), for a particular combination of the values of the predictor variables.
The goal of prediction is to determine either 1. the value of a new observation of the response variable, or 2. the values of a specified proportion of all future observations of the response variable for a particular combination of the values of the predictor variables
The goal of calibration is to quantitatively relate measurements made using one measurement system to those of another measurement system.
Optimization is performed to determine the values of process inputs that should be used to obtain the desired process output. Typical optimization goals might be to maximize the yield of a process, to minimize the processing time required to fabricate a product, or to hit a target product specification with minimum variation in order to maintain specified tolerances.
There is often more than one statistical tool that can be effectively applied to a given modeling application. Some of the more well-established statistical techniques useful for different model building situations: Process Modeling Methods 1. Linear Least Squares Regression 2. Nonlinear Least Squares Regression 3. Weighted Least Squares Regression 4. LOESS (aka LOWESS) Statistical methods for model building
The basic steps of the model-building process are: 1. model selection 2. model fitting 3. model validation These three basic steps are used iteratively until an appropriate model for the data has been developed. In the model selection step, plots of the data, process knowledge and assumptions about the process are used to determine the form of the model to be fit to the data. Then, using the selected model and possibly information about the data, an appropriate model-fitting method is used to estimate the unknown parameters in the model. When the parameter estimates have been made, the model is then assessed to see if the underlying assumptions of the analysis appear plausible. If the assumptions seem valid, the model can be used to answer the scientific or engineering questions that prompted the modeling effort. If the model validation identifies problems with the current model, however, then the modeling process is repeated using information from the model validation step to select and/or fit an improved model.
Once a model that gives a good description of the process has been developed, it can be used for estimation or prediction.