Discussion: Machine Learning and Neural Networks for Process Simulation: Difference between revisions

Revision as of 15:30, 13 February 2024

This is a Discussion Page with supplementary user information. It is not part of the core SysCAD Help Documentation - please refer to the User Guide for full documentation links.

Navigation: Product Blog ➔ Discussion Pages ➔ Machine Learning and AI in SysCAD

Introduction to Machine Learning

Machine Learning (ML) is a branch of Artificial Intelligence (AI) based on the idea that systems are capable of learning from data, recognising patterns, and making decisions with minimal human intervention. In ML there are three main categories:

Supervised Learning uses datasets for training which contain labelled or true observations. The algorithm makes predictions that are compared with the real output values and, if not within a certain tolerance range, the algorithm is modified until it achieves the correct output. This type of machine learning requires a large labelled dataset and is typically associated with the use of artificial neural networks (NN).
Unsupervised Learning uses unlabelled data, i.e. the dataset does not include true or known outputs. In unsupervised learning the algorithm tries to discover a pattern to solve by either clustering (grouping data based on similarities or differences), association (finding relationships between variables), or dimensionality reduction (e.g. principal component analysis).
Reinforcement Learning is similar to supervised learning but the model is not trained using a sample dataset of true values. Instead, the model learns as it goes by a trial-and-error method whereby a reward system or "scoring" is used to reinforce model behaviour to achieve a certain objective, leading to adaptive decision-making in dynamic and complex scenarios.

In this discussion page we will focus on Supervised Learning, specifically looking at the use of artificial neural network models applied within a SysCAD project. In later instalments of this discussion series we will look at other types of ML with SysCAD. For Part 2, we will focus on a step-by-step example of using supervised learning neural network model for green steelmaking. For Part 3, the focus will be on developing a fully trained convolutional neural network for dynamic simulation.

Supervised Learning and Neural Networks

What are Neural Networks?

Artificial Neural Networks (NN) mimic the structure of real neurons in the brain, perceiving inputs and firing signals through a net of connected neurons. Each of the neurons, also called nodes or perceptrons, combine weighted inputs into a single output. In terms of overall structure, there are various different arrangements for NNs. At a very high-level, NNs can be distinguished based on their intended purpose:

Categorical NNs are used for classification involving non-numeric labels. Outputs are binary true/false (e.g. is this an image of a cat?), or a set of probabilities that the input belongs to a range of categories (e.g. identifying which animal an image represents).
Regression NNs are used to predict one or a set of numerical values, similar to a mathematical function (e.g. mass of compounds in a process stream, temperature, size fraction, etc.).

Both NN types have applications in process modelling. A regression NN could be used to solve a mass balance in a reactor unit model, while a categorical NN could be used to predict a quality or state of a process (e.g. colour of a given stream, in- or out-of-spec product, equipment failure, etc.).

In this discussion we will focus on two NN structures that have been used with SysCAD: Deep Feed Forward (DFF) and Convolutional Neural Networks (ConvNet).

Deep Feed Forward Neural Network

Deep Feed Forward Neural Network (DFF) is the most common type of neural network. Its structure includes an input layer, output layer, and at least one hidden layer containing neurons, weights, bias, and an activation function. Data flows from the input to the output (a forward pass) via a series of calculations (described later in detail). The image below represents the structure of a DFF neural network:

Convolutional Neural Network

Convolutional Neural Networks (ConvNet or CNN) is an extended form of DFF and is primarily used for feature extraction from a grid-like matrix dataset. ConvNets are very powerful tools, typically used for image recognition or signal processing. In SysCAD, we have used ConvNet for dynamic time-based process simulation with great success, with much better performance (lowest loss) than DFF using the same training and validation dataset.

ConvNet has many layers which include an input layer (typically a 2D or greater dimension matrix-type), one or more convolutional and pooling layers, followed by a fully-connected layer represented by a DFF. Essentially a ConvNet is a DFF with additional pre-processing steps to simplify and summarise the input data.

How do Neural Networks Work?

A neural network is a network of connected neurons. Each connection is given a weight, and each neuron in the network calculates the sum of weighted inputs, plus a bias. The calculated value is then subject to an activation function. The output value from each neuron is then sent to neurons in the next layer, and so on. The mathematical computation done for each node is represented by the formula:

[math]\displaystyle{ output= f_{act}\left({\sum_{i=1}^{n_{inputs}}({input_i*weight_i}}) +bias\right) }[/math]

Neural Network Parameters

Neural Network parameters are the learnable / trainable values within the network. The final values of these parameters are obtained by an optimisation process (training or fit) where successive adjustments to these values are made in order to minimise the error (or loss) between the predicted values of the output layer and the true values for each sample in the dataset.

For a DFF network, these are the weights and biases which impact the behaviour of each neuron. Weights are usually randomised and biases are zeroed before the learning session begins. Together with an activation function, they allow the model to propagate forward and produce an acceptable output.

Weight: The weight is multiplied by the input value entering the node. The weights represents the strength of a node connection.

Bias: The output value from a neuron can be shifted using a bias. The bias can be compared to the y-intercept in a linear equation.

For a ConvNet, there are additional trainable parameters. These are the coefficients of a kernel which are used to extract features from the input matrix and also reduce the dimensions or size of the inputs.

Kernel: This is a matrix or set of matrices with a smaller size than the input layer, used to calculate the dot product between a section of the input layer and the kernel. The kernel is displaced, sweeping through the input matrix and calculating the dot product again, repeating until the entire input is processed. This produces a new layer called an activation map or feature map which contains only the dot products and is typically of a smaller dimension than the input matrix. Kernels act as filters, identifying features, and simplifying and summarising the input data into to more computationally manageable chunks. The values of this kernel matrix need to be optimised during the training process.

Neural Network Hyperparameters

Below are some of the hyperparameters that can be modified when setting up a neural network model. These are not trainable parameters but rather design parameters defining the structure or makeup of the NN model. Initially, one might not know how many layers, or neurons per layer, are going to result in the lowest validation loss. Similarly, the use of one or more convolutional layers, number of kernels (each convolutional layer might have several kernels), the size of the kernel window, whether to use or not pooling layer at all, etc.

The process by which one determines the final NN design is often referred to as hyperparameter tuning or optimisation and involves running parametrically, several combinations of designs, training each NN model using the same dataset and evaluating which combination of hyperparameters results in the lowers training and validation losses.

Neuron Count: A neuron in a neural network is where the sum of the weight multiplied by the input is computed and a bias is added. A large number of neurons can be used, however some popular number of neurons used are 32, 64 or 128 (aligning with computer hardware for efficient calculation).

Activation Function: The net output from the neuron is passed through an activation function. The type of activation function can be selected as hyperparameter by the user, whereas the weights and biases are adjusted during the training process. The choice of activation function(s) depends on the problem you are trying to solve. Some examples are Rectified linear function (ReLU), SoftMax and Sigmoid Function. The purpose of this activation function is to introduce non-linearity into the output of a neuron, making the model much more versatile and capable of modelling very complex problems despite the simple mathematical structure.

Input Layer: The input layer is the first layer where all the inputs are given for the model. The number of neurons for the input layer depends on the number of features or inputs in the dataset.

Hidden Layer(s): The hidden layer obtains the data from the input or previous layers. In the model there can be as many hidden layers as necessary and each hidden layer can have a different number of neurons.

Output Layer: The output layer is the the last layer of the neural network. The number of neurons in the output corresponds to the number of outputs or variables representing true values there are.

Loss: Loss is the metric used during the optimisation or training process. Depending on the type of model, different loss functions can be defined such as Categorical Cross-Entropy and Binary Cross-Entropy (for categorical-type problems) and Mean Squared Error or Mean Absolute Error (for regression-type problems).

Optimiser: These are crucial in assisting the network in learning to generate ever-better predictions during training. Optimiser routines assist in determining the optimal set of model parameters (weights and biases) so that the model can generate the best results for the problem it is solving. There are many types of optimisers that can be used such as stochastic gradient descent (SGD), SGD with decay, SGD with momentum, AdaGrad (Adaptive Gradient), RMSProp (Root Mean Square Propagation), and AdaM (Adaptive Momentum). AdaM is the most widely used optimiser as is often capable of finding the global minimum and avoiding getting stuck in local minima.

Learning Rate: The learning rate determines the amount that the model will change in response to the estimated error every time the model weights are changed. This is similar to the gain in a PID Controller. Choosing the correct value for learning rate is important because if the learning rate is too small then this can result in the training process being too long or it could get stuck. However, if the value is too large then it could result in a sub-optimal set of weights. Learning rate values range between 0 and 1.

Epochs: In neural networks a forward and a backward pass together is counted as one iteration during the training sessions. The number of epochs may be interpreted as the total number of iterations the algorithm has made across the training dataset.

The additional parameters below are only for Convolutional Neural Networks:

Convolutional Layer: In the convolutional layer the dot product between two matrices corresponding to a section of the input layer and the kernel is preformed. The kernel slides across the height and width of the dataset, each time performing a dot product, producing an activation map. Typically an activation function, similar to that used for DFF NN, is then applied to the activation map.

Pooling Layer: The pooling layer summarises statistics of nearby outputs from the activation map. There are different types of pooling functions that can be used such as average rectangular neighbourhood, L2 norm of the rectangular neighbourhood, or max pooling. Max pooling is the most popular type of pooling function.

Creating a Neural Network

The following is a high-level overview of the steps required to set up and train a neural network, summarised in the flowchart to the right. A detailed walkthrough of this process will be presented in Part 2.

Source Data: In order to create and run a neural network you first need to collect data for all the inputs and outputs you require. For example, process data such as temperature, pressure, input flows, tank levels, etc.
Preprocessing: It is generally good practice to pre-process the data, such as by shuffling and scaling. To avoid overfitting, it is also recommended to set aside about 20% of the data collected for validation while the remaining 80% is run through the NN during training.
Parametric Optimisation: Many different tools can be used to design the NN structure and obtain the final optimised parameters. For example, in Python some available tools include TensorFlow, TensorBoard and PyTorch. Other programming languages that can be used are R, Java, C++, and many more.
Postprocessing: Hyperparameters should also be adjusted to find the optimal NN structure (lowest loss). For DFF, these include learning rate, epochs, hidden layers, and neurons. For ConvNet there is also time window size, convolutional layers, convolutional filters, size of kernel, and pooling layer size.
Save Results: After parameter tuning and hyperparameter optimisation, it is of course very important to save the NN model structure including the final weights and biases for later use.

Neural Networks in SysCAD

There are several ways SysCAD can be combined with Machine Learning and Neural Networks.

SysCAD as the Data Source: Using a detailed calibrated SysCAD model of a process, we could generate a large dataset by running scenarios covering a wide range of input conditions and generating the corresponding outputs. This data could then be used to train a NN model.
NN as a Unit Model within SysCAD: We can incorporate a predefined NN model as a unit operation within a SysCAD project. Here the dataset used for training and validation has been generated externally (or from another SysCAD model as above). SysCAD can use a NN API to load the optimised NN parameters and calculate the output of a NN model at each SysCAD iteration while interacting with other SysCAD unit models, in the form of a controller or reactor unit model. In the simplest case, custom PGM code could be implemented to perform the NN forward pass. Alternatively, as demonstrated in the example below, a custom unit model running a development C++ NN API can be used directly in place.

SysCAD NN API

The SysCAD NN API is currently in development. The API is a set of C++ dynamic libraries (DLL) that enable SysCAD to:

Load and run an optimised DFF or ConvNet model from file
Run a Mass Balance Reactor unit model based on optimised NN parameters

The optimised parameters for the DFF or ConvNet models can be generated using the same SysCAD NN API outside of SysCAD, or some of the publicly available libraries. In Part 2 of this series we will show in detail how this process works using TensorFlow.

Once NN model parameters are loaded, the NN API is used by a SysCAD General Controller or reactor unit model, providing the necessary inputs to the NN model, calculating the outputs, and transferring those outputs back to the SysCAD model.

Mass Balance Neural Network Reactor

The Mass Balance Reactor uses a NN to model chemical reactions within a process, converting input reactants to output products. Critically, this must be done while maintaining mass balance.

For example, let's look at the combustion system CH₄ + O₂ which produces a range of species (including CO₂, CO and H₂O). A mass balance system can be defined by system components (SC) and phase constituents (phC), in this case the elements and species respectively. The SC and phC are related by a stoichiometric matrix ([math]\displaystyle{ S }[/math]) (i.e. the elemental composition of each species). A set of independent orthonormal basis vectors [math]\displaystyle{ \vec{(b_{i})} }[/math] can be obtained for the [math]\displaystyle{ S }[/math] matrix by calculating the nullspace of [math]\displaystyle{ S }[/math]. The number of vectors (nullity) represents the DOF (degree of freedom) for the system. Any mass transfer (generation or consumption) in the system with no net overall mass change can be calculated by a set of transformation coefficients ([math]\displaystyle{ \lambda_{i} }[/math]):

[math]\displaystyle{ \vec{y_{out}}[moles]= \vec{y_{in}} + \sum_{i=1}^{DOF} \lambda^{}_{i} \cdot \vec{(b_{i})} }[/math]

The stoichiometric matrix and basis vector for the system (gas phase only) are shown below. 6 phase constituents (species, the rows in [math]\displaystyle{ S }[/math]) and 3 system components (elements, the columns in [math]\displaystyle{ S }[/math]) are considered.

It is important to note that there are infinite sets of independent basis vectors. For this example, a rref (reduced row echelon form) of the basis vectors was chosen for convenience as it makes it easy to calculate the transformation coefficients given a training data set.

[math]\displaystyle{ \ \ \ O \ \ \ C \ \ \ H }[/math]

[math]\displaystyle{ S =\matrix{H_2 \\ CH_4\\ O_2\\ H_2O\\ CO\\ CO_2\\} \pmatrix{ 0 & 0 & 2 \\ 0 & 1 & 4 \\ 2 & 0 & 0 \\ 1 & 0 & 2 \\ 1 & 1 & 0 \\ 2 & 1 & 0 \\} \quad \quad \vec{b} = \pmatrix{1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ -1 & -2 & 0 \\ -1 & -4 & 2 \\ 1 & 3 & -2} }[/math]

Note that the columns of the basis vector [math]\displaystyle{ \vec{(b_{i})} }[/math] each represent values for a stoichiometrically balanced reaction. E.g. for the first column: 1 H₂ + 1 CO₂ ⇔ H₂O + CO. The three column vectors of [math]\displaystyle{ \vec{b_{i}} }[/math] represent the three degrees of freedom for the system, i.e. any overall reaction the system can be represented by some combination of these three reactions.

The overall mass balance equation for this CH₄ + O₂ combustion system is then:

[math]\displaystyle{ \pmatrix{{y}_{H_2} \\{y}_{CH_4} \\{y}_{O_2} \\{y}_{H_2O} \\{y}_{CO} \\{y}_{CO_2}}_{out} = \pmatrix{{y}_{H_2} \\{y}_{CH_4} \\{y}_{O_2} \\ {y}_{H_2O} \\{y}_{CO} \\{y}_{CO_2}}_{in} + \lambda_{1} \pmatrix{1 \\ 0 \\ 0 \\-1 \\-1 \\1} + \lambda_{2} \pmatrix{0 \\1 \\0 \\-2 \\-4 \\3} + \lambda_{3} \pmatrix{ 0 \\0 \\1 \\0 \\2 \\-2} }[/math]

For any feed stream vector [math]\displaystyle{ \vec{y_{in}} }[/math], an arbitrary set of transformation coefficients [math]\displaystyle{ \lambda_i }[/math] will produce a product vector [math]\displaystyle{ \vec{y_{out}} }[/math] with mass and element balance strictly conserved. However, while the total mass of the system is conserved, it does not guarantee that all phase constituents will have positive mass. That problem needs to be addressed by improving accuracy of the model and by implementing mechanisms to ensure all output masses are positive (constrained mass balance).

The problem now is to find a set of [math]\displaystyle{ \lambda_i }[/math] values given a known set of input species, T and pressure. This is similar to solving a thermodynamic equilibrium problem for the system, where the thermodynamic model (e.g. GFEM or TCE) can calculate the output species representing the most stable state. However, in this case, we will use a trained Neural Network to find [math]\displaystyle{ \lambda_i }[/math] values for each input set. Here, the training and validation datasets are generated using an equilibrium thermodynamics model (but could equally be done from experimental results or operational measurements).

When creating the NN, true values for [math]\displaystyle{ \lambda }[/math] need to be calculated for each set of input species, T, P and true output composition. This can be done using the following equation for [math]\displaystyle{ \lambda_{1} }[/math], [math]\displaystyle{ \lambda_{2} }[/math] and [math]\displaystyle{ \lambda_{3} }[/math] (thanks to the simplified form of the rref basis vector):

[math]\displaystyle{ \lambda_{1}= y_{1,out}-y_{1,in}=y_{H_2,out}-y_{H_2,in} }[/math]

[math]\displaystyle{ \lambda_{2}= y_{2,out}-y_{2,in}=y_{CH_4,out}-y_{CH_4,in} }[/math]

[math]\displaystyle{ \lambda_{3}= y_{3,out}-y_{3,in}=y_{O_2,out}-y_{O_2,in} }[/math]

Once the input and true output dataset is collected or generated, this can be used to train a NN model. For this example, a set of 2000 random combinations of input amounts for all 6 phase constituents and random temperature between 300 and 6000 K was generated as input/training data.

NNs typically expect input values between 0 and 1. As such, the input amounts were normalised so that the sum of all inputs added to 1, while the temperature was divided by the maximum temperature used in this example (6000 K). This ensures that the NN model will work for any set of inputs and temperature, as long as normalised and scaled inputs are provided before performing a forward pass through the trained model. This step is integrated into the NN API.

To train the neural network, almost any programming language can be used. Various parameters such as epochs, hidden layers, type of optimiser, learning and decay rates were adjusted to determine the optimal set of weights and bias, and the structure of the neural network.

For this example, the best hyperparameters were 2 hidden layers, 64 nodes for the hidden layers, Adaptive Momentum (AdaM) for the optimiser, a learning rate of 0.006, a decay rate of 0.005, and 100000 epochs. The training results for [math]\displaystyle{ \lambda_{1} }[/math], [math]\displaystyle{ \lambda_{2} }[/math] and [math]\displaystyle{ \lambda_{3} }[/math] are shown in the animation below. The outputs from the initially randomised weights are shown at iteration 0, improving through to 100000 epochs. As can be seen, results from the last iteration were very accurate!

The Loss and Accuracy throughout the training was also calculated. As shown in the following graphs, the overall loss was ~1e-5 and the accuracy was ~70% after 100000 epochs.

Depending on the type of Neural Network (categorical, binary or regression), there are different methods to calculate loss and accuracy.

To calculate the Loss, we determined the Mean Squared Error (MSE) between the true ([math]\displaystyle{ \hat{y} }[/math]) and predicted ([math]\displaystyle{ y }[/math]) values for each of the [math]\displaystyle{ K }[/math] samples (2000 in this case) for each of the [math]\displaystyle{ J }[/math] outputs (3 transformation coefficients [math]\displaystyle{ \lambda }[/math]):

[math]\displaystyle{ L = \frac{1}{K\cdot J} \sum_{k=1}^{K} \sum_{j=1}^{J} \left(y_{k,j} - \hat{y}_{k,j}\right)^2 }[/math]

To calculate the Accuracy for the regression model, we assess each individual sample [math]\displaystyle{ k }[/math] for each output [math]\displaystyle{ j }[/math], and assign a value of 1 or 0 if the if the absolute difference between the true ([math]\displaystyle{ y }[/math]) and predicted ([math]\displaystyle{ \hat{y} }[/math]) values is less than or greater than a defined threshold [math]\displaystyle{ \varepsilon }[/math]:

[math]\displaystyle{ a_{k,j} = \begin{cases} 0 & \text{if } |y_{k,j} - \hat{y}_{k,j}| \geq \varepsilon_j \\ 1 & \text{if } |y_{k,j} - \hat{y}_{k,j}| \lt \varepsilon_j \end{cases} }[/math]

The choice and definition of threshold depends on the application and problem that is being solved. In this case, the threshold [math]\displaystyle{ \varepsilon_j }[/math] is defined, on a per-output [math]\displaystyle{ j }[/math] basis, as an arbitrary fraction of the standard deviation [math]\displaystyle{ \sigma }[/math] of each set of true values [math]\displaystyle{ \hat{y}_{j} }[/math]:

[math]\displaystyle{ \varepsilon_j = \frac{\sigma_{{\hat{y}}_{j}}}{100} }[/math]

The total accuracy is then simply the average of all individual accuracy values across all samples and outputs ([math]\displaystyle{ \overline{a}_{k,j} }[/math]).

Once the model has been trained, the NN API is used to load the optimised parameters for use by the SysCAD Mass Balance Reactor.

A validation dataset was use to test the model. In this case, fixed amounts of inputs, for example 1 mol of CH₄ and 2 moles of O₂ within a range of temperature were tested in the SysCAD model.

On the left side is the results using a TCE Model in SysCAD (considered as true values) and on the right side is the results using the NN Mass Balance Reactor Model with the optimised model.

TCE Model

Neural Network Mass Balance Trained Model

When comparing both models, the results are very close and with very good agreement, showing how NN can be used in various different scenarios for process simulation given a good set of supervised data is available.

Why Use a NN Model?

So after looking at this high-level description of supervised machine learning application in SysCAD and looking at the mass balance example for the combustion of methane, one might ask, is it worthwhile? is it more advantageous than just use a GFEM or TCE model or another high definition model. Well, in some cases, yes.

There are various pros and also some cons to this approach, listed below are some advantages and disadvantages of using NN:

Advantages:

Neural Networks have simple structures and calculations (sums and multiplications)
They are relatively fast to calculate since they do not require an iterative process once trained and does not require highly complex calculations. This is where they really make a different when compared to solving the same problem but using a highly complex and non-ideal thermodynamic solver. A NN model can be seen as an accelerator. It does not replace entirely the thermodynamic model but when implemented and embedded in a large model with multiple recirculation streams and NN models, the gain in speed can drastically reduce the overall model computation time
Neural Networks can be used for a wide range of applications and several resources are available. Can be used to model non-linear relationships, pattern recognition, and classification problems. This versatility, allows to add more features to a model that otherwise would be difficult to incorporate.
Depending on the amount and quality of training data, the neural network can achieve highly accurate results.
Neural Network models can be used for forecasting model outputs, for example, in a time-dependent model, a NN model can be use to predict the immediate outcome, whereas at the same time, a second model that was trained to predict 10, 30, 60 minutes or 1 day ahead of time, can make predictions (probably with lower accuracy but correct trend) in parallel to act as early warning indicator and support in decision making.
Neural Networks are the backbone of most Machine Learning algorithms.

Disadvantages:

Require large and supervised (checked) dataset. Depending on the problem you are trying to solve it might take a long time to collect all the data.
Training might be difficult and lengthy.
Limited extrapolation (might not perform as well outside of training dataset). This of course is something to address during training sessions, having a proper balance between training and validation datasets. It is easy to fall into overtraining a model, making it too rigid and compromising the performance of the model outside the training dataset.
Despite of the flexibility that ML and NN models provide, there is also a certain rigidity. For example, in the case of the mass balance problem shown above, if we decide to add more phase constituents or system components to the model, then we would need to generate a new training and validation dataset and fit the model again before we can use it in a SysCAD model.

What's Next?

In the next part of this series, we will go in detail in how to create a NN model for steady state process simulation from scratch, going through each step about creating a GFEM model to generate the training and validation dataset, training the model and optimising the model's hyperparameters and then implementing the newly trained model back in a SysCAD project.

References

I. Goodfellow, Y. Bengio, and A. Courville, “Chapter 9: Convolutional Networks,” in Deep learning, MIT Press, 2018, pp. 326–366
M. Mishra, “Convolutional Neural Networks, explained,” Medium, https://towardsdatascience.com/convolutional-neural-networks-explained-9cc5188c4939
J. Brownlee, “Understand the impact of learning rate on neural network performance,” MachineLearningMastery.com, https://machinelearningmastery.com/understand-the-dynamics-of-learning-rate-on-deep-learning-neural-networks/
K. Nyuytiymbiy, “Parameters and hyperparameters in machine learning and Deep Learning,” Medium, https://towardsdatascience.com/parameters-and-hyperparameters-aa609601a9ac
J. Brownlee, “Understand the impact of learning rate on neural network performance,” MachineLearningMastery.com, https://machinelearningmastery.com/understand-the-dynamics-of-learning-rate-on-deep-learning-neural-networks/
R. Bagheri, “An introduction to Deep Feedforward Neural Networks,” Medium, https://towardsdatascience.com/an-introduction-to-deep-feedforward-neural-networks-1af281e306cd

@@ Line 165: / Line 165: @@
 Depending on the type of Neural Network (categorical, binary or regression), there are different methods to calculate loss and accuracy.
-To calculate the '''Loss''', we determined the Mean Squared Error (MSE) between the true (<math>y</math>) and predicted (<math>\hat{y}</math>) values for each of the <math>K</math> samples (2000 in this case) for each of the <math>J</math> outputs (3 transformation coefficients <math>\lambda</math>):
+To calculate the '''Loss''', we determined the Mean Squared Error (MSE) between the true (<math>\hat{y}</math>) and predicted (<math>y</math>) values for each of the <math>K</math> samples (2000 in this case) for each of the <math>J</math> outputs (3 transformation coefficients <math>\lambda</math>):
 :<math>L = \frac{1}{K\cdot J} \sum_{k=1}^{K} \sum_{j=1}^{J} \left(y_{k,j} - \hat{y}_{k,j}\right)^2</math>

Discussion: Machine Learning and Neural Networks for Process Simulation: Difference between revisions

Revision as of 15:30, 13 February 2024

Contents

Introduction to Machine Learning