Explore the fundamentals of C++ Machine Learning and discover how to implement linear regression from scratch in C++. This hands-on guide covers key concepts such as gradient descent, feature normalization, and predictive modeling, offering a practical approach to understanding linear regression in a performance-efficient language.
C++ Linear Regression
Linear regression models the relationship between a dependent variable (output) and one or more independent variables (features). The goal is to fit a linear equation to the data, allowing predictions based on new inputs. The core objective is to find the best-fitting line that minimizes the difference between predicted and actual values. The linear model can be expressed as:
y=m⋅x+b
where m is the slope (weight), x is the feature, and b is the intercept (bias).
Why Use C++ for Linear Regression?
C++ is not typically the first choice for machine learning tasks, but implementing linear regression in this language has several advantages:
- Performance: C++ offers superior execution speed, which is beneficial for handling large datasets or computationally intensive tasks.
- Control: Coding from scratch provides complete control over the algorithm’s details, enabling custom modifications and optimizations.
- Learning Experience: Implementing the algorithm manually helps in understanding the fundamental concepts behind machine learning models.

Key Components of Our Implementation
Here’s a breakdown of the major components in our simplified linear regression model:
- Feature Normalization: The
normalizeFeatures
method scales the features so that all values lie between 0 and 1. This step is crucial for ensuring that the gradient descent algorithm converges efficiently. - Gradient Descent: The
fit
method uses gradient descent to minimize the cost function. It iteratively updates the weights and bias to reduce the prediction error. The cost function, Mean Squared Error (MSE), is calculated and displayed every 100 iterations to track the model’s progress. - Prediction: The
predict
method calculates the predicted value for new inputs using the learned weights and bias. - Training and Evaluation: After training, the model’s parameters are printed, and predictions are made for new data points to demonstrate the model’s functionality.
NOTE
Visit GitHub to access the complete code.
Download the code of C++ Machine learning as a zip file.
Training and Evaluation
The model is trained using a dataset, where each data point consists of multiple features and a corresponding output value. During training, the model’s parameters are updated to minimize the error between predicted and actual values. The Mean Squared Error (MSE) is used as the cost function to evaluate the model’s performance. It’s printed periodically during training to monitor the convergence of the algorithm.
After training, the model’s parameters—weights and bias—are printed, providing insight into the learned relationships. Predictions can then be made using the trained model.
Conclusion
Implementing linear regression in C++ provides a valuable learning experience and highlights the fundamental concepts of machine learning. While C++ might not be the most common choice for such tasks, it offers performance benefits and a deep understanding of the algorithm’s workings.
By coding the model from scratch, you gain insights into essential components like gradient descent and feature normalization. This approach not only enhances your grasp of linear regression but also equips you with the skills to implement and optimize machine learning algorithms effectively.
Source Code of C++ Machine Learning
#include <iostream>
#include <vector>
#include <cmath>
#include <numeric>
using namespace std;
class LinearRegression {
private:
double learning_rate;
int iterations;
vector<double> weights; // Weights for each feature
double bias; // Intercept term
// Compute the cost (Mean Squared Error)
double computeCost(const vector<vector<double>>& X, const vector<double>& Y) {
int n = X.size();
double cost = 0;
for (int i = 0; i < n; ++i) {
double y_pred = predict(X[i]);
cost += pow(y_pred - Y[i], 2);
}
return cost / (2 * n);
}
public:
LinearRegression(double lr, int iter) : learning_rate(lr), iterations(iter), bias(0) {}
// Train the model using gradient descent
void fit(const vector<vector<double>>& X, const vector<double>& Y) {
int n = X.size(); // Number of data points
int m = X[0].size(); // Number of features
// Initialize weights
weights.assign(m, 0);
for (int iter = 0; iter < iterations; ++iter) {
vector<double> dW(m, 0); // Gradient for weights
double db = 0; // Gradient for bias
for (int i = 0; i < n; ++i) {
double y_pred = predict(X[i]);
double error = y_pred - Y[i];
// Update gradients
for (int j = 0; j < m; ++j) {
dW[j] += error * X[i][j];
}
db += error;
}
// Update weights and bias
for (int j = 0; j < m; ++j) {
weights[j] -= (learning_rate * dW[j]) / n;
}
bias -= (learning_rate * db) / n;
// Print the cost every 100 iterations
if (iter % 100 == 0) {
cout << "Cost at iteration " << iter << ": " << computeCost(X, Y) << endl;
}
}
}
// Predict the output for a given input
double predict(const vector<double>& x) const {
double y_pred = bias;
for (int j = 0; j < x.size(); ++j) {
y_pred += weights[j] * x[j];
}
return y_pred;
}
// Print the model's parameters
void printParameters() const {
cout << "Weights: ";
for (double w : weights) {
cout << w << " ";
}
cout << "\nBias: " << bias << endl;
}
// Feature normalization (scaling features between 0 and 1)
static void normalizeFeatures(vector<vector<double>>& X) {
int n = X.size();
int m = X[0].size();
for (int j = 0; j < m; ++j) {
double min_val = X[0][j];
double max_val = X[0][j];
for (int i = 1; i < n; ++i) {
if (X[i][j] < min_val) min_val = X[i][j];
if (X[i][j] > max_val) max_val = X[i][j];
}
for (int i = 0; i < n; ++i) {
X[i][j] = (X[i][j] - min_val) / (max_val - min_val);
}
}
}
};
int main() {
// Example data with two features (X1, X2) and output Y
vector<vector<double>> X = {
{1, 2},
{2, 3},
{3, 4},
{4, 5},
{5, 6}
};
vector<double> Y = { 5, 7, 9, 11, 13 };
// Normalize features
LinearRegression::normalizeFeatures(X);
// Hyperparameters
double learning_rate = 0.01;
int iterations = 1000;
// Create Linear Regression model
LinearRegression lr(learning_rate, iterations);
// Train the model
lr.fit(X, Y);
// Print the learned parameters
lr.printParameters();
// Predict a value
vector<double> x_new = { 6, 7 };
double y_pred = lr.predict(x_new);
cout << "Predicted value for x = [6, 7] is y = " << y_pred << endl;
return 0;
}
Very interesting article. Never even knew some of these built in functions :0