Saturday, 16 May 2009

Linear Regression in C#

When looking at time series data, such as a stream of prices, it can often be useful to establish a general trend and represent this with a single number. This can be achieved using a linear regression calculation.

Take this series of prices:
4.8, 4.8, 4.5, 3.9, 4.4, 3.6, 3.6, 2.9, 3.5, 3.0, 2.5, 2.2, 2.6, 2.1, 2.2

If you plot on an Excel graph and add a linear trend line, you should get something like this:



We can do the same thing in code:

using System;

 

class Regression

{

    static void Main(string[] args)

    {

        double[] values = { 4.8, 4.8, 4.5, 3.9, 4.4, 3.6, 3.6, 2.9, 3.5, 3.0, 2.5, 2.2, 2.6, 2.1, 2.2 };

 

        double xAvg = 0;

        double yAvg = 0;

 

        for (int x = 0; x < values.Length; x++)

        {

            xAvg += x;

            yAvg += values[x];

        }

 

        xAvg = xAvg / values.Length;

        yAvg = yAvg / values.Length;

 

        double v1 = 0;

        double v2 = 0;

 

        for (int x = 0; x < values.Length; x++)

        {

            v1 += (x - xAvg) * (values[x] - yAvg);

            v2 += Math.Pow(x - xAvg, 2);

        }

 

        double a = v1 / v2;

        double b = yAvg - a * xAvg;

 

        Console.WriteLine("y = ax + b");

        Console.WriteLine("a = {0}, the slope of the trend line.", Math.Round(a, 2));

        Console.WriteLine("b = {0}, the intercept of the trend line.", Math.Round(b, 2));

 

        Console.ReadLine();

    }

}



Now you have the slope of the trend line, this can be used as an input for neural networks analysing time series data. I use something similar in NNATS…

For a complete explanation of linear regression see Wikipedia.

John