[c#] How do I calculate a trendline for a graph?

Google is not being my friend - it's been a long time since my stats class in college...I need to calculate the start and end points for a trendline on a graph - is there an easy way to do this? (working in C# but whatever language works for you)

This question is related to c# math graph

The answer is


Here's what I ended up using.

public class DataPoint<T1,T2>
{
    public DataPoint(T1 x, T2 y)
    {
        X = x;
        Y = y;
    }

    [JsonProperty("x")]
    public T1 X { get; }

    [JsonProperty("y")]
    public T2 Y { get; }
}

public class Trendline
{
    public Trendline(IEnumerable<DataPoint<long, decimal>> dataPoints)
    {
        int count = 0;
        long sumX = 0;
        long sumX2 = 0;
        decimal sumY = 0;
        decimal sumXY = 0;

        foreach (var dataPoint in dataPoints)
        {
            count++;
            sumX += dataPoint.X;
            sumX2 += dataPoint.X * dataPoint.X;
            sumY += dataPoint.Y;
            sumXY += dataPoint.X * dataPoint.Y;
        }

        Slope = (sumXY - ((sumX * sumY) / count)) / (sumX2 - ((sumX * sumX) / count));
        Intercept = (sumY / count) - (Slope * (sumX / count));
    }

    public decimal Slope { get; private set; }
    public decimal Intercept { get; private set; }
    public decimal Start { get; private set; }
    public decimal End { get; private set; }

    public decimal GetYValue(decimal xValue)
    {
        return Slope * xValue + Intercept;
    }
}

My data set is using a Unix timestamp for the x-axis and a decimal for the y. Change those datatypes to fit your need. I do all the sum calculations in one iteration for the best possible performance.


This is the way i calculated the slope: Source: http://classroom.synonym.com/calculate-trendline-2709.html

class Program
    {
        public double CalculateTrendlineSlope(List<Point> graph)
        {
            int n = graph.Count;
            double a = 0;
            double b = 0;
            double bx = 0;
            double by = 0;
            double c = 0;
            double d = 0;
            double slope = 0;

            foreach (Point point in graph)
            {
                a += point.x * point.y;
                bx = point.x;
                by = point.y;
                c += Math.Pow(point.x, 2);
                d += point.x;
            }
            a *= n;
            b = bx * by;
            c *= n;
            d = Math.Pow(d, 2);

            slope = (a - b) / (c - d);
            return slope;
        }
    }

    class Point
    {
        public double x;
        public double y;
    }

Here is a very quick (and semi-dirty) implementation of Bedwyr Humphreys's answer. The interface should be compatible with @matt's answer as well, but uses decimal instead of int and uses more IEnumerable concepts to hopefully make it easier to use and read.

Slope is b, Intercept is a

public class Trendline
{
    public Trendline(IList<decimal> yAxisValues, IList<decimal> xAxisValues)
        : this(yAxisValues.Select((t, i) => new Tuple<decimal, decimal>(xAxisValues[i], t)))
    { }
    public Trendline(IEnumerable<Tuple<Decimal, Decimal>> data)
    {
        var cachedData = data.ToList();

        var n = cachedData.Count;
        var sumX = cachedData.Sum(x => x.Item1);
        var sumX2 = cachedData.Sum(x => x.Item1 * x.Item1);
        var sumY = cachedData.Sum(x => x.Item2);
        var sumXY = cachedData.Sum(x => x.Item1 * x.Item2);

        //b = (sum(x*y) - sum(x)sum(y)/n)
        //      / (sum(x^2) - sum(x)^2/n)
        Slope = (sumXY - ((sumX * sumY) / n))
                    / (sumX2 - (sumX * sumX / n));

        //a = sum(y)/n - b(sum(x)/n)
        Intercept = (sumY / n) - (Slope * (sumX / n));

        Start = GetYValue(cachedData.Min(a => a.Item1));
        End = GetYValue(cachedData.Max(a => a.Item1));
    }

    public decimal Slope { get; private set; }
    public decimal Intercept { get; private set; }
    public decimal Start { get; private set; }
    public decimal End { get; private set; }

    public decimal GetYValue(decimal xValue)
    {
        return Intercept + Slope * xValue;
    }
}

Here's a working example in golang. I searched around and found this page and converted this over to what I needed. Hope someone else can find it useful.

// https://classroom.synonym.com/calculate-trendline-2709.html
package main

import (
    "fmt"
    "math"
)

func main() {

    graph := [][]float64{
        {1, 3},
        {2, 5},
        {3, 6.5},
    }

    n := len(graph)

    // get the slope
    var a float64
    var b float64
    var bx float64
    var by float64
    var c float64
    var d float64
    var slope float64

    for _, point := range graph {

        a += point[0] * point[1]
        bx += point[0]
        by += point[1]
        c += math.Pow(point[0], 2)
        d += point[0]

    }

    a *= float64(n)           // 97.5
    b = bx * by               // 87
    c *= float64(n)           // 42
    d = math.Pow(d, 2)        // 36
    slope = (a - b) / (c - d) // 1.75

    // calculating the y-intercept (b) of the Trendline
    var e float64
    var f float64

    e = by                            // 14.5
    f = slope * bx                    // 10.5
    intercept := (e - f) / float64(n) // (14.5 - 10.5) / 3 = 1.3

    // output
    fmt.Println(slope)
    fmt.Println(intercept)

}

If anyone needs the JS code for calculating the trendline of many points on a graph, here's what worked for us in the end:

_x000D_
_x000D_
/**@typedef {{_x000D_
 * x: Number;_x000D_
 * y:Number;_x000D_
 * }} Point_x000D_
 * @param {Point[]} data_x000D_
 * @returns {Function} */_x000D_
function _getTrendlineEq(data) {_x000D_
    const xySum = data.reduce((acc, item) => {_x000D_
        const xy = item.x * item.y_x000D_
        acc += xy_x000D_
        return acc_x000D_
    }, 0)_x000D_
    const xSum = data.reduce((acc, item) => {_x000D_
        acc += item.x_x000D_
        return acc_x000D_
    }, 0)_x000D_
    const ySum = data.reduce((acc, item) => {_x000D_
        acc += item.y_x000D_
        return acc_x000D_
    }, 0)_x000D_
    const aTop = (data.length * xySum) - (xSum * ySum)_x000D_
    const xSquaredSum = data.reduce((acc, item) => {_x000D_
        const xSquared = item.x * item.x_x000D_
        acc += xSquared_x000D_
        return acc_x000D_
    }, 0)_x000D_
    const aBottom = (data.length * xSquaredSum) - (xSum * xSum)_x000D_
    const a = aTop / aBottom_x000D_
    const bTop = ySum - (a * xSum)_x000D_
    const b = bTop / data.length_x000D_
    return function trendline(x) {_x000D_
        return a * x + b_x000D_
    }_x000D_
}
_x000D_
_x000D_
_x000D_

It takes an array of (x,y) points and returns the function of a y given a certain x Have fun :)


Thank You so much for the solution, I was scratching my head.
Here's how I applied the solution in Excel.
I successfully used the two functions given by MUHD in Excel:
a = (sum(x*y) - sum(x)sum(y)/n) / (sum(x^2) - sum(x)^2/n)
b = sum(y)/n - b(sum(x)/n)
(careful my a and b are the b and a in MUHD's solution).

- Made 4 columns, for example:
NB: my values y values are in B3:B17, so I have n=15;
my x values are 1,2,3,4...15.
1. Column B: Known x's
2. Column C: Known y's
3. Column D: The computed trend line
4. Column E: B values * C values (E3=B3*C3, E4=B4*C4, ..., E17=B17*C17)
5. Column F: x squared values
I then sum the columns B,C and E, the sums go in line 18 for me, so I have B18 as sum of Xs, C18 as sum of Ys, E18 as sum of X*Y, and F18 as sum of squares.
To compute a, enter the followin formula in any cell (F35 for me):
F35=(E18-(B18*C18)/15)/(F18-(B18*B18)/15)
To compute b (in F36 for me):
F36=C18/15-F35*(B18/15)
Column D values, computing the trend line according to the y = ax + b:
D3=$F$35*B3+$F$36, D4=$F$35*B4+$F$36 and so on (until D17 for me).

Select the column datas (C2:D17) to make the graph.
HTH.


Regarding a previous answer

if (B) y = offset + slope*x

then (C) offset = y/(slope*x) is wrong

(C) should be:

offset = y-(slope*x)

See: http://zedgraph.org/wiki/index.php?title=Trend


Thanks to all for your help - I was off this issue for a couple of days and just came back to it - was able to cobble this together - not the most elegant code, but it works for my purposes - thought I'd share if anyone else encounters this issue:

public class Statistics
{
    public Trendline CalculateLinearRegression(int[] values)
    {
        var yAxisValues = new List<int>();
        var xAxisValues = new List<int>();

        for (int i = 0; i < values.Length; i++)
        {
            yAxisValues.Add(values[i]);
            xAxisValues.Add(i + 1);
        }

        return new Trendline(yAxisValues, xAxisValues);
    }
}

public class Trendline
{
    private readonly IList<int> xAxisValues;
    private readonly IList<int> yAxisValues;
    private int count;
    private int xAxisValuesSum;
    private int xxSum;
    private int xySum;
    private int yAxisValuesSum;

    public Trendline(IList<int> yAxisValues, IList<int> xAxisValues)
    {
        this.yAxisValues = yAxisValues;
        this.xAxisValues = xAxisValues;

        this.Initialize();
    }

    public int Slope { get; private set; }
    public int Intercept { get; private set; }
    public int Start { get; private set; }
    public int End { get; private set; }

    private void Initialize()
    {
        this.count = this.yAxisValues.Count;
        this.yAxisValuesSum = this.yAxisValues.Sum();
        this.xAxisValuesSum = this.xAxisValues.Sum();
        this.xxSum = 0;
        this.xySum = 0;

        for (int i = 0; i < this.count; i++)
        {
            this.xySum += (this.xAxisValues[i]*this.yAxisValues[i]);
            this.xxSum += (this.xAxisValues[i]*this.xAxisValues[i]);
        }

        this.Slope = this.CalculateSlope();
        this.Intercept = this.CalculateIntercept();
        this.Start = this.CalculateStart();
        this.End = this.CalculateEnd();
    }

    private int CalculateSlope()
    {
        try
        {
            return ((this.count*this.xySum) - (this.xAxisValuesSum*this.yAxisValuesSum))/((this.count*this.xxSum) - (this.xAxisValuesSum*this.xAxisValuesSum));
        }
        catch (DivideByZeroException)
        {
            return 0;
        }
    }

    private int CalculateIntercept()
    {
        return (this.yAxisValuesSum - (this.Slope*this.xAxisValuesSum))/this.count;
    }

    private int CalculateStart()
    {
        return (this.Slope*this.xAxisValues.First()) + this.Intercept;
    }

    private int CalculateEnd()
    {
        return (this.Slope*this.xAxisValues.Last()) + this.Intercept;
    }
}

OK, here's my best pseudo math:

The equation for your line is:

Y = a + bX

Where:

b = (sum(x*y) - sum(x)sum(y)/n) / (sum(x^2) - sum(x)^2/n)

a = sum(y)/n - b(sum(x)/n)

Where sum(xy) is the sum of all x*y etc. Not particularly clear I concede, but it's the best I can do without a sigma symbol :)

... and now with added Sigma

b = (Σ(xy) - (ΣxΣy)/n) / (Σ(x^2) - (Σx)^2/n)

a = (Σy)/n - b((Σx)/n)

Where Σ(xy) is the sum of all x*y etc. and n is the number of points


If you have access to Excel, look in the "Statistical Functions" section of the Function Reference within Help. For straight-line best-fit, you need SLOPE and INTERCEPT and the equations are right there.

Oh, hang on, they're also defined online here: http://office.microsoft.com/en-us/excel/HP052092641033.aspx for SLOPE, and there's a link to INTERCEPT. OF course, that assumes MS don't move the page, in which case try Googling for something like "SLOPE INTERCEPT EQUATION Excel site:microsoft.com" - the link given turned out third just now.


Examples related to c#

How can I convert this one line of ActionScript to C#? Microsoft Advertising SDK doesn't deliverer ads How to use a global array in C#? How to correctly write async method? C# - insert values from file into two arrays Uploading into folder in FTP? Are these methods thread safe? dotnet ef not found in .NET Core 3 HTTP Error 500.30 - ANCM In-Process Start Failure Best way to "push" into C# array

Examples related to math

How to do perspective fixing? How to pad a string with leading zeros in Python 3 How can I use "e" (Euler's number) and power operation in python 2.7 numpy max vs amax vs maximum Efficiently getting all divisors of a given number Using atan2 to find angle between two vectors How to calculate percentage when old value is ZERO Finding square root without using sqrt function? Exponentiation in Python - should I prefer ** operator instead of math.pow and math.sqrt? How do I get the total number of unique pairs of a set in the database?

Examples related to graph

How to plot multiple functions on the same figure, in Matplotlib? Python equivalent to 'hold on' in Matlab How to combine 2 plots (ggplot) into one plot? how to draw directed graphs using networkx in python? What is the difference between dynamic programming and greedy approach? Plotting using a CSV file Python equivalent of D3.js Count number of times a date occurs and make a graph out of it How do I create a chart with multiple series using different X values for each series? Rotating x axis labels in R for barplot