I've taken Irritate's answer and refactored it so as to minimize the computational steps for subsequent computations by factoring it into the fewest constants. The motivation is to allow a scaler to be trained on one set of data, and then be run on new data (for an ML algo). In effect, it's much like SciKit's preprocessing MinMaxScaler for Python in usage.
Thus, x' = (b-a)(x-min)/(max-min) + a
(where b!=a) becomes x' = x(b-a)/(max-min) + min(-b+a)/(max-min) + a
which can be reduced to two constants in the form x' = x*Part1 + Part2
.
Here's a C# implementation with two constructors: one to train, and one to reload a trained instance (e.g., to support persistence).
public class MinMaxColumnSpec
{
/// <summary>
/// To reduce repetitive computations, the min-max formula has been refactored so that the portions that remain constant are just computed once.
/// This transforms the forumula from
/// x' = (b-a)(x-min)/(max-min) + a
/// into x' = x(b-a)/(max-min) + min(-b+a)/(max-min) + a
/// which can be further factored into
/// x' = x*Part1 + Part2
/// </summary>
public readonly double Part1, Part2;
/// <summary>
/// Use this ctor to train a new scaler.
/// </summary>
public MinMaxColumnSpec(double[] columnValues, int newMin = 0, int newMax = 1)
{
if (newMax <= newMin)
throw new ArgumentOutOfRangeException("newMax", "newMax must be greater than newMin");
var oldMax = columnValues.Max();
var oldMin = columnValues.Min();
Part1 = (newMax - newMin) / (oldMax - oldMin);
Part2 = newMin + (oldMin * (newMin - newMax) / (oldMax - oldMin));
}
/// <summary>
/// Use this ctor for previously-trained scalers with known constants.
/// </summary>
public MinMaxColumnSpec(double part1, double part2)
{
Part1 = part1;
Part2 = part2;
}
public double Scale(double x) => (x * Part1) + Part2;
}