Pragmatic Data Science | 4, feature scaling

Feature scaling is the task of taking a parameter and rescale it to any predefined interval. If this is done with all parameters it can be said that the data set is normalized. After the feature scaling the internal structure of the data is kept, meaning that if one value was much bigger than some other value this proportion is persistent after the feature scaling.

By using the min and max value of all seen examples of a variable we can create a sense of whether this is the biggest value seen (100%) or the smallest value seen (0%). If we think of this as percentage we can say that the features have been scaled between 0 and 100. The interval of the scaling can be different depending of what you are trying to achieve but the most common scenario is to scale features [0, 1]. In this sense it can also be thought of as a percentage in decimal form.

There are different type of feature scalings but a very common is the min/max scaling that we have previously discussed. The new value x-prime is givenĀ  by subtracting the min value of x over the interval of x (given by taking the max value of x minus the min value of x). Other common data normalization techniques include the mean normalization.

Here, the x-prime is instead given by how much from the x value is deviating from the mean. If x is 20% less than the mean this normalization will yield -0.2 while if x is 20% more than the mean it will yield 0.2. Therefore this normalization will have the domain [-1, 1].