Calculate standard deviation using AWK

The standard deviation ? (sigma) is the square root of the average value of (X – ?)2.

In the case where X takes random values from a finite data set x1, x2, …, xN, with each value having the same probability, the standard deviation is

  where mu

Assume we have an input file foo with f.ex. line number in first column and in the second column ($2 in awk) we have the values of interest.

File: foo

1 2
2 3
3 6
4 8
5 11

Use the one of the following awk commandos to calculate the standard deviation

awk ‘{sum+=$2; array[NR]=$2} END {for(x=1;x<=NR;x++){sumsq+=((array[x]-(sum/NR))^2);}print sqrt(sumsq/NR)}’ foo

awk ‘{sum+=$2;sumsq+=$2*$2} END {print sqrt(sumsq/NR – (sum/NR)^2)}’ foo

The result is

3.28634

Average

Here you may find how to calculate the average or arithmetic mean using AWK.

Minimum and maximum

Here you may find how to calculate the minimum and maximum values using AWK.