基于
Wikipedia描述计算方差和标准差与R中的标准函数var()和sd()相比得到了不同的结果.
差异:4对4.571429.标准差:2对2.13809.
有人建议还是解释?
> df <- c(2,4,4,4,5,5,7,9)
> df.length <- length(df)
> df.length
[1] 8
> df.mean <- sum(df) / df.length
> df.mean
[1] 5
> df.difference <- (df - df.mean)**2
> df.difference
[1] 9 1 1 1 0 0 4 16
> sum(df.difference)
[1] 32
> df.variance <- sum(df.difference) / df.length
> df.variance
[1] 4
> df.standard.deviation <- sqrt(df.variance)
> df.standard.deviation
[1] 2
> # mean, var and sd (default R)
> mean(df)
[1] 5
> var(df)
[1] 4.571429
> sd(df)
[1] 2.13809
最佳答案 它是除以n或(n-1)自由度之间的差异.
>df <- c(2,4,4,4,5,5,7,9)
> var(df)
[1] 4.571429
> sum((df-mean(df))^2/length(df))
[1] 4
> sum((df-mean(df))^2/(length(df)-1))
[1] 4.571429
它是n-1,因为…直接从维基百科复制(link)
A common way to think of degrees of freedom is as the number of
independent pieces of information available to estimate another piece
of information. More concretely, the number of degrees of freedom is
the number of independent observations in a sample of data that are
available to estimate a parameter of the population from which that
sample is drawn. For example, if we have two observations, when
calculating the mean we have two independent observations; however,
when calculating the variance, we have only one independent
observation, since the two observations are equally distant from the
mean.