Python统计模块statistics用法精要

      有些统计学术语把握不是特别准确,担心有翻译错的,所以在不确定的地方保留了英文原文,如果有翻译错的也请路过的同行指出,多谢!
1、mean()
计算平均值 >>> import statistics >>> statistics.mean([1, 2, 3, 4, 5, 6, 7, 8, 9]) 5.0 >>> statistics.mean(range(1,10)) 5.0 >>> import fractions >>> x = [(3, 7), (1, 21), (5, 3), (1, 3)] >>> y = [fractions.Fraction(*item) for item in x] >>> y [Fraction(3, 7), Fraction(1, 21), Fraction(5, 3), Fraction(1, 3)] >>> statistics.mean(y) Fraction(13, 21) >>> import decimal >>> x = (‘0.5’, ‘0.75’, ‘0.625’, ‘0.375’) >>> y = map(decimal.Decimal, x) >>> y <map object at 0x00000000033465C0> >>> list(y) [Decimal(‘0.5’), Decimal(‘0.75’), Decimal(‘0.625’), Decimal(‘0.375’)] >>> statistics.mean(y) Traceback (most recent call last):   File “<pyshell#411>”, line 1, in <module>     statistics.mean(y)   File “C:\Python 3.5\lib\statistics.py”, line 292, in mean     raise StatisticsError(‘mean requires at least one data point’) statistics.StatisticsError: mean requires at least one data point >>> list(y) [] >>> y = map(decimal.Decimal, x) >>> statistics.mean(y) Decimal(‘0.5625’)
2、median()、median_low()、median_high()、median_grouped() 各种中位数 >>> statistics.median([1, 3, 5, 7]) 4.0 >>> statistics.median_low([1, 3, 5, 7]) 3 >>> statistics.median_high([1, 3, 5, 7]) 5 >>> statistics.median([1, 3, 7]) 3 >>> statistics.median([5, 3, 7]) 5 >>> statistics.median(range(1,10)) 5 >>> statistics.median_low([5, 3, 7]) 5 >>> statistics.median_high([5, 3, 7]) 5 >>> statistics.median_grouped([5, 3, 7]) 5.0 >>> statistics.median_grouped([5, 3, 7, 1]) 4.5 >>> statistics.median_grouped([52, 52, 53, 54]) 52.5 >>> statistics.median_low([52, 52, 53, 54]) 52 >>> statistics.median_high([52, 52, 53, 54]) 53 >>> statistics.median_high([1, 3, 3, 5, 7]) 3 >>> statistics.median_low([1, 3, 3, 5, 7]) 3 >>> statistics.median_grouped([1, 3, 3, 5, 7]) 3.25 >>> statistics.median_grouped([1, 2, 2, 3, 4, 4, 4, 4, 4, 5]) 3.7 >>> statistics.median_grouped([1, 2, 2, 3, 4, 4, 4, 4, 4, 5], interval=2) 3.4
3、mode() 返回最常见数据或出现次数最多的数据(most common data) >>> statistics.mode([1, 3, 5, 7]) Traceback (most recent call last):   File “<pyshell#435>”, line 1, in <module>     statistics.mode([1, 3, 5, 7])   File “C:\Python 3.5\lib\statistics.py”, line 434, in mode     ‘no unique mode; found %d equally common values’ % len(table) statistics.StatisticsError: no unique mode; found 4 equally common values >>> statistics.mode([1, 3, 5, 7, 3]) 3 >>> statistics.mode([1, 3, 5, 7, 3, 5]) Traceback (most recent call last):   File “<pyshell#437>”, line 1, in <module>     statistics.mode([1, 3, 5, 7, 3, 5])   File “C:\Python 3.5\lib\statistics.py”, line 434, in mode     ‘no unique mode; found %d equally common values’ % len(table) statistics.StatisticsError: no unique mode; found 2 equally common values >>> statistics.mode([1, 3, 5, 7, 3, 5, 5]) 5 >>> statistics.mode([“red”, “blue”, “blue”, “red”, “green”, “red”, “red”]) ‘red’ >>> statistics.mode(list(range(5)) + [3]) 3
4、pstdev() 返回总体标准差(population standard deviation ,the square root of the population variance)。 >>> statistics.pstdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75]) 0.986893273527251 >>> statistics.pstdev(range(20)) 5.766281297335398 >>> statistics.pstdev([1, 2, 3, 4, 5, 10, 9, 8, 7, 6]) 2.8722813232690143
5、pvariance() 返回总体方差(population variance)或二次矩(second moment)。 >>> statistics.pvariance([1.5, 2.5, 2.5, 2.75, 3.25, 4.75]) 0.9739583333333334 >>> statistics.pvariance([1, 2, 3, 4, 5, 10, 9, 8, 7, 6]) 8.25 >>> x = [1, 2, 3, 4, 5, 10, 9, 8, 7, 6] >>> mu = statistics.mean(x) >>> mu 5.5 >>> statistics.pvariance([1, 2, 3, 4, 5, 10, 9, 8, 7, 6], mu) 8.25 >>> statistics.pvariance(range(20)) 33.25 >>> statistics.pvariance((random.randint(1,10000) for i in range(30))) 10903549.933333334
6、variance()、stdev() 计算样本方差(
sample variance
)和样本标准差(
sample standard deviation,
the square root of the sample variance,
也叫均方差)。 >>> statistics.variance((random.randint(1,10000) for i in range(30))) 10229013.655172413 >>> statistics.stdev((random.randint(1,10000) for i in range(30))) 3106.2902337180203 >>> _ * _ #注意,上面的两个样本数据并不一样,因为都是随机数 9649039.016091954 >>> statistics.variance(range(20)) 35.0 >>> statistics.stdev(range(20)) 5.916079783099616 >>> _ * _ 35.0 >>> statistics.variance([1, 2, 3, 4, 5, 10, 9, 8, 7, 6]) 9.166666666666666 >>> statistics.stdev([1, 2, 3, 4, 5, 10, 9, 8, 7, 6]) 3.0276503540974917 >>> statistics.variance([1.5, 2.5, 2.5, 2.75, 3.25, 4.75]) 1.16875 >>> statistics.stdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75]) 1.0810874155219827 >>> _ * _ 1.1687500000000002 >>> statistics.variance([3, 3, 3, 3, 3, 3]) 0.0 >>> statistics.stdev([3, 3, 3, 3, 3, 3])

0.0 

原文地址:http://user.qzone.qq.com/306467355/blog/1446598412

更多精彩内容请访问作者QQ空间。

    原文作者:dongfuguo
    原文地址: https://blog.csdn.net/dongfuguo/article/details/50163757
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞