好吧有问题说postgres不使用order by但我的情况是错误使用的地方.
没有索引的排序 – 缓存结果后的热运行.需要8.48秒
explain (analyze,buffers) select * from users order by userid limit 100000;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Limit (cost=246372.98..246622.98 rows=100000 width=72) (actual time=8451.119..8479.138 rows=100000 loops=1)
Buffers: shared hit=16134 read=35121
-> Sort (cost=246372.98..251348.03 rows=1990021 width=72) (actual time=8451.117..8467.403 rows=100000 loops=1)
Sort Key: userid
Sort Method: top-N heapsort Memory: 20207kB
Buffers: shared hit=16134 read=35121
-> Seq Scan on users (cost=0.00..71155.21 rows=1990021 width=72) (actual time=25.448..7782.830 rows=1995958 loops=1)
Buffers: shared hit=16134 read=35121
Planning time: 40.542 ms
Execution time: 8487.556 ms
(10 rows)
使用userid列上的索引进行排序.使用更多磁盘I / O并占用高达6.2分钟
explain (analyze,buffers) select * from users order by userid limit 100000;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..12771.83 rows=100000 width=72) (actual time=35.498..372437.748 rows=100000 loops=1)
Buffers: shared hit=60846 read=39425
-> Index Scan using users_userid_idx on users (cost=0.43..255288.96 rows=1998907 width=72) (actual time=35.496..372372.192 rows=100000 loops=1)
Buffers: shared hit=60846 read=39425
Planning time: 0.160 ms
Execution time: 372476.536 ms
(6 rows)
很少有事情需要注意
>在运行两个查询之前,我运行了真空分析.
>两者都是热运行,即我在运行3-4次后接受它们
>有足够的工作mem,它使用前N个堆.虽然问题是没有索引的排序更快.
我的问题不是改善秩序,而是要理解规划师错误估计的原因.在写这个问题的那一刻,我在postgres 9.4上运行了我的Mac OSx上的这些查询.我没有任何其他具有不同操作系统的机器来测试那一刻,也许很快就会生病.
任何人都可以确认这是否是规划师的错误,或者我的机器有问题.
最佳答案 我对实际发生的事情感到非常难过.在我做了以下步骤之后,这是新的统计数据.
>重启我的Mac
>将共享缓冲区更改为256 MB(以前为128 MB)
>重新启动postgres
在我做了这些之后,这里是新的统计数据.
explain (analyze,buffers) select * from users order by userid limit 100000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..12788.49 rows=100000 width=72) (actual time=0.031..78.785 rows=100000 loops=1)
Buffers: shared hit=100271
-> Index Scan using users_userid_idx on users (cost=0.43..255244.73 rows=1995958 width=72) (actual time=0.030..65.937 rows=100000 loops=1)
Buffers: shared hit=100271
Planning time: 0.119 ms
Execution time: 84.985 ms
(6 rows)
唯一的变化是没有磁盘I / O,因为所有内容都被缓存,可能是因为增加了共享缓冲区.但实际时间变化超出了逻辑.
没有指数的正常的前N个堆也有所改善.
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Limit (cost=246955.09..247205.09 rows=100000 width=72) (actual time=707.350..734.954 rows=100000 loops=1)
Buffers: shared hit=26071 read=25184
-> Sort (cost=246955.09..251944.99 rows=1995958 width=72) (actual time=707.348..723.127 rows=100000 loops=1)
Sort Key: userid
Sort Method: top-N heapsort Memory: 20207kB
Buffers: shared hit=26071 read=25184
-> Seq Scan on users (cost=0.00..71214.58 rows=1995958 width=72) (actual time=9.922..270.684 rows=1995958 loops=1)
Buffers: shared hit=26071 read=25184
Planning time: 0.090 ms
Execution time: 743.788 ms
(10 rows)
随着共享缓冲区更改回128 MB,结果仍然很好.
explain (analyze,buffers) select * from users order by userid limit 100000;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..12788.49 rows=100000 width=72) (actual time=0.098..232.314 rows=100000 loops=1)
Buffers: shared hit=61313 read=38958
-> Index Scan using users_userid_idx on users (cost=0.43..255244.73 rows=1995958 width=72) (actual time=0.096..218.272 rows=100000 loops=1)
Buffers: shared hit=61313 read=38958
Planning time: 0.131 ms
Execution time: 238.861 ms
(6 rows)
explain (analyze,buffers) select * from users order by userid limit 100000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Limit (cost=246955.09..247205.09 rows=100000 width=72) (actual time=722.003..749.696 rows=100000 loops=1)
Buffers: shared hit=16192 read=35063
-> Sort (cost=246955.09..251944.99 rows=1995958 width=72) (actual time=722.001..737.715 rows=100000 loops=1)
Sort Key: userid
Sort Method: top-N heapsort Memory: 20207kB
Buffers: shared hit=16192 read=35063
-> Seq Scan on users (cost=0.00..71214.58 rows=1995958 width=72) (actual time=8.584..294.605 rows=1995958 loops=1)
Buffers: shared hit=16192 read=35063
Planning time: 0.070 ms
Execution time: 757.495 ms
(10 rows)
我听说有人说不要在Mac /台式机上取得计时结果,但这完全是疯了.