关于重复行问题:
在SQL Server中则可以自动排出重复行,不需要处理。在Oracle中经常遇到upsert语法,来排出冲突行。在PostgreSQL中,也需要手动排出重复行,否则会爆出错误,upsert用法如下:
WITH upsert AS ( UPDATE TEST AS a SET updatetime=b.updatetime, msec=b.msec FROM TEMP_WRITEEVTUPDATETMSTAMP AS b WHERE a.deviceid=b.deviceid AND a.eventtype=b.eventtype RETURNING a.deviceid,a.eventtype ) INSERT INTO TEST SELECT b.deviceid, b.eventtype, b.updatetime,b.msec FROM TEMP_WRITEEVTUPDATETMSTAMP b WHERE NOT EXISTS (SELECT 1 FROM upsert as a WHERE a.deviceid=b.deviceid AND a.eventtype=b.eventtype);
经过测试,发现这种方法效率很低,于是将upsert过程拆分为两个过程,先insert不重复的数据,然后再执行update:
INSERT INTO TEST Select b.* from TEMP_WRITEEVTUPDATETMSTAMP b, (select deviceid, eventtype, max(ctid) max_ctid from TEMP_WRITEEVTUPDATETMSTAMP group by deviceid, eventtype) c where (b.deviceid = c.deviceid and b.eventtype = c.eventtype and b.ctid = c.max_ctid) and NOT EXISTS (SELECT 1 FROM TEST as a WHERE a.deviceid=b.deviceid AND a.eventtype=b.even UPDATE TEST AS a SET updatetime=b.updatetime, msec=b.msec FROM TEMP_WRITEEVTUPDATETMSTAMP AS b WHERE a.deviceid=b.deviceid AND a.eventtype=b.eventtype;
当时在存储过程中大量使用该方法来去除重复数据的插入,但是最后发现还有更好的方法来处理,upsert:
INSERT INTO TEST select * from TEMP_WRITEEVTUPDATETMSTAMP on conflict pk_test_deviceid_eventtype do update set updatetime = excluded.updatetime, msec = excluded.msec;