【描述】
使用loadrunner 对 sdb 集群做 lob 写入压力测试,lob 的pagesize = 256KB,文件大小为200KB。
【错误信息】
集群中的某些节点,在持续压力写入下,会自动退出。
从退出的节点日志中发现以下错误信息。
2019-02-19-16.26.44.197188 Level:SEVERE
PID:421798 TID:422062
Function:createNewEDU Line:760
File:SequoiaDB/engine/pmd/pmdEDUMgr.cpp
Message:
Failed to create new agent: boost::thread_resource_error: Resource temporarily unavailable
2019-02-19-16.26.44.197210 Level:ERROR
PID:421798 TID:422062
Function:createNewEDU Line:780
File:SequoiaDB/engine/pmd/pmdEDUMgr.cpp
Message:
Failed to create new agent, probe = 30
2019-02-19-16.26.44.197226 Level:ERROR
PID:421798 TID:422062
Function:_startSessionEDU Line:1018
File:SequoiaDB/engine/pmd/pmdAsyncSession.cpp
Message:
Failed to create subagent thread, rc: -10
2019-02-19-16.26.44.197251 Level:ERROR
PID:421798 TID:422062
Function:getSession Line:943
File:SequoiaDB/engine/pmd/pmdAsyncSession.cpp
Message:
Failed to start session EDU, rc = -10
2019-02-19-16.26.44.197268 Level:ERROR
PID:421798 TID:422062
Function:_handleSessionMsg Line:282
File:SequoiaDB/engine/pmd/pmdAsyncHandler.cpp
Message:
Failed to create session[ID:305002807972062], rc: -10
然后节点在退出时的日志:
2019-02-19-16.30.38.931076 Level:ERROR
PID:421798 TID:422310
Function:startJob Line:179
File:SequoiaDB/engine/rtn/rtnBackgroundJobBase.cpp
Message:
Start background job[Job[ReplSync]] failed, rc = -10
2019-02-19-16.30.38.931095 Level:SEVERE
PID:421798 TID:422310
Function:_pushData Line:501
File:SequoiaDB/engine/cls/clsReplBucket.cpp
Message:
Start repl-sync session failed, rc: -10.The node can’t to sync data from other node, need to restart
2019-02-19-16.30.38.931116 Level:ERROR
PID:421798 TID:422310
Function:replayByBucket Line:241
File:SequoiaDB/engine/cls/clsReplayer.cpp
Message:
Failed to push log to bucket, rc: -10
2019-02-19-16.30.38.931154 Level:ERROR
PID:421798 TID:422310
Function:replayByBucket Line:295
File:SequoiaDB/engine/cls/clsReplayer.cpp
Message:
sync bucket: replay log [type:1, lsn:14481259800, data:
Version: 0x00000003(3)
LSN : 0x000000035f267d18(14481259800)
PreLSN : 0x000000035f2339ac(14481045932)
Length : 296
Type : INSERT(1)
FullName : SCM.SCMBATCH
Insert : { “_id”: { “$oid”: “5c6bbdd1e4b0a374109f238e” }, “ECM_BATCH_ID”: “fb028a026e3c4d6f9e976897ecb38abf_test”, “CREATE_DATE”: “20190219162657”, “ECM_S_VERSION”: 1, “CREATE_USERID”: “admin”, “LAST_CHANGED_DATE”: “20190219162657”, “LAST_CHANGED_USERID”: “admin”, “IS_DELETED”: 0 }
] failed, rc: -10
2019-02-19-16.30.38.931177 Level:ERROR
PID:421798 TID:422310
Function:_replayLog Line:1015
File:SequoiaDB/engine/cls/clsReplSession.cpp
Message:
Session[Type:Sync-Dest,NodeID:1075,TID:1]: Failed to replay log, rc: -10
详细的节点日志和节点配置文件,详见附件
【咨询】
请问一下,这个-10 错误,是什么原因导致的。
节点是由于长时间无法创建同步线程,然后就自动退出吗?