randomForest:na.fail.default中的错误:对象中缺少值

我尝试用交叉验证训练一个随机森林并使用插入包来训练射频:

### variable return_customer = binary variable
idx.train <- createDataPartition(y = known$return_customer, p = 0.8, list = FALSE)
train <- known[idx.train, ]
test <- known[-idx.train, ]
k <- 10
set.seed(123)
model.control <- trainControl(method = "cv", number = k, classProbs = TRUE, summaryFunction = twoClassSummary,  allowParallel = TRUE)
rf.parms <- expand.grid(mtry = 1:10)
rf.caret <- train(return_customer~., data = train, method = "rf", ntree = 500, tuneGrid = rf.parms, metric = "ROC", trControl = model.control)

运行train函数时,我收到此错误代码,但return_customer中没有缺失值:

Error in na.fail.default(list(return_customer = c(0L, 0L, 0L, 0L, 0L, :
missing values in object

我想了解为什么函数正在读取数据中的缺失值以及如何解决此问题.我知道在论坛中有类似的问题,但我无法修复我的代码.谢谢!

最佳答案 缺少值将在您的预测变量中.

尝试使用此代码删除具有空值的行:

row.has.na <- apply(train, 1, function(x){any(is.na(x))})
predictors_no_NA <- train[!row.has.na, ]

希望它有所帮助.

点赞