It may be better to use .I
and .SDcols
.
dt[dt[, .I[.SD[[1]] > 0], .SDcols = varName], (newVarName) := .SD[[1]],
.SDcols = varName]
In the third expression, error occured because it is trying to subset the column from the whole dataset where the length is different. Instead, we could use .SD
dt[dt[[varName]]>0, (newVarName):= .SD[[varName]]]
Benchmarks
set.seed(24)
dt <- data.table(education = sample(0:50, 682446, replace = TRUE))
dt1 <- copy(dt)
varName <- 'education'
newVarName <- paste0(varName, 'NewVersion')
system.time(dt[dt[[varName]]>0, (newVarName):= .SD[[varName]]])
# user system elapsed
# 0.022 0.003 0.026
system.time( dt1[dt1[, .I[.SD[[1]] > 0], .SDcols = varName],
(newVarName) := .SD[[1]],
.SDcols = varName])
# user system elapsed
# 0.023 0.003 0.024
CLICK HERE to find out more related problems solutions.