how can i remove special digits in dataframe columns in r?

Too long to type as a comment, using the only example you have provided and another invented example to show you don’t need the sapply() :

d2 = data.frame(GISJOIN=c("31000109654001","12345678910112"))
d2$GISJOIN = as.character(d2$GISJOIN)

What you have now:

splitted <- as.data.frame(t(sapply(d2$GISJOIN, function(x) substring(x, first=c(1,4,8), last=c(2,6,14)))))

splitted$v4 <- (paste(splitted$V1, splitted$V2, splitted$V3))

               V1  V2      V3             v4
31000109654001 31 001 9654001 31 001 9654001
12345678910112 12 456 8910112 12 456 8910112

The new string still has spaces in between, hence if you convert as.numeric() it gives NA. Below I just split it into characters and exclude position 3 and 7:

d2$new = lapply(strsplit(d2$GISJOIN,""),function(i){
                          paste(i[-c(3,7)],collapse="")
                           })

as.numeric(d2$new)
[1] 310019654001 124568910112

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top