how can i find trios of duplicated values among different columns?

One way would be to calculate the cross product of the docs by the fruits, and use the indices of the values that are equal to or greater than three to subset the original data:

res <- `diag<-`(crossprod(table(cbind(stack(dat, select = -1)[-2], dat[[1]]))), 0)

dat[which(res >= 3, arr.ind = TRUE)[, 2], ]

  doc     A      B      C     D
2 DOC2 prune  apple banana berry
4 DOC4 berry banana   pear prune

Data:

dat <- read.table(text ="doc      A                B                C               D
1 DOC1    apple           coconut            berry           pear 
2 DOC2    prune            apple            banana           berry
3 DOC3  coconut           cherry             apple          banana
4 DOC4    berry           banana             pear            prune", header = TRUE)

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top