how do i create a subset of data for most common duplicates?

I just manipulated your data a little bit to demostrate how the problem could be solved:

# Just changed the last two states to Texas so that you get a two line result (not just one)
election <- data.frame(State = c("Alabama", "Alabama", "Texas", "Texas"),
                       Candidate = c("D J Trump", "Clinton", "Gary Johnson", "Other"),
                       candidatevotes = c(1318255, 729547, 44467, 21712),
                       totalvotes = c(2123372, 2123372, 2123372, 2123372))
# need library
library(dplyr)

election %>% 
  # group by the variable you want the max value for (State)
  dplyr::group_by(State) %>% 
  # get the lines with maximum candidatevotes for each State
  dplyr::filter(candidatevotes == max(candidatevotes))

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top