Find the value that occurs most frequently and indicate relative frequency

I’d probably use a two step solution. First, create a data.frame of frequency/relative frequency. Then join to it. We use slice(which.max()), because it will return one row. Using slice_max may return multiple rows.

library(tidyverse)
# count by id, response, calculate rel frequency
# rename columns to make inner_join easier
freq_table <- dd %>%
  count(id, response) %>%
  group_by(id) %>%
  mutate(rel_freq = n / sum(n)) %>%
  select(id, most_frequent_response = response, rel_freq)

# inner join to sliced freq_table (grouping by id is preserved)
dd %>%
  inner_join(freq_table %>% slice(which.max(rel_freq)))

#    id response most_frequent_response  rel_freq
# 1   1        2                      3 0.4166667
# 2   1        2                      3 0.4166667
# 3   1        3                      3 0.4166667
# 4   1        3                      3 0.4166667
# 5   1        6                      3 0.4166667
# ...          

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top