how do i split the file and build a new variable with condition?

Here is an option with cumsum to create the ‘page’ based on the substring ‘ID’ in the ‘Name’ column, then grouped by that ‘page’, if the second element of ‘Name’ is ‘mood’, flag as ‘Y’ or else ‘N’

library(dplyr)
library(stringr)
Sample.data %>% 
   group_by(page = cumsum(str_detect(Name, '(?i)form:ID'))) %>% 
   mutate(Keep = if(Name[2] == 'mood') 'Y' else 'N') %>%
   ungroup %>%
   mutate(across(c(Keep, page), ~ replace(.,  Name == "", "")))

-output

# A tibble: 17 x 3
#   Name      page  Keep 
#   <chr>     <chr> <chr>
# 1 "form:ID" "1"   "N"  
# 2 "dave"    "1"   "N"  
# 3 "mike"    "1"   "N"  
# 4 "marry"   "1"   "N"  
# 5 "rose"    "1"   "N"  
# 6 ""        ""    ""   
# 7 "Form:ID" "2"   "Y"  
# 8 "mood"    "2"   "Y"  
# 9 "happy"   "2"   "Y"  
#10 "sad"     "2"   "Y"  
#11 "angry"   "2"   "Y"  
#12 ""        ""    ""   
#13 "Form:ID" "3"   "N"  
#14 "dave"    "3"   "N"  
#15 "mike"    "3"   "N"  
#16 "marry"   "3"   "N"  
#17 "rose"    "3"   "N"  

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top