r studio how do you solve a name substitution error

your first problem is, that those are not numbers but strings. hard to get the mean of a character string. second you might want to read up on regular expressions (regex)

you can check with str(name)

'data.frame':   13 obs. of  2 variables:
 $ orig   : chr  "50000000-100000000" "1000000-2000000" "5000000-10000000" "2000000-5000000" ...
 $ average: chr  "75000000" "1500000" "7500000" "3500000" ...

then:

orig <-as.numeric(sub(".+-", "", na.omit(name$orig))) #.+ will take everything before/after the "-" depending where it is

orig2<-as.numeric(sub("-.+", "", na.omit(name$orig)))

name$orig2<-orig2
name$orig_mean<-as.integer((orig+orig2)/2)

                  orig   average orig2 orig_mean
1   50000000-100000000  75000000 5e+07  75000000
2      1000000-2000000   1500000 1e+06   1500000
3     5000000-10000000   7500000 5e+06   7500000
4      2000000-5000000   3500000 2e+06   3500000
5       500000-1000000    750000 5e+05    750000
6        200000-500000    350000 2e+05    350000
7        100000-200000    150000 1e+05    150000
8         50000-100000     75000 5e+04     75000
9          20000-50000     35000 2e+04     35000
10             0-20000     10000 0e+00     10000
11   10000000-20000000  15000000 1e+07  15000000
12 100000000-200000000 150000000 1e+08 150000000
13   20000000-50000000  35000000 2e+07  35000000

this might help in the future: How to extract everything after a specific string?

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top