how can you extract strings of any size in r?

Since you don’t want to extract 'GN=' in the final output we can make use lookbehind regex and extract the first word (\\w+) after occurrence of "GN=".

string = "Signal recognition particle subunit SRP72 OS=Homo sapiens OX=9606 GN=SRP72 PE=1 SV=3"
stringr::str_extract(string, pattern = "(?<=GN=)\\w+")
#[1] "SRP72"

In base R, we can use sub :

sub('.*GN=(\\w+).*', '\\1', string)

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top