how to use regex to add a sign to a string only for specific strings that don’t have it?

You may make judicious use of lookarounds here to ensure that the percent sign only gets added where you want it:

df$comment <- gsub("\\b(\\d+\\.\\d+)\\b(?![%.])", "\\1%", df$comment, perl=TRUE)
df

                                comment
1   3.22%-1ST $100000 AND 1.15% BALANCE
2 3.25%  1ST $100000 AND 1.16%  BALANCE
3   3.22% 1ST 100000 AND 1.16%  BALANCE
4   3.22% 1ST 100000 AND 1.15%  BALANCE
5                   3.26%-100 AND 1.16%
6                   3.26%-100 AND 1.16%

Note that I assume here that you only want to target decimal numbers. If you also might want to target integers, then we would need more information about the context of all replacements.

The regex pattern says to:

\b            match a word boundary (start of the number)
(             capture
    \d+\.\d+  a number with a decimal component
)             end capture
\b            word boundary
(?![%.])      assert that what follows is NOT % or .

Note that the final negative lookahead prevents replacements from being made on numbers which already have %, or the integer component of of a decimal number.

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top