python creates a list of tuples of strings by splitting from the regex pattern

You may use this regex in findall:

>>> regx = re.compile(r'^(.*?)\s*\((\d+\s*-\s*\w+[^)]*)\)')
>>> arr = ['hello 4, this is stackoverflow, looking for help (1345-today is wednesday)', 'hello again, this is a (bit-more complicated), string (67890123 - tomorrow is thursday)']
>>> for el in arr:
...     regx.findall(el)
...
[('hello 4, this is stackoverflow, looking for help', '1345-today is wednesday')]
[('hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday')]

RegEx Details:

  • ^(.*?): Match 0 or more characters at the start in group #1
  • \s*: Match 0 or more whitespaces
  • \((\d+\s*-\s*\w+[^)]*)\): Match (<number>-word ..) string and capture what is inside brackets in capture group #2

Alternatively, you may use this regex in split:

>>> import re
>>> reg = re.compile(r'(?<!\s)\s*(?=\((\d+\s*-\s*\w+[^)]*)\))')
>>> for el in arr:
...     reg.split(el)[:-1]
...
['hello 4, this is stackoverflow, looking for help', '1345-today is wednesday']
['hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday']

RegEx Demo

RegEx Details:

  • (?<!\s): If we don’t have a whitespace at previous position
  • \s*: Match 0+ whitespaces
  • (?=\((\d+\s*-\s*\w+[^)]*)\)): Lookahead to assert a string ahead of us which is (<number>-word ..). Note that we are using a capture group to get string inside (...) in the result of split.

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top