clean a string in postgresql

At first, you can use regexp_split_into_table and a regex with a positive lookahead to get a version of your table in which each of the rows contains exactly one update:

select companyID, 
       updated_at, 
       regexp_split_to_table(updates, '\n(?=\y.+:)') as updates 
  from old;

This will split the column updates at any newline (\n) that is followed by a single word and a colon (\y.+:).

#=============#==============#================#
| companyID   |  updated_at  |   updates      |
#=============#==============#================#
| 101         | 2020-11-01   | name:          |
|             |              | -ABC           |
|             |              | -XYZ           |
+-------------+--------------+----------------+
| 101         | 2020-11-01   | url:           |
|             |              | -www.abc.com   |
|             |              | -www.xyz.com   |
+-------------+--------------+----------------+
| 109         | 2020-10-20   | rating:        |
|             |              | -4.5           |
|             |              | -4.0           |
+-------------+--------------+----------------+

From this, you can more easily build your desired table. To do this, you can use e.g. split_part to split the update string into the three parts you want.

Putting this together with the first part gets you the full query:

select companyID, 
       updated_at, 
       split_part(updates, E':', 1) as field, 
       split_part(updates, E'\n-', 2) as old_value, 
       split_part(updates, E'\n-', 3) as new_value  
  from (select companyID, 
               updated_at, 
               regexp_split_to_table(updates, '\n(?=\y.+:)') as updates 
          from old
       )
;

Here is a db<>fiddle example.

More details / additional info:

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top