filtering of two columns from a tsv file

I want to extract ID (e.g. CA01g00010) from column 9 if column 3 is a gene

You may use this awk solution:

awk -F '\t' '$3 == "gene" {gsub(/^ID=|;.*/, "", $9); print $9}' file.tsv

CA01g00010
CA01g00020

Details:

  • -F '\t': This awk command uses \t (tab) as input field separator.
  • $3 == "gene": When $3 is gene then take an action
  • {...} is action block that contains:
    • gsub(/^ID=|;.*/, "", $9): Remove initial ID= part and anything that comes after ; from $9
    • print $9: print $9

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top