annotate every other column according to a matching condition

EDIT: Since OP’s requirement got more clear now after adding more samples but keeping my first answer here too.

awk '
{
  for(i=4;i<=NF;i+=2){
    if(length($i)==2){
      if(substr($i,1,1) == substr($i,2,1)){
        val=(val?val OFS:"")"homo"
      }
      else{
        val=(val?val OFS:"")"het"
      }
    }
  }
  printf("%s%s\n",$0,(val!=""?OFS val:""))
  val=""
}' Input_file

OR if you don’t want to print those lines where NO field is having 2 letter length(basically no value of homo or het) and you want to skip that line from printing then do following.

awk '
{
  for(i=4;i<=NF;i+=2){
    if(length($i)==2){
      if(substr($i,1,1) == substr($i,2,1)){
        val=(val?val OFS:"")"homo"
      }
      else{
        val=(val?val OFS:"")"het"
      }
    }
  }
  if(val!=""){
    print $0,val
  }
  val=""
}' Input_file


Could you please try following, written as per your shown samples only. your Input_file doesn’t look to me tab delimited in case its tab delimited the add a BEGIN section like BEGIN{FS=OFS="\t"} after awk ‘` line in following solution.

awk  '
{
  for(i=6;i<=NF;i+=2){
    if($i==$(i-2)){
      val=(val?val OFS:"")"homo"
    }
    else{
      val=(val?val OFS:"")"het"
    }
  }
  print $0,val
  val=""
}' Input_file

Explanation: Adding detailed explanation for above.

awk  '                                ##Starting awk program from here.
{
  for(i=6;i<=NF;i+=2){                ##Starting a for loop from 6th field to last field which will go every 2nd field from 6th one.
    if($i==$(i-2)){                   ##Checking condition if current field is equals to current-2 field value.
      val=(val?val OFS:"")"homo"      ##if its equal then add homo in val variable here and keep appending value to it.
    }
    else{                             ##else part, in case current field is NOT equal to current-2 field then do following.
      val=(val?val OFS:"")"het"       ##if its NOT equal then add net to val value here.
    }
  }
  print $0,val                        ##Printing current line and val when for loop is completed here.
  val=""                              ##Nullifying val here.
}' Input_file                         ##Mentioning Input_file name here.

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top