EDIT: Since OP’s requirement got more clear now after adding more samples but keeping my first answer here too.
awk '
{
for(i=4;i<=NF;i+=2){
if(length($i)==2){
if(substr($i,1,1) == substr($i,2,1)){
val=(val?val OFS:"")"homo"
}
else{
val=(val?val OFS:"")"het"
}
}
}
printf("%s%s\n",$0,(val!=""?OFS val:""))
val=""
}' Input_file
OR if you don’t want to print those lines where NO field is having 2 letter length(basically no value of homo
or het
) and you want to skip that line from printing then do following.
awk '
{
for(i=4;i<=NF;i+=2){
if(length($i)==2){
if(substr($i,1,1) == substr($i,2,1)){
val=(val?val OFS:"")"homo"
}
else{
val=(val?val OFS:"")"het"
}
}
}
if(val!=""){
print $0,val
}
val=""
}' Input_file
Could you please try following, written as per your shown samples only. your Input_file doesn’t look to me tab delimited in case its tab delimited the add a BEGIN
section like BEGIN{FS=OFS="\t"} after
awk ‘` line in following solution.
awk '
{
for(i=6;i<=NF;i+=2){
if($i==$(i-2)){
val=(val?val OFS:"")"homo"
}
else{
val=(val?val OFS:"")"het"
}
}
print $0,val
val=""
}' Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
{
for(i=6;i<=NF;i+=2){ ##Starting a for loop from 6th field to last field which will go every 2nd field from 6th one.
if($i==$(i-2)){ ##Checking condition if current field is equals to current-2 field value.
val=(val?val OFS:"")"homo" ##if its equal then add homo in val variable here and keep appending value to it.
}
else{ ##else part, in case current field is NOT equal to current-2 field then do following.
val=(val?val OFS:"")"het" ##if its NOT equal then add net to val value here.
}
}
print $0,val ##Printing current line and val when for loop is completed here.
val="" ##Nullifying val here.
}' Input_file ##Mentioning Input_file name here.
CLICK HERE to find out more related problems solutions.