Quantcast
Channel: Remove string from a particular field using awk/sed - Unix & Linux Stack Exchange
Viewing all articles
Browse latest Browse all 5

Answer by glenn jackman for Remove string from a particular field using awk/sed

$
0
0

The field separator of the split function is a regular expression, so you can split on = OR ;. If you know that $9 begins with "ID=", then

awk -v OFS='\t' '
    $3 == "gene" {
        split($9, id, /[=;]/)
        print $1, $4, $5, id[2], $6, $7
    }
' genes.gff3

If "ID=" is not necessarily at the beginning of the field, then there's a little more work to do:

awk -v OFS='\t' '
    $3 == "gene" {
        id = ""
        len = split($9, f, /[=;]/)
        for (i=1; i<len; i++) {
            if (f[i] == "ID") {
                id = f[i+1]
                break
            }
        }
        print $1, $4, $5, id, $6, $7    
    }
' genes.gff3

Viewing all articles
Browse latest Browse all 5

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>