Eu tenho este arquivo test.txt:
gene 1:362273700-362275735
exon 1:362275166-362275246
exon 1:362274811-362275058
exon 1:362274230-362274685
gene 1:362279796-362287281
exon 1:362279796-362280179
exon 1:362280576-362280662
exon 1:362280858-362280958
exon 1:362281056-362281106
Eu preciso obter esta saída:
gene-1 1:362275166-362275246
gene-1 1:362274811-362275058
gene-1 1:362274230-362274685
gene-2 1:362279796-362280179
gene-2 1:362280576-362280662
gene-2 1:362280858-362280958
gene-2 1:362281056-362281106
-> Na verdade, preciso remover as linhas "gene" e substituir cada linha "exon" por "gene-X" (onde X começa com 1).
Eu luto com isso.
awk '$1~/exon/ {print $0 (/^exon/ ? "-" (++c) : "")}' test.txt
exon 1:362275166-362275246-1
exon 1:362274811-362275058-2
exon 1:362274230-362274685-3
exon 1:362279796-362280179-4
exon 1:362280576-362280662-5
exon 1:362280858-362280958-6
exon 1:362281056-362281106-7
awk '$1~/exon/ {$1=$1 "-" (++count[$1])}1' test.txt
gene 1:362273700-362275735
exon-1 1:362275166-362275246
exon-2 1:362274811-362275058
exon-3 1:362274230-362274685
gene 1:362279796-362287281
exon-4 1:362279796-362280179
exon-5 1:362280576-362280662
exon-6 1:362280858-362280958
exon-7 1:362281056-362281106
Supondo que o contador se baseie apenas na existência da string
gene
na 1ª coluna...Uma
awk
ideia:Isso gera: