重新排列字母并比较两个单词

Question

Ashar

Asked: 2022-03-24 02:15:13 +0800 CST2022-03-24 02:15:13 +0800 CST 2022-03-24 02:15:13 +0800 CST

查找重复的第一个字段并在单行中连接其值

772

我有一个文件，其条目key: value格式如下：

猫数据.txt

name: 'tom'
tom_age: '31'
status_tom_mar: 'yes'
school: 'anne'
fd_year_anne: '1987'
name: 'hmz'
hmz_age: '21'
status_hmz_mar: 'no'
school: 'svp'
fd_year_svp: '1982'
name: 'toli'
toli_age: '41'

同样...

我只需要查找并打印那些key: value具有重复键的单个条目。

下面的代码让我得到了重复的键

cat data.txt | awk '{ print $1 }' | sort  | uniq -d
name:
school:

但是，我想要在一行中连接重复键的值的输出。

预期输出：

name: ['tom', 'hmz', 'toli']
school: ['anne', 'svp']
tom_age: '31'
status_tom_mar: 'yes'
fd_year_anne: '1987'
hmz_age: '21'
status_hmz_mar: 'no'
fd_year_svp: '1982'
toli_age: '41'

你能建议吗？

5 个回答

Voted

terdon · Answer 1 · 2022-03-24T03:26:22+08:00

在awk：

$ awk -F': ' '
{
    count[$1]++; 
    data[$1] = $1 in data ? data[$1]", "$2 : $2 
} 
END { 
    for (id in count) { 
        printf "%s: ",id; 
        print (count[id]>1 ? "[ "data[id]" ]" : data[id])
    }
}' data.txt 
hmz_age: '21'
tom_age: '31'
fd_year_anne: '1987'
school: [ 'anne', 'svp' ]
name: [ 'tom', 'hmz', 'toli' ]
toli_age: '41'
fd_year_svp: '1982'
status_hmz_mar: 'no'
status_tom_mar: 'yes'

Perl 方法：

$ perl -F: -lane 'push @{$k{$F[0]}},$F[1]; 
        END{ 
            for $key (keys(%k)){ 
                $data=""; 
                if(scalar(@{$k{$key}})>1){ 
                    $data="[" . join(",",@{$k{$key}}) . "]"; 
                } 
                else{
                    $data=${$k{$key}}[0];
                }
                print "$key: $data"
            }
        }' data.txt 
status_tom_mar:  'yes'
fd_year_anne:  '1987'
tom_age:  '31'
toli_age:  '41'
fd_year_svp:  '1982'
hmz_age:  '21'
school: [ 'anne', 'svp']
name: [ 'tom', 'hmz', 'toli']
status_hmz_mar:  'no'

或者，也许更容易理解：

perl -F: -lane '@fields=@F; 
                push @{$key_hash{$fields[0]}},$fields[1]; 
                END{ 
                    for $key (keys(%key_hash)){ 
                        $data=""; 
                        @key_data=@{$key_hash{$key}};
                        if(scalar(@key_data)>1){ 
                           $data="[" . join(",", @key_data) . "]"; 
                        } 
                        else{
                            $data=$key_data[0]
                        }
                        print "$key: $data"
                    }
                }' data.txt

roaima · Answer 2 · 2022-03-24T03:54:58+08:00

roaima

2022-03-24T03:54:58+08:002022-03-24T03:54:58+08:00

一个简短的awk程序将为您实现这一目标

awk -F': ' '
    # Every line of input; fields split at colon+space
    {
        # Append a comma if we have previous items
        if (h[$1] > "") { h[$1] = h[$1] ", " };

        # Append the item and increment the count
        h[$1] = h[$1] $2;
        i[$1]++
    }

    # Finally
    END {
        # Iterate across all the keys we have found
        for (k in h) {
            if (i[k] > 1) { p = "[%s]" } else { p = "%s" };
            printf "%s: " p "\n", k, h[k]
        }
    }
' data.txt

输出

hmz_age: ['21', '41']
tom_age: '31'
fd_year_anne: ['1987', '1982']
school: ['anne', 'svp']
name: ['tom', 'hmz', 'toli']
status_hmz_mar: 'no'
status_tom_mar: 'yes'

4

K-att- · Answer 3 · 2022-03-24T05:48:40+08:00

K-att-

2022-03-24T05:48:40+08:002022-03-24T05:48:40+08:00

在 awk 中： awk '{arr[$1][length(arr[$1])+1]=$2}; END {for (i in arr) {printf i;if (length(arr[i])>1) {xc=" [";for (rr in arr[i]) {printf xc;printf arr[i][rr];xc=","} print "]"} else print arr[i][length(arr[i])]} }' data.txt

输出：

hmz_age:'21'
fd_year_svp:'1982'
fd_year_anne:'1987'
name: ['tom','hmz','toli']
school: ['anne','svp']
status_tom_mar:'yes'
tom_age:'31'
toli_age:'41'
status_hmz_mar:'no'

2

jubilatious1 · Answer 4 · 2022-04-09T06:13:38+08:00

使用Raku（以前称为 Perl_6）：

raku -e 'my %h; for lines() {%h.=append: .split(":").map(*.trim).hash}; .say for %h;'

或者

raku -e 'my %h.=append: .split(":").map(*.trim).hash for lines; .say for %h;'

使用 Raku，您可以内置哈希功能（请参阅底部的文档页面）。简而言之，上面的代码从生成的 2 个元素中获取冒号上的 , linesssplit和s 空格，并生成一个（即键值对）。然后将每一行的散列编辑到命名（散列）对象，并将值适当地添加到它们各自的键中。":"trimhashappend%h

样本输入：

name: 'tom'
tom_age: '31'
status_tom_mar: 'yes'
school: 'anne'
fd_year_anne: '1987'
name: 'hmz'
hmz_age: '21'
status_hmz_mar: 'no'
school: 'svp'
fd_year_svp: '1982'
name: 'toli'
toli_age: '41'

样本输出：

hmz_age => '21'
fd_year_svp => '1982'
status_tom_mar => 'yes'
fd_year_anne => '1987'
school => ['anne' 'svp']
status_hmz_mar => 'no'
tom_age => '31'
name => ['tom' 'hmz' 'toli']
toli_age => '41'

一旦您的数据在%h对象中，您就可以操作输出。在上面的代码中替换.put为.say制表符分隔（未=>分隔）返回。此外，您可以像这样提取与单个键关联的值（在下面添加作为最终语句）：

say %h<name>;'
['tom' 'hmz' 'toli']

https://docs.raku.org/language/hashmap
https://docs.raku.org/language/101-basics#Hashes

Praveen Kumar BS · Answer 5 · 2022-04-07T23:28:19+08:00

Praveen Kumar BS

2022-04-07T23:28:19+08:002022-04-07T23:28:19+08:00

步骤1

for i in $(awk -F ":" '{a[$1]++}END{for(x in a){print x,a[x]}}' file.txt | awk '$NF>1{print $1}'|tac); do grep "^$i" file.txt >/dev/null; if [[ $? == 0 ]]; then awk -v i="$i" -F ":" '$1 == i{print $2}' file.txt|awk 'END{print "\n"}ORS=","'|sed "s/^,//g"|sed "s/,$//g"|awk -v i="$i" '{print i":["$0"]"}';else grep -v "^$i" file.txt;fi; done >output.txt

第2步

for i in $(awk -F ":" '{a[$1]++}END{for(x in a){print x,a[x]}}' file.txt| awk '$NF==1'); do awk -v i="$i" -F ":" '$1 ~ i' file.txt; done >>output.txt

输出

name: ['tom', 'hmz', 'toli']
school: ['anne', 'svp']
tom_age: '31'
status_tom_mar: 'yes'
fd_year_anne: '1987'
hmz_age: '21'
status_hmz_mar: 'no'
fd_year_svp: '1982'
toli_age: '41'

-1

查找重复的第一个字段并在单行中连接其值

模块 i915 可能缺少固件 /lib/firmware/i915/*

无法获取 jessie backports 存储库

如何将 GPG 私钥和公钥导出到文件

我们如何运行存储在变量中的命令？

如何配置 systemd-resolved 和 systemd-networkd 以使用本地 DNS 服务器来解析本地域和远程 DNS 服务器来解析远程域？

dist-upgrade 后 Kali Linux 中的 apt-get update 错误 [重复]

如何从 systemctl 服务日志中查看最新的 x 行

Nano - 跳转到文件末尾

grub 错误：你需要先加载内核

如何下载软件包而不是使用 apt-get 命令安装它？

查找重复的第一个字段并在单行中连接其值

5 个回答

相关问题