我正在使用 Arch Linux/Debian Linux,想要一个 ASCII txt 文件中的唯一“标识符”列表。以下是我想要缩减的数据片段:
... (Received from VRW): wind ...
... (Received from 1a00): air_ ...
... (Received from 5710): air_ ...
... (Received from ####): air_ ...
... (Received from 15d8): air_ ...
... (Received from ####): air_ ...
... (Received from 6e9e): baro ...
... (Received from 6e9e): volt ...
... (Received from 6e9e): wind ...
... (Received from 6e9e): air_ ...
由于文件很大且有大量重复的“标识符”,我只想输出唯一的标识符,以便输出如下所示:
... (Received from VRW): wind ...
... (Received from 1a00): air_ ...
... (Received from 5710): air_ ...
... (Received from ####): air_ ...
... (Received from 15d8): air_ ...
... (Received from 6e9e): baro ...
更好的做法是简单地列出唯一标识符,例如,,,15d8
等等。但我认为这会困难得多。6e9e
VRW
根据我以前尝试过的类似问题的建议:
grep "(Received from" datafile.txt
并得到了大量的标识符列表,其中大多数是重复的。
我也尝试过:
grep "(Received from" datafile.txt | sort -u
但不能说这是否有任何区别
我也尝试过:
parallel --tag --lb grep "Received from" {} | perl -ne '$seen{$_}++ or print;' ::: Data1.txt
这可能显示了我对这些问题的无知程度。