我在 Linux 机器上有一个非常基本的文本文件,其中包含章节、对话和参考资料等内容。
这就是它的样子
Chapter: 1 One: Birds and Trees
Birds are beautiful and trees are amazing and
they are dependent on each other. Birds most of the time
choose to make their nests on trees since trees provide more
stability. One day the bird sat on a tree and said;
Bird: Oh my I'm so tired from all the flying, I should take a rest
Tree: Mr Bird, you seem tired, perhaps you should take some rest, and
here are some fruits to quench your thirst.
Bird: Oh thank you very much!
Reference: Chapter 1: birds and trees
Chapter: 2 Two: Trees and Fruits
Fruits are very delicious to eat and they are mostly found
in trees. Fruits contain essential vitamins, minerals and loads
of good fibers.
Reference: Chapter 2: trees and fruits
这些是 txt 文件的内容。现在假设我正在搜索quench
,我认为它会从章节号开始直到参考。所以我尝试使用 grep;
$ grep -A 5 -B 5 'quench' file.txt
但是,这并没有产生所需的输出。我期望的是这样的;
Chapter: 1 One: Birds and Trees
Birds are beautiful and trees are amazing and
they are dependent on each other. Birds most of the time
choose to make their nests on trees since trees provide more
stability. One day the bird sat on a tree and said;
Bird: Oh my I'm so tired from all the flying, I should take a rest
Tree: Mr Bird, you seem tired, perhaps you should take some rest, and
here are some fruits to quench your thirst.
Bird: Oh thank you very much!
Reference: Chapter 1: birds and trees
并且,搜索单词“维生素”会打印出来;
Chapter: 2 Two: Trees and Fruits
Fruits are very delicious to eat and they are mostly found
in trees. Fruits contain essential vitamins, minerals and loads
of good fibers.
Reference: Chapter 2: trees and fruits
我想知道这是否可以通过 sed 或 awk 实现。
PS:每一行都是真正的新行
一个
awk
想法:这样
-v word="quench"
就生成了:这样
-v word="essential"
就生成了:使用
-v word="bubble"
,或未-v word=...
提供任何子句时,将生成:使用Raku(以前称为 Perl_6)
某些人会发布 Perl 答案,但这里有一个用 Raku(又名 Perl6)编写的答案。Raku 内置了对 Unicode 的高级支持。
slurp
简而言之,输入文件,然后comb
遍历以找到匹配的记录(章节)。然后在最后的语句中grep
仅返回匹配的记录(章节)。示例输入与 OP 提供的相同。示例输出:
根据需要,在最后的语句中添加对
trim
、trim-leading
或 的调用以删除周围的空格。trim-trailing
https://raku.org