AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / user-511452

Ramón Wilhelm's questions

Martin Hope
Ramón Wilhelm
Asked: 2022-02-01 04:39:16 +0800 CST

AWK:在字典中的源术语之后随机选择行插入目标术语

  • 1

注意:我已经在AWK 中问过一个类似的问题:Quick way to insert target words after an source term,我是 AWK 的初学者。

这个问题考虑在随机选择的行中在源词之后插入多个目标词。

有了这个 AWK 代码片段

awk '(NR==FNR){a[$1];next}
    FNR in a { gsub(/\<source term\>/,"& target term") }
     1
    ' <(shuf -n 5 -i 1-$(wc -l < file)) file

我想target term在.source termfile

例如:我有一个双语词典dict,其中包含左侧的源术语和右侧的目标术语,例如

apple     : Apfel
banana    : Banane
raspberry : Himbeere

我的file由以下几行组成:

I love the Raspberry Pi.
The monkey loves eating a banana.
Who wants an apple pi?
Apple pen... pineapple pen... pen-pineapple-apple-pen!
The banana is tasty and healthy.
An apple a day keeps the doctor away.
Which fruit is tastes better: raspberry or strawberry?

假设第一个单词apple随机选择第 1、3、5、4、7 行。带有单词 apple 的输出将如下所示:

I love the Raspberry Pi.
The monkey loves eating a banana.
Who wants an apple Apfel pi?
Apple Apfel pen... pineapple pen... pen-pineapple-apple-pen!
The banana is tasty and healthy.
An apple a day keeps the doctor away.
Which fruit is tastes better: raspberry or strawberry?

然后是另外 5 条随机线;3、3、5、6、7;对于单词banana将被选中:

I love the Raspberry Pi .
The monkey loves eating a banana .
Who wants an apple Apfel pi ?
Apple Apfel pen... pineapple pen... pen-pineapple-apple-pen!
The banana Banane is tasty and healthy .
An apple a day keeps the doctor away .
Which fruit is tastes better: raspberry or strawberry?

dict在匹配最后一个条目之前,所有其他条目也是如此。

我想选择 5 条随机线。如果这些行有一个完整的源术语,比如我apple只想匹配整个单词(诸如“菠萝”之类的术语将被忽略)。如果一行包含两次源术语,例如,那么我也想在它之后插入目标术语。匹配应该不区分大小写,所以我也可以匹配源术语,比如and 。ApfelappleappleappleApple

我的问题:我怎样才能重写上面的代码片段,这样我就可以使用字典dict,它选择随机行file并在源术语后面插入目标术语?

awk files
  • 2 个回答
  • 107 Views
Martin Hope
Ramón Wilhelm
Asked: 2022-02-01 01:04:11 +0800 CST

在 AWK 之后,行数增加了

  • 2

运行我的 AWK 脚本后

awk -i inplace '(NR==FNR){a[$1];next}
    (FNR in a) && gsub(/\<Source Term\>/,"& Target Term")
     1
    ' <(shuf -n 198058 -i 1-$(wc -l < file)) file

在我file用命令检查之后

wc -l file

我注意到我的行数file从 40058 增加到 44156。这是有原因的吗?

有什么办法可以保持原来的行数吗?

bash awk
  • 1 个回答
  • 192 Views
Martin Hope
Ramón Wilhelm
Asked: 2022-01-30 07:09:39 +0800 CST

Unix 粘贴的线重叠不正确

  • 0

我有一个名为 的文件file1,其中包含以下几行

you are searching for a four .
you are searching for a six .
you are searching for a three .
you are searching for an ace .
you are searching for an eight .
you can use empty spaces in the Tab@@ le@@ au to move multiple cards . be careful with K@@ ings in the Reserve : the only way to remove them is by playing them to a Foundation on top of a Queen .
you can use empty spaces in the Tab@@ le@@ au to move multiple cards . be careful with K@@ ings in the Reserve : the only way to remove them is by playing them to a Foundation on top of a Queen .

我有第二个文件file2用这些行调用

four|||Vier
six|||Se@@ chs
for|||nach
searching|||suchen
eight|||Acht
spaces|||Plätze
spaces|||Plätze spaces|||Plätze

但是,在我执行了通过 Unix paste 合并这些行的命令之后

paste file1 file2 > result

我得到这样的结果:

you are four|||Vieror a four .
you are six|||Se@@ chsa six .
you are for|||nachfor a three .
you are searching|||suchence .
you are eight|||Achtr an eight .
you can use empty spaces in the Tab@@ le@@ au to move multiple cards . be careful with K@@ ings in the Reserve : the only way to remove them is by playing them to a Fouspaces|||Plätzeof a Queen .
you can use empty spaces in the Tab@@ le@@ au to move multiple cards . be careful with K@@ ings in the Reserve : the only way to remove them is by playing them to a Foundation on top of a Queen .     spaces|||Plätze spaces|||Plätze

我不明白发生了什么。为什么每个文件的合并行重叠?

files paste
  • 1 个回答
  • 86 Views
Martin Hope
Ramón Wilhelm
Asked: 2022-01-30 05:33:55 +0800 CST

AWK:在源词之后插入目标词的快速方法

  • 0

我不熟悉awk。为了在 198058 随机行中的源术语之后插入单个目标术语,我在此处有此代码

awk -i inplace '(NR==FNR){a[$1];next}
    (FNR in a) && gsub(/\<Source Term\>/,"& Target Term")
     1
    ' <(shuf -n 198058 -i 1-$(wc -l < file)) file

file包含这样的句子行

David has to eat his vegetables .
This weather is very cold .
Can you please stop this music ? This is terrible music .
The teddy bear is very plushy .
I must be going !

例如,如果我想在“天气”之后插入“Wetter”这个词,那么某行会是这样的

This weather Wetter is very cold .

如何重写代码,所以我只需要包含两个不同的文件,其中包含源术语和目标术语的列表?

假设源术语文件被调用sourceterms,目标术语文件被调用targetterms。

如果sourceterms包含这些术语的列表

vegetables
weather
terrible
plushy
going

并targetterms包含这些条款

Gemüse
Wetter
schreckliche
flauschig
gehen

我希望我的代码检查每一行file是否包含源术语并在其后插入目标术语,因此我的代码file如下所示:

David has to eat his vegetables Gemüse .
This weather Wetter is very cold .
Can you please stop this music ? This is terrible schreckliche music .
The teddy bear is very plushy flauschig.
I must be going gehen!

是否可以重写上面的代码?

awk files
  • 1 个回答
  • 73 Views
Martin Hope
Ramón Wilhelm
Asked: 2022-01-25 02:20:40 +0800 CST

通过使用带有 SED 行号的列表在某些行中附加单词

  • 2

我有一个包含多行的文件 example.txt:

Larry is funny!
Funny, I've no glue!
Look here!
Tom has no pants.
The underpants are in the drawer.
Pants are in the closet!

创建具有 4 个随机行号的文件后

sed -n '=' example.txt | shuf | head -3 > line_numbers.txt

假设 line_numbers.txt 中的行号包含

1
3
6

我想通过在 line_numbers.txt 的每一行附加单词 WORD 来编辑 example.txt,其中包含完整的单词“pants”(不是像“underpants”这样的部分单词)。

我怎样才能做到这一点?

我希望 example.txt 看起来像这样

Larry is funny!
Funny, I've no glue!
Look here!
Tom has no pants.
The underpants are in the drawer.
Pants are in the closet!WORD

编辑:

要仅查找完整的单词,您必须将 source_word 写为\<source_word\>.

其他可能的例子:

我有另一个文件,其中包含这些行:

I love apples.
You hate pineapples.
Apple pie is delicious.
Why do you do not like eating an apple?
We prefer pears to apples.
How many apples do you want to eat?
I have to bake three apple pies for sunday.

我有一个包含三个随机行号的列表

6
2
4

我只想在每行末尾添加--OK,如果该行包含完整的单词apples。

输出必须如下所示:

I love apples.
You hate pineapples.
Apple pie is delicious.
Why do you do not like eating an apple?
We prefer pears to apples.
How many apples do you want to eat?--OK
I have to bake three apple pies for sunday.
text-processing sed
  • 4 个回答
  • 241 Views

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    模块 i915 可能缺少固件 /lib/firmware/i915/*

    • 3 个回答
  • Marko Smith

    无法获取 jessie backports 存储库

    • 4 个回答
  • Marko Smith

    如何将 GPG 私钥和公钥导出到文件

    • 4 个回答
  • Marko Smith

    我们如何运行存储在变量中的命令?

    • 5 个回答
  • Marko Smith

    如何配置 systemd-resolved 和 systemd-networkd 以使用本地 DNS 服务器来解析本地域和远程 DNS 服务器来解析远程域?

    • 3 个回答
  • Marko Smith

    dist-upgrade 后 Kali Linux 中的 apt-get update 错误 [重复]

    • 2 个回答
  • Marko Smith

    如何从 systemctl 服务日志中查看最新的 x 行

    • 5 个回答
  • Marko Smith

    Nano - 跳转到文件末尾

    • 8 个回答
  • Marko Smith

    grub 错误:你需要先加载内核

    • 4 个回答
  • Marko Smith

    如何下载软件包而不是使用 apt-get 命令安装它?

    • 7 个回答
  • Martin Hope
    user12345 无法获取 jessie backports 存储库 2019-03-27 04:39:28 +0800 CST
  • Martin Hope
    Carl 为什么大多数 systemd 示例都包含 WantedBy=multi-user.target? 2019-03-15 11:49:25 +0800 CST
  • Martin Hope
    rocky 如何将 GPG 私钥和公钥导出到文件 2018-11-16 05:36:15 +0800 CST
  • Martin Hope
    Evan Carroll systemctl 状态显示:“状态:降级” 2018-06-03 18:48:17 +0800 CST
  • Martin Hope
    Tim 我们如何运行存储在变量中的命令? 2018-05-21 04:46:29 +0800 CST
  • Martin Hope
    Ankur S 为什么 /dev/null 是一个文件?为什么它的功能不作为一个简单的程序来实现? 2018-04-17 07:28:04 +0800 CST
  • Martin Hope
    user3191334 如何从 systemctl 服务日志中查看最新的 x 行 2018-02-07 00:14:16 +0800 CST
  • Martin Hope
    Marko Pacak Nano - 跳转到文件末尾 2018-02-01 01:53:03 +0800 CST
  • Martin Hope
    Kidburla 为什么真假这么大? 2018-01-26 12:14:47 +0800 CST
  • Martin Hope
    Christos Baziotis 在一个巨大的(70GB)、一行、文本文件中替换字符串 2017-12-30 06:58:33 +0800 CST

热门标签

linux bash debian shell-script text-processing ubuntu centos shell awk ssh

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve