AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / user-1266296

DynV's questions

Martin Hope
DynV
Asked: 2022-08-04 03:49:48 +0800 CST

忽略时间戳,如何删除非聊天重复项?

  • 8

忽略时间戳,如何删除以下非聊天重复项?聊天有 2 种格式

  1. 以尖括号包围的昵称开头,并且
  2. 以昵称开头,后跟“告诉您:”。

如果它是在 Notepad++ 中完成的,我更喜欢,但由于 Cygwin,我还可以访问多个实用程序。

原来的

[16:29] You see a sheep; it looks like it weighs about 98.
[16:30] You see a sheep; it looks like it weighs about 100.
[16:52] anonymized tells you: Do you know the bank yet?
[17:11] Only anonymized may access the corpse for now.
[17:12] Only anonymized may access the corpse for now.
[17:14] <anonymized> You can do it later.
[17:14] <anonymized> The dagger for example
[17:15] <anonymized> The dagger for example
[17:15] <dynv> hi
[17:32] gnome has been killed by anonymized and dynv
[17:32] The corpse is too far away.
[17:32] The corpse is too far away.
[17:33] anonymized: now is gets dangerous

期望的结果

[16:29] You see a sheep; it looks like it weighs about 98.
[16:30] You see a sheep; it looks like it weighs about 100.
[16:52] anonymized tells you: Do you know the bank yet?
[17:11] Only anonymized may access the corpse for now.
[17:14] <anonymized> You can do it later.
[17:14] <anonymized> The dagger for example
[17:15] <anonymized> The dagger for example
[17:15] <dynv> hi
[17:32] gnome has been killed by anonymized and dynv
[17:32] The corpse is too far away.
[17:33] anonymized: now is gets dangerous

非常感谢你

notepad++ cygwin
  • 2 个回答
  • 35 Views
Martin Hope
DynV
Asked: 2022-04-06 04:42:59 +0800 CST

使TSV脱离多行列表

  • 5

我有一个项目列表,其中每个项目都有多行。分隔项目的标记是唯一的(每个项目, HTML <li>),我只看到包含在单个标记化段落 (HTML <p>) 中的文本实例。我想用它制作一个 TSV,按顺序排列哪些物品:

  1. 日期
  2. 姓名
  3. 网址
  4. 概括

从我所看到的所有项目中,URL 和名称都有重复项(在每个项目中),所以我选择了第一个 URL 和第二个名称,因为这对我来说似乎最简单。摘要可能包含视觉辅助标签(即<strong>),所以我使用否定的前瞻来完成它,而不是不应该有内部标签的日期,所以我使用了否定的字符类。

前 2 项是

    <li><p style="margin-bottom: 0in"><a href="https://www.rt.com/shows/on-contact/550756-america-long-war-race/">On
    Contact: Race and America's long war </a>
    </p>
    <p style="margin-bottom: 0in"><a href="https://www.rt.com/shows/on-contact/550756-america-long-war-race/">
  <font color="#000080">
    <img src="rt.com-on_contact-220405-no_blurb_html_1dff87941f1c724a.jpg" name="Image1" alt="On Contact: Race and America's long war" align="bottom" width="280" height="157" border="1"/>
  </font>
</a>
</p>
    <p style="margin-bottom: 0in">On the show, Chris Hedges discusses
    America's inner and outer wars and its nexus with capitalism and
    empire with Professor of Social and Cultural Analysis and History at
    New York University Nikhil Pal Singh. The internal violence in the
    United... 
    </p>
    <p style="margin-bottom: 0in">Feb 27, 2022 10:36</p>
    <li><p style="margin-bottom: 0in"><a href="https://www.rt.com/shows/on-contact/550319-george-washington-genocidal-colonist/">
  <font color="#000080">
    <img src="rt.com-on_contact-220405-no_blurb_html_198feb67032166ff.png" name="Image3" alt="On Contact: George Washington and the legacy of white supremacy" align="bottom" width="280" height="157" border="1"/>
  </font>
</a>
</p>
    <p style="margin-bottom: 0in"><strong><a href="https://www.rt.com/shows/on-contact/550319-george-washington-genocidal-colonist/">On
    Contact: George Washington and the legacy of white supremacy </a></strong>
    </p>
    <p style="margin-bottom: 0in">On the show, Chris Hedges discusses
    George Washington, the fallible human being and one of the principal
    architects of the United States, with author Nathaniel Philbrick. As
    America fractures into ideologically hostile camps, it colors how
    we... 
    </p>
    <p style="margin-bottom: 0in">Feb 25, 2022 09:09 
    </p>
    <li>[...]

我尝试的正则表达式是<li>.*<a href="([^"]+)".*alt="On Contact: ([^"]+)".*<p[^>]*>((?:.(?!<\/p>))+)<\/p><p[^>]*>([^<]+)<,如果它有效,它将被替换为$4\t$2\t$1\t$3. 我希望正则表达式在 Notepad++ 中工作。

感谢您的帮助

更新 1

我后来使用的测试字符串添加了列表项,在摘要中添加了显示标签(即<strong>),虽然它与标题不一致,但我不得不删除标签,因为它们干扰 TSV 创建,我想我不妨删除其中的换行符过程(删除[\t\r\n]),导致:

<li><p style="margin-bottom: 0in"><a href="https://www.rt.com/shows/on-contact/550756-america-long-war-race/">OnContact: Race and America's long war </a></p><p style="margin-bottom: 0in"><a href="https://www.rt.com/shows/on-contact/550756-america-long-war-race/">  <font color="#000080">    <img src="rt.com-on_contact-220405-no_blurb_html_1dff87941f1c724a.jpg" name="Image1" alt="On Contact: Race and America's long war" align="bottom" width="280" height="157" border="1"/>  </font></a></p><p style="margin-bottom: 0in">On the show, Chris Hedges discussesAmerica's inner and outer wars and its nexus with capitalism and <strong>empire</strong> with Professor of Social and Cultural Analysis and History atNew York University Nikhil Pal Singh. The internal violence in theUnited... </p><p style="margin-bottom: 0in">Feb 27, 2022 10:36</p><li><p style="margin-bottom: 0in"><a href="https://www.rt.com/shows/on-contact/550319-george-washington-genocidal-colonist/">  <font color="#000080">    <img src="rt.com-on_contact-220405-no_blurb_html_198feb67032166ff.png" name="Image3" alt="On Contact: George Washington and the legacy of white supremacy" align="bottom" width="280" height="157" border="1"/>  </font></a></p><p style="margin-bottom: 0in"><strong><a href="https://www.rt.com/shows/on-contact/550319-george-washington-genocidal-colonist/">OnContact: George Washington and the legacy of white supremacy </a></strong></p><p style="margin-bottom: 0in">On the show, <span class="host">Chris Hedges</span> discusses George Washington, the fallible human being and one of the principalarchitects of the United States, with author Nathaniel Philbrick. AsAmerica fractures into ideologically hostile camps, it colors howwe... </p><p style="margin-bottom: 0in">Feb 25, 2022 09:09 </p><li><p style="margin-bottom: 0in"><a href="https://www.rt.com/shows/on-contact/549103-oppenheimer-bomb-culture-bird/">  <font color="#000080">    <img src="rt.com-on_contact-220405-no_blurb_html_e46c470920b1171d.jpg" name="Image4" alt="On Contact: Oppenheimer & the bomb culture" align="bottom" width="420" height="236" border="1"/>  </font></a></p><p style="margin-bottom: 0in"><strong><a href="https://www.rt.com/shows/on-contact/549103-oppenheimer-bomb-culture-bird/">OnContact: Oppenheimer &amp; the bomb culture </a></strong></p><p style="margin-bottom: 0in">On the show, Chris Hedges discusses J.Robert Oppenheimer and the making of the bomb with author <span class="author">Kai Bird.J. Robert Oppenheimer</span>, &ldquo;the father of the atomic bomb,&rdquo;was by the end of World War II one of the most celebrated men inAmerica.... </p><p style="margin-bottom: 0in">Feb 20, 2022 06:10 </p><li><p style="margin-bottom: 0in"><a href="https://www.rt.com/shows/on-contact/469859-war-iran-stephen-kinzer/">  <font color="#000080">    <img src="rt.com-on_contact-220405-no_blurb_html_15449064d00f77f3.jpg" name="Image149" alt="On Contact – War with Iran? Stephen Kinzer" align="bottom" width="420" height="236" border="1"/>  </font></a></p><p style="margin-bottom: 0in"><strong><a href="https://www.rt.com/shows/on-contact/469859-war-iran-stephen-kinzer/">OnContact &ndash; War with Iran? Stephen Kinzer </a></strong></p><p style="margin-bottom: 0in">Host Chris Hedges talks to journalistand author, Stephen Kinzer, on efforts by Saudi Arabia and Washington to cripple Iran&rsquo;s economy, inevitably putting Saudi Arabia, its Gulf allies and Washington on a collision course with the <em>Islamic</em>... </p><p style="margin-bottom: 0in">Sep 29, 2019 07:10 </p><li><p style="margin-bottom: 0in"><a href="https://www.rt.com/shows/on-contact/469339-future-amazon-rain-forest/">  <font color="#000080">    <img src="rt.com-on_contact-220405-no_blurb_html_b82502a96022a758.png" name="Image150" alt="The future of the Amazon rain forest – Sonia Bone Guajajara" align="bottom" width="280" height="157" border="1"/>  </font></a></p><p style="margin-bottom: 0in"><strong><a href="https://www.rt.com/shows/on-contact/469339-future-amazon-rain-forest/">Thefuture of the Amazon rain forest &ndash; Sonia Bone Guajajara </a></strong></p><p style="margin-bottom: 0in">Host Chris Hedges talks to Sonia BoneGuajajara, leader of 300 indigenous ethnic groups in Brazil, aboutthe future of the Amazon rain forest, its people, climate change,and the competing goals of agrobusiness, multinational corporations,and the... </p><p style="margin-bottom: 0in">Sep 22, 2019 07:15 </p></ul>
regex notepad++
  • 2 个回答
  • 73 Views
Martin Hope
DynV
Asked: 2021-07-01 08:04:43 +0800 CST

正则表达式提取文本“块”的 3 个版本(从中制作 TSV)的信息

  • 5

我有一个 GUI 的文本粘贴,结果形成文本“块”,每一行包含有关特定条目的不同信息。对于我想收集的信息,有3种类型,我不知道如何处理。我认为 TSV 将是一种很好的输出格式。我更希望正则表达式由 Notepad++ 处理,但如果不能适应,我真的更希望正则表达式由免费且易于使用(如果是软件则安装)网站或软件来处理.

有问题的 3 种类型是有底价的,也有回扣的价格,根本没有价格,对于所有 3 种,应该包括“块”第一行;以下包含每个的输入和所需的输出。到目前为止,这是我为具有基本价格和回扣价格的类型所做的:\R0\R\R(\w.*)\R(?:\w.*\R){4}[\w-].*\R.*CDN\$\s(\d+\.\d{2})\RCDN\$\s(\d+\.\d{2})\R(?:\w.*\R){3}。

我不得不使用代码块,即使下面的代码不是块引用成束的行。

感谢您的帮助

输入


0

South Park™: The Stick of Truth™
OVERALL REVIEWS:
OVERWHELMINGLY POSITIVE
RELEASE DATE:
3 MAR, 2014
-75%
CDN$ 39.99
CDN$ 9.99
Add to Cart
RPGComedyAdventureFunnyTurn-Based
Added on 8/9/2020 ( remove )

输出

South Park™: The Stick of Truth™    39.99   9.99

输入


0

Grand Theft Auto V
OVERALL REVIEWS:
VERY POSITIVE
RELEASE DATE:
13 APR, 2015
View Details
Open WorldActionMultiplayerAutomobile SimCrime
Added on 1/15/2020 ( remove )

输出

Grand Theft Auto V      

输入


0

System Shock
OVERALL REVIEWS:
NO USER REVIEWS
RELEASE DATE:
SUMMER 2021
CDN$ 51.49
Add to Cart
ActionAdventureCyberpunkSci-fiImmersive Sim
Added on 6/9/2020 ( remove )

输出

System Shock    51.49   
regex notepad++
  • 1 个回答
  • 65 Views

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    如何减少“vmmem”进程的消耗?

    • 11 个回答
  • Marko Smith

    从 Microsoft Stream 下载视频

    • 4 个回答
  • Marko Smith

    Google Chrome DevTools 无法解析 SourceMap:chrome-extension

    • 6 个回答
  • Marko Smith

    Windows 照片查看器因为内存不足而无法运行?

    • 5 个回答
  • Marko Smith

    支持结束后如何激活 WindowsXP?

    • 6 个回答
  • Marko Smith

    远程桌面间歇性冻结

    • 7 个回答
  • Marko Smith

    子网掩码 /32 是什么意思?

    • 6 个回答
  • Marko Smith

    鼠标指针在 Windows 中按下的箭头键上移动?

    • 1 个回答
  • Marko Smith

    VirtualBox 无法以 VERR_NEM_VM_CREATE_FAILED 启动

    • 8 个回答
  • Marko Smith

    应用程序不会出现在 MacBook 的摄像头和麦克风隐私设置中

    • 5 个回答
  • Martin Hope
    Vickel Firefox 不再允许粘贴到 WhatsApp 网页中? 2023-08-18 05:04:35 +0800 CST
  • Martin Hope
    Saaru Lindestøkke 为什么使用 Python 的 tar 库时 tar.xz 文件比 macOS tar 小 15 倍? 2021-03-14 09:37:48 +0800 CST
  • Martin Hope
    CiaranWelsh 如何减少“vmmem”进程的消耗? 2020-06-10 02:06:58 +0800 CST
  • Martin Hope
    Jim Windows 10 搜索未加载,显示空白窗口 2020-02-06 03:28:26 +0800 CST
  • Martin Hope
    andre_ss6 远程桌面间歇性冻结 2019-09-11 12:56:40 +0800 CST
  • Martin Hope
    Riley Carney 为什么在 URL 后面加一个点会删除登录信息? 2019-08-06 10:59:24 +0800 CST
  • Martin Hope
    zdimension 鼠标指针在 Windows 中按下的箭头键上移动? 2019-08-04 06:39:57 +0800 CST
  • Martin Hope
    jonsca 我所有的 Firefox 附加组件突然被禁用了,我该如何重新启用它们? 2019-05-04 17:58:52 +0800 CST
  • Martin Hope
    MCK 是否可以使用文本创建二维码? 2019-04-02 06:32:14 +0800 CST
  • Martin Hope
    SoniEx2 更改 git init 默认分支名称 2019-04-01 06:16:56 +0800 CST

热门标签

windows-10 linux windows microsoft-excel networking ubuntu worksheet-function bash command-line hard-drive

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve