我为 spamassassin 写了一个正则表达式规则,它应该匹配邮件主题和正文中的一些单词或短语。
规则如下:
header __SUBJECT_CHUJ Subject =~ /(powi.ksz.(0,5} penis.|jak powi.kszy.|wcieraj|wmasuj|grubszy|d.u.szy|erekcj.|zwi.sz. rozmiar|b.dzie twardy|b.dzie du.y|b.dzie d.ugi|wzw.d|centymetry)/i
body __BODY_CHUJOWE /\s+(jak powi.kszy.|wcieraj|posmaruj|wmasuj|natrzyj|grubszy|d.u.szy|erekcj.|zwi.sz. rozmiar|b.dzie twardy|wypisz|cz.onek|wypysuj|urosn..|du.ego penisa|b.dzie du.y|b.dzie d.ugi|wzw.d|centymetry|nowy .el|dodatkowe centymetry|dodatkowych centrumetr.w|zadowala. kobiety|)\s+/i
meta CHUJOWY_MAIL (__SUBJECT_CHUJ || __BODY_CHUJOWE )
score CHUJOWY_MAIL 1.4
describe CHUJOWY_MAIL Spam związany z CHUJEM
它匹配 1 个单词的电子邮件,如下所示:
Return-Path: <[email protected]>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on mail01
X-Spam-Level: **
X-Spam-Status: No, score=2.8 required=5.0 tests=ALL_TRUSTED,CHUJOWY_MAIL,
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_REPLYTO,URIBL_BLOCKED
autolearn=no autolearn_force=no version=3.4.0
Delivered-To: [email protected]
Received: from example.com (unknown [80.*.*.*.])
(Authenticated sender: [email protected])
by mail.glmr.in (Postfix) with ESMTPSA id 47E44428
for <[email protected]>; Tue, 2 Oct 2018 22:27:36 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=example.com;
s=default; t=1538512056;
bh=Rbj/g/DG4Vxz5Q2spNg8e4CJNwTKexCgSS9rpHGC0c8=;
h=Date:To:From:Reply-To:Subject;
b=InLP2mWzq3cWk6N8pNfDxle6swdrp7KaXkQTyHfMJqmZcuRhoJFESQL16RMsyz2LJ
dFLkXa0TO638JP+MC02DKi79dNGjKOncJSiWCN5z5mVGqg7YzzyPokgtBKNmr/bCG+
exxcSU3vngAOEVTAqJxQYTiOIXkonJf9R0UAsw9E=
Date: Tue, 2 Oct 2018 20:27:35 +0000
To: [email protected]
From: test name <[email protected]>
Reply-To: [email protected]
Subject: test subject
Message-ID: <[email protected]>
X-Mailer: WPMailSMTP/Mailer/smtp 1.3.3
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-EsetId: 37303A29DFC057626C7761
在该电子邮件的正文中,只有test text
Can有人发现该代码有什么问题吗?我写的所有其他规则似乎都很好。
您在第一个正则表达式中有错字,您应该使用
{0,5}
而不是(0,5}
.您可能希望
|
在第二个正则表达式中删除最后一个。正(one|two|three|)
则表达式匹配空字符串(因为括号内的第四个替代项是空字符串)。所以你的长正则表达式也只匹配/s+/s+
- 两个后续空格,回车等。我认为这不是故意的。