这是关于打击垃圾邮件的规范问题。
还相关:
关于打击垃圾邮件,有很多技术和很多知识需要了解。管理员、域所有者和最终用户可以使用哪些广泛使用的技巧和技术来帮助将垃圾拒之门外?
我们正在寻找一个从不同角度涵盖不同技术的答案。接受的答案应包括各种技术(例如 SPF/SenderID、DomainKeys/DKIM、灰名单、DNS RBL、信誉服务、过滤软件 [SpamAssassin 等]);最佳实践(例如永远不允许中继端口 25 上的邮件,应该使用端口 587 等)、术语(例如开放中继、反向散射、MSA/MTA/MUA、垃圾邮件/非垃圾邮件)以及可能的其他技术。
要打败你的敌人,你必须了解你的敌人。
什么是垃圾邮件?
出于我们的目的,垃圾邮件是任何未经请求的批量电子消息。如今,垃圾邮件旨在引诱毫无戒心的用户访问(通常是阴暗的)网站,在那里他们将被要求购买产品,或者将恶意软件发送到他们的计算机,或两者兼而有之。一些垃圾邮件会直接传送恶意软件。
您可能会惊讶地发现,第一封垃圾邮件是在 1864 年发送的。它是通过西联汇款电报发送的牙科服务广告。这个词本身是对Monty Python 的飞行马戏团中的一个场景的引用。
在这种情况下,垃圾邮件不是指用户订阅的邮件列表流量,即使他们后来改变了主意(或忘记了)但实际上还没有退订。
为什么垃圾邮件是个问题?
垃圾邮件是一个问题,因为它适用于垃圾邮件发送者。垃圾邮件通常会产生足够多的销售额(或恶意软件传播,或两者兼而有之)来支付垃圾邮件制造者发送垃圾邮件的成本。垃圾邮件发送者不会考虑收件人、您和您的用户的成本。即使只有极少数收到垃圾邮件的用户做出回应,这就足够了。
因此,您需要为带宽、服务器和管理员时间支付费用来处理传入的垃圾邮件。
我们出于以下原因阻止垃圾邮件:我们不想看到它,以降低我们处理电子邮件的成本,并使垃圾邮件发送者的垃圾邮件成本更高。
垃圾邮件是如何运作的?
垃圾邮件通常以不同于正常合法电子邮件的方式传送。
垃圾邮件发送者几乎总是想掩盖电子邮件的来源,因此典型的垃圾邮件将包含虚假的标头信息。
From:
地址通常是假的。一些垃圾邮件包含假Received:
线路以试图掩饰踪迹。许多垃圾邮件是通过开放的 SMTP 中继、开放的代理服务器和僵尸网络传送的。所有这些方法都使确定谁发起了垃圾邮件变得更加困难。一旦进入用户的收件箱,垃圾邮件的目的就是诱使用户访问广告网站。在那里,用户将被引诱进行购买,或者该站点将尝试在用户的计算机上安装恶意软件,或两者兼而有之。或者,垃圾邮件会要求用户打开包含恶意软件的附件。
如何阻止垃圾邮件?
作为邮件服务器的系统管理员,您将配置您的邮件服务器和域,使垃圾邮件发送者更难将他们的垃圾邮件发送给您的用户。
我将涵盖专门针对垃圾邮件的问题,并且可能会跳过与垃圾邮件不直接相关的内容(例如加密)。
不要运行开放中继
大邮件服务器的罪过是运行一个开放中继,一个 SMTP 服务器,它将接受任何目的地的邮件并将其转发。垃圾邮件发送者喜欢开放中继,因为它们实际上可以保证投递。他们承担传递消息(并重试!)的负担,而垃圾邮件发送者则做其他事情。他们使垃圾邮件便宜。
开放式中继也会导致反向散射问题。这些是中继接受但随后发现无法送达的消息。然后,开放中继将向
From:
包含垃圾邮件副本的地址发送退回消息。From:
和To:
地址都不在您的域内。该消息应被拒绝。(或者,使用MX Toolbox等在线服务来执行测试,但请注意,如果您的邮件服务器未通过测试,某些在线服务会将您的 IP 地址提交到黑名单。)拒绝任何看起来太可疑的东西
各种错误配置和错误可能是传入消息可能是垃圾邮件或非法邮件的提示。
HELO
/的连接EHLO
。HELO
/所在的连接EHLO
:验证您的用户
到达您的服务器的邮件应该被认为是入站邮件和出站邮件。入站邮件是到达您的 SMTP 服务器并最终发往您的域的任何邮件;出站邮件是到达您的 SMTP 服务器的任何邮件,这些邮件将在投递之前转移到其他地方(例如,它会转到另一个域)。入站邮件可以由您的垃圾邮件过滤器处理,并且可能来自任何地方,但必须始终发往您的用户。此邮件无法通过身份验证,因为不可能向可能向您发送邮件的每个站点提供凭据。
出站邮件,即要中继的邮件,必须经过身份验证。无论它来自 Internet 还是来自您的网络内部,都是这种情况(尽管在操作上可能的话,您应该限制允许使用您的邮件服务器的 IP 地址范围);这是因为垃圾邮件机器人可能正在您的网络中运行。因此,配置您的 SMTP 服务器,以便丢弃绑定到其他网络的邮件(中继访问将被拒绝),除非该邮件经过身份验证。更好的是,对入站邮件和出站邮件使用单独的邮件服务器,根本不允许对入站邮件进行中继,并且不允许对出站邮件进行未经身份验证的访问。
如果您的软件允许这样做,您还应该根据经过身份验证的用户过滤消息;如果邮件的发件人地址与通过身份验证的用户不匹配,则应拒绝该邮件。不要默默地更新发件人地址;用户应该知道配置错误。
您还应该记录用于发送邮件的用户名,或向其添加标识标头。这样,如果确实发生了滥用行为,您就有了证据并知道使用了哪个帐户。这使您可以隔离受感染的帐户和有问题的用户,这对于共享托管服务提供商尤其重要。
过滤流量
您希望确定离开您网络的邮件实际上是由您的(经过身份验证的)用户发送的,而不是由机器人或外部人员发送的。具体如何执行此操作取决于您管理的系统类型。
通常,如果您是公司网络,则阻止除出站邮件服务器之外的所有端口 25、465 和 587(SMTP、SMTP/SSL 和提交)的出口流量是个好主意。这样一来,您网络上运行恶意软件的机器人就无法从您的网络发送垃圾邮件到 Internet 上的开放中继或直接发送到最终 MTA 以获取地址。
热点是一种特殊情况,因为来自它们的合法邮件来自许多不同的域,但是(由于 SPF 等)“强制”邮件服务器是不合适的,用户应该使用他们自己域的 SMTP 服务器来提交邮件。这种情况要困难得多,但是对来自这些主机的 Internet 流量使用特定的公共 IP 或 IP 范围(以保护您站点的声誉)、限制 SMTP 流量和深度数据包检查是需要考虑的解决方案。
从历史上看,垃圾邮件机器人主要在端口 25 上发送垃圾邮件,但没有什么能阻止他们将端口 587 用于相同目的,因此更改用于入站邮件的端口具有可疑的价值。但是, RFC 2476推荐使用端口 587 进行邮件提交,并允许在邮件提交(到第一个 MTA)和邮件传输(在 MTA 之间)之间进行分离,这在网络拓扑中并不明显;如果你需要这样的分离,你应该这样做。
如果您是 ISP、VPS 主机、主机托管提供商或类似机构,或者正在提供供访问者使用的热点,则阻止出口 SMTP 流量对于使用自己的域发送邮件的用户来说可能会有问题。在除公共热点之外的所有情况下,您应该要求需要出站 SMTP 访问的用户,因为他们正在运行邮件服务器以明确请求它。让他们知道滥用投诉最终将导致访问被终止以保护您的声誉。
动态 IP 和那些用于虚拟桌面基础结构的 IP 永远不应具有出站 SMTP 访问权限,除非这些节点预期使用的特定邮件服务器除外。这些类型的 IP也应出现在黑名单中,您不应试图为它们建立声誉。这是因为他们极不可能运行合法的 MTA。
考虑使用 SpamAssassin
SpamAssassin 是一个邮件过滤器,可用于根据邮件标题和内容识别垃圾邮件。它使用基于规则的评分系统来确定邮件是垃圾邮件的可能性。分数越高,邮件越有可能是垃圾邮件。
SpamAssassin 还有一个贝叶斯引擎,可以分析反馈给它的垃圾邮件和非垃圾邮件(合法电子邮件)样本。
SpamAssassin 的最佳做法不是拒绝邮件,而是将其放入垃圾邮件文件夹。可以设置 Outlook 和 Thunderbird 等 MUA(邮件用户代理)来识别 SpamAssassin 添加到电子邮件消息中的标头,并适当地归档它们。误报可能而且确实会发生,虽然这种情况很少见,但当它发生在 CEO 身上时,你就会听说。如果将邮件简单地发送到“垃圾邮件”文件夹而不是直接拒绝,那么对话会进行得更好。
SpamAssassin 几乎是独一无二的,尽管存在一些替代方案。
sa-update
。考虑使用基于 DNS 的黑洞名单和信誉服务
DNSBL(以前称为 RBL,或实时黑洞列表)提供与垃圾邮件或其他恶意活动相关的 IP 地址列表。这些由独立的第三方根据自己的标准运行,因此请仔细研究 DNSBL 使用的列名和除名标准是否符合您的组织接收电子邮件的需要。例如,一些 DNSBLs 有严厉的除名政策,这使得意外列出的人很难被删除。其他的IP地址在一段时间内没有发送垃圾邮件后自动下架,这样更安全。大多数 DNSBL 都是免费使用的。
信誉服务类似,但声称通过分析与任何给定 IP 地址相关的更多数据来提供更好的结果。大多数信誉服务需要订阅付款或硬件购买或两者兼而有之。
有许多可用的 DNSBL 和信誉服务,但我使用和推荐的一些更为知名和有用的是:
保守名单:
攻击性列表:
如前所述,还有许多其他可用的,可能适合您的需要。我最喜欢的技巧之一是查找发送垃圾邮件的 IP 地址,该垃圾邮件通过了多个 DNSBL,以查看其中哪些会拒绝它。
Use SPF
SPF (Sender Policy Framework; RFC 4408 and RFC 6652) is a means to prevent email address spoofing by declaring which Internet hosts are authorized to deliver mail for a given domain name.
-all
to reject all others.Investigate DKIM
DKIM (DomainKeys Identified Mail; RFC 6376) is a method of embedding digital signatures in mail messages which can be verified using public keys published in the DNS. It is patent-encumbered in the US, which has slowed its adoption. DKIM signatures can also break if a message is modified in transit (e.g. SMTP servers occasionally may repack MIME messages).
Consider using greylisting
Greylisting is a technique where the SMTP server issues a temporary rejection for an incoming message, rather than a permanent rejection. When the delivery is retried in a few minutes or hours, the SMTP server will then accept the message.
Greylisting can stop some spam software which is not robust enough to differentiate between temporary and permanent rejections, but does not help with spam that was sent to an open relay or with more robust spam software. It also introduces delivery delays which users may not always tolerate.
Consider using nolisting
Nolisting is a method of configuring your MX records such that the highest priority (lowest preference number) record does not have a running SMTP server. This relies on the fact that a lot of spam software will only try the first MX record, while legitimate SMTP servers try all MX records in ascending order of preference. Some spam software also attempts to send directly to the lowest priority (highest preference number) MX record in violation of RFC 5321, so that could also be set to an IP address without an SMTP server. This is reported to be safe, though as with anything, you should test carefully first.
Consider a spam filtering appliance
Place a spam filtering appliance such as Cisco IronPort or Barracuda Spam & Virus Firewall (or other similar appliances) in front of your existing SMTP server to take much of the work out of reducing the spam you receive. These appliances are pre-configured with DNSBLs, reputation services, Bayesian filters and the other features I've covered, and are updated regularly by their manufacturers.
Consider hosted email services
If it's all too much for you (or your overworked IT staff) you can always have a third party service provider handle your email for you. Services such as Google's Postini, Symantec MessageLabs Email Security (or others) will filter messages for you. Some of these services can also handle regulatory and legal requirements.
What guidance should sysadmins give to end users regarding fighting spam?
The absolute #1 thing that end users should do to fight spam is:
DO NOT RESPOND TO THE SPAM.
If it looks funny, don't click the website link and don't open the attachment. No matter how attractive the offer seems. That viagra isn't that cheap, you aren't really going to get naked pictures of anybody, and there is no $15 million dollars in Nigeria or elsewhere except for the money taken from people who did respond to the spam.
If you see a spam message, mark it as Junk or Spam depending on your mail client.
DO NOT mark a message as Junk/Spam if you actually signed up to receive the messages and just want to stop receiving them. Instead, unsubscribe from the mailing list using the unsubscribe method provided.
Check your Junk/Spam folder regularly to see if any legitimate messages got through. Mark these as Not Junk/Not Spam and add the sender to your contacts to prevent their messages from being marked as spam in the future.
多年来,我管理过 100 多个独立的邮件环境,并使用过许多流程来减少或帮助消除垃圾邮件。
技术随着时间的推移而发展,所以这个答案将介绍我过去尝试过的一些事情并详细说明当前的情况。
关于保护的一些想法...
关于传入的垃圾邮件...
我目前的做法:
我是基于设备的垃圾邮件解决方案的坚定拥护者。我想在网络外围拒绝并在邮件服务器级别保存 CPU 周期。使用设备还提供了一些独立于实际邮件服务器(邮件投递代理)的解决方案。
出于多种原因,我推荐梭子鱼垃圾邮件过滤器设备。我已经部署了几十个单元,Web 驱动的界面、行业思想共享和设置后不用管的设备特性使其成为赢家。后端技术结合了上面列出的许多技术。
Barracuda Spam & Virus Firewall 300 状态控制台
较新的方法:
在过去的一个月里,我一直在试用梭子鱼的基于云的电子邮件安全服务。这类似于其他托管解决方案,但非常适合较小的站点,在这些站点中,昂贵的设备成本过高。对于象征性的年费,该服务提供硬件设备功能的大约 85%。该服务还可以与现场设备一起运行,以减少传入带宽并提供另一层安全性。它也是一个很好的缓冲区,可以在服务器中断时假脱机处理邮件。分析仍然有用,尽管不像物理单元那样详细。
梭子鱼云邮件安全控制台
总而言之,我尝试过很多解决方案,但考虑到某些环境的规模和用户群不断增长的需求,我想要最优雅的解决方案。采取多管齐下的方法和“自己动手”当然是可能的,但我在 Barracuda 设备的一些基本安全和良好使用监控方面做得很好。用户对结果非常满意。
注意:Cisco Ironport也很棒……只是更贵。
Partly, I endorse what others have said; partly, I don't.
Spamassassin
This works very well for me, but you need to spend some time training the Bayesian filter with both ham and spam.
Greylisting
ewwhite may feel its day has come and gone, but I can't agree. One of my clients asked how effective my various filters were, so here are approximate stats for July 2012 for my personal mailserver:
So about 44000 never made it through the greylisting; if I'd not had greylisting, and had accepted all those, they'd have all needed spam filtering, all using CPU and memory, and indeed bandwidth.
Edit: since this answer seems to have been useful to some people, I thought I'd bring the statistics up-to-date. So I re-ran the analysis on the mail logs from Jan 2015, 2.5 years later.
The numbers aren't directly comparable, because I no longer have a note of how I arrived at the 2012 figures, so I can't be sure the methodologies were identical. But I have confidence that I didn't have to run computationally-expensive spam filtering on an awful lot of content back then, and I still don't, because of greylisting.
SPF
This isn't really an anti-spam technique, but it can reduce the amount of backscatter you have to deal with, if you're joe-jobbed. You should use it both in and out, that is: You should check the SPF record of the sender for incoming email, and accept/reject accordingly. You should also publish your own SPF records, listing fully all machines that are approved to send mail as you, and lock out all others with
-all
. SPF records that don't end in-all
are completely useless.Blackhole lists
RBLs are problematic, since one can get onto them through no fault of one's own, and they can be hard to get off. Nevertheless, they have a legitimate use in spam-fighting but I would strongly suggest that no RBL should ever be used as a bright-line test for mail acceptance. The way spamassassin handles RBLs - by using many, each of which contributes towards a total score, and it's this score that makes the accept/reject decision - is much better.
Dropbox
I don't mean the commercial service, I mean that my mail server has one address which cuts through all my greylisting and spam-filtering, but which instead of delivering to anyone's INBOX, it goes to a world-writeable folder in
/var
, which is automatically pruned nightly of any emails over 14 days old.I encourage all users to take advantage of it when eg filling out email forms that require a validatable email address, where you're going to receive one email that you need to keep, but from whom you never wish to hear again, or when buying from online vendors who will likely sell and/or spam their address (particularly those outside the reach of European privacy laws). Instead of giving her real address, a user can give the dropbox address, and look in the dropbox only when she expects something from a correspondent (usually a machine). When it arrives, she can pick it out and save it in her proper mail collection. No user need look in the dropbox at any other time.
I am using a number of techniques which reduce spam to acceptable levels.
Delay accepting connections from incorrectly configured servers. A majority of the Spam I receive is from Spambots running on malware infected system. Almost all of these do not pass rDNS validation. Delaying for 30 seconds or so before each response causes most Spambots to give up before they have delivered their message. Applying this only to servers which fail rDNS avoids penalizing properly configured servers. Some incorrectly configured legitimate bulk or automated senders get penalized, but do deliver with minimal delay.
Configuring SPF for all your domains protects your domains. Most sub-domains should not be used to send email. The main exception is MX domains which must be able to send mail on their own. A number of legitimate senders delegate bulk and automated mail to servers that are not permitted by their policy. Deferring rather than rejecting based on SPF allow them to fix their SPF configuration, or you to whitelist them.
Requiring a FQDN (Fully Qualified Domain Name) in the HELO/EHLO command. Spam often uses an unqualified hostname, address literals, ip addresses, or invalid TLD (Top Level Domain). Unfortunately some legitimate senders use invalid TLDs so it may be more appropriate to defer in this case. This can require monitoring and whitelisting to enable the mail through.
DKIM helps with non-repudiation, but is otherwise not highly useful. My experience is that Spam is not likely to be signed. Ham is more likely to be signed so it has some value in Spam scoring. A number of legitimate senders don't publish their public keys, or otherwise improperly configure their system.
Greylisting is helpful for servers which show some signs of misconfiguration. Servers that are properly configured will get through eventually, so I tend to exclude them from greylisting. It is useful to greylist freemailers as they do tend to be used occasionally for Spam. The delay gives some of the Spam filter inputs time to catch the Spammer. It also tends to deflect Spambots as they usually don't retry.
Blacklists and Whitelists can help as well.
Spam filtering software is reasonably good at finding Spam although some will get through. It can be tricky getting the false negative to a reasonable level without increasing the false positive too much. I find Spamassassin catches most of the Spam that reaches it. I've added a few custom rules, that fit my needs.
Postmasters should configure the required abuse and postmaster addresses. Acknowledge the feedback you get to these addresses and act on it. This allows other to help you ensure your server is properly configured and not originating Spam.
If you are a developer, use the existing email services rather than setting up your own server. It is my experience that servers setup for automated mail senders are likely to be incorrectly configured. Review the RFCs and send properly formatted email from a legitimate address in your domain.
End users can do a number of things to help reduce Spam:
Domain owners / ISPs can help by limiting Internet access on port 25 (SMTP) to official e-mail servers. This will limit the ability of Spambots to send to the Internet. It also helps when dynamic addresses return names which do not pass rDNS validation. Even better is to verify the PTR record for mail servers do pass rDNS valiation. (Verify for typographical errors when configuring PTR records for your clients.)
I have started classifying email in three categories:
The SINGLE most effective solution I have seen is to use one of the external mail filtering services.
I have experience with the following services at current clients. I am sure there are others. Each of these has done an excellent job in my experience. The cost is reasonable for all three.
The services have several huge advantages over local solutions.
They stop most (>99%) of the spam BEFORE it hits your internet connection and your email server. Given the volume of spam, this is a lot of data not on your bandwidth and not on your server. I have implemented one of these services a dozen times and every one resulted in a noticeable performance improvement to the email server.
They also do anti-virus filtering, typically both directions. This mitigates the need to have a "mail anti-virus" solution on your server, and also keeps the virii completely
They also do a great job at blocking spam. In 2 years working at a company using MXLogic, I have never has a false positive, and can count the legit spam messages that got through on one hand.
No two mail environments are the same. So building an effective solution will require a lot of trial and error around the many different techniques available because the content of email, traffic, software, networks, senders, recipients and a lot more will all vary hugely across different environments.
However I find the following block lists (RBLs) to be well suited for general filtering:
如前所述,如果配置正确,SpamAssassin 是一个很好的解决方案,只需确保在 CPAN 以及 Razor、Pyzor 和 DCC 中安装尽可能多的插件 Perl 模块。Postfix 与 SpamAssassin 配合得很好,并且比 EXIM 更容易管理和配置。
在某些事件(例如因滥用行为触发 RBL 命中)后,使用 fail2ban 和 iptables 或类似工具在 IP 级别最终阻止客户端短时间(比如一天到一周)也可能非常有效。为什么要浪费资源与已知病毒感染的主机对话呢?