我有一个位于公共 IP 后面的 DNS 服务器集群。
这些服务器有时会解决,但有时它们只是为任何查询返回一个 ServFail 错误代码
我的设置不是典型的(这是继承的)。
基本上在服务器上有一个名为 gi 的命名空间,这里是新服务调用 srv-gi ''' 使用命名服务的地方
#!/bin/sh
start_service() {
ip netns exec gi /usr/sbin/zebra -d -A 127.0.0.1 -f /etc/quagga/zebra.conf
ip netns exec gi /usr/sbin/bgpd -d -A 127.0.0.1 -f /etc/quagga/bgpd.conf
#DNS service
ip netns exec gi /usr/sbin/named -u named -c /etc/gi-named.conf
}
start_service
'''
named.conf 文件也已重命名为 gi-named.conf 文件。
// // named.conf // // 由 Red Hat 绑定包提供,用于将 ISC BIND named(8) DNS // 服务器配置为仅缓存名称服务器(仅作为 localhost DNS 解析器)。// // 参见 /usr/share/doc/bind*/sample/ 例如命名的配置文件。// // 有关位于 /usr/share/doc/bind-{version}/Bv9ARM.html 中的配置的详细信息,请参阅 BIND 管理员参考手册 (ARM)
options {
listen-on port 53 { Public IP; };
#listen-on-v6 port 53 { ::1; };
directory "/var/named";
dump-file "/var/named/data/cache_dump.db";
statistics-file "/var/named/data/named_stats.txt";
memstatistics-file "/var/named/data/named_mem_stats.txt";
recursing-file "/var/named/data/named.recursing";
secroots-file "/var/named/data/named.secroots";
allow-query { any; };
allow-query-on { PublicIP; };
/*
- If you are building an AUTHORITATIVE DNS server, do NOT enable recursion.
- If you are building a RECURSIVE (caching) DNS server, you need to enable
recursion.
- If your recursive DNS server has a public IP address, you MUST enable access
control to limit queries to your legitimate users. Failing to do so will
cause your server to become part of large scale DNS amplification
attacks. Implementing BCP38 within your network would greatly
reduce such attack surface
*/
recursion yes;
allow-query-cache { Internal Range; };
allow-query-cache-on { PublicIP; };
query-source address Public IP ;
dnssec-enable yes;
dnssec-validation yes;
/* Path to ISC DLV key */
bindkeys-file "/etc/named.iscdlv.key";
managed-keys-directory "/var/named/dynamic";
pid-file "/run/named/named.pid";
session-keyfile "/run/named/session.key";
};
logging
{
/* If you want to enable debugging, eg. using the 'rndc trace' command,
* named will try to write the 'named.run' file in the $directory (/var/named).
* By default, SELinux policy does not allow named to modify the /var/named directory,
* so put the default debug log file in data/ :
*/
/*channel default_debug {
print-time yes;
print-category yes;
print-severity yes;
file "data/named.run";
severity dynamic;
};*/
channel queries_log {
file "/var/log/queries" versions 1 size 20m;
print-time yes;
print-category yes;
print-severity yes;
severity debug 3;
};
category queries { queries_log; };
category client { queries_log; };
};
zone "." IN {
type hint;
file "named.ca";
};
include "/etc/named.rfc1912.zones";
include "/etc/named.root.key";
另请注意,我有一个 quagga riuter 配置为允许通过公共 IP 进行 DNS 解析
/etc/quagga/bgpd.conf
!
! Zebra configuration saved from vty
! 2019/10/11 10:11:45
!
!
router bgp AS
bgp router-id PublicIP
network PublicIP/32
network CoreIP/32
neighbor DUB1-WGW peer-group
neighbor DUB1-WGW remote-as AS
neighbor DUB1-WGW soft-reconfiguration inbound
neighbor DUB1-WGW route-map XXXXX out
neighbor CoreBGPIP peer-group DUB1-WGW
neighbor CoreBGPIP peer-group DUB1-WGW
!
ip prefix-list XXXX seq 5 permit PublicIP/32
ip prefix-list XXXX seq 10 permit PrivateIP/32
!
route-map DNS_TO_GI permit 10
match ip address prefix-list XXXXX
!
line vty
!
/etc/quagga/zebra.conf
!
! Zebra configuration saved from vty
! 2019/10/11 10:11:45
!
hostname hostname
!
interface ens160
ipv6 nd suppress-ra
!
interface ens192
ipv6 nd suppress-ra
!
interface ens192.890
ipv6 nd suppress-ra
!
interface ens192.892
ipv6 nd suppress-ra
!
interface XX
ipv6 nd suppress-ra
!
interface lo
!
ip prefix-list XX seq 5 permit PublicIP3/32
ip prefix-list XX seq 10 permit PrivateIP/32
!
route-map XXXX permit 10
match ip address prefix-list XXX
!
!
!
line vty
!
# show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, A - Babel,
> - selected route, * - FIB route
B>* 0.0.0.0/0 [20/10] via neighbor IP, ens192.892, 00:02:18
C>* 127.0.0.0/8 is directly connected, lo
C>* Public IP/32 is directly connected, lo
C>* NeighborSubnet/30 is directly connected, ens192.890
C>* NeighborIP/30 is directly connected, ens192.892
C>* LocalIP/32 is directly connected, lo
我正在使用测试 APN 测试分辨率,虽然当我引入第二个 APN 时我可以将一个 APN 作为 sson 获得分辨率,但我只是在 tcpdump 中遇到以下错误:
11:29:38.065284 IP PublicIP.domain > internal IP.p2pcommunity: 30622 ServFail 0/0/0 (44)
11:29:38.265736 IP PublicIP.domain > internal IP.32209: 12606 ServFail 0/0/0 (37)
11:29:38.266037 IP PublicIP.domain > internal IP.10793: 26678 ServFail 0/0/0 (37)
11:29:38.295727 IP PublicIP.domain > internal IP.ibm_wrless_lan: 23483 ServFail 0/0/0 (33)
11:29:38.296038 IP PublicIP.domain > internal IP.22097: 8347 ServFail 0/0/0 (33)
11:29:38.297532 IP PublicIP.domain > internal IP.31026: 23400 ServFail 0/0/0 (38)
11:29:38.298117 IP PublicIP.domain > internal IP.23707: 26481 ServFail 0/0/0 (38)
并从 /var/log/queries
22-Sep-2020 11:31:07.552 client: debug 3: client InternalIP#61793 (www.facebook.com): error
22-Sep-2020 11:31:07.552 client: debug 3: client InternalIP#61793 (www.facebook.com): send
22-Sep-2020 11:31:07.552 client: debug 3: client InternalIP#61793 (www.facebook.com): sendto
22-Sep-2020 11:31:07.552 client: debug 3: client InternalIP#48008 (2.android.pool.ntp.org): error
22-Sep-2020 11:31:07.552 client: debug 3: client InternalIP#61793 (www.facebook.com): senddone
22-Sep-2020 11:31:07.552 client: debug 3: client InternalIP#61793 (www.facebook.com): next
22-Sep-2020 11:31:07.552 client: debug 3: client InternalIP#61793 (www.facebook.com): endrequest
22-Sep-2020 11:31:07.553 client: debug 3: client InternalIP#48008 (2.android.pool.ntp.org): send
22-Sep-2020 11:31:07.553 client: debug 3: client InternalIP#48008 (2.android.pool.ntp.org): sendto
22-Sep-2020 11:31:07.553 client: debug 3: client InternalIP#48008 (2.android.pool.ntp.org): senddone
22-Sep-2020 11:31:07.553 client: debug 3: client InternalIP#48008 (2.android.pool.ntp.org): next
22-Sep-2020 11:31:07.553 client: debug 3: client InternalIP#48008 (2.android.pool.ntp.org): endrequest
我真的不确定如何解决这个问题,任何指针或建议将不胜感激
dig 命令的输出
dig facebook.com
; <<>> DiG 9.9.4-RedHat-9.9.4-74.el7_6.1 <<>> facebook.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7204
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4000
;; QUESTION SECTION:
;facebook.com. IN A
;; ANSWER SECTION:
facebook.com. 93 IN A 31.13.86.36
;; Query time: 2 msec
;; SERVER: internal DNS#53(Internal DNS)
;; WHEN: Tue Sep 22 19:38:58 UTC 2020
;; MSG SIZE rcvd: 57
dig @PublicIP facebook.com
; <<>> DiG 9.9.4-RedHat-9.9.4-74.el7_6.1 <<>> @PublicIP facebook.com
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
dig @208.67.222.222 facebook.com
; <<>> DiG 9.9.4-RedHat-9.9.4-74.el7_6.1 <<>> @208.67.222.222 facebook.com
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
ip netns exec gi tcpdump -n -f 'port 53' -i any
09:55:35.676645 IP PublicIP.domain > InternalIP.46571: 36451 ServFail 0/0/0 (32)
09:55:35.676939 IP PublicIP.domain > InternalIP.37817: 52592 ServFail 0/0/0 (32)
09:55:35.677865 IP PublicIP.domain > InternalIP41737: 52624 ServFail 0/0/0 (32)
09:55:35.713870 IP PublicIP.34042 > 193.0.14.129.domain: 11264 [1au] A? mtalk.google.com. (45)
09:55:35.713914 IP PublicIP.11218 > 193.0.14.129.domain: 3623 [1au] NS? . (28)
09:55:35.768649 IP 193.0.14.129.domain > PublicIP.11218: 3623*-| 0/0/1 (28)
09:55:35.784456 IP 193.0.14.129.domain > PublicIP.34042: 11264-| 0/0/1 (45)
09:55:36.045130 IP PublicIP.wcbackup > 192.112.36.4.domain: 28368 A? update.googleapis.com. (39)
09:55:36.063323 IP InternalIP.49382 > PublicIP.domain: 57145+ A? accounts.google.com. (37)
09:55:36.064459 IP PublicIP.48169 > 193.0.14.129.domain: 15825 [1au] A? accounts.google.com. (48)
09:55:36.065883 IP APNIP.54312 > PublicIP.domain: 53585+ A? accounts.google.com. (37)
09:55:36.080202 IP 192.112.36.4.domain > PublicIP.wcbackup: 28368- 0/13/14 (499)
09:55:36.120905 IP 193.0.14.129.domain > PublicIP.48169: 15825- 0/15/27 (1182)
09:55:36.170289 IP InternalIP.59759 > PublicIP.domain: 52061+ A? www.google.com. (32)
09:55:36.224316 IP PublicIP.5346 > 192.112.36.4.domain: 40438 A? www.facebook.com. (34)
09:55:36.257993 IP 192.112.36.4.domain > PublicIP.5346: 40438- 0/13/14 (494)
09:55:36.441576 IP PublicIP.domain > InternalIP.65408: 45517 ServFail 0/0/0 (39)
09:55:36.441666 IP PublicIP.domain > InternalIP.60664: 54663 ServFail 0/0/0 (39)
09:55:36.442994 IP PublicIP.domain > InternalIP.48634: 56799 ServFail 0/0/0 (39)
09:55:36.443474 IP PublicIP.domain > InternalIP.36045: 34980 ServFail 0/0/0 (39)
所以我相信我会深入探讨这个问题。
基本上这个错误来自我对linux和bind服务的误解。
以前的一位同事构建了这些 DNS 服务器并创建了一个服务,该服务搭载了命名服务 /usr/local/bin/service-gi
该服务本质上是使用我的 quagga 虚拟路由器运行命名服务,并且它作为转发器工作(我必须更改配置,因此它现在是一个递归服务器)。
但是我犯的错误是启动和运行命名服务并与自定义服务并行运行(我这样做是为了监控目的,因为我们使用的工具只能识别通用命名服务而不是自定义服务)但是因为 2服务正在同时工作,无法解决查询。
一旦命名服务停止并且我只使用自定义服务,查询开始成功解决