Keep learning, keep living...

0%

PowerDNS模块机制分析

PowerDNS是一个开源DNS服务器。它包括了授权DNS服务器和递归DNS服务器。本文只讨论授权DNS服务器,主要分析PowerDNS的架构及Backend模块机制的实现。

PowerDNS由各种各样的backend来处理DNS解析相关数据,数据可以存储在BIND模式的ZONE文件中,也可以存储在关系型数据库中。官方实现了很多backend,如Bind, Postgresql, Mysql等。每种backend实现为一个SO文件。PowerDNS启动时根据配置中的launch语句加载指定的so文件。这种方式使得PowerDNS的可扩展性非常高。
简略的PowerDNS架构图如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
+-----------------------------------------------------------------------+
| | | |
| +------------+ | +-----------+ | |
| |main thread | | |PacketCache| | |
| +------------+ | +-----------+ | |
| | | |
| +------------+ | | |
| |DynListerner| | | |
| +------------+ | | +-----------+ +----------+ |
| | +---------+--|Distributor|---|DNSBackend| |
| +--------+ | | | +-----------+ +----------+ |
| |receiver|------+-------+ | |
| +--------+ | | | +-----------+ +----------+ |
| | +---------+--|Distributor|---|DNSBackend| |
| ... | | +-----------+ +----------+ |
| | | ... |
| | | |
| +--------+ | | +-----------+ +----------+ |
| |receiver|------+-------+---------+--|Distributor|---|DNSBackend| |
| +--------+ | | | +-----------+ +----------+ |
| | | | |
| | | | +-----------+ +----------+ |
| | +---------+--|Distributor|---|DNSBackend| |
| | | +-----------+ +----------+ |
| |
+-----------------------------------------------------------------------+

下面具体分析PowerDNS的启动流程。

1
2
3
4
5
6
7
8
9
10
11
12
if(::arg().mustDo("guardian") && !isGuarded(argv)) {
if(::arg().mustDo("daemon")) {
L.toConsole(Logger::Critical);
daemonize();
}
guardian(argc, argv);
// never get here, guardian will reinvoke process
cerr<<"Um, we did get here!"<<endl;
}

// we really need to do work - either standalone or as an instance
...

如果启动参数中指定了“–guardian=yes”,则PowerDNS运行在guardian模式,主进程fork出一个子进程执行正常的启动流程,而父进程则一秒钟探测一次子进程是否存活,如果子进程异常退出则重新启动子进程。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
for(;;) {
int ret=waitpid(pid,&status,WNOHANG);

if(ret<0) {
L<<Logger::Error<<"In guardian loop, waitpid returned error: "<<strerror(errno)<<endl;
L<<Logger::Error<<"Dying"<<endl;
exit(1);
}
else if(ret) // something exited
break;
else { // child is alive
// execute some kind of ping here
if(DLQuitPlease())
takedown(1); // needs a parameter..
setStatus("Child running on pid "+itoa(pid));
sleep(1);
}
}

pthread_mutex_lock(&g_guardian_lock);
close(g_fd1[1]);
fclose(g_fp);
g_fp=0;

if(WIFEXITED(status)) {
int ret=WEXITSTATUS(status);

if(ret==99) {
L<<Logger::Error<<"Child requested a stop, exiting"<<endl;
exit(1);
}
setStatus("Child died with code "+itoa(ret));
L<<Logger::Error<<"Our pdns instance exited with code "<<ret<<endl;
L<<Logger::Error<<"Respawning"<<endl;

goto respawn;
}

子进程或非guardian模式的原生进程执行正常的启动流程。

1
2
loadModules();
BackendMakers().launch(::arg()["launch"]); // vrooooom!

loadModules()会根据”load-modules”配置语句调用dlopen来加载相应的SO文件。BackendMakers().launch(::arg()[“launch”]);来加载launch语句指定的modules。然后开始监听指定的UDP和TCP端口。

1
2
3
4
N=new UDPNameserver; // this fails when we are not root, throws exception

if(!::arg().mustDo("disable-tcp"))
TN=new TCPNameserver;

接着调用mainthread()来初始化若干网络包接收线程。

1
2
3
unsigned int max_rthreads= ::arg().asNum("receiver-threads");
for(unsigned int n=0; n < max_rthreads; ++n)
pthread_create(&qtid,0,qthread, reinterpret_cast<void *>(n)); // receives packets

而每个接收线程则会创建若干自己的distributor线程。

1
DNSDistributor *distributor = DNSDistributor::Create(::arg().asNum("distributor-threads")); // the big dispatcher!

接下来接收线程开始自己处理网络数据。接收到DNS查询包后,首先在PacketCache中查询是否有相应的响应。PacketCache是PowerDNS中Backend响应的Cache,它减少对于Backend的访问,可以大大提高性能。如果在PacketCache中没有找到,则调用distributor->question分发给distributor线程。示意代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
for(;;) {
if(!(P=NS->receive(&question))) { // receive a packet inline
continue; // packet was broken, try again
}

if(P->couldBeCached() && PC.get(P, &cached)) { // short circuit - does the PacketCache recognize this question?
NS->send(&cached); // answer it then
continue;
}

distributor->question(P, &sendout);
}

在调用question方法时会传递一个回调函数sendout。distributor线程处理完DNS请求后将调用该回调函数将响应返回给客户端。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
void sendout(const DNSDistributor::AnswerData &AD)
{
static unsigned int &numanswered=*S.getPointer("udp-answers");
static unsigned int &numanswered4=*S.getPointer("udp4-answers");
static unsigned int &numanswered6=*S.getPointer("udp6-answers");

if(!AD.A)
return;

N->send(AD.A);
numanswered++;

if(AD.A->d_remote.getSocklen()==sizeof(sockaddr_in))
numanswered4++;
else
numanswered6++;

int diff=AD.A->d_dt.udiff();
avg_latency=(int)(1023*avg_latency/1024+diff/1024);

delete AD.A;
}

distributor线程中调用backend来处理DNS响应。示意代码如下:

1
2
3
4
5
B.lookup;
DNSResourceRecord rr;
while(yb.get(rr))
cout<<"Found cname pointing to '"+rr.content+"'"<<endl;
}

其中B为UeberBackend实例。UeberBackend是一个特殊的Backend。启动时被加载的其他backend向它注册,被保存在一个vector中。UeberBackend按注册顺序依次调用其他backend的相应成员方法。以reload操作为例:

1
2
3
4
5
6
7
void UeberBackend::reload()
{
for ( vector< DNSBackend * >::iterator i = backends.begin(); i != backends.end(); ++i )
{
( *i )->reload();
}
}

lookup和get成员方法实现比较特殊。UeberBackend有一个成员变量d_handle表示当前处理请求的真实backend.在lookup中,会将第一个注册的backend赋给d_handle, 并调用该backend的lookup。

1
(d_handle.d_hinterBackend=backends[d_handle.i++])->lookup(qtype, qname,pkt_p,zoneId);

UeberBackend的get方法会调用d_handle实例的get方法。而d_handle.get首先调用第一个backend的get方法。如果成功,则返回。否则调用下一个backend的lookup和get方法。直到get方法返回成功或者遍历完所有backend.这种实现保证了get方法所获取的响应是由本backend的lookup所生成。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
bool UeberBackend::handle::get(DNSResourceRecord &r)
{
DLOG(L << "Ueber get() was called for a "<<qtype.getName()<<" record" << endl);
bool isMore=false;
while(d_hinterBackend && !(isMore=d_hinterBackend->get(r))) { // this backend out of answers
if(i<parent->backends.size()) {
DLOG(L<<"Backend #"<<i<<" of "<<parent->backends.size()
<<" out of answers, taking next"<<endl);

d_hinterBackend=parent->backends[i++];
d_hinterBackend->lookup(qtype,qname,pkt_p,parent->domain_id);
}
else
break;

DLOG(L<<"Now asking backend #"<<i<<endl);
}

if(!isMore && i==parent->backends.size()) {
DLOG(L<<"UeberBackend reached end of backends"<<endl);
return false;
}

DLOG(L<<"Found an answering backend - will not try another one"<<endl);
i=parent->backends.size(); // don't go on to the next backend
return true;
}

下面介绍module加载的过程。PowerDNS的module实现需要包含三个部分。

  1. Backend本身,它需要继承父类DNSBackend.
1
2
/* FIRST PART */
class RandomBackend : public DNSBackend{}
  1. 工厂类,它用于产生Backend实例。
1
2
/* SECOND PART */
class RandomFactory : public BackendFactory{}
  1. 加载器类和该类的一个static实例。
1
2
3
/* THIRD PART */
class RandomLoader{};
static RandomLoader randomloader;

当进程调用dlopen时,该static loader实例的构造函数被调用,将其注册在UeberBackend中。

可以参考我写的一个简单的PowerDNS模块: remoteipbackend, 用于返回到达它的递归DNS的出口IP地址。