POSIX DNS APIs considered harmful.
I'm quite upset with the POSIX APIs. I'm trying to write some kind of simple piece of software, to be included in a mail chain, doing a lot of DNS queries, mostly to RBLs.
I've met two kind of problems:
- the POSIX API is synchronous, and absolutely no usual API for non blocking requests exists, there is of course:
- adns, but it's very bloated, does not work with ipv6,
- c-ares, but it's quite bloated, does not work with ipv6, and need select (cannot be used with epoll e.g.),
- libdnsresolv (hahahaha, that's an horrible patch over the BSD resolving API, it's horrible, yuck),
- udns, that's the best of the four, but still, the code do not feels very well, solid (do not check recvfrom/sendto errors e.g.).
- there is no way to escape the system resolver, which is good for usual applications, but very bad when it come to RBLs.
The second point is in fact the worse. On mail queues I administrate, there is also very often a bind that is used as a caching DNS resolver. Very well, except that it complectely sucks with RBLs. RBL generate a lot of queries that resolve and that you will almost never ask again before its TTL expires, and the other kind of queries you do get NOT FOUND answers, that are usually not cached.
Too bad, the most useful answers are the NOT FOUND ones, and the cached answers are just here to make our local cache use huge amounts of memory for nothing. So RBLs just end up poisonning your system efficiency. That's quite ridiculous.
I'd really love to have a decent async resolver API, and a way to tell my DNS cache (here BIND) that the domains serves a RBL, and that I (for at the same time the sake of the RBL and from my system) want specific caching features for it. I fear that it wont be possible, and that I will end up writing some API specifically designed to craft DNS queries to the RBL servers (once again there is absolutely no reason to use forwarders for that, as it will end up by screwing your forwarder cache for almost no gain, and won't save you from a lot of useless queries with no answers at all).
The good point is that the kind of queries you need to do to RBL servers is indeed completely trivial:
- only A and maybe one day AAAA queries ;
- a query will never be greater than 512 octets, so only UDP is needed ;
- you only have to ask the NS that serve your RBL (and here we use the local resolver API, to benefit from the caches), and query that RBL, no recursion is needed.
Then implementing on top of that some efficient caching (RBL-aware) will be rocking fast, and way more efficient than bind caching, without the poisonning effects. A big gain — I hope. What suprises me the most, is that I've not found anybody speaking about those side effects of RBLs on nameservers, almost no discussion at all. Very surprising for a protocol that is so vital for the internet as a general rule.
