[python] How can I do DNS lookups in Python, including referring to /etc/hosts?

dnspython will do my DNS lookups very nicely, but it entirely ignores the contents of /etc/hosts.

Is there a python library call which will do the right thing? ie check first in etc/hosts, and only fall back to DNS lookups otherwise?

This question is related to python dns

The answer is


The answer above was meant for Python 2. If you're using Python 3, here is the code.

>>> import socket
>>> print(socket.gethostbyname('google.com'))
8.8.8.8
>>>

Sounds like you don't want to resolve dns yourself (this might be the wrong nomenclature) dnspython appears to be a standalone dns client that will understandably ignore your operating system because its bypassing the operating system's utillities.

We can look at a shell utility named getent to understand how the (debian 11 alike) operating system resolves dns for programs, this is likely the standard for all *nix like systems that use a socket implementation.

see man getent's "hosts" section, which mentions the use of getaddrinfo, which we can see as man getaddrinfo

and to use it in python, we have to extract some info from the data structures

.

import socket

def get_ipv4_by_hostname(hostname):
    # see `man getent` `/ hosts `
    # see `man getaddrinfo`

    return list(
        i        # raw socket structure
            [4]  # internet protocol info
            [0]  # address
        for i in 
        socket.getaddrinfo(
            hostname,
            0  # port, required
        )
        if i[0] is socket.AddressFamily.AF_INET  # ipv4

        # ignore duplicate addresses with other socket types
        and i[1] is socket.SocketKind.SOCK_RAW  
    )

print(get_ipv4_by_hostname('localhost'))
print(get_ipv4_by_hostname('google.com'))


The normal name resolution in Python works fine. Why do you need DNSpython for that. Just use socket's getaddrinfo which follows the rules configured for your operating system (on Debian, it follows /etc/nsswitch.conf:

>>> print socket.getaddrinfo('google.com', 80)
[(10, 1, 6, '', ('2a00:1450:8006::63', 80, 0, 0)), (10, 2, 17, '', ('2a00:1450:8006::63', 80, 0, 0)), (10, 3, 0, '', ('2a00:1450:8006::63', 80, 0, 0)), (10, 1, 6, '', ('2a00:1450:8006::68', 80, 0, 0)), (10, 2, 17, '', ('2a00:1450:8006::68', 80, 0, 0)), (10, 3, 0, '', ('2a00:1450:8006::68', 80, 0, 0)), (10, 1, 6, '', ('2a00:1450:8006::93', 80, 0, 0)), (10, 2, 17, '', ('2a00:1450:8006::93', 80, 0, 0)), (10, 3, 0, '', ('2a00:1450:8006::93', 80, 0, 0)), (2, 1, 6, '', ('209.85.229.104', 80)), (2, 2, 17, '', ('209.85.229.104', 80)), (2, 3, 0, '', ('209.85.229.104', 80)), (2, 1, 6, '', ('209.85.229.99', 80)), (2, 2, 17, '', ('209.85.229.99', 80)), (2, 3, 0, '', ('209.85.229.99', 80)), (2, 1, 6, '', ('209.85.229.147', 80)), (2, 2, 17, '', ('209.85.229.147', 80)), (2, 3, 0, '', ('209.85.229.147', 80))]

This code works well for returning all of the IP addresses that might belong to a particular URI. Since many systems are now in a hosted environment (AWS/Akamai/etc.), systems may return several IP addresses. The lambda was "borrowed" from @Peter Silva.

def get_ips_by_dns_lookup(target, port=None):
    '''
        this function takes the passed target and optional port and does a dns
        lookup. it returns the ips that it finds to the caller.

        :param target:  the URI that you'd like to get the ip address(es) for
        :type target:   string
        :param port:    which port do you want to do the lookup against?
        :type port:     integer
        :returns ips:   all of the discovered ips for the target
        :rtype ips:     list of strings

    '''
    import socket

    if not port:
        port = 443

    return list(map(lambda x: x[4][0], socket.getaddrinfo('{}.'.format(target),port,type=socket.SOCK_STREAM)))

ips = get_ips_by_dns_lookup(target='google.com')

list( map( lambda x: x[4][0], socket.getaddrinfo( \
     'www.example.com.',22,type=socket.SOCK_STREAM)))

gives you a list of the addresses for www.example.com. (ipv4 and ipv6)


I found this way to expand a DNS RR hostname that expands into a list of IPs, into the list of member hostnames:

#!/usr/bin/python

def expand_dnsname(dnsname):
    from socket import getaddrinfo
    from dns import reversename, resolver
    namelist = [ ]
    # expand hostname into dict of ip addresses
    iplist = dict()
    for answer in getaddrinfo(dnsname, 80):
        ipa = str(answer[4][0])
        iplist[ipa] = 0
    # run through the list of IP addresses to get hostnames
    for ipaddr in sorted(iplist):
        rev_name = reversename.from_address(ipaddr)
        # run through all the hostnames returned, ignoring the dnsname
        for answer in resolver.query(rev_name, "PTR"):
            name = str(answer)
            if name != dnsname:
                # add it to the list of answers
                namelist.append(name)
                break
    # if no other choice, return the dnsname
    if len(namelist) == 0:
        namelist.append(dnsname)
    # return the sorted namelist
    namelist = sorted(namelist)
    return namelist

namelist = expand_dnsname('google.com.')
for name in namelist:
    print name

Which, when I run it, lists a few 1e100.net hostnames: