I'm talking about a hypothetical ISP that wants to extract all the hostnames its customers are connecting to. It has to analyze the traffic off a live stream and re-construct the TCP stream to do this. Rebuilding the TCP stream on a 100Gbps switch is pretty hard to do. Something like "sniproxy" is only extracting the hostname for all traffic connecting to it, so it doesn't have to try and re-build the tcp stream.
For the reverse DNS stuff, yeah you can't count on PTR records. The easiest thing is to use a third party like Domain Tools (https://www.domaintools.com/), or you can roll your own. The quick and dirty way to do this is to get your hands on regularly updated zone files with all the hostnames, do a DNS lookup for that domain name, and store that data in an index. Assuming you get regular updates to your zone files the daily load is manageable. From memory, for .com you only need to evaluate about 400K domain names a day.
I'm talking about a hypothetical ISP that wants to extract all the hostnames its customers are connecting to. It has to analyze the traffic off a live stream and re-construct the TCP stream to do this. Rebuilding the TCP stream on a 100Gbps switch is pretty hard to do. Something like "sniproxy" is only extracting the hostname for all traffic connecting to it, so it doesn't have to try and re-build the tcp stream.
For the reverse DNS stuff, yeah you can't count on PTR records. The easiest thing is to use a third party like Domain Tools (https://www.domaintools.com/), or you can roll your own. The quick and dirty way to do this is to get your hands on regularly updated zone files with all the hostnames, do a DNS lookup for that domain name, and store that data in an index. Assuming you get regular updates to your zone files the daily load is manageable. From memory, for .com you only need to evaluate about 400K domain names a day.