After reading about the newest glibc vulnerability, I have decided to see how much effort there is in rewriting parts of glibc in a safe language. Rust is well suited for this as it should prevent the kinds of buffer overflows that caused this problem. So where to start. So first order of business is to get a copy of the current implementation of getaddrinfo from glibc
git clone git://sourceware.org/git/glibc.git
You will find the definition of getaddrinfo in
it starts at line 2324 and goes for about 300 lines. All in all not too bad.
getaddrinfo starts by sanitizing the name and service inputs. it will treat a NULL pointer and a string consisting of "*" as the same, so here it replaces "*" with NULL ( fun aside, name and service are both const char* so I find it funny that they are set in the program, I do understand that from the callers perspective they don't change, but still bad form ).
The next step is to check the flags against the list of allowed flags. First just check to see if a bit was set that shouldn't be, then check that you have combinations that make sense including
Next we start dealing with the service. You need to check if service is a number, if its not that's cool unless you set AI_NUMERICSERV in which case we need to error out. After this we really get into the heart of the function
So after we have passed the sanity checks control is passed into gaih_inet, this is the function that powers getaddrinfo. It also happens to have a ton of goto's and very unhelpfully named variables. After this function does some formatting of data structures it looks to call out to
__getservbyname_r ( inside a nested helper function
gaih_inet_serv ) and
__gethostbyname2_r with those data structures and the process the results, under some circumstances it will also engage NSS to do some lookup using NSS verisons of gethostbyname. Then it looks like the function try's to connect to the services on the given port ( discovered from
getsrvbyname_r ) and returns a list of connections.
After the actual data is gathered the next step is sorting it. There appears to be a few places where the data is sorted, first a list of local IPv6 interfaces is sorted, this is used later to determine if we connected via a "temporary or disabled" interface. The next time sort appears we are sorting the results according to RFC3484. The final step is to set the results, which is done with a double assignment operation.
q = p = results[order].dest_addr;
q = q->ai_next = results[order[i]].dest_addr;
To re implement this I think I will break it down into a few separate components ( much like glibc ) Look for follow up posts for each of these components.
__check_pfi.e. getifaddrs wrapper )
At the end I want to benchmark against the libc version to see what the slowdown is, and I want to classify which parts of glibc my version uses to see where else things need to be implemented to truly have a glibc replacement of the function