[vox-tech] Omsoft transparent HTTP proxy

Micah Cowan vox-tech@lists.lugod.org
Mon, 18 Nov 2002 15:09:38 -0800


On Monday, November 18, 2002, at 02:17  PM, Ken Bloom wrote:

>> ---ORIGINAL MESSAGE---
>> Date: Sun, 17 Nov 2002 00:43:20 -0800
>> From: Samuel Merritt <spam@andcheese.org>
>> To: vox-tech@lists.lugod.org
>> Subject: Re: [vox-tech] Omsoft transparent HTTP proxy
>> Reply-To: vox-tech@lists.lugod.org
>>
>>
>> On Sat, Nov 16, 2002 at 10:48:22PM -0800, Ken Bloom wrote:
>>> Who's right, and who's wrong? Is the web browser wrong for not 
>>> expanding
>>> the Host header? Or is the proxy wrong for relying on the Host header 
>>> to
>>> resolve IP addresses instead of relying on the IP address that the
>>> actual packets are destined for? Or are they both wrong (this could 
>>> very
>>> well be the case)?
>>
>> The web browser is right. I think what happens is something like this:
>>
>> 1) The browser gets a request from the user for http://my/.
>> 2) The browser issues a gethostbyname(my) call.
>> 3) The DNS resolver checks its search order, and finds "ucdavis.edu", 
>> so
>> sends a query to the name server for "my.ucdavis.edu"
>> 4) The DNS resolver gets an IP back from the name server.
>> 5) gethostbyname(my) returns that IP.
>>
>> Notice that the browser doesn't have any idea that "my" is in domain
>> ucdavis.edu; the search order information is in /etc/resolv.conf, and
>> only the DNS libraries make use of that.
>>
>> The proxy is the one screwing things up.
>>
>> What the proxy should do:
>> 1) Get a request originally destined for IP A.B.C.D, with "Host: my" in
>> the header.
>> 2) Connect to A.B.C.D, passing the Host header (and any other headers)
>> along.
>> 3) If the content is in cache, return it from cache rather than
>> downloading it again.
>>
>> What it does:
>> 1) Get a request originally destined for IP A.B.C.D, with "Host: my" in
>> the header.
>> 2) Look up "my" in the DNS, and fail.
>> 3) Ignore the fact that the request was headed for A.B.C.D, and give an
>> error message.
>>
>> I recommend bugging Omsoft about this; their proxy is clearly broken.
>>
>
> It would seem to me however, that the web browser's behavior is also
> broken, because if my.ucdavis.edu were really virtual hosted, even if
> Omsoft didn't filter HTTP traffic through a transparent proxy,
> then the server hosting my.ucdavis.edu would recieve a header stating
> "Host: my", and be just as confused as Omsoft's proxy.

That's true, but it's not the web browser's fault. How is the web 
browser supposed to know that my is really my.ucdavis.edu (as Sam points 
out). That's what you get with virtual hosts: the browser user should 
just bunker down and type my.ucdavis.edu if he expects that to be in the 
Host header field.

-Micah