[vox-tech] Question: mod_dav.1.0.3 + apache.1.3.26 and CR/LF issues w/ MacOS+MSWin

Jeff Newmiller vox-tech@lists.lugod.org
Wed, 3 Jul 2002 14:22:22 -0700 (PDT)


On Wed, 3 Jul 2002, ME wrote:

> > On Tue, 2 Jul 2002, ME wrote:
> > > Web browsers seem to do this. However, when testing "MS Web Folders" and
> > > "Cadaver" I have found they transmit the files with the actual
> > > line termination encoding of the file.
> 
> On Tue, 2 Jul 2002, Jeff Newmiller wrote:
> > So, if the on-the-wire format is supposed to be CRLF, which is native to
> > Windows, and your server is not storing that as LF, then either the
> > transmitted content type is wrong, or the server is not configured
> > or programmed correctly.
> 
> Web browsers work just fine with text file (text/plain, text/html for
> example) when served to standard web browsers. The browsers themselves
> actually make the translation of the content.

Web browsers work just fine with various newline conventions, because that
is the way web browsers are programmed.  That has nothing to do with how
the server is supposed to deliver the data.

> WebDAV clients (Goliath, MS Windows Web Folders, cadaver (for Linux)) do
> not appear to do any translation like the web clients (Nestscape, MSIE,
> Lynx) have done and still do.

Netscape does NOT do any translation.  That was the point of my anecdote
quoted below.

IE _does_ do translation, and while it solves problems for users, it does
so by masking the screwups of the server administrators.


> When a WebDAV client grabs a text file, the file is transmitted as stored
> on the server. If the file was created with ^M termination only, then the
> file received has ^M as line breaks. If the files tored on the server has
> ^M^J line termination, then then WebDAV clients (listed above) transmit a
> file with those same breaks. The same file loaded with MSIE/Netscape
> appears with transaltion as expected for the client OS's line termination
> strings.

A properly configured web server running on a *nix environment, when
confronted with a "text" file on disk that has CRLFs, is supposed to
transmit CRCRLFs, because it is supposed to translate the native newlines
(LF) to the on-the-wire newlines (CRLF).

> The RFC for HTTP/1.1 says the header is supposed to use CR/LF for HTTP,
> but the body conversion to line breaks is left tot he client. With the
> HTTP request, the "body" becomes the actual file (for the most part) but
> not the header.

The client is allowed to _expect_ that "text" files transmitted to it have
CRLFs as newline separators.  Any other sequence (LF or CR) _may_ be
misinterpreted by the client.  The fact that most clients can handle
various alternatives is sugar coating that saves lazy administrator's
butts.

"Text" files have type "text/*" in the ContentType header.  If the server
is serving up a file (in response to a GET) that it has determined to be
some variant of "text" file, it is expected to make sure the line
separators are CRLF at a minimum, and the client is expected to be able
to handle that.

Similarly, if the client is serving up a "text" file, it is responsible to
tell the server that in a ContentType header, and to normalize the line
separators.  Likewise, the server is supposed to recognize the ContentType
header and denormalize the line separators (CRLF to LF) as appropriate.

> > This sounds vaguely like the reverse of the Netscape Communicator "MSWin
> > bug". NS has a habit of believing the server's content-type, so
> > downloading a binary file from a *nix web server that has not been told
> > the file is binary will yield an on-the wire corrupted file.  A *nix
> > Netscape client will fortuitously un-corrupt the file, but an MSWin or Mac
> > NS Communicator yields corrupt downloads because it believes the file is
> > text.  MSIE second-guesses the content-type, and usually downloads the
> > file correctly.
> 
> http://asg.web.cmu.edu/rfc/rfc2616.html
> ...
> "HTTP/1.1 defines the sequence CR LF as the end-of-line marker for all
> protocol elements except the entity-body (see appendix 19.3 for tolerant
> applications). The end-of-line marker within an entity-body is defined by
> its associated media type, as described in section 3.7."
> ...
> 3.7.1:
> ...
> "When in canonical form, media subtypes of the "text" type use CRLF as the
> text line break. HTTP relaxes this requirement and allows the transport of
> text media with plain CR or LF alone representing a line break when it is
> done consistently for an entire entity-body. HTTP applications MUST accept
> CRLF, bare CR, and bare LF as being representative of a line break in text
> media received via HTTP."
> ...
> 
> (This, above, is what you mentioned in the previous e-mail as being a part
> of http. WebDav runs over http.)
> 
> WebDAV clients are not doing this if "line break" is to be considered the
> local OS's line termination string sequence when transmitting text files.
> 
> It later goes on to say that the CRLF sequence for header
> information/options must use CRLF without substituition.
> 
> Ideally, the clients would need to perform the line break translations for
> files of type "text" that are dowloaded from a Dav enabled web server.

Yes, they should recognize ContentType as appropriate for their local
system.

> Unfortunately, this does not happen in the Dav clients I have tested.
> 
> I have posted this as an issue to the "Microsoft news groups" complaining
> about "MS Web Folders" not doing this translation. I have yet to post this
> to the Cadaver or Goliath developers list, but will get around to it.

I am puzzled... this is a non-issue for that operating
environment.  On-the-wire is no different from their on-the-disk format.
The Mac and Unix clients would be the ones to be concerned about.

> For now, a solution exists on my server to enable content modification to
> server text files to DAV users and based on a matching search of their
> client name/version, perform a translation of the text file
> "on-the-fly" to use their native OS line termination char sequence. It is
> not a "good solution" as all, but it does work to fill in this feature
> missing in the clients.

I am boggled.  Are you sure it isn't a client-side misconfiguration?  The
client software has to be able to tell the difference between a text file
and a binary file somehow.  On a Mac, this is easy.  On *nix, may have to
depend on file extensions, but you don't seem to be.

> It seems the "best solution" is to see if this can be added to Dav.

It must be in both the server and the clients, and the software at both
ends has to be configured to distinguish text and binary files.... but
MIME-type magic is pretty widespread... I thought the bad old days of ftp
servers that had to be told the transfer type for each session were behind
us.

> OK. Fine. I am subscribing to a ietf/w3 working group on WebDAV. If the
> RFC for Dav can be modified to be explicit on content modification for
> text files over http, perhaps the clients will add this feature. :-)

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...2k
---------------------------------------------------------------------------