[vox-tech] Re: Changing data with awk

Richard Crawford vox-tech@lists.lugod.org
Mon Jun 7 17:19:52 PDT 2004


Apologies for breaking the thread.  I'm reading various responses on the
LUGOD website, since the e-mails are stuck on my computer at home, so I'm
replying from SM.

Jay wrote, quoting Mark,

>> Yes, "\n" terminates a record.  But Richard (the original poster) said
>> that the field has embedded "carriage return" characters, which is "\r".
>> Since "\r" is not "\n", the codes do work... at least under *NIX.
>>
>> Yes, pressing the "Enter" key produces carriage return code on a standard
>> keyboard (ASCII 13 "\r"), but *NIX translates it to the linefeed code
>> (ASCII 10 "\n"), whereas DOS/Windows translates it to carriage return
>> followed by linefeed "\r\n", and the older MacOS (before 10) doesn't
>> translate it at all.  And by convention "\n" under C (along with various
>> other languages) represents the default line terminator for that platform
>> ("\n" for *NIX, "\r\n" for DOS/Windows, "\r"  under older MacOS.)  But
>> unlike DOS/Windows, UNIX Lets you turn off the translation using termios.
>> But now we're off topic... -_-v
>>
>> -Mark

>Thanks Mark, I wasn't aware of those subtleties.
>
>But, I'm betting Richard has embedded ASCII 10s in his file, and that
>is why SQLLoader (Oracle's mass data loading tool, which uses ASCII 10s
>as it's record separator) is giving him problems.
>
>Richard, wha'chu'got in your file?
>
>Jay

Here's the scoop.

I've got a moderately-sized table in our database (only 35,000 records)
which is part of our messaging system.  One of the fields, "msgMessage",
contains the text of e-mail messages sent from students to instructors,
and vice versa (don't question me about the logic of this approach; I
didn't design the system, I'm just stuck with it -- heh).  Since many of
these messages have carriage returns as line breaks, they're in the data
for this field for this table.

So I'm using a Windows program, DBArtisan, to export the table into an
external .DAT file, using the caret, ^, as the field separator.  I'm using
SQL Loader to create the control file which I'll use to load the data from
this .dat file into Oracle.  Unfortunately, while SQL Loader understands
the ^ field separator just fine, it interprets ALL carriage returns as
end-of-records, whether it really is at the end of the record or within
the msgMessage field.  Thus, the carriage returns in Field 6 are
interpreted as end-of-records.  That's something I didn't really think
through before.

You'd think that this would be a common problem, but I have yet to find a
solution on the web or with any sort of tech support.  I wish there were a
way to fix the data in the tables before exporting from SQL Server, but
there doesn't seem to be an easy way to do that.

-- 
Sláinte,
Richard S. Crawford (AIM: Buffalo2K)
http://www.mossroot.com   http://www.stonegoose.com/catseyeview
"You cannot trust your judgement if your imagination is out of focus." 
--Mark Twain





More information about the vox-tech mailing list