[vox-tech] Changing data with awk
Mark K. Kim
vox-tech@lists.lugod.org
Sat Jun 5 06:11:09 PDT 2004
Yes, "\n" terminates a record. But Richard (the original poster) said
that the field has embedded "carriage return" characters, which is "\r".
Since "\r" is not "\n", the codes do work... at least under *NIX.
Yes, pressing the "Enter" key produces carriage return code on a standard
keyboard (ASCII 13 "\r"), but *NIX translates it to the linefeed code
(ASCII 10 "\n"), whereas DOS/Windows translates it to carriage return
followed by linefeed "\r\n", and the older MacOS (before 10) doesn't
translate it at all. And by convention "\n" under C (along with various
other languages) represents the default line terminator for that platform
("\n" for *NIX, "\r\n" for DOS/Windows, "\r" under older MacOS.) But
unlike DOS/Windows, UNIX Lets you turn off the translation using termios.
But now we're off topic... -_-v
-Mark
On Fri, 4 Jun 2004 me@heyjay.com wrote:
> Do any of these work? It seems like all of them will break because
> awk and perl are going to read a record at a time, where the record
> terminator is a \n. reproducing Richard's original problem
>
> I think in perl you'd have to slurp the whole file (undef $\) then do
> a regex in a loop finding X number of "^"s followed by a "\n"
> memorizing the fields, then substitute the 6th field's "\n"s with
> null, then join the fields back together.
>
> Jay
> ----- Original Message -----
> From: "Mark K. Kim" <lugod@cbreak.org>
> To: <vox-tech@lists.lugod.org>
> Sent: Friday, June 04, 2004 6:24 PM
> Subject: Re: [vox-tech] Changing data with awk
>
>
> > Field separator is not space. BTW, found a better way:
> >
> > awk -F^ '{OFS="^"; $6=gensub(/\r/,"<cr>","",$6); print}' test.dat
> >
> > ...mark
> >
> >
> > On Fri, 4 Jun 2004, Leo Rainer wrote:
> >
> > > In awk you can directly modify individual fields so this should work:
> > >
> > > awk -F^ '{$6=gensub(/\r/,"<cr>","",$6); print}' test.dat
> > >
> > > ...leo
> > >
> > > At 03:36 PM 6/4/04, Mark K. Kim wrote:
> > > >Unfortunately that wouldn't work since Richard wants to modify the
> column
> > > >in the file, not strip it out at the same time he modifies it...
> Unless
> > > >you know of some way to re-insert the modified column back into the
> file
> > > >(I don't.) Good try, though.
> > > >
> > > >I'm not an awk expert but I'd guess you could do something like:
> > > >
> > > > awk -F^ '{$6=gensub(/\r/,"<cr>","",$6);
> > > > printf("%s^%s^%s^%s^%s^%s\n",$1,$2,$3,$4,$5,$6)}' test.dat
> > > >
> > > >I'm sure someone's got a better idea that putting all those %s's...
> > > >
> > > >-Mark
> > > >
> > > >PS: Then there's PERL... =P
> > > >
> > > >
> > > >On Fri, 4 Jun 2004, Dylan Beaudette wrote:
> > > >
> > > > > > I have a large flat file generated by SQL Loader that I'd like to
> mess
> > > > > > around with; specifically, I'd like to replace all of the carriage
> > > > returns
> > > > > > in one field with some other character, since they're messing up
> my data
> > > > > > load.
> > > > > >
> > > > > > I figured I'd use awk, since it's a pretty powerful little tool
> for
> > > > > > getting right to the data. If I use:
> > > > > >
> > > > > > $ awk -F^ {print $6} test.dat
> > > > > >
> > > > > > I get the field that I want. But how do I change the characters
> in that
> > > > > > field and replace them in test.dat?
> > > > >
> > > > >
> > > > > i recently had a similar problem: trying to convert
> > > > > this:
> > > > > 1<CR>
> > > > > 2<CR>
> > > > > 3<CR>
> > > > > ...
> > > > >
> > > > > into this: 1, 2, 3...
> > > > >
> > > > > here is how i did it:
> > > > >
> > > > > append a comma+space to the end of each line with sed
> > > > > then remove each CR using tr:
> > > > >
> > > > > sed -e 's/$/, /g' input_file | tr -d "\n" > output_file
> > > > >
> > > > > so something like this might do the trick:
> > > > >
> > > > > awk -F^ {print $6} test.dat | sed -e 's/$/, /g' | tr -d "\n" >
> output_file
> > > > >
> > > > >
> > > > > .. you would be left with one column of data that would have to be
> > > > > re-instered into the DB, or added back to the original file.
> > > > >
> > > > > the command 'paste' might be helpful for appending the data to the
> > > > > original...
> > > > >
> > > > >
> > > > > good luck!
> > > > >
> > > > > Dylan
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > vox-tech mailing list
> > > > > vox-tech@lists.lugod.org
> > > > > http://lists.lugod.org/mailman/listinfo/vox-tech
> > > > >
> > > >
> > > >--
> > > >Mark K. Kim
> > > >AIM: markus kimius
> > > >Homepage: http://www.cbreak.org/
> > > >Xanga: http://www.xanga.com/vindaci
> > > >Friendster: http://www.friendster.com/user.jsp?id=13046
> > > >PGP key fingerprint: 7324 BACA 53AD E504 A76E 5167 6822 94F0 F298 5DCE
> > > >PGP key available on the homepage
> > > >_______________________________________________
> > > >vox-tech mailing list
> > > >vox-tech@lists.lugod.org
> > > >http://lists.lugod.org/mailman/listinfo/vox-tech
> > >
> > > _______________________________________________
> > > vox-tech mailing list
> > > vox-tech@lists.lugod.org
> > > http://lists.lugod.org/mailman/listinfo/vox-tech
> > >
> >
> > --
> > Mark K. Kim
> > AIM: markus kimius
> > Homepage: http://www.cbreak.org/
> > Xanga: http://www.xanga.com/vindaci
> > Friendster: http://www.friendster.com/user.jsp?id=13046
> > PGP key fingerprint: 7324 BACA 53AD E504 A76E 5167 6822 94F0 F298 5DCE
> > PGP key available on the homepage
> > _______________________________________________
> > vox-tech mailing list
> > vox-tech@lists.lugod.org
> > http://lists.lugod.org/mailman/listinfo/vox-tech
> >
> >
>
> _______________________________________________
> vox-tech mailing list
> vox-tech@lists.lugod.org
> http://lists.lugod.org/mailman/listinfo/vox-tech
>
--
Mark K. Kim
AIM: markus kimius
Homepage: http://www.cbreak.org/
Xanga: http://www.xanga.com/vindaci
Friendster: http://www.friendster.com/user.jsp?id=13046
PGP key fingerprint: 7324 BACA 53AD E504 A76E 5167 6822 94F0 F298 5DCE
PGP key available on the homepage
More information about the vox-tech
mailing list