[vox-tech] awk and multi-line records

Dylan Beaudette dylan.beaudette at gmail.com
Fri Aug 18 14:13:59 PDT 2006


so it looks like my simple approach to dealing with some files, that normally 
contain single-line records, breaks when they decide to use multiline records 
some of the time.... 

the nature of the data is something like this:

fixed number of columns, with '|' delimiter:
 col 1| col 2 | col 3 | ... col_n 
 col 1| col 2 | col 3 | ... col_n 
 col 1| col 2 | col 3 | ... col_n 
 col 1| col 2 | col 3 | ... col_n 

some of the time one of these columns will contain multi-line text:

col 1| col 2 | col 3 |
blah blah blah ... 
blah blah blah ...
blah blah blah ...
blah blah blah ...
blah blah blah ... | ... col_n 

(!)

I was using a simple awk script to add an extra column to each record as 
follows:

awk -v areasymbol=$areasymbol '{gsub("\"","") ; print ""areasymbol"|"$0}' 
input-table > output-table
...
newco | col 1| col 2 | col 3 | ... col_n 
newcol | col 1| col 2 | col 3 | ... col_n 
...

the question is then: 

is there any simple way to "add" a column to this kind of flat text file using 
awk when there is:

1. consistent field delimeters
2. consistent number of fields
3. inconsistent records delimiters

thoughts? 

PS: this is being used to prep. some data for loading into PGSQL with the COPY 
command.

Cheers,





-- 
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341


More information about the vox-tech mailing list