[vox-tech] Perl - reading fixed width formats

Wes Hardaker wjhns156 at hardakers.net
Thu Aug 13 14:13:58 PDT 2009


>>>>> On Thu, 13 Aug 2009 10:50:58 -0400, Peter Jay Salzman <p at dirac.org> said:

PJS> field 1: line 1, chars 1-4
PJS> field 2: line 1, chars 5-6
PJS> field 3: line 1, chars 7-11
PJS> field 4: line 1, chars 12 to EOL
PJS> field 5: line 2, chars 1-30
PJS> field 6: line 3, chars 1-10
PJS> field 7: line 4, chars 1-2

Are those all fixed number of columns?  IE, line 2 will always have
exactly 30 characters and will padded if the data is shorter than 30?

If so, then unpack() is probably your best bet for speed I'd think.  You
can read() in chunks of data based on the exact record length and then
use unpack to split apart the chunks.

Doing things like split/regexp/etc (as suggested) I think are
potentially easier to read and maintain and handle file format changes
much better though.  But in terms of manipulating data, treating the
data as binary will yield faster results I think.
-- 
\ Wes Hardaker                           http://pontifications.hardakers.net /
 \_____ "In the bathtub of history the truth is harder to hold than ________/
       \_______ the soap, and much more difficult to find." _______/
               \_________ -- Terry Pratchett ______________/
                         \__________________/


More information about the vox-tech mailing list