[vox-tech] Finding the right tool for parsing

Ted Deppner ted at psyber.com
Sat Jan 19 17:49:10 PST 2008


On Sat, Jan 19, 2008 at 01:50:08PM -0800, Alex Mandel wrote:
> I've got a big text file to parse(example below)

Stick it on a website somewhere... it's hard to write a parser when inline
email handling might be misformatting things.

> The only pattern I can find to parse on is a:

Looks like there's lots of patterns to parse on.  That "." on line by
itself, the address line with two commas in it, the " mi " line, etc.

Seems like a simple job in perl to match any of those.

See if a perl pattern match on $/ = "\n.\n" might work.

ie: given the file named foo
====
one
.
two
.
three
.
four
.
====

and perl:
perl -ne 'BEGIN{$/="\n.\n"} print "+++$_---"' foo

you get:
+++one
.
---+++two
.
---+++three
.
---+++four
.
---

Demonstrating that perl is treating everything (including newlines)
between the "."-on-a-line-by-itself as a unit (in the $_ variable).

-- 
Ted Deppner
http://www.deppner.us/


More information about the vox-tech mailing list