[vox-tech] CSV with rogue EOLs
Samuel N. Merritt
vox-tech@lists.lugod.org
Thu, 16 Oct 2003 19:01:26 -0700
--x+6KMIRAuhnl3hBn
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
On Thu, Oct 16, 2003 at 06:24:32PM -0700, Bill Kendrick wrote:
>=20
> Has anyone got a Perl or sed script handy that can take a CSV
> (comma-separated values) text file like this:
>=20
> "1234","Hello","ABCD"
> "1235","Hello
> there","XYZ"
> "1236","Goodbye","LLLL"
>=20
> and make it look like this:
>=20
> "1234","Hello","ABCD"
> "1235","Hello there","XYZ"
> "1236","Goodbye","LLLL"
>=20
> e.g., wherever there are EOLs _within_ fields (between quotes), have it
> replace those with something (in my example above, just a space)
Let's see if this works. (Warning: use at own risk!)
while(<>)
{
$line =3D $_;
chomp $line;
while (scalar($line =3D~ s/([^\\]")/$1/g ) % 2)
{
$line .=3D <STDIN>;
chomp $line;
}
print "$line\n";
}
I tested this on the example you provided, and it seems to work.=20
=20
> I'm unfortunately dealing with Excel, and even it is too stupid to rememb=
er
> when its within a field, so you end up with a spreadsheet like this:
>=20
> 1234 | Hello | ABCD
> 1235 | Hello | [blank]
> there | XYZ | [blank]
> 1236 | Goodbye | LLLL
>=20
> I've dealt with this issue before, but it was years ago. And I used C. ;=
^)
>=20
> Thx!
>=20
> -bill!
>=20
> _______________________________________________
> vox-tech mailing list
> vox-tech@lists.lugod.org
> http://lists.lugod.org/mailman/listinfo/vox-tech
--=20
Samuel Merritt
OpenPGP key is at http://meat.andcheese.org/~spam/spam_at_andcheese_dot_org=
.asc
Information about PGP can be found at http://www.mindspring.com/~aegreene/p=
gp/
--x+6KMIRAuhnl3hBn
Content-Type: application/pgp-signature
Content-Disposition: inline
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)
iD8DBQE/j012W3tuPJ1t7wURApMAAJ4n7h91bFCRE1a4g/yQHax7osUcMgCfVVVR
BrkbmyO7hKhUlnaE+SmyF9w=
=ErW7
-----END PGP SIGNATURE-----
--x+6KMIRAuhnl3hBn--