[vox-tech] ARE (Tcl / Postgresql) REGEX question
Dylan Beaudette
debeaudette at ucdavis.edu
Mon Dec 1 18:45:44 PST 2008
Hi,
I have a rather complex (for me) regular expression that I am trying to figure
out.
Here is an example that works just fine:
-- I am trying to extract the two colors:
-- 10YR 6/4 and 7.5YR 4/4 from the following block of text
SELECT regexp_matches('B11t Light yellowish brown (10YR 6/4) gravelly clay
loam, brown to dark brown (7.5YR 4/4) moist; weak coarse subangular blocky;
hard, friable, sticky and plastic; few very fine and many fine and medium
roots; many very fine and fine interstital and tubular pores; few thin clay
films lining pores; pH 5.4; clear smooth boundary.' , E'([0-9]?[\\.]?[0-9][Y|
y|R|r]+[ ]+?[0-9]/[0-9]).*?([0-9]?[\\.]?[0-9][Y|y|R|r]+[ ]+?[0-9]/[0-9])') ;
regexp_matches
--------------------------
{"10YR 6/4","7.5YR 4/4"}
However, this pattern does not work when there is only one color:
SELECT regexp_matches('B11t Light yellowish brown (10YR 6/4) gravelly clay
loam; weak coarse subangular blocky; hard, friable, sticky and plastic; few
very fine and many fine and medium roots; many very fine and fine interstital
and tubular pores; few thin clay films lining pores; pH 5.4; clear smooth
boundary.' , E'([0-9]?[\\.]?[0-9][Y|y|R|r]+[ ]+?[0-9]/[0-9]).*?([0-9]?[\\.]?
[0-9][Y|y|R|r]+[ ]+?[0-9]/[0-9])') ;
I have tried making the second capturing clause optional by appending the '?'
operator. This causes the single color example to be parsed correctly, but
now the double color example does not work:
SELECT regexp_matches('B11t Light yellowish brown (10YR 6/4) gravelly clay
loam, brown to dark brown (7.5YR 4/4) moist; weak coarse subangular blocky;
hard, friable, sticky and plastic; few very fine and many fine and medium
roots; many very fine and fine interstital and tubular pores; few thin clay
films lining pores; pH 5.4; clear smooth boundary.' , E'([0-9]?[\\.]?[0-9][Y|
y|R|r]+[ ]+?[0-9]/[0-9]).*?([0-9]?[\\.]?[0-9][Y|y|R|r]+[ ]+?[0-9]/[0-9])?') ;
regexp_matches
-------------------
{"10YR 6/4",NULL}
Any ideas on how to improve this regex?
Thanks!
Dylan
--
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341
More information about the vox-tech
mailing list