[vox-tech] regex help - matching literal []

Micah J. Cowan micah at cowan.name
Fri Apr 28 16:18:05 PDT 2006


On Fri, Apr 28, 2006 at 03:45:25PM -0700, Kenneth Herron wrote:
> Micah J. Cowan wrote:
> > On Fri, Apr 28, 2006 at 04:19:34PM -0500, Ken Bloom wrote:
> >> Target exception: java.util.regex.PatternSyntaxException: Unclosed 
> >> character class near index 5
> >> [[].*]
> >>      ^
> >>
> >> java.util.regex.PatternSyntaxException: Unclosed character class near 
> >> index 5
> >> [[].*]
> >>      ^
> > 
> > Then it violates POSIX regex syntax. That's a broken response, IMO.
> 
> I've been reading 
> <http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html> 
> and <http://www.opengroup.org/onlinepubs/007908799/xbd/re.html>, and 
> perhaps I'm missing it but they don't seem to support your assertion. 

No, those are the references I use as well.

> And of course google doesn't allow searching for punctuation. I'd 
> appreciate it if you would explain how "[[].*]" is valid, or point to 
> some source that supports your position.

Well, I'm not sure I understand what you're saying: you yourself seem to
prove the validity of the RE in your final sentence below. However, in the
SUSv3 spec you cited, the proof is in the grammar, and also (easier to
read) in 9.3.5, point #1, where it says that the "[" character loses its
special meaning from within a bracket expression.

It would have a /new/ special meaning /if/ it were part of a character
class, collation class, or equivalence class; but it's none of those, so
what you have is "[[]" => literal "[".

The final "]" of my RE isn't special (since there's no active character
class), so it's literal.

> Besides which, Peter was trying to match "[" and "]" individually. A 
> single RE that matches either character isn't what he wanted. Does 
> "[[].*]" match "[" or "]"?

A single RE that matches either character isn't what I provided.

[[] means a character group consisting of the single character, "[".

> ObAlternateSolution: the expression "[[]" matches "[", "[]]" matches 
> "]", and "[[][]]" matches "[]".

If you'll look closer, you'll see that's exactly what I did. But Java
doesn't support it(!).

-- 
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer...
http://micah.cowan.name/


More information about the vox-tech mailing list