thr3ads.net - llvm dev - [LLVMdev] Regular Expression lib support [Aug 2009]

If this information is useful, please help other people find it:
Share via:

Daniel Dunbar

2009-Aug-24 00:32 UTC

[LLVMdev] Regular Expression lib support

On Sun, Aug 23, 2009 at 3:29 PM, Kenneth Uildriks<kennethuil at gmail.com>
wrote:> On Sun, Aug 23, 2009 at 4:56 PM, Daniel Dunbar<daniel at zuster.org>
wrote:
>> We would like to have access to some kind of regular expression
>> library inside LLVM. For example, we need this to extend the FileCheck
>> test case checking tool to support regular expressions.
>>
>> There are three obvious options:
>>  1. Roll our own library. Multiple unnamed individuals may even
>> already have implementations lying around! :)
>>  2. Use POSIX regcomp facilities. This implies importing some
>> implementation of this interface, e.g., Windows. On Linux, BSD, etc.
>> we would try to use the platform version if available (and non-buggy).
>>
>>  3. Import a more heavy weight library such as PCRE, and use it
universally.
>
>
> Personally, I'm a big fan of the Boost libraries.  They've got a
regex
> library, and a full-blown parser library (which I am using in my
> front-end).  It's definitely heavier than POSIX, but it's portable,
> well-tested, and loaded with features.
This is too heavy, and we don't need the extra features, and regexec
is well tested and much more standard. Unless there is an overwhelming
agreement to add another option, I'd like to keep the discussion to
the obvious choices. That is, I need to be convinced *not* to use #2,
before I get derailed into discussing which form of #3 to take.

 - Daniel

OvermindDL1

2009-Aug-24 00:44 UTC

head link

[LLVMdev] Regular Expression lib support

Blast, LLVM list not filling in the response headers still...

On Sun, Aug 23, 2009 at 6:32 PM, Daniel Dunbar<daniel at zuster.org>
wrote:> On Sun, Aug 23, 2009 at 3:29 PM, Kenneth Uildriks<kennethuil at
gmail.com> wrote:
>> On Sun, Aug 23, 2009 at 4:56 PM, Daniel Dunbar<daniel at
zuster.org> wrote:
>>> We would like to have access to some kind of regular expression
>>> library inside LLVM. For example, we need this to extend the
FileCheck
>>> test case checking tool to support regular expressions.
>>>
>>> There are three obvious options:
>>>  1. Roll our own library. Multiple unnamed individuals may even
>>> already have implementations lying around! :)
>>>  2. Use POSIX regcomp facilities. This implies importing some
>>> implementation of this interface, e.g., Windows. On Linux, BSD,
etc.
>>> we would try to use the platform version if available (and
non-buggy).
>>>
>>>  3. Import a more heavy weight library such as PCRE, and use it
universally.
>>
>>
>> Personally, I'm a big fan of the Boost libraries.  They've got
a regex
>> library, and a full-blown parser library (which I am using in my
>> front-end).  It's definitely heavier than POSIX, but it's
portable,
>> well-tested, and loaded with features.
>
> This is too heavy, and we don't need the extra features, and regexec
> is well tested and much more standard. Unless there is an overwhelming
> agreement to add another option, I'd like to keep the discussion to
> the obvious choices. That is, I need to be convinced *not* to use #2,
> before I get derailed into discussing which form of #3 to take.
POSIX has a tendency to be rather useless on some major platforms,
notably Windows (which has no built-in regex library), which is why I
still recommend Spirit, it can be as lightweight as you want, you only
pay for what you use, and it works fast and everywhere.
Boost.xpressive is also quite good if you need the dynamic
functionality (you really should not, how often is the grammar/regex
going to be generated at runtime anyway?).

OvermindDL1

2009-Aug-24 00:50 UTC

head link

[LLVMdev] Regular Expression lib support

On Sun, Aug 23, 2009 at 6:32 PM, Daniel Dunbar<daniel at zuster.org>
wrote:> This is too heavy, and we don't need the extra features, and regexec
> is well tested and much more standard. Unless there is an overwhelming
'regexec' I had never heard of, figured it was a library, turns out it
is a function call on *nix systems, yea, that is very much not usable
in any way shape or form, and is certainly not a standard if it does
not work on one of the major LLVM platforms (and it is still not a
standard in any pure form since it is not part of the C/C++ standard
headers).  If that is option #2, then option #2 is very unusable.

And yes, if you must know, I program on Windows, which is why I am
pushing to use something that actually works everywhere instead of
just someone's favorite OS (I prefer BSD honestly, but Windows is what
the desktop world is sadly stuck on, so that is what I have to program
for).

Thomas Neumann

2009-Aug-24 01:06 UTC

head link

[LLVMdev] Regular Expression lib support

Daniel Dunbar wrote:> This is too heavy, and we don't need the extra features, and regexec
> is well tested and much more standard. Unless there is an overwhelmingactually, Boost is much more standard. IIRC the Boost library corresponds to 
tr1::regex (or std::regex in C++0x), which has the incredible advantage of 
being available on all standard conforming C++ compilers. Admittedly the C++ 
compiler has to be fairly new to support this, but you can use Boost as a 
drop-in replacement until the compiler supports it on its own.

Thomas

Chris Lattner

2009-Aug-24 02:28 UTC

head link

[LLVMdev] Regular Expression lib support

On Aug 23, 2009, at 5:50 PM, OvermindDL1 wrote:
> On Sun, Aug 23, 2009 at 6:32 PM, Daniel Dunbar<daniel at zuster.org>
> wrote:
>> This is too heavy, and we don't need the extra features, and
regexec
>> is well tested and much more standard. Unless there is an  
>> overwhelming
>
> 'regexec' I had never heard of, figured it was a library, turns out
it
> is a function call on *nix systems, yea, that is very much not usable
> in any way shape or form, and is certainly not a standard if it does
> not work on one of the major LLVM platforms (and it is still not a
> standard in any pure form since it is not part of the C/C++ standard
> headers).  If that is option #2, then option #2 is very unusable.
>
> And yes, if you must know, I program on Windows, which is why I am
> pushing to use something that actually works everywhere instead of
> just someone's favorite OS (I prefer BSD honestly, but Windows is what
> the desktop world is sadly stuck on, so that is what I have to program
> for).
I think you're seriously confused about the proposal.  To put it  
bluntly, there is no way we'll use boosts regex support, sorry.

The proposal is to use the unix standard regexec library interface.   
The LLVM tree would include an imported BSD-licenced implementation  
from one of many sources.  We'd then have configury logic detect when  
the host OS already supports the regexec interfaces, and if so, don't  
build our imported copy.

We'd have a simple layer on top of it to make the interface to the  
regex library less horrible than what regexec provides.

Again, forget boost regex. :)

-Chris

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Aug 2009 - [LLVMdev] Regular Expression lib support

[LLVMdev] Regular Expression lib support

[LLVMdev] Regular Expression lib support

[LLVMdev] Regular Expression lib support

[LLVMdev] Regular Expression lib support

[LLVMdev] Regular Expression lib support

Possibly Parallel Threads