Priyank Bhatt
2014-Dec-18 22:01 UTC
[Xapian-devel] Replace atoi and atol with strtol strtoul:Need Help
Hello, I came across the file *omega.cc* which is in directory* xapain-application/omega/* In this file , atoi is used in *Percentage Relevance cutoff *(293 line no) as Percentage lies between 0-100 their is no need to modify atoi . But do we need to check for error's ? Second Implementation is in *collapsing* (301) in which we collapse set of document under a key,range of this key has not been defined anywhere so we can increase the size of the key over here to accumulate more key's ? Third Implementation is in *Sort* (330) And I am not sure what is the val value over here is ?? Is it the entire string of sorted value number's as cgi_params is multimap and find returns the iterator at which it find the element containing the key value ? And I am not sure whether to modify atoi over here. val = cgi_params.find("SORT"); if (val != cgi_params.end()) { sort_key = atoi(val->second.c_str()); Thank You, Priyank Bhatt On 17 December 2014 at 03:38, Olly Betts <olly at survex.com> wrote:> > On Wed, Dec 17, 2014 at 01:15:12AM +0530, Priyank Bhatt wrote: > > I came across this function *HtmlParser::decode_entities(string &s)* in > > *xapian-application/omega/htmlparse.cc* which basically does is extract > hex > > value if any or extract number. > > The code you refer to is actually parsing a decimal value (like &) - > the hex case (like &) uses sscanf(). > > > For extracting number atoi is used and value > > returned by it is stored in variable "val" , I think so replacing atoi > with > > strtoul would be useful here as number can have larger value although the > > variable "val" is unsigned int so i need to change the that definition of > > "val" also. Is that ok to do so ? Just need to clarify . > > We ultimately pass val to Xapian::Unicode::nonascii_to_utf8() which > takes "unsigned" so there's not much point making val a wider type > here. > > While ISO C/C++ only guarantee that int is at least 16 bits, in > practice it is 32 bits on the platforms we support. > > Cheers, > Olly >-------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.xapian.org/pipermail/xapian-devel/attachments/20141219/2f50e170/attachment-0002.html>
Olly Betts
2014-Dec-18 22:55 UTC
[Xapian-devel] Replace atoi and atol with strtol strtoul:Need Help
,On 19 December 2014 at 11:01, Priyank Bhatt <priyankbhatt1234 at gmail.com> wrote:> [omega.cc] > In this file , atoi is used in Percentage Relevance cutoff (293 line no) as > Percentage lies between 0-100 their is no need to modify atoi . But do we > need to check for error's ?Yes - the main motivation of switching away from atoi() is that it doesn't report errors, though it is certainly good to be also considering what the appropriate range to support is in each case.> Second Implementation is in collapsing (301) in which we collapse set of > document under a key,range of this key has not been defined anywhere so we > can increase the size of the key over here to accumulate more key's ?It's the range of Xapian::valueno in the C++ API - if you look in xapian-core/include/xapian/types.h you can see it's an unsigned int.> Third Implementation is in Sort (330) And I am not sure what is the val > value over here is ?? Is it the entire string of sorted value number's as > cgi_params is multimap and find returns the iterator at which it find the > element containing the key value ? And I am not sure whether to modify atoi > over here. > > val = cgi_params.find("SORT"); > if (val != cgi_params.end()) { > sort_key = atoi(val->second.c_str());You can find the documentation for each parameter easily - "git grep -w SORT" shows it's documented in docs/cgiparams.rst, which shows it's a value number (like COLLAPSE). Cheers, Olly
James Aylett
2014-Dec-18 23:38 UTC
[Xapian-devel] Replace atoi and atol with strtol strtoul:Need Help
On 18 Dec 2014, at 22:01, Priyank Bhatt <priyankbhatt1234 at gmail.com> wrote:> I came across the file omega.cc which is in directory xapain-application/omega/ > > In this file , atoi is used in Percentage Relevance cutoff (293 line no) as Percentage lies between 0-100 their is no need to modify atoi . But do we need to check for error's ?I think we should; I?d suggest that correct error handling here is to not override the default setting (although right now that?s what atoi/atol will usually do because a lot of the defaults are 0), but providing it sounds sensible and is documented (in omega?s cgiparams.rst) I?m not hugely bothered which way we go.> Second Implementation is in collapsing (301) in which we collapse set of document under a key,range of this key has not been defined anywhere so we can increase the size of the key over here to accumulate more key's ?Collapsing is done using a valueno (you can tell because collapse_key is of type Xapian::valueno), which is a 32 bit unsigned integer (see here: getting-started-with-xapian.readthedocs.org/en/latest/concepts/indexing/values.html). For 32 bit unsigned you need unsigned long.> Third Implementation is in Sort (330) And I am not sure what is the val value over here is ?? Is it the entire string of sorted value number's as cgi_params is multimap and find returns the iterator at which it find the element containing the key value ? And I am not sure whether to modify atoi over here.Same comment about it being a valueno. Note also that there are a number of atol calls in omega.cc which will need considering as well. I think you?ve been looking at other uses of atoi/atol; if you?ve successfully updated code then you should make a pull request or email a patch (or attach one to the trac ticket) so someone can review it and get it into trunk (and hence into a future release). Smaller PRs are generally easier to review than ones that touch lots of files, and it?s not a problem to have a number of small PRs, each fixing atoi() usage in one subsystem of Xapian. J -- James Aylett, occasional trouble-maker xapian.org
James Aylett
2014-Dec-19 00:16 UTC
[Xapian-devel] Replace atoi and atol with strtol strtoul:Need Help
On 18 Dec 2014, at 22:55, Olly Betts <olly at survex.com> wrote:>> Second Implementation is in collapsing (301) in which we collapse set of >> document under a key,range of this key has not been defined anywhere so we >> can increase the size of the key over here to accumulate more key's ? > > It's the range of Xapian::valueno in the C++ API - if you look in > xapian-core/include/xapian/types.h you can see it's an unsigned int.Our documentation says it?s 32 bit unsigned, which isn?t even always containable in unsigned int (according to TC3). So our documentation needs some attention? (it?s in getting started). J -- James Aylett, occasional trouble-maker xapian.org
Olly Betts
2014-Dec-19 00:44 UTC
[Xapian-devel] Replace atoi and atol with strtol strtoul:Need Help
On Thu, Dec 18, 2014 at 11:38:42PM +0000, James Aylett wrote:> Collapsing is done using a valueno (you can tell because collapse_key > is of type Xapian::valueno), which is a 32 bit unsigned integer (see > here: > getting-started-with-xapian.readthedocs.org/en/latest/concepts/indexing/values.html). > For 32 bit unsigned you need unsigned long.Xapian::valueno is actually a typedef for unsigned currently. While the standards permit 16 bit integers, no relevant platforms actually use them nowadays, so pragmatically unsigned is a better choice if you need to pick a single built-in type. It'd be nice to be a bit smarter about this - now we've decided to require C++11 for trunk, we can now use uint32_t here. Prior to C++11 I think we'd have to generate the API header based on configure checks and then install it to an architecture-dependent path. But anyway, there's no "strtou()" so you have to converting to unsigned long with strtoul() and then check for potential overflow from the cast to Xapian::valueno.> I think you?ve been looking at other uses of atoi/atol; if you?ve > successfully updated code then you should make a pull request or email > a patch (or attach one to the trac ticket) so someone can review it > and get it into trunk (and hence into a future release). Smaller PRs > are generally easier to review than ones that touch lots of files, and > it?s not a problem to have a number of small PRs, each fixing atoi() > usage in one subsystem of Xapian.Seconded. Another reason to get some of your changes reviewed early is in case there's something you're doing wrong - better to discover that before you repeat it in a lot of different places. Cheers, Olly