Priyank Bhatt
2014-Dec-16 19:45 UTC
[Xapian-devel] Replace atoi and atol with strtol strtoul:Need Help
Hello , I came across this function *HtmlParser::decode_entities(string &s)* in *xapian-application/omega/htmlparse.cc* which basically does is extract hex value if any or extract number.For extracting number atoi is used and value returned by it is stored in variable "val" , I think so replacing atoi with strtoul would be useful here as number can have larger value although the variable "val" is unsigned int so i need to change the that definition of "val" also. Is that ok to do so ? Just need to clarify . On 16 December 2014 at 04:34, Olly Betts <olly at survex.com> wrote:> > On Tue, Dec 16, 2014 at 02:32:31AM +0530, Priyank Bhatt wrote: > > I am working on replacing atoi () and atol() functions with strtol() and > > strtoul() . I came across many files which uses statement like these > > time_t secs= atoi(data_span.c_str()), here time_t Datatype is not known > but > > wikipedia says that it is integer so is it necessary to replace atoi > with > > strtol over here ?? > > The time_t type is a standard one - ISO C only says it's a "arithmetic > type" (so it could potentially be a double) but POSIX says it's an > "integer type": > > http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_types.h.html > > While that would allow it to be unsigned, in practice it seems to be a > signed integer, and it is 64 bit on modern systems (since a signed 32 > bit integer with the Unix epoch can only represent dates up to 2038). > > You missed out a relevant part of line, which in full is: > > xapian-applications/omega/date.cc: time_t secs > atoi(date_span.c_str()) * (24 * 60 * 60); > > So in this case, date_span is a number of days, so converting it via a > long is reasonable - even if long is 32 bits that can still represent a > span of 5.8 million years. You'd want to make sure the multiplication > happens in type time_t though. > > If we were actually converting a number of seconds, you'd want to use > strtoll(), at least if sizeof(long) < 8. C++11 includes strtoll(), and > we decided to require a C++11 compiler for 1.3.x, so there's no need to > worry whether strtoll() is available. > > > And is their any document which helps me what each file function does > like > > date.cc,cgiparams.cc,omega.cc,query.cc. > > There should be a short comment summarising the purpose of each file at > the top, for example: > > | /* date.cc: date range parsing routines for omega > > A few files may still lack these, but most have them. > > Cheers, > Olly >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20141217/ec869922/attachment-0002.html>
Olly Betts
2014-Dec-16 22:08 UTC
[Xapian-devel] Replace atoi and atol with strtol strtoul:Need Help
On Wed, Dec 17, 2014 at 01:15:12AM +0530, Priyank Bhatt wrote:> I came across this function *HtmlParser::decode_entities(string &s)* in > *xapian-application/omega/htmlparse.cc* which basically does is extract hex > value if any or extract number.The code you refer to is actually parsing a decimal value (like &) - the hex case (like &) uses sscanf().> For extracting number atoi is used and value > returned by it is stored in variable "val" , I think so replacing atoi with > strtoul would be useful here as number can have larger value although the > variable "val" is unsigned int so i need to change the that definition of > "val" also. Is that ok to do so ? Just need to clarify .We ultimately pass val to Xapian::Unicode::nonascii_to_utf8() which takes "unsigned" so there's not much point making val a wider type here. While ISO C/C++ only guarantee that int is at least 16 bits, in practice it is 32 bits on the platforms we support. Cheers, Olly
Priyank Bhatt
2014-Dec-18 22:01 UTC
[Xapian-devel] Replace atoi and atol with strtol strtoul:Need Help
Hello, I came across the file *omega.cc* which is in directory* xapain-application/omega/* In this file , atoi is used in *Percentage Relevance cutoff *(293 line no) as Percentage lies between 0-100 their is no need to modify atoi . But do we need to check for error's ? Second Implementation is in *collapsing* (301) in which we collapse set of document under a key,range of this key has not been defined anywhere so we can increase the size of the key over here to accumulate more key's ? Third Implementation is in *Sort* (330) And I am not sure what is the val value over here is ?? Is it the entire string of sorted value number's as cgi_params is multimap and find returns the iterator at which it find the element containing the key value ? And I am not sure whether to modify atoi over here. val = cgi_params.find("SORT"); if (val != cgi_params.end()) { sort_key = atoi(val->second.c_str()); Thank You, Priyank Bhatt On 17 December 2014 at 03:38, Olly Betts <olly at survex.com> wrote:> > On Wed, Dec 17, 2014 at 01:15:12AM +0530, Priyank Bhatt wrote: > > I came across this function *HtmlParser::decode_entities(string &s)* in > > *xapian-application/omega/htmlparse.cc* which basically does is extract > hex > > value if any or extract number. > > The code you refer to is actually parsing a decimal value (like &) - > the hex case (like &) uses sscanf(). > > > For extracting number atoi is used and value > > returned by it is stored in variable "val" , I think so replacing atoi > with > > strtoul would be useful here as number can have larger value although the > > variable "val" is unsigned int so i need to change the that definition of > > "val" also. Is that ok to do so ? Just need to clarify . > > We ultimately pass val to Xapian::Unicode::nonascii_to_utf8() which > takes "unsigned" so there's not much point making val a wider type > here. > > While ISO C/C++ only guarantee that int is at least 16 bits, in > practice it is 32 bits on the platforms we support. > > Cheers, > Olly >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20141219/2f50e170/attachment-0002.html>