Michael Welsh Duggan
2016-Dec-19 00:57 UTC
fts-solr: Returning 400 on searches; unescaped braces
Using Debian, dovecot-solr 1:2.2.26.0-4, and solr-tomcat 3.6.2+dfsg-9, I am getting 400 errors when doing searches. Here is an example search query from dovecot that failed (captured with wireshark): Frame 23: 338 bytes on wire (2704 bits), 338 bytes captured (2704 bits) on interface 0 Linux cooked capture Internet Protocol Version 6, Src: ::1, Dst: ::1 Transmission Control Protocol, Src Port: 56860, Dst Port: 8080, Seq: 1, Ack: 1, Len: 250 Hypertext Transfer Protocol GET /solr/select?fl=uid,score&rows=2664&sort=uid+asc&q={!lucene+q.op%3dAND}(hdr:test+OR+body:test)&fq=%2Bbox:6d5de009f991854df726000012cf7b9c+%2Buser:md5i HTTP/1.1\r\n Host: localhost:8080\r\n Date: Mon, 19 Dec 2016 00:25:56 GMT\r\n Connection: Keep-Alive\r\n \r\n [Full request URI: http://localhost:8080/solr/select?fl=uid,score&rows=2664&sort=uid+asc&q={!lucene+q.op%3dAND}(hdr:test+OR+body:test)&fq=%2Bbox:6d5de009f991854df726000012cf7b9c+%2Buser:md5i] [HTTP request 1/1] [Response in frame: 25] Here is the same query from firefox, which succeeds: Frame 66: 646 bytes on wire (5168 bits), 646 bytes captured (5168 bits) on interface 0 Linux cooked capture Internet Protocol Version 6, Src: ::1, Dst: ::1 Transmission Control Protocol, Src Port: 56862, Dst Port: 8080, Seq: 1, Ack: 1, Len: 558 Hypertext Transfer Protocol GET /solr/select?fl=uid,score&rows=2664&sort=uid+asc&q=%7B!lucene+q.op%3DAND%7D(hdr:test+OR+body:test)&fq=%2Bbox:6d5de009f991854df726000012cf7b9c+%2Buser:md5i HTTP/1.1\r\n Host: localhost:8080\r\n Connection: keep-alive\r\n Cache-Control: max-age=0\r\n Upgrade-Insecure-Requests: 1\r\n User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.75 Safari/537.36\r\n Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\n DNT: 1\r\n Accept-Encoding: gzip, deflate, sdch, br\r\n Accept-Language: en-US,en;q=0.8\r\n \r\n [Full request URI: http://localhost:8080/solr/select?fl=uid,score&rows=2664&sort=uid+asc&q=%7B!lucene+q.op%3DAND%7D(hdr:test+OR+body:test)&fq=%2Bbox:6d5de009f991854df726000012cf7b9c+%2Buser:md5i] [HTTP request 1/1] [Response in frame: 86] The salient difference seems to be the encoding of the braces. Indeed in the tomcat 8 logs, I find the following which seems to corroborate my hypothesis: java.lang.IllegalArgumentException: Invalid character found in the request target. The valid characters are defined in RFC 7230 and RFC 3986 at org.apache.coyote.http11.Http11InputBuffer.parseRequestLine(Http11InputBuffer.java:467) at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:667) at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66) at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:789) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1437) at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:745) Indeed the braces are listed in fts-backend-solr.c as part of solr_escape_chars, so don't know how the braces are making it through unencoded. -- Michael Welsh Duggan (md5i at md5i.com)
Michael Welsh Duggan
2017-Jan-16 05:06 UTC
fts-solr: Returning 400 on searches; unescaped braces
Should I try to get more information on this? Michael Welsh Duggan <mwd at md5i.com> writes:> Using Debian, dovecot-solr 1:2.2.26.0-4, and solr-tomcat 3.6.2+dfsg-9, I > am getting 400 errors when doing searches. Here is an example search > query from dovecot that failed (captured with wireshark): > > Frame 23: 338 bytes on wire (2704 bits), 338 bytes captured (2704 bits) on interface 0 > Linux cooked capture > Internet Protocol Version 6, Src: ::1, Dst: ::1 > Transmission Control Protocol, Src Port: 56860, Dst Port: 8080, Seq: 1, Ack: 1, Len: 250 > Hypertext Transfer Protocol > GET /solr/select?fl=uid,score&rows=2664&sort=uid+asc&q={!lucene+q.op%3dAND}(hdr:test+OR+body:test)&fq=%2Bbox:6d5de009f991854df726000012cf7b9c+%2Buser:md5i HTTP/1.1\r\n > Host: localhost:8080\r\n > Date: Mon, 19 Dec 2016 00:25:56 GMT\r\n > Connection: Keep-Alive\r\n > \r\n > [Full request URI: http://localhost:8080/solr/select?fl=uid,score&rows=2664&sort=uid+asc&q={!lucene+q.op%3dAND}(hdr:test+OR+body:test)&fq=%2Bbox:6d5de009f991854df726000012cf7b9c+%2Buser:md5i] > [HTTP request 1/1] > [Response in frame: 25] > > Here is the same query from firefox, which succeeds: > > Frame 66: 646 bytes on wire (5168 bits), 646 bytes captured (5168 bits) on interface 0 > Linux cooked capture > Internet Protocol Version 6, Src: ::1, Dst: ::1 > Transmission Control Protocol, Src Port: 56862, Dst Port: 8080, Seq: 1, Ack: 1, Len: 558 > Hypertext Transfer Protocol > GET /solr/select?fl=uid,score&rows=2664&sort=uid+asc&q=%7B!lucene+q.op%3DAND%7D(hdr:test+OR+body:test)&fq=%2Bbox:6d5de009f991854df726000012cf7b9c+%2Buser:md5i HTTP/1.1\r\n > Host: localhost:8080\r\n > Connection: keep-alive\r\n > Cache-Control: max-age=0\r\n > Upgrade-Insecure-Requests: 1\r\n > User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.75 Safari/537.36\r\n > Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\n > DNT: 1\r\n > Accept-Encoding: gzip, deflate, sdch, br\r\n > Accept-Language: en-US,en;q=0.8\r\n > \r\n > [Full request URI: http://localhost:8080/solr/select?fl=uid,score&rows=2664&sort=uid+asc&q=%7B!lucene+q.op%3DAND%7D(hdr:test+OR+body:test)&fq=%2Bbox:6d5de009f991854df726000012cf7b9c+%2Buser:md5i] > [HTTP request 1/1] > [Response in frame: 86] > > > The salient difference seems to be the encoding of the braces. Indeed > in the tomcat 8 logs, I find the following which seems to corroborate > my hypothesis: > > java.lang.IllegalArgumentException: Invalid character found in the request target. The valid characters are defined in RFC 7230 and RFC 3986 > at org.apache.coyote.http11.Http11InputBuffer.parseRequestLine(Http11InputBuffer.java:467) > at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:667) > at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66) > at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:789) > at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1437) > at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) > at java.lang.Thread.run(Thread.java:745) > > Indeed the braces are listed in fts-backend-solr.c as part of > solr_escape_chars, so don't know how the braces are making it through > unencoded.-- Michael Welsh Duggan (md5i at md5i.com)
Michael Welsh Duggan
2017-Jan-16 19:52 UTC
fts-solr: Returning 400 on searches; unescaped braces
Michael Welsh Duggan <mwd at md5i.com> writes:> Using Debian, dovecot-solr 1:2.2.26.0-4, and solr-tomcat 3.6.2+dfsg-9, I > am getting 400 errors when doing searches. Here is an example search > query from dovecot that failed (captured with wireshark): > > Frame 23: 338 bytes on wire (2704 bits), 338 bytes captured (2704 bits) on interface 0 > Linux cooked capture > Internet Protocol Version 6, Src: ::1, Dst: ::1 > Transmission Control Protocol, Src Port: 56860, Dst Port: 8080, Seq: 1, Ack: 1, Len: 250 > Hypertext Transfer Protocol > GET /solr/select?fl=uid,score&rows=2664&sort=uid+asc&q={!lucene+q.op%3dAND}(hdr:test+OR+body:test)&fq=%2Bbox:6d5de009f991854df726000012cf7b9c+%2Buser:md5i HTTP/1.1\r\n > Host: localhost:8080\r\n > Date: Mon, 19 Dec 2016 00:25:56 GMT\r\n > Connection: Keep-Alive\r\n > \r\n > [Full request URI: http://localhost:8080/solr/select?fl=uid,score&rows=2664&sort=uid+asc&q={!lucene+q.op%3dAND}(hdr:test+OR+body:test)&fq=%2Bbox:6d5de009f991854df726000012cf7b9c+%2Buser:md5i] > [HTTP request 1/1] > [Response in frame: 25] > > Here is the same query from firefox, which succeeds: > > Frame 66: 646 bytes on wire (5168 bits), 646 bytes captured (5168 bits) on interface 0 > Linux cooked capture > Internet Protocol Version 6, Src: ::1, Dst: ::1 > Transmission Control Protocol, Src Port: 56862, Dst Port: 8080, Seq: 1, Ack: 1, Len: 558 > Hypertext Transfer Protocol > GET /solr/select?fl=uid,score&rows=2664&sort=uid+asc&q=%7B!lucene+q.op%3DAND%7D(hdr:test+OR+body:test)&fq=%2Bbox:6d5de009f991854df726000012cf7b9c+%2Buser:md5i HTTP/1.1\r\n > Host: localhost:8080\r\n > Connection: keep-alive\r\n > Cache-Control: max-age=0\r\n > Upgrade-Insecure-Requests: 1\r\n > User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.75 Safari/537.36\r\n > Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\n > DNT: 1\r\n > Accept-Encoding: gzip, deflate, sdch, br\r\n > Accept-Language: en-US,en;q=0.8\r\n > \r\n > [Full request URI: http://localhost:8080/solr/select?fl=uid,score&rows=2664&sort=uid+asc&q=%7B!lucene+q.op%3DAND%7D(hdr:test+OR+body:test)&fq=%2Bbox:6d5de009f991854df726000012cf7b9c+%2Buser:md5i] > [HTTP request 1/1] > [Response in frame: 86] > > > The salient difference seems to be the encoding of the braces. Indeed > in the tomcat 8 logs, I find the following which seems to corroborate > my hypothesis: > > java.lang.IllegalArgumentException: Invalid character found in the request target. The valid characters are defined in RFC 7230 and RFC 3986 > at org.apache.coyote.http11.Http11InputBuffer.parseRequestLine(Http11InputBuffer.java:467) > at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:667) > at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66) > at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:789) > at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1437) > at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) > at java.lang.Thread.run(Thread.java:745) > > Indeed the braces are listed in fts-backend-solr.c as part of > solr_escape_chars, so don't know how the braces are making it through > unencoded.I have attached a patch which solves this problem. I initially tried changing http_url_escape_param() to include braces, but this did not solve the problem. I have to guess that the {!lucene+q.op=AND} bit does not travel through this function. So I just changed the braces in the lines where they were introduced into their encoded values. Since the equals-sign was already encoded this way there, it seemed to make sense. -------------- next part -------------- A non-text attachment was scrubbed... Name: escape-braces.patch Type: text/x-diff Size: 831 bytes Desc: not available URL: <http://dovecot.org/pipermail/dovecot/attachments/20170116/ec00edb6/attachment.bin> -------------- next part -------------- -- Michael Welsh Duggan (md5i at md5i.com)
Op 16-1-2017 om 20:52 schreef Michael Welsh Duggan:> Michael Welsh Duggan <mwd at md5i.com> writes: > >> Indeed the braces are listed in fts-backend-solr.c as part of >> solr_escape_chars, so don't know how the braces are making it through >> unencoded. > I have attached a patch which solves this problem. I initially tried > changing http_url_escape_param() to include braces, but this did not > solve the problem. I have to guess that the {!lucene+q.op=AND} bit does > not travel through this function. So I just changed the braces in the > lines where they were introduced into their encoded values. Since the > equals-sign was already encoded this way there, it seemed to make sense.Applied: https://github.com/dovecot/core/commit/c32d111cf4d8be4ffdc582b440b5348d87461066 Regards, Stephan.