Aki Tuomi
2016-Oct-17 05:51 UTC
Dict proxy client returning empty string instead of multiline string
Hi! This does sound like a bug, we'll have look. Aki On 17.10.2016 01:26, Pierre Jaury wrote:> I dived a little bit further into the rabbit hole, up to the point where > debugging has become unpracticle but I still haven't found the root > cause for sure. > > I read most of the code for "p_strdup" based on datastack memory pools > (which are used for dictionary lookups both with doveadm and by extdata) > and it seems ok. Still, after "t_malloc_real" is called in "t_malloc0", > the allocated buffer has the same address as the source string. > > The only sensible explanation I can come up with is that during > unescaping, strings are not allocated properly, leading to the memory > pool reusing the string address and zeroing it in the process before the > string copy operation. > > I will follow on this path tomorrow, any lead is more than welcome. > > kaiyou. > > On 10/16/2016 11:16 PM, Pierre Jaury wrote: >> Hello, >> >> I am using a dict proxy for my sieve extdata plugin to access some >> fields from an SQLite database (autoreply text and other >> database-configured items). >> >> All tests are performed against version 2.2.25. >> >> $ dovecot --version >> 2.2.25 (7be1766) >> >> My configuration looks like: >> >> dict { >> sieve = sqlite:/etc/dovecot/pigeonhole-sieve.dict >> } >> >> [...] >> sieve_extdata_dict_uri = proxy::sieve >> >> I am able to read pretty much any attribute without any issue, except >> when the value contains a special character like "\r" or "\n". By using >> the doveadm dict client, I narrowed it down to the dictionary management >> part (either server, protocol or client). >> >> I was suspecting escaping functions from "lib/strescape.c" (mostly >> str_tabescape and its counterpart, used by "lib-dict/client.c"), so I >> monitored socket communications. It seems that escaping is done properly >> on the server and the socket is not an issue either. >> >> The following strace dump results from running "doveadm dict get" >> against the dict socket: >> >> connect(8, {sa_family=AF_UNIX, sun_path="..."}, 110) = 0 >> fstat(8, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0 >> [...] >> write(8, "H2\t0\t0\tadmin at domain.tld\tsieve\n", 30) = 30 >> [...] >> read(8, "Otest\1r\1ntest\n", 8192) = 14 >> >> Indeed "\1r" and "\1n" are the escape sequences used by >> "lib/strescape.c". I went deeped and debugged the call to "dict_lookup" >> performed by doveadm. Indeed the client gets the proper string from the >> socket and to my surprise, it is properly unescaped. >> >> Then, in "client_dict_lookup" ("lib-dict/dict-client.c"), the call to >> "p_strdup" returns an empty string (null byte set at the target address). >> >> Before the call to the dict "->lookup" attribute (client_dict_lookup): >> >> RAX: 0x7ffff73a37c0 (push r14) >> RBX: 0x6831b8 ("priv/reply_body") >> RCX: 0x7fffffffe240 --> 0x682a60 --> 0x6831b8 ("priv/reply_body") >> RDX: 0x6831b8 ("priv/reply_body") >> RSI: 0x683288 --> 0x7ffff7653120 --> 0x7ffff73ea620 ([...]) >> RDI: 0x690ad0 --> 0x7ffff7400713 --> 0x75250079786f7270 ('proxy') >> >> 0x7ffff73a1f10 <dict_lookup+32>: mov rcx,r11 (value_r) >> 0x7ffff73a1f13 <dict_lookup+35>: mov rdx,r8 (key) >> 0x7ffff73a1f16 <dict_lookup+38>: mov rsi,r10 (pool) >> 0x7ffff73a1f19 <dict_lookup+41>: mov rdi,r9 (dict) >> 0x7ffff73a1f1c <dict_lookup+44>: add rsp,0x8 >> => 0x7ffff73a1f20 <dict_lookup+48>: jmp rax >> >> Before the call to p_strdup in "client_dict_lookup": >> >> RSI: 0x6832d8 ("test\r\ntest") (lookup.result.value) >> RDI: 0x683288 --> 0x7ffff7653120 --> [...] (pool) >> RAX: 0x0 (result) >> >> 0x7ffff73a384f: nop >> 0x7ffff73a3850: mov rsi,QWORD PTR [rsp+0x8] >> 0x7ffff73a3855: mov rdi,r14 >> => 0x7ffff73a3858: call 0x7ffff736d3c0 <p_strdup at plt> >> 0x7ffff73a385d: mov QWORD PTR [r13+0x0],rax >> 0x7ffff73a3861: mov rsi,QWORD PTR [rsp+0x18] >> 0x7ffff73a3866: xor rsi,QWORD PTR fs:0x28 >> 0x7ffff73a386f: mov eax,ebx >> >> After the call: >> >> 0x7ffff73a3850: mov rsi,QWORD PTR [rsp+0x8] >> 0x7ffff73a3855: mov rdi,r14 >> 0x7ffff73a3858: call 0x7ffff736d3c0 <p_strdup at plt> >> => 0x7ffff73a385d: mov QWORD PTR [r13+0x0],rax >> 0x7ffff73a3861: mov rsi,QWORD PTR [rsp+0x18] >> 0x7ffff73a3866: xor rsi,QWORD PTR fs:0x28 >> 0x7ffff73a386f: mov eax,ebx >> 0x7ffff73a3871: jne 0x7ffff73a38da >> >> RSI: 0x0 >> RDI: 0x6832d8 --> 0x0 >> RAX: 0x6832d8 --> 0x0 (result) >> >> It is worth noting that I can reproduce the exact same execution flow >> with a non-multiline result string (lookup.result.value) that is >> properly copied by "p_strdup" and returned in RAX, then displayed by >> doveadm. >> >> I am not familiar with the pooling mechanism hidden behind the call to >> p_strdump and not quite sure why this behaviour is emerging. Maybe I am >> even miles away from an understanding of the issue here, but it sounds >> to me like something is wrong in the way "p_strdup" performs the copy. >> >> Hope this helps, >> kaiyou. >> >> >>
Pierre Jaury
2016-Oct-17 07:52 UTC
Dict proxy client returning empty string instead of multiline string
While trying to isolate properly and reproduce, I was able to trigger the same bug with the following code: struct dict *dict; char* dict_uri = "proxy::sieve"; char* key = "priv/key"; char* username = "admin at domain.tld"; char* value, error; dict_drivers_register_builtin(); dict_init(dict_uri, DICT_DATA_TYPE_STRING, username, doveadm_settings->base_dir, &dict, &error); dict_lookup(dict, pool_datastack_create(), key, &value); printf(">%s\n", value); // outputs an empty string dict_deinit(&dict); I trimmed it to the bare minimal string manipulation functions involved but cannot reproduce in that case: pool_t pool = pool_datastack_create(); char* s1 = "test\001n\001rtest"; char* s2 = t_str_tabunescape(s1); char* s3 = p_strdup(pool, s2); printf("1>%s\n", s1); printf("2>%s\n", s2); printf("3>%s\n", s3); // all three output the string with NL and CR Maybe I am missing a function call in the process or maybe the issue is related to the way unescaping is performed in the async callback function in "dict-client.c", or maybe even some other edge case. Finally, I was able to run the first snippet without bug by removing the string duplication in "t_str_tabunescape" (which I realize is not a proper solution), or by explicitely using the following pool: return str_tabunescape(p_strdup(pool_datastack_create(), str)); Hope this helps. kaiyou On 10/17/2016 07:51 AM, Aki Tuomi wrote:> Hi! > > This does sound like a bug, we'll have look. > > Aki > > > On 17.10.2016 01:26, Pierre Jaury wrote: >> I dived a little bit further into the rabbit hole, up to the point where >> debugging has become unpracticle but I still haven't found the root >> cause for sure. >> >> I read most of the code for "p_strdup" based on datastack memory pools >> (which are used for dictionary lookups both with doveadm and by extdata) >> and it seems ok. Still, after "t_malloc_real" is called in "t_malloc0", >> the allocated buffer has the same address as the source string. >> >> The only sensible explanation I can come up with is that during >> unescaping, strings are not allocated properly, leading to the memory >> pool reusing the string address and zeroing it in the process before the >> string copy operation. >> >> I will follow on this path tomorrow, any lead is more than welcome. >> >> kaiyou. >> >> On 10/16/2016 11:16 PM, Pierre Jaury wrote: >>> Hello, >>> >>> I am using a dict proxy for my sieve extdata plugin to access some >>> fields from an SQLite database (autoreply text and other >>> database-configured items). >>> >>> All tests are performed against version 2.2.25. >>> >>> $ dovecot --version >>> 2.2.25 (7be1766) >>> >>> My configuration looks like: >>> >>> dict { >>> sieve = sqlite:/etc/dovecot/pigeonhole-sieve.dict >>> } >>> >>> [...] >>> sieve_extdata_dict_uri = proxy::sieve >>> >>> I am able to read pretty much any attribute without any issue, except >>> when the value contains a special character like "\r" or "\n". By using >>> the doveadm dict client, I narrowed it down to the dictionary management >>> part (either server, protocol or client). >>> >>> I was suspecting escaping functions from "lib/strescape.c" (mostly >>> str_tabescape and its counterpart, used by "lib-dict/client.c"), so I >>> monitored socket communications. It seems that escaping is done properly >>> on the server and the socket is not an issue either. >>> >>> The following strace dump results from running "doveadm dict get" >>> against the dict socket: >>> >>> connect(8, {sa_family=AF_UNIX, sun_path="..."}, 110) = 0 >>> fstat(8, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0 >>> [...] >>> write(8, "H2\t0\t0\tadmin at domain.tld\tsieve\n", 30) = 30 >>> [...] >>> read(8, "Otest\1r\1ntest\n", 8192) = 14 >>> >>> Indeed "\1r" and "\1n" are the escape sequences used by >>> "lib/strescape.c". I went deeped and debugged the call to "dict_lookup" >>> performed by doveadm. Indeed the client gets the proper string from the >>> socket and to my surprise, it is properly unescaped. >>> >>> Then, in "client_dict_lookup" ("lib-dict/dict-client.c"), the call to >>> "p_strdup" returns an empty string (null byte set at the target address). >>> >>> Before the call to the dict "->lookup" attribute (client_dict_lookup): >>> >>> RAX: 0x7ffff73a37c0 (push r14) >>> RBX: 0x6831b8 ("priv/reply_body") >>> RCX: 0x7fffffffe240 --> 0x682a60 --> 0x6831b8 ("priv/reply_body") >>> RDX: 0x6831b8 ("priv/reply_body") >>> RSI: 0x683288 --> 0x7ffff7653120 --> 0x7ffff73ea620 ([...]) >>> RDI: 0x690ad0 --> 0x7ffff7400713 --> 0x75250079786f7270 ('proxy') >>> >>> 0x7ffff73a1f10 <dict_lookup+32>: mov rcx,r11 (value_r) >>> 0x7ffff73a1f13 <dict_lookup+35>: mov rdx,r8 (key) >>> 0x7ffff73a1f16 <dict_lookup+38>: mov rsi,r10 (pool) >>> 0x7ffff73a1f19 <dict_lookup+41>: mov rdi,r9 (dict) >>> 0x7ffff73a1f1c <dict_lookup+44>: add rsp,0x8 >>> => 0x7ffff73a1f20 <dict_lookup+48>: jmp rax >>> >>> Before the call to p_strdup in "client_dict_lookup": >>> >>> RSI: 0x6832d8 ("test\r\ntest") (lookup.result.value) >>> RDI: 0x683288 --> 0x7ffff7653120 --> [...] (pool) >>> RAX: 0x0 (result) >>> >>> 0x7ffff73a384f: nop >>> 0x7ffff73a3850: mov rsi,QWORD PTR [rsp+0x8] >>> 0x7ffff73a3855: mov rdi,r14 >>> => 0x7ffff73a3858: call 0x7ffff736d3c0 <p_strdup at plt> >>> 0x7ffff73a385d: mov QWORD PTR [r13+0x0],rax >>> 0x7ffff73a3861: mov rsi,QWORD PTR [rsp+0x18] >>> 0x7ffff73a3866: xor rsi,QWORD PTR fs:0x28 >>> 0x7ffff73a386f: mov eax,ebx >>> >>> After the call: >>> >>> 0x7ffff73a3850: mov rsi,QWORD PTR [rsp+0x8] >>> 0x7ffff73a3855: mov rdi,r14 >>> 0x7ffff73a3858: call 0x7ffff736d3c0 <p_strdup at plt> >>> => 0x7ffff73a385d: mov QWORD PTR [r13+0x0],rax >>> 0x7ffff73a3861: mov rsi,QWORD PTR [rsp+0x18] >>> 0x7ffff73a3866: xor rsi,QWORD PTR fs:0x28 >>> 0x7ffff73a386f: mov eax,ebx >>> 0x7ffff73a3871: jne 0x7ffff73a38da >>> >>> RSI: 0x0 >>> RDI: 0x6832d8 --> 0x0 >>> RAX: 0x6832d8 --> 0x0 (result) >>> >>> It is worth noting that I can reproduce the exact same execution flow >>> with a non-multiline result string (lookup.result.value) that is >>> properly copied by "p_strdup" and returned in RAX, then displayed by >>> doveadm. >>> >>> I am not familiar with the pooling mechanism hidden behind the call to >>> p_strdump and not quite sure why this behaviour is emerging. Maybe I am >>> even miles away from an understanding of the issue here, but it sounds >>> to me like something is wrong in the way "p_strdup" performs the copy. >>> >>> Hope this helps, >>> kaiyou. >>> >>> >>>
Pierre Jaury
2016-Oct-17 13:14 UTC
Dict proxy client returning empty string instead of multiline string
Okay, it seems to me that the bug is due to "t_str_tabunescape" using the unsafe datastack ("t_strdup_noconst") while the string is actually returned in an async callback. Before it is handled by "client_dict_lookup", "client_dict_wait" actually fires some IO loops that are more than likely to call "t_pop" and free/flush the result string (I checked, it does call "t_pop" a couple times indeed). Maybe it would be safer to use a standard datastack pool when unescaping a string in that context. I could work on a patch that would: - add an optional "pool" attribute to the "client_dict_cmd" structure, - pass the pool to the async lookup function, - use the pool when escaping strings that should survive the callback chain What do you think? Regards, kaiyou On 10/17/2016 09:52 AM, Pierre Jaury wrote:> While trying to isolate properly and reproduce, I was able to trigger > the same bug with the following code: > > struct dict *dict; > char* dict_uri = "proxy::sieve"; > char* key = "priv/key"; > char* username = "admin at domain.tld"; > char* value, error; > > dict_drivers_register_builtin(); > dict_init(dict_uri, DICT_DATA_TYPE_STRING, username, > doveadm_settings->base_dir, &dict, &error); > dict_lookup(dict, pool_datastack_create(), key, &value); > printf(">%s\n", value); // outputs an empty string > dict_deinit(&dict); > > I trimmed it to the bare minimal string manipulation functions involved > but cannot reproduce in that case: > > pool_t pool = pool_datastack_create(); > > char* s1 = "test\001n\001rtest"; > char* s2 = t_str_tabunescape(s1); > char* s3 = p_strdup(pool, s2); > > printf("1>%s\n", s1); > printf("2>%s\n", s2); > printf("3>%s\n", s3); // all three output the string with NL and CR > > Maybe I am missing a function call in the process or maybe the issue is > related to the way unescaping is performed in the async callback > function in "dict-client.c", or maybe even some other edge case. > > Finally, I was able to run the first snippet without bug by removing the > string duplication in "t_str_tabunescape" (which I realize is not a > proper solution), or by explicitely using the following pool: > > return str_tabunescape(p_strdup(pool_datastack_create(), str)); > > > Hope this helps. > > kaiyou > > > On 10/17/2016 07:51 AM, Aki Tuomi wrote: >> Hi! >> >> This does sound like a bug, we'll have look. >> >> Aki >> >> >> On 17.10.2016 01:26, Pierre Jaury wrote: >>> I dived a little bit further into the rabbit hole, up to the point where >>> debugging has become unpracticle but I still haven't found the root >>> cause for sure. >>> >>> I read most of the code for "p_strdup" based on datastack memory pools >>> (which are used for dictionary lookups both with doveadm and by extdata) >>> and it seems ok. Still, after "t_malloc_real" is called in "t_malloc0", >>> the allocated buffer has the same address as the source string. >>> >>> The only sensible explanation I can come up with is that during >>> unescaping, strings are not allocated properly, leading to the memory >>> pool reusing the string address and zeroing it in the process before the >>> string copy operation. >>> >>> I will follow on this path tomorrow, any lead is more than welcome. >>> >>> kaiyou. >>> >>> On 10/16/2016 11:16 PM, Pierre Jaury wrote: >>>> Hello, >>>> >>>> I am using a dict proxy for my sieve extdata plugin to access some >>>> fields from an SQLite database (autoreply text and other >>>> database-configured items). >>>> >>>> All tests are performed against version 2.2.25. >>>> >>>> $ dovecot --version >>>> 2.2.25 (7be1766) >>>> >>>> My configuration looks like: >>>> >>>> dict { >>>> sieve = sqlite:/etc/dovecot/pigeonhole-sieve.dict >>>> } >>>> >>>> [...] >>>> sieve_extdata_dict_uri = proxy::sieve >>>> >>>> I am able to read pretty much any attribute without any issue, except >>>> when the value contains a special character like "\r" or "\n". By using >>>> the doveadm dict client, I narrowed it down to the dictionary management >>>> part (either server, protocol or client). >>>> >>>> I was suspecting escaping functions from "lib/strescape.c" (mostly >>>> str_tabescape and its counterpart, used by "lib-dict/client.c"), so I >>>> monitored socket communications. It seems that escaping is done properly >>>> on the server and the socket is not an issue either. >>>> >>>> The following strace dump results from running "doveadm dict get" >>>> against the dict socket: >>>> >>>> connect(8, {sa_family=AF_UNIX, sun_path="..."}, 110) = 0 >>>> fstat(8, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0 >>>> [...] >>>> write(8, "H2\t0\t0\tadmin at domain.tld\tsieve\n", 30) = 30 >>>> [...] >>>> read(8, "Otest\1r\1ntest\n", 8192) = 14 >>>> >>>> Indeed "\1r" and "\1n" are the escape sequences used by >>>> "lib/strescape.c". I went deeped and debugged the call to "dict_lookup" >>>> performed by doveadm. Indeed the client gets the proper string from the >>>> socket and to my surprise, it is properly unescaped. >>>> >>>> Then, in "client_dict_lookup" ("lib-dict/dict-client.c"), the call to >>>> "p_strdup" returns an empty string (null byte set at the target address). >>>> >>>> Before the call to the dict "->lookup" attribute (client_dict_lookup): >>>> >>>> RAX: 0x7ffff73a37c0 (push r14) >>>> RBX: 0x6831b8 ("priv/reply_body") >>>> RCX: 0x7fffffffe240 --> 0x682a60 --> 0x6831b8 ("priv/reply_body") >>>> RDX: 0x6831b8 ("priv/reply_body") >>>> RSI: 0x683288 --> 0x7ffff7653120 --> 0x7ffff73ea620 ([...]) >>>> RDI: 0x690ad0 --> 0x7ffff7400713 --> 0x75250079786f7270 ('proxy') >>>> >>>> 0x7ffff73a1f10 <dict_lookup+32>: mov rcx,r11 (value_r) >>>> 0x7ffff73a1f13 <dict_lookup+35>: mov rdx,r8 (key) >>>> 0x7ffff73a1f16 <dict_lookup+38>: mov rsi,r10 (pool) >>>> 0x7ffff73a1f19 <dict_lookup+41>: mov rdi,r9 (dict) >>>> 0x7ffff73a1f1c <dict_lookup+44>: add rsp,0x8 >>>> => 0x7ffff73a1f20 <dict_lookup+48>: jmp rax >>>> >>>> Before the call to p_strdup in "client_dict_lookup": >>>> >>>> RSI: 0x6832d8 ("test\r\ntest") (lookup.result.value) >>>> RDI: 0x683288 --> 0x7ffff7653120 --> [...] (pool) >>>> RAX: 0x0 (result) >>>> >>>> 0x7ffff73a384f: nop >>>> 0x7ffff73a3850: mov rsi,QWORD PTR [rsp+0x8] >>>> 0x7ffff73a3855: mov rdi,r14 >>>> => 0x7ffff73a3858: call 0x7ffff736d3c0 <p_strdup at plt> >>>> 0x7ffff73a385d: mov QWORD PTR [r13+0x0],rax >>>> 0x7ffff73a3861: mov rsi,QWORD PTR [rsp+0x18] >>>> 0x7ffff73a3866: xor rsi,QWORD PTR fs:0x28 >>>> 0x7ffff73a386f: mov eax,ebx >>>> >>>> After the call: >>>> >>>> 0x7ffff73a3850: mov rsi,QWORD PTR [rsp+0x8] >>>> 0x7ffff73a3855: mov rdi,r14 >>>> 0x7ffff73a3858: call 0x7ffff736d3c0 <p_strdup at plt> >>>> => 0x7ffff73a385d: mov QWORD PTR [r13+0x0],rax >>>> 0x7ffff73a3861: mov rsi,QWORD PTR [rsp+0x18] >>>> 0x7ffff73a3866: xor rsi,QWORD PTR fs:0x28 >>>> 0x7ffff73a386f: mov eax,ebx >>>> 0x7ffff73a3871: jne 0x7ffff73a38da >>>> >>>> RSI: 0x0 >>>> RDI: 0x6832d8 --> 0x0 >>>> RAX: 0x6832d8 --> 0x0 (result) >>>> >>>> It is worth noting that I can reproduce the exact same execution flow >>>> with a non-multiline result string (lookup.result.value) that is >>>> properly copied by "p_strdup" and returned in RAX, then displayed by >>>> doveadm. >>>> >>>> I am not familiar with the pooling mechanism hidden behind the call to >>>> p_strdump and not quite sure why this behaviour is emerging. Maybe I am >>>> even miles away from an understanding of the issue here, but it sounds >>>> to me like something is wrong in the way "p_strdup" performs the copy. >>>> >>>> Hope this helps, >>>> kaiyou. >>>> >>>> >>>>