Hello,
we are developing a library to show last arrived messages of all threads
in a folder with more than 300k messages.
As per the RFC 5256, the IMAP thread command has only the option to
specify the grouping criteria (REFERENCES vs ORDEREDSUBJECT). So we
implemented an algorithm which gets the full UID SORT ordered by date in
reverse order then gets all thread relations and then post process all
outputs to obtain the final result.
Once we have both output we loop through the first uids set (which is
ordered) and for every item of it we loop through the other uids set
with all the thread relations. If the uid of the first set is contained
in one of the elements of the thread messages, we pick it up and push it
in the resulting array.
This is a pseudo-code that shows what we are doing:
array_of_latest_messages_in_thread = []
array_of_sorted_uids = [n,n,n,n,...] // UID SORT (REVERSE DATE) UTF-8 ALL
array_of_thread_uids = [n,[n,n],n,[n,n,n],[n,n],n,n,...] // UID THREAD
(REFERENCES) UTF-8 ALL
foreach(array_of_sorted_uids as s_uid){
??? foreach(array_of_thread_uids as t_uids){
??????? if(t_uids contains s_uid){ // or a function to loop recursively
t_uids to search if a leaf is equal to s_uid
??????????? array_of_latest_messages_in_thread.push(s_uid)
??????????? break
??????? }
??? }
}
We have also made some little optimizations in the above code like for
example skipping from the outer loop the uids already processed in the
inner loop that have not being selected. Anyway, the loop is very
expensive in term of computation and we are wondering if there is a
better approach to this issue directly using IMAP instead of post
processing the output of the two commands at application level.
Thanks.
--
Alessio Cecchi
Postmaster @ http://www.qboxmail.it
https://www.linkedin.com/in/alessice
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<https://dovecot.org/pipermail/dovecot/attachments/20181002/3f31488f/attachment.html>
On 2 Oct 2018, at 12.22, Alessio Cecchi <alessio at skye.it> wrote:> > Hello, > > we are developing a library to show last arrived messages of all threads in a folder with more than 300k messages. > > As per the RFC 5256, the IMAP thread command has only the option to specify the grouping criteria (REFERENCES vs ORDEREDSUBJECT). So we implemented an algorithm which gets the full UID SORT ordered by date in reverse order then gets all thread relations and then post process all outputs to obtain the final result. >It sounds like what you want is REFS threading, which Dovecot supports, but it didn't make it into official RFC: https://tools.ietf.org/html/draft-gulbrandsen-imap-inthread-05 <https://tools.ietf.org/html/draft-gulbrandsen-imap-inthread-05> -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20181002/4ee3d8f0/attachment.html>
Il 02/10/2018 12:08, Timo Sirainen ha scritto:> On 2 Oct 2018, at 12.22, Alessio Cecchi <alessio at skye.it > <mailto:alessio at skye.it>> wrote: >> >> Hello, >> >> we are developing a library to show last arrived messages of all >> threads in a folder with more than 300k messages. >> >> As per the RFC 5256, the IMAP thread command has only the option to >> specify the grouping criteria (REFERENCES vs ORDEREDSUBJECT). So we >> implemented an algorithm which gets the full UID SORT ordered by date >> in reverse order then gets all thread relations and then post process >> all outputs to obtain the final result. >> > It sounds like what you want is REFS threading, which Dovecot > supports, but it didn't make it into official RFC: > https://tools.ietf.org/html/draft-gulbrandsen-imap-inthread-05 >Thanks Timo, you are the best! p.s. was a pleasure to meet you and the OX team in Rome last week -- Alessio Cecchi Postmaster @ http://www.qboxmail.it https://www.linkedin.com/in/alessice -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://dovecot.org/pipermail/dovecot/attachments/20181002/dfc3c024/attachment.html>