> On 13/06/2022 02:09 gravitini <dovecot at gravitini.com> wrote:
> 
>  
> Replying to: https://dovecot.org/pipermail/dovecot/2022-May/124816.html
> 
> 
> Hi,
> 
> Looking at the code (and tested via local build from source) it looks 
> like doveadm deduplicate in 2.3.19 can cause significant data loss.
> 
> A 2022-02-11 commit removed key duplication resulting in undefined 
> behaviour which is often truncation of a mailbox to 67 entries. 
> (HASH_TABLE_MIN_SIZE)
> 
>
https://github.com/dovecot/core/commit/320844f50cd669b602d30210e2e5216f65d2050f?diff=split#diff-5842cf9d4248dc515d80ebb45575341b7d76832f979a8ac5f602784cb5b03f2cL121
> 
> diff --git a/src/doveadm/doveadm-mail-deduplicate.c 
> b/src/doveadm/doveadm-mail-deduplicate.c
> 
> index caec758112..2152482876 100644
> --- a/src/doveadm/doveadm-mail-deduplicate.c
> +++ b/src/doveadm/doveadm-mail-deduplicate.c
> @@ -63,8 +63,10 @@ cmd_deduplicate_box(struct doveadm_mail_cmd_context 
> *_ctx,
>  ??????????????? if (key != NULL && *key != '\0') {
>  ??????????????????????? if (hash_table_lookup(hash, key) != NULL)
>  ??????????????????????????????? mail_expunge(mail);
> -?????????????????????? else
> +?????????????????????? else {
> +?????????????????????????????? key = p_strdup(pool, key);
>  ??????????????????????????????? hash_table_insert(hash, key, 
> POINTER_CAST(1));
> +?????????????????????? }
>  ??????????????? }
>  ??????? }
Thank you both for the report, we'll look into this!
Aki