Mikael Jagan
2023-Nov-22 18:14 UTC
[Rd] R-4.3 version list.files function could not work correctly in chinese
FWIW, a user on Stack Overflow just reported the same issue with list.files
running R 4.3.z on Windows. They do not observe the issue running R-devel,
with Tomas' patch (r84960). It is still the case that their file names did
not exceed 260 wide characters.
https://stackoverflow.com/q/77527167/12685768
Mikael
On 2023-08-17 6:00 am, r-devel-request at r-project.org
wrote:> Message: 5
> Date: Wed, 16 Aug 2023 16:00:13 +0200
> From: Tomas Kalibera<tomas.kalibera at gmail.com>
> To: Ivan Krylov<krylov.r00t at gmail.com>
> Cc:"r-devel at r-project.org" <r-devel at r-project.org>
> Subject: Re: [Rd] R-4.3 version list.files function could not work
> correctly in chinese
> Message-ID:<21e91609-85b2-103b-8e23-12eadff62784 at gmail.com>
> Content-Type: text/plain; charset="utf-8";
Format="flowed"
>
>
> On 8/16/23 13:22, Ivan Krylov wrote:
>> On Wed, 16 Aug 2023 09:42:09 +0200
>> Tomas Kalibera<tomas.kalibera at gmail.com> wrote:
>>
>>> Fixed in R-devel (84960). Please let me know if you see any problem
>>> with the fix.
>> Thank you for implementing the fix! I gave ??? the link to the
>> GitHub Action build of the r84960 installer.
> Thanks and thanks for looking at the change.
>> I'm worried that ??? was seeing FindNextFileA fail for a different
>> reason (all the examples given at the Capital of Statistics forum
>> seemed to use less than 256/4 = 64 characters per file name...), but
>> maybe this won't reappear with the switch to FindNextFileW. If this
>> keeps happening, it might be worth producing a warning when
>> FindNextFileW() fails with an unexpected GetLastError() value.
> I've added a warning to R-devel when list.files() on Windows stops
> listing a directory due to an error.
>
> There is probably not more we can do unless there is a revised bug
> report of the original problem.
>
>> fs::dir_fs() uses NtQueryDirectoryFile() and WideCharToMultiByte()
>> instead of FindNextFileW() and wcstombs(), but maybe this shouldn't
>> matter. In particular, both list.files() and fs::dir_fs() would fail
>> given a file name that cannot be represented in UTF-8 (invalid UTF-16
>> surrogate pairs?)
> Right, R only support file names that are valid strings, this assumption
> is present at many places in the code, so it is fine/consistent to be
> here as well. The choice of opendir/readdir in R was probably motivated
> by minimization of platform-specific code.
>
> Best
> Tomas
Tomas Kalibera
2023-Nov-22 18:29 UTC
[Rd] R-4.3 version list.files function could not work correctly in chinese
On 11/22/23 19:14, Mikael Jagan wrote:> FWIW, a user on Stack Overflow just reported the same issue with > list.files > running R 4.3.z on Windows.? They do not observe the issue running > R-devel, > with Tomas' patch (r84960).? It is still the case that their file > names did > not exceed 260 wide characters. > > ??? https://stackoverflow.com/q/77527167/12685768Great, thanks! Tomas> > Mikael > > On 2023-08-17 6:00 am, r-devel-request at r-project.org wrote: >> Message: 5 >> Date: Wed, 16 Aug 2023 16:00:13 +0200 >> From: Tomas Kalibera<tomas.kalibera at gmail.com> >> To: Ivan Krylov<krylov.r00t at gmail.com> >> Cc:"r-devel at r-project.org"? <r-devel at r-project.org> >> Subject: Re: [Rd]? R-4.3 version list.files function could not work >> ????correctly in chinese >> Message-ID:<21e91609-85b2-103b-8e23-12eadff62784 at gmail.com> >> Content-Type: text/plain; charset="utf-8"; Format="flowed" >> >> >> On 8/16/23 13:22, Ivan Krylov wrote: >>> On Wed, 16 Aug 2023 09:42:09 +0200 >>> Tomas Kalibera<tomas.kalibera at gmail.com>? wrote: >>> >>>> Fixed in R-devel (84960). Please let me know if you see any problem >>>> with the fix. >>> Thank you for implementing the fix! I gave ??? the link to the >>> GitHub Action build of the r84960 installer. >> Thanks and thanks for looking at the change. >>> I'm worried that ??? was seeing FindNextFileA fail for a different >>> reason (all the examples given at the Capital of Statistics forum >>> seemed to use less than 256/4 = 64 characters per file name...), but >>> maybe this won't reappear with the switch to FindNextFileW. If this >>> keeps happening, it might be worth producing a warning when >>> FindNextFileW() fails with an unexpected GetLastError() value. >> I've added a warning to R-devel when list.files() on Windows stops >> listing a directory due to an error. >> >> There is probably not more we can do unless there is a revised bug >> report of the original problem. >> >>> fs::dir_fs() uses NtQueryDirectoryFile() and WideCharToMultiByte() >>> instead of FindNextFileW() and wcstombs(), but maybe this shouldn't >>> matter. In particular, both list.files() and fs::dir_fs() would fail >>> given a file name that cannot be represented in UTF-8 (invalid UTF-16 >>> surrogate pairs?) >> Right, R only support file names that are valid strings, this assumption >> is present at many places in the code, so it is fine/consistent to be >> here as well. The choice of opendir/readdir in R was probably motivated >> by minimization of platform-specific code. >> >> Best >> Tomas