I would really like the bug fixed. At least this one, because I know people in my institute using this function. I understand your arguments about open source, but I also saw in this mail list a proposal for a fix for this bug for which there were no answer from the people who are able to include it in the distribution. It looks like if there were interesting bugs and the other ones. I don't understand the other arguments : the example was reproduced with a simple USB key and you cannot state that a disk will eternally be empty enough, specially when it has several users. JLL -----Message d'origine----- De : Duncan Murdoch [mailto:murdoch.duncan at gmail.com] Envoy? : mardi 4 juillet 2017 14:24 ? : Lipatz Jean-Luc; r-devel at r-project.org Objet : Re: [Rd] write.csv On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:> Hi all, > > I am currently studying how to generalize the usage of R in my statistical institute and I encountered a problem that I cannot declare on bugzilla (cannot understand why).Bugzilla was badly abused by spammers last year, so you need to have your account created manually by one of the admins to post there. Write to me privately if you'd like me to create an account for you. (If you want it attached to a different email address, that's fine.) Sorry for trying this mailing list but I am really worried about the problem itself and the possible implications in using R in a professionnal data production context.> The issue about 'write.csv' is that it just doesn't check if there is enough space on disk and doesn't report failure to write data. > > Example (R 3.4.0 windows 32 bits, but I reproduced the problem with older versions and under Mac OS/X) > >> fwrite(as.list(1:1000000),"G:/Test") > Error in fwrite(as.list(1:1e+06), "G:/Test") : > No space left on device: 'G:/Test' >> write.csv(1:1000000,"G:/Test") >> > > I have a big concern here, because it means that you could save some important data at one point of time and discover a long time after that you actually lost them.> I suppose that the fix is relatively straightforward, but how can we be sure that there is no another function with the same bad properties? R is open source. You could work out the patch for this bug, and in the process see the pattern of coding that leads to it. Then you'll know if other functions use the same buggy pattern.> Is the lesson that you should not use a R function, even from the core, without having personnally tested it against extreme conditions?I think the answer to that is yes. Most people never write such big files that they fill their disk: if they did, all sorts of things would go wrong on their systems. So this kind of extreme condition isn't often tested. It's not easy to test in a platform independent way: R would need to be able to create a volume with a small capacity. That's a very system-dependent thing to do.> And wouldn't it be the work of the developpers to do such elementary tests?Again, R is open source. You can and should contribute code (and therefore become one of the developers) if you are working in unusual conditions. R states quite clearly in the welcome message every time it starts: "R is free software and comes with ABSOLUTELY NO WARRANTY." This is essentially the same lack of warranty that you get with commercial software, though it's stated a lot more clearly. Duncan Murdoch
This doesn't really strike me as a bug. Lots of (most?) programming languages expect you to handle this as an error condition. If you tried the same thing in C you would get the same error. -----Original Message----- From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of Lipatz Jean-Luc Sent: Tuesday, July 4, 2017 5:40 AM To: Duncan Murdoch <murdoch.duncan at gmail.com> Cc: r-devel at r-project.org Subject: Re: [Rd] write.csv I would really like the bug fixed. At least this one, because I know people in my institute using this function. I understand your arguments about open source, but I also saw in this mail list a proposal for a fix for this bug for which there were no answer from the people who are able to include it in the distribution. It looks like if there were interesting bugs and the other ones. I don't understand the other arguments : the example was reproduced with a simple USB key and you cannot state that a disk will eternally be empty enough, specially when it has several users. JLL -----Message d'origine----- De : Duncan Murdoch [mailto:murdoch.duncan at gmail.com] Envoy? : mardi 4 juillet 2017 14:24 ? : Lipatz Jean-Luc; r-devel at r-project.org Objet : Re: [Rd] write.csv On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:> Hi all, > > I am currently studying how to generalize the usage of R in my statistical institute and I encountered a problem that I cannot declare on bugzilla (cannot understand why).Bugzilla was badly abused by spammers last year, so you need to have your account created manually by one of the admins to post there. Write to me privately if you'd like me to create an account for you. (If you want it attached to a different email address, that's fine.) Sorry for trying this mailing list but I am really worried about the problem itself and the possible implications in using R in a professionnal data production context.> The issue about 'write.csv' is that it just doesn't check if there is enough space on disk and doesn't report failure to write data. > > Example (R 3.4.0 windows 32 bits, but I reproduced the problem with > older versions and under Mac OS/X) > >> fwrite(as.list(1:1000000),"G:/Test") > Error in fwrite(as.list(1:1e+06), "G:/Test") : > No space left on device: 'G:/Test' >> write.csv(1:1000000,"G:/Test") >> > > I have a big concern here, because it means that you could save some important data at one point of time and discover a long time after that you actually lost them.> I suppose that the fix is relatively straightforward, but how can we be sure that there is no another function with the same bad properties? R is open source. You could work out the patch for this bug, and in the process see the pattern of coding that leads to it. Then you'll know if other functions use the same buggy pattern.> Is the lesson that you should not use a R function, even from the core, without having personnally tested it against extreme conditions?I think the answer to that is yes. Most people never write such big files that they fill their disk: if they did, all sorts of things would go wrong on their systems. So this kind of extreme condition isn't often tested. It's not easy to test in a platform independent way: R would need to be able to create a volume with a small capacity. That's a very system-dependent thing to do.> And wouldn't it be the work of the developpers to do such elementary tests?Again, R is open source. You can and should contribute code (and therefore become one of the developers) if you are working in unusual conditions. R states quite clearly in the welcome message every time it starts: "R is free software and comes with ABSOLUTELY NO WARRANTY." This is essentially the same lack of warranty that you get with commercial software, though it's stated a lot more clearly. Duncan Murdoch ______________________________________________ R-devel at r-project.org mailing list https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40microsoft.com%7C92c3e87c4ca1482e32f908d4c2d9dd57%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636347688364867350&sdata=7z5OJqLZDZ1zIvx8pP7KhQzNaQ%2FBrhZFKdUHeiFfke4%3D&reserved=0
On 04/07/2017 8:40 AM, Lipatz Jean-Luc wrote:> I would really like the bug fixed. At least this one, because I know people in my institute using this function. > I understand your arguments about open source, but I also saw in this mail list a proposal for a fix for this bug for which there were no answer from the people who are able to include it in the distribution. It looks like if there were interesting bugs and the other ones.Please post a link to that, and I'll look. Bug reports should be posted to the bug list. It's unfortunate that it is currently so difficult to do so, but if they are only posted here, they are often overlooked.> I don't understand the other arguments : the example was reproduced with a simple USB key and you cannot state that a disk will eternally be empty enough, specially when it has several users.I am not denying that it's a bug, I'm just saying that it is a difficult one to test automatically (so we probably won't add a regression test once it's fixed), and it's not one that has been reported often. I didn't know there were any reports before yours. Duncan Murdoch> JLL > > > -----Message d'origine----- > De : Duncan Murdoch [mailto:murdoch.duncan at gmail.com] > Envoy? : mardi 4 juillet 2017 14:24 > ? : Lipatz Jean-Luc; r-devel at r-project.org > Objet : Re: [Rd] write.csv > > On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote: >> Hi all, >> >> I am currently studying how to generalize the usage of R in my statistical institute and I encountered a problem that I cannot declare on bugzilla (cannot understand why). > > Bugzilla was badly abused by spammers last year, so you need to have your account created manually by one of the admins to post there. Write to me privately if you'd like me to create an account for you. (If you want it attached to a different email address, that's fine.) > > Sorry for trying this mailing list but I am really worried about the problem itself and the possible implications in using R in a professionnal data production context. >> The issue about 'write.csv' is that it just doesn't check if there is enough space on disk and doesn't report failure to write data. >> >> Example (R 3.4.0 windows 32 bits, but I reproduced the problem with older versions and under Mac OS/X) >> >>> fwrite(as.list(1:1000000),"G:/Test") >> Error in fwrite(as.list(1:1e+06), "G:/Test") : >> No space left on device: 'G:/Test' >>> write.csv(1:1000000,"G:/Test") >>> >> >> I have a big concern here, because it means that you could save some important data at one point of time and discover a long time after that you actually lost them. > > I suppose that the fix is relatively straightforward, but how can we > be sure that there is no another function with the same bad properties? > > R is open source. You could work out the patch for this bug, and in the > process see the pattern of coding that leads to it. Then you'll know if > other functions use the same buggy pattern. > >> Is the lesson that you should not use a R function, even from the core, without having personnally tested it against extreme conditions? > > I think the answer to that is yes. Most people never write such big > files that they fill their disk: if they did, all sorts of things would > go wrong on their systems. So this kind of extreme condition isn't > often tested. It's not easy to test in a platform independent way: R > would need to be able to create a volume with a small capacity. That's > a very system-dependent thing to do. > >> And wouldn't it be the work of the developpers to do such elementary tests? > > Again, R is open source. You can and should contribute code (and > therefore become one of the developers) if you are working in unusual > conditions. > > R states quite clearly in the welcome message every time it starts: "R > is free software and comes with ABSOLUTELY NO WARRANTY." This is > essentially the same lack of warranty that you get with commercial > software, though it's stated a lot more clearly. > > Duncan Murdoch >
On 04/07/2017 8:46 AM, Nathan Sosnovske wrote:> This doesn't really strike me as a bug. Lots of (most?) programming languages expect you to handle this as an error condition. If you tried the same thing in C you would get the same error.The bug is that there is no error signalled. It looks as though the write succeeded, when it didn't. Duncan Murdoch> -----Original Message----- > From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of Lipatz Jean-Luc > Sent: Tuesday, July 4, 2017 5:40 AM > To: Duncan Murdoch <murdoch.duncan at gmail.com> > Cc: r-devel at r-project.org > Subject: Re: [Rd] write.csv > > I would really like the bug fixed. At least this one, because I know people in my institute using this function. > I understand your arguments about open source, but I also saw in this mail list a proposal for a fix for this bug for which there were no answer from the people who are able to include it in the distribution. It looks like if there were interesting bugs and the other ones. > I don't understand the other arguments : the example was reproduced with a simple USB key and you cannot state that a disk will eternally be empty enough, specially when it has several users. > > JLL > > > -----Message d'origine----- > De : Duncan Murdoch [mailto:murdoch.duncan at gmail.com] Envoy? : mardi 4 juillet 2017 14:24 ? : Lipatz Jean-Luc; r-devel at r-project.org Objet : Re: [Rd] write.csv > > On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote: >> Hi all, >> >> I am currently studying how to generalize the usage of R in my statistical institute and I encountered a problem that I cannot declare on bugzilla (cannot understand why). > > Bugzilla was badly abused by spammers last year, so you need to have your account created manually by one of the admins to post there. Write to me privately if you'd like me to create an account for you. (If you want it attached to a different email address, that's fine.) > > Sorry for trying this mailing list but I am really worried about the problem itself and the possible implications in using R in a professionnal data production context. >> The issue about 'write.csv' is that it just doesn't check if there is enough space on disk and doesn't report failure to write data. >> >> Example (R 3.4.0 windows 32 bits, but I reproduced the problem with >> older versions and under Mac OS/X) >> >>> fwrite(as.list(1:1000000),"G:/Test") >> Error in fwrite(as.list(1:1e+06), "G:/Test") : >> No space left on device: 'G:/Test' >>> write.csv(1:1000000,"G:/Test") >>> >> >> I have a big concern here, because it means that you could save some important data at one point of time and discover a long time after that you actually lost them. > > I suppose that the fix is relatively straightforward, but how can we be sure that there is no another function with the same bad properties? > > R is open source. You could work out the patch for this bug, and in the process see the pattern of coding that leads to it. Then you'll know if other functions use the same buggy pattern. > >> Is the lesson that you should not use a R function, even from the core, without having personnally tested it against extreme conditions? > > I think the answer to that is yes. Most people never write such big files that they fill their disk: if they did, all sorts of things would go wrong on their systems. So this kind of extreme condition isn't often tested. It's not easy to test in a platform independent way: R would need to be able to create a volume with a small capacity. That's a very system-dependent thing to do. > >> And wouldn't it be the work of the developpers to do such elementary tests? > > Again, R is open source. You can and should contribute code (and therefore become one of the developers) if you are working in unusual conditions. > > R states quite clearly in the welcome message every time it starts: "R is free software and comes with ABSOLUTELY NO WARRANTY." This is essentially the same lack of warranty that you get with commercial software, though it's stated a lot more clearly. > > Duncan Murdoch > > ______________________________________________ > R-devel at r-project.org mailing list > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Cnsosnov%40microsoft.com%7C92c3e87c4ca1482e32f908d4c2d9dd57%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636347688364867350&sdata=7z5OJqLZDZ1zIvx8pP7KhQzNaQ%2FBrhZFKdUHeiFfke4%3D&reserved=0 >
I tested myself, and the "reason" why write.csv() is not giving any
error,
is because a file is created. I tested the following with a USB stick
containing only 32Mb of free space:
write.csv(data.frame(V=rnorm(2e7),
V2= rnorm(2e7),
V3 = rnorm(2e7)),
file = "G:/Test.csv")
X <- read.csv("G:/Test.csv")
Gives:
> str(X)
'data.frame': 506336 obs. of 4 variables:
$ X : int 1 2 3 4 5 6 7 8 9 10 ...
$ V : num 0.0666 -1.2052 -0.2288 -0.4758 1.9168 ...
$ V2: num -0.304 -1.766 -1.611 -0.221 -1.118 ...
$ V3: num -0.6774 0.0841 0.2062 1.7053 -0.2105 ...
So the first part of the data is stored actually. I totally agree that at
least a warning could be given to tell you not all lines are saved.
While Duncan's reaction might come off a bit direct, please understand that
they are not employees but volunteers. You can demand things from a
company, but in the case of R that's actually rather rude, even when not
intended that way.
Given my limited C skills and my wife hating it when I'm solving other
people's problems in the middle of the night, I'm not hacking in the R
core
myself. But as for now, I can offer you this very naive and for big
datasets very time consuming function to check beforehand whether you have
enough space:
testSpace <- function(df,dir){
totchar <- do.call(sum,
lapply(df,
function(i) sum(nchar(as.character(i)))))
# On Windows!
path <- path.expand(dir)
path <- gsub("(^[A-Z]{1}:)/.*","\\1",path)
disks <- system("wmic logicaldisk get freespace, caption",
inter = TRUE)
available <- disks[grep(path,disks)]
available <- gsub("\\D","",available)
# Assume 2 bytes per char in UTF-8, which is very liberal
# but not uncommon
totchar*16 < as.numeric(available)
}
Gives after about half a minute:
> mydf <- data.frame(V=rnorm(1e7))
> testSpace(mydf, "G:/text.csv")
[1] FALSE
Best regards
Joris
On Tue, Jul 4, 2017 at 2:40 PM, Lipatz Jean-Luc <jean-luc.lipatz at
insee.fr>
wrote:
> I would really like the bug fixed. At least this one, because I know
> people in my institute using this function.
> I understand your arguments about open source, but I also saw in this mail
> list a proposal for a fix for this bug for which there were no answer from
> the people who are able to include it in the distribution. It looks like if
> there were interesting bugs and the other ones.
> I don't understand the other arguments : the example was reproduced
with a
> simple USB key and you cannot state that a disk will eternally be empty
> enough, specially when it has several users.
>
> JLL
>
>
> -----Message d'origine-----
> De : Duncan Murdoch [mailto:murdoch.duncan at gmail.com]
> Envoy? : mardi 4 juillet 2017 14:24
> ? : Lipatz Jean-Luc; r-devel at r-project.org
> Objet : Re: [Rd] write.csv
>
> On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
> > Hi all,
> >
> > I am currently studying how to generalize the usage of R in my
> statistical institute and I encountered a problem that I cannot declare on
> bugzilla (cannot understand why).
>
> Bugzilla was badly abused by spammers last year, so you need to have your
> account created manually by one of the admins to post there. Write to me
> privately if you'd like me to create an account for you. (If you want
it
> attached to a different email address, that's fine.)
>
> Sorry for trying this mailing list but I am really worried about the
> problem itself and the possible implications in using R in a professionnal
> data production context.
> > The issue about 'write.csv' is that it just doesn't check
if there is
> enough space on disk and doesn't report failure to write data.
> >
> > Example (R 3.4.0 windows 32 bits, but I reproduced the problem with
> older versions and under Mac OS/X)
> >
> >> fwrite(as.list(1:1000000),"G:/Test")
> > Error in fwrite(as.list(1:1e+06), "G:/Test") :
> > No space left on device: 'G:/Test'
> >> write.csv(1:1000000,"G:/Test")
> >>
> >
> > I have a big concern here, because it means that you could save some
> important data at one point of time and discover a long time after that you
> actually lost them.
> > I suppose that the fix is relatively straightforward, but how can we
> be sure that there is no another function with the same bad properties?
>
> R is open source. You could work out the patch for this bug, and in the
> process see the pattern of coding that leads to it. Then you'll know
if
> other functions use the same buggy pattern.
>
> > Is the lesson that you should not use a R function, even from the
core,
> without having personnally tested it against extreme conditions?
>
> I think the answer to that is yes. Most people never write such big
> files that they fill their disk: if they did, all sorts of things would
> go wrong on their systems. So this kind of extreme condition isn't
> often tested. It's not easy to test in a platform independent way: R
> would need to be able to create a volume with a small capacity. That's
> a very system-dependent thing to do.
>
> > And wouldn't it be the work of the developpers to do such
elementary
> tests?
>
> Again, R is open source. You can and should contribute code (and
> therefore become one of the developers) if you are working in unusual
> conditions.
>
> R states quite clearly in the welcome message every time it starts: "R
> is free software and comes with ABSOLUTELY NO WARRANTY." This is
> essentially the same lack of warranty that you get with commercial
> software, though it's stated a lot more clearly.
>
> Duncan Murdoch
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Joris Meys
Statistical consultant
Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics
tel : +32 (0)9 264 61 79
Joris.Meys at Ugent.be
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
[[alternative HTML version deleted]]
On linux at least you can use `/dev/full` [1] to test writing to a full device.
> echo 'foo' > /dev/full
bash: echo: write error: No space left on device
Although that won't be a perfect test for this case where part of the
file is written successfully.
An alternative suggestion for testing this is to create and mount a
loop device [2] with a small file.
[1]: https://en.wikipedia.org/wiki//dev/full
[2]: https://stackoverflow.com/a/16044420/2055486
On Tue, Jul 4, 2017 at 3:38 PM, Duncan Murdoch <murdoch.duncan at
gmail.com> wrote:> On 04/07/2017 8:40 AM, Lipatz Jean-Luc wrote:
>>
>> I would really like the bug fixed. At least this one, because I know
>> people in my institute using this function.
>> I understand your arguments about open source, but I also saw in this
mail
>> list a proposal for a fix for this bug for which there were no answer
from
>> the people who are able to include it in the distribution. It looks
like if
>> there were interesting bugs and the other ones.
>
>
> Please post a link to that, and I'll look. Bug reports should be
posted to
> the bug list. It's unfortunate that it is currently so difficult to do
so,
> but if they are only posted here, they are often overlooked.
>
>> I don't understand the other arguments : the example was reproduced
with a
>> simple USB key and you cannot state that a disk will eternally be empty
>> enough, specially when it has several users.
>
>
> I am not denying that it's a bug, I'm just saying that it is a
difficult one
> to test automatically (so we probably won't add a regression test once
it's
> fixed), and it's not one that has been reported often. I didn't
know there
> were any reports before yours.
>
> Duncan Murdoch
>
>
>> JLL
>>
>>
>> -----Message d'origine-----
>> De : Duncan Murdoch [mailto:murdoch.duncan at gmail.com]
>> Envoy? : mardi 4 juillet 2017 14:24
>> ? : Lipatz Jean-Luc; r-devel at r-project.org
>> Objet : Re: [Rd] write.csv
>>
>> On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
>>>
>>> Hi all,
>>>
>>> I am currently studying how to generalize the usage of R in my
>>> statistical institute and I encountered a problem that I cannot
declare on
>>> bugzilla (cannot understand why).
>>
>>
>> Bugzilla was badly abused by spammers last year, so you need to have
your
>> account created manually by one of the admins to post there. Write to
me
>> privately if you'd like me to create an account for you. (If you
want it
>> attached to a different email address, that's fine.)
>>
>> Sorry for trying this mailing list but I am really worried about the
>> problem itself and the possible implications in using R in a
professionnal
>> data production context.
>>>
>>> The issue about 'write.csv' is that it just doesn't
check if there is
>>> enough space on disk and doesn't report failure to write data.
>>>
>>> Example (R 3.4.0 windows 32 bits, but I reproduced the problem with
older
>>> versions and under Mac OS/X)
>>>
>>>> fwrite(as.list(1:1000000),"G:/Test")
>>>
>>> Error in fwrite(as.list(1:1e+06), "G:/Test") :
>>> No space left on device: 'G:/Test'
>>>>
>>>> write.csv(1:1000000,"G:/Test")
>>>>
>>>
>>> I have a big concern here, because it means that you could save
some
>>> important data at one point of time and discover a long time after
that you
>>> actually lost them.
>>
>> > I suppose that the fix is relatively straightforward, but how can
we
>> be sure that there is no another function with the same bad properties?
>>
>> R is open source. You could work out the patch for this bug, and in
the
>> process see the pattern of coding that leads to it. Then you'll
know if
>> other functions use the same buggy pattern.
>>
>>> Is the lesson that you should not use a R function, even from the
core,
>>> without having personnally tested it against extreme conditions?
>>
>>
>> I think the answer to that is yes. Most people never write such big
>> files that they fill their disk: if they did, all sorts of things
would
>> go wrong on their systems. So this kind of extreme condition isn't
>> often tested. It's not easy to test in a platform independent way:
R
>> would need to be able to create a volume with a small capacity.
That's
>> a very system-dependent thing to do.
>>
>>> And wouldn't it be the work of the developpers to do such
elementary
>>> tests?
>>
>>
>> Again, R is open source. You can and should contribute code (and
>> therefore become one of the developers) if you are working in unusual
>> conditions.
>>
>> R states quite clearly in the welcome message every time it starts:
"R
>> is free software and comes with ABSOLUTELY NO WARRANTY." This is
>> essentially the same lack of warranty that you get with commercial
>> software, though it's stated a lot more clearly.
>>
>> Duncan Murdoch
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel