Alexandra Catena
2015-Apr-10 20:07 UTC
[R] Finding values in a dataframe at a specified hour
Hello, I have a large dataframe (windHW) of wind speeds (ws) at each hour from many days over a set of years. Some of these values are obviously wrong (600 m/s) and I want to get rid of all the values that are larger than 5*sigma for each hour. The 5*sigma (variable name sigma5) values are located in different dataframes for each season, with each dataframe titled as a season. For example, in the dataframe, spring, the 5*sigma value is 79.6 m/s for hour 1. So my question is as follows: how can I get it so that the code will be able to find all the wind speed values in the dataframe, windHW, of a specific hour be higher than the 5*sigma value at that hour? For example, I would like to find if any of the wind speed values at hour 1 are higher than 79.6 m/s, and if so, then replace that value with NA. I have something like this but I can't seem to figure out how to get it for specific hours: windHW$ws[windHW$ws>=spring$sigma5] <- NA I imported the data using readLines and into the dataframe windHW. I also have R version 3.1.1 Any help would be appreciated! Thanks, Alexandra
Alexandra Catena
2015-Apr-10 21:06 UTC
[R] Finding values in a dataframe at a specified hour
Update: I have this so far. * The first column of windHW is the wind speed. The 5th column of the dataframe, spring, is the 5*sigma value of every hour. hourRow gives out all the rows of wind speed at a given hour. for (i in 0:23){ hourRow = which(windHW$hour==i,arr.ind=TRUE) for (h in hourRow){ if (windHW[h,1]>=spring[spring$hour==i,5]){ windHW[h,1]<-NA} } } This then gives the error: Error in if (windHW[h, 1] >spring[spring$hour == i, 5]) { : argument is of length zero *Note: The dataframe for each of the seasons have 24 rows corresponding to each hour of the day 0:23. Thanks, Alexandra On Fri, Apr 10, 2015 at 1:07 PM, Alexandra Catena <amc5981 at gmail.com> wrote:> Hello, > > I have a large dataframe (windHW) of wind speeds (ws) at each hour > from many days over a set of years. Some of these values are > obviously wrong (600 m/s) and I want to get rid of all the values that > are larger than 5*sigma for each hour. The 5*sigma (variable name > sigma5) values are located in different dataframes for each season, > with each dataframe titled as a season. For example, in the > dataframe, spring, the 5*sigma value is 79.6 m/s for hour 1. > > So my question is as follows: how can I get it so that the code will > be able to find all the wind speed values in the dataframe, windHW, of > a specific hour be higher than the 5*sigma value at that hour? > For example, I would like to find if any of the wind speed values at > hour 1 are higher than 79.6 m/s, and if so, then replace that value > with NA. > > I have something like this but I can't seem to figure out how to get > it for specific hours: > > windHW$ws[windHW$ws>=spring$sigma5] <- NA > > I imported the data using readLines and into the dataframe windHW. I > also have R version 3.1.1 > > Any help would be appreciated! > > Thanks, > Alexandra
Hi Alexandra, The error probably comes from the first iteration of i in 0:23. As indexing in R begins at 1, there is no element 0. Try using: for(i in 1:24) { ... and see what happens. Jim On Sat, Apr 11, 2015 at 7:06 AM, Alexandra Catena <amc5981 at gmail.com> wrote:> Update: > > I have this so far. * The first column of windHW is the wind speed. > The 5th column of the dataframe, spring, is the 5*sigma value of every > hour. hourRow gives out all the rows of wind speed at a given hour. > > for (i in 0:23){ > hourRow = which(windHW$hour==i,arr.ind=TRUE) > for (h in hourRow){ > if (windHW[h,1]>=spring[spring$hour==i,5]){ > windHW[h,1]<-NA} > } > } > > This then gives the error: Error in if (windHW[h, 1] >> spring[spring$hour == i, 5]) { : argument is of length zero > > *Note: The dataframe for each of the seasons have 24 rows > corresponding to each hour of the day 0:23. > > Thanks, > Alexandra > > > On Fri, Apr 10, 2015 at 1:07 PM, Alexandra Catena <amc5981 at gmail.com> > wrote: > > Hello, > > > > I have a large dataframe (windHW) of wind speeds (ws) at each hour > > from many days over a set of years. Some of these values are > > obviously wrong (600 m/s) and I want to get rid of all the values that > > are larger than 5*sigma for each hour. The 5*sigma (variable name > > sigma5) values are located in different dataframes for each season, > > with each dataframe titled as a season. For example, in the > > dataframe, spring, the 5*sigma value is 79.6 m/s for hour 1. > > > > So my question is as follows: how can I get it so that the code will > > be able to find all the wind speed values in the dataframe, windHW, of > > a specific hour be higher than the 5*sigma value at that hour? > > For example, I would like to find if any of the wind speed values at > > hour 1 are higher than 79.6 m/s, and if so, then replace that value > > with NA. > > > > I have something like this but I can't seem to figure out how to get > > it for specific hours: > > > > windHW$ws[windHW$ws>=spring$sigma5] <- NA > > > > I imported the data using readLines and into the dataframe windHW. I > > also have R version 3.1.1 > > > > Any help would be appreciated! > > > > Thanks, > > Alexandra > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi I presume your data frames are not big. What about merging them by hour and comparing appropriate columns? something like windMerged<-merge(windHW, spring, by = "hour", all=TRUE) sel <- which(windMerged[, xx] >= windMerged[,yy]) windMerged[sel, xx] <- NA Untested because lack of data. Cheers Petr> -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of > Alexandra Catena > Sent: Friday, April 10, 2015 11:07 PM > To: r-help at r-project.org > Subject: Re: [R] Finding values in a dataframe at a specified hour > > Update: > > I have this so far. * The first column of windHW is the wind speed. > The 5th column of the dataframe, spring, is the 5*sigma value of every > hour. hourRow gives out all the rows of wind speed at a given hour. > > for (i in 0:23){ > hourRow = which(windHW$hour==i,arr.ind=TRUE) > for (h in hourRow){ > if (windHW[h,1]>=spring[spring$hour==i,5]){ > windHW[h,1]<-NA} > } > } > > This then gives the error: Error in if (windHW[h, 1] >> spring[spring$hour == i, 5]) { : argument is of length zero > > *Note: The dataframe for each of the seasons have 24 rows corresponding > to each hour of the day 0:23. > > Thanks, > Alexandra > > > On Fri, Apr 10, 2015 at 1:07 PM, Alexandra Catena <amc5981 at gmail.com> > wrote: > > Hello, > > > > I have a large dataframe (windHW) of wind speeds (ws) at each hour > > from many days over a set of years. Some of these values are > > obviously wrong (600 m/s) and I want to get rid of all the values > that > > are larger than 5*sigma for each hour. The 5*sigma (variable name > > sigma5) values are located in different dataframes for each season, > > with each dataframe titled as a season. For example, in the > > dataframe, spring, the 5*sigma value is 79.6 m/s for hour 1. > > > > So my question is as follows: how can I get it so that the code will > > be able to find all the wind speed values in the dataframe, windHW, > of > > a specific hour be higher than the 5*sigma value at that hour? > > For example, I would like to find if any of the wind speed values at > > hour 1 are higher than 79.6 m/s, and if so, then replace that value > > with NA. > > > > I have something like this but I can't seem to figure out how to get > > it for specific hours: > > > > windHW$ws[windHW$ws>=spring$sigma5] <- NA > > > > I imported the data using readLines and into the dataframe windHW. I > > also have R version 3.1.1 > > > > Any help would be appreciated! > > > > Thanks, > > Alexandra > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.________________________________ Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a jsou ur?eny pouze jeho adres?t?m. Jestli?e jste obdr?el(a) tento e-mail omylem, informujte laskav? neprodlen? jeho odes?latele. Obsah tohoto emailu i s p??lohami a jeho kopie vyma?te ze sv?ho syst?mu. Nejste-li zam??len?m adres?tem tohoto emailu, nejste opr?vn?ni tento email jakkoliv u??vat, roz?i?ovat, kop?rovat ?i zve?ej?ovat. Odes?latel e-mailu neodpov?d? za eventu?ln? ?kodu zp?sobenou modifikacemi ?i zpo?d?n?m p?enosu e-mailu. V p??pad?, ?e je tento e-mail sou??st? obchodn?ho jedn?n?: - vyhrazuje si odes?latel pr?vo ukon?it kdykoliv jedn?n? o uzav?en? smlouvy, a to z jak?hokoliv d?vodu i bez uveden? d?vodu. - a obsahuje-li nab?dku, je adres?t opr?vn?n nab?dku bezodkladn? p?ijmout; Odes?latel tohoto e-mailu (nab?dky) vylu?uje p?ijet? nab?dky ze strany p??jemce s dodatkem ?i odchylkou. - trv? odes?latel na tom, ?e p??slu?n? smlouva je uzav?ena teprve v?slovn?m dosa?en?m shody na v?ech jej?ch n?le?itostech. - odes?latel tohoto emailu informuje, ?e nen? opr?vn?n uzav?rat za spole?nost ??dn? smlouvy s v?jimkou p??pad?, kdy k tomu byl p?semn? zmocn?n nebo p?semn? pov??en a takov? pov??en? nebo pln? moc byly adres?tovi tohoto emailu p??padn? osob?, kterou adres?t zastupuje, p?edlo?eny nebo jejich existence je adres?tovi ?i osob? j?m zastoupen? zn?m?. This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner. The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email. In case that this e-mail forms part of business dealings: - the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning. - if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation. - the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects. - the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient.