lucy88
2012-Oct-04 11:18 UTC
[R] R combining vectors into a data frame but without a continuous common variable
Hello, I have two different files which I'd like to combine to make one data frame but I've no idea how to do it! The first file has two columns; one is the date, the following is a binary code for debris flow events. Then my other file has also two columns; the date and then precipitation data. The thing is, is that the two date columns don't all contain the same dates. The binary one is every day from April - October from 1900 - 2005, yet the precipitation file has dates from from say, 1911 to 2004, with some missing data on certain months and during certain years. So my question is how to make a data frame which would have the date, the binary 0 or 1, and then the corresponding precip value from that particular date. I only want the precip information for the days where I have information in the binary file; the others can be disregarded. I have tried using codes which I found in answer to other questions asked but none of them work with my issue. If I'm honest I don't really know if this is what I need. I'm hoping to end up doing a logistic regression. I've uploaded the two files in case I've not been very clear... I'd be really grateful if anyone could help me and suggest a way to do it! I'm also really not very technical and am not at all comfortable with R so if you could be really basic in your advice I'd appreciate it! Many thanks in advance, Lucy Landeck_vec.txt <http://r.789695.n4.nabble.com/file/n4644986/Landeck_vec.txt> Kaurnetal_vec.txt <http://r.789695.n4.nabble.com/file/n4644986/Kaurnetal_vec.txt> -- View this message in context: http://r.789695.n4.nabble.com/R-combining-vectors-into-a-data-frame-but-without-a-continuous-common-variable-tp4644986.html Sent from the R help mailing list archive at Nabble.com.
lucy88
2012-Oct-04 17:22 UTC
[R] R combining vectors into a data frame but without a continuous common variable
Oh my word, you're a genius!! That is absolutely perfect, thank you so much!! I've no idea how you've learnt these things but I would never ever have been able to do that. You've just made my day so much better after the horror of confusion before. Thank you!! -- View this message in context: http://r.789695.n4.nabble.com/R-combining-vectors-into-a-data-frame-but-without-a-continuous-common-variable-tp4644986p4645049.html Sent from the R help mailing list archive at Nabble.com.
Rui Barradas
2012-Oct-04 17:42 UTC
[R] R combining vectors into a data frame but without a continuous common variable
Hello, Try the following. url1 <- "http://r.789695.n4.nabble.com/file/n4644986/Landeck_vec.txt" url2 <- "http://r.789695.n4.nabble.com/file/n4644986/Kaurnetal_vec.txt" dat1 <- read.table(url1, header = TRUE) dat2 <- read.table(url2, header = TRUE) str(dat1) str(dat2) # Precip is a factor, so convert to numeric dat2$Precip <- as.numeric(levels(dat2$Precip)[dat2$Precip]) dat1$Landeck <- as.Date(dat1$Landeck, format = "%d.%m.%Y") dat2$Date <- as.Date(dat2$Date, format = "%d.%m.%Y") dat3 <- merge(dat1, dat2, by.x = "Landeck", by.y = "Date") str(dat3) head(dat3, 20) # See first 20 rows Hope this helps, Rui Barradas Em 04-10-2012 12:18, lucy88 escreveu:> Hello, > > I have two different files which I'd like to combine to make one data frame > but I've no idea how to do it! The first file has two columns; one is the > date, the following is a binary code for debris flow events. Then my other > file has also two columns; the date and then precipitation data. > > The thing is, is that the two date columns don't all contain the same dates. > The binary one is every day from April - October from 1900 - 2005, yet the > precipitation file has dates from from say, 1911 to 2004, with some missing > data on certain months and during certain years. > > So my question is how to make a data frame which would have the date, the > binary 0 or 1, and then the corresponding precip value from that particular > date. I only want the precip information for the days where I have > information in the binary file; the others can be disregarded. > > I have tried using codes which I found in answer to other questions asked > but none of them work with my issue. If I'm honest I don't really know if > this is what I need. I'm hoping to end up doing a logistic regression. I've > uploaded the two files in case I've not been very clear... > > I'd be really grateful if anyone could help me and suggest a way to do it! > I'm also really not very technical and am not at all comfortable with R so > if you could be really basic in your advice I'd appreciate it! > > Many thanks in advance, > Lucy > > Landeck_vec.txt > <http://r.789695.n4.nabble.com/file/n4644986/Landeck_vec.txt> > > Kaurnetal_vec.txt > <http://r.789695.n4.nabble.com/file/n4644986/Kaurnetal_vec.txt> > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/R-combining-vectors-into-a-data-frame-but-without-a-continuous-common-variable-tp4644986.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
arun
2012-Oct-04 19:03 UTC
[R] R combining vectors into a data frame but without a continuous common variable
Hi Lucy, No problem. Just a correction to my earlier email. dat1<-read.table("Landeck_vec.txt",sep="",header=TRUE,stringsAsFactors=FALSE) dat2<-read.table("Kaurnetal_vec.txt",sep="",header=TRUE,stringsAsFactors=FALSE) colnames(dat1)[1]<-"Date" (Rui: #dat2 Date format is inconsistent.) dat2$Date<-gsub("\\.","\\/",dat2$Date) dat1$Date<-as.POSIXct(dat1$Date,format="%d.%m.%Y") dat2$Date<-as.POSIXct(dat2$Date,format="%d/%m/%Y") ?str(dat1) #'data.frame':??? 22623 obs. of? 2 variables: # $ Date : POSIXct, format: "1900-04-01" "1900-04-02" ... # $ Event: int? 0 0 0 0 0 0 0 0 0 0 ... ?str(dat2) #'data.frame':??? 36598 obs. of? 2 variables: # $ Date? : POSIXct, format: "1900-01-01" "1900-01-02" ... # $ Precip: chr? "0" "0" "0" "0" ... Precip is "character", which I convert it to numeric ?#dat2<-within(dat2,{Precip<-as.numeric(Precip)}) #Warning message: #In eval(expr, envir, enclos) : NAs introduced by coercion The reason is that there are datapoints which has some unusual characters. which(is.na(dat2$Precip)) # [1]? 7060? 8584? 8798 11235 12848 13701 14006 14038 14098 14311 16016 16748 #[13] 18575 19307 19489 19702 19764 21196 dat2[8584,] #?????????? Date Precip #8584 1923-09-01???? NA When I looked into the data, I found this: 01/09/1923 L?cke ? count(is.na(dat2$Precip)) #????? x? freq #1 FALSE 36580 #2? TRUE??? 18 #Removed those rows. dat3<-subset(dat2,!is.na(Precip)) ?nrow(dat3) #[1] 36580 dat4<-merge(dat1,dat3,by="Date") ?dat5<-subset(dat4,Event!=0) ?nrow(dat5) #[1] 132 ?rownames(dat5)<-1:nrow(dat5) ?head(dat5) #??????? Date Event Precip #1 1901-06-02???? 1??? 0.0 #2 1905-06-02???? 1??? 0.0 #3 1906-08-03???? 1?? 15.6 #4 1908-05-08???? 1??? 0.0 #5 1911-06-02???? 1??? 3.0 #6 1911-09-15???? 1?? 23.2 A.K. ----- Original Message ----- From: lucy88 <lucy.foggin at gmail.com> To: r-help at r-project.org Cc: Sent: Thursday, October 4, 2012 7:18 AM Subject: [R] R combining vectors into a data frame but without a continuous common variable Hello, I have two different files which I'd like to combine to make one data frame but I've no idea how to do it! The first file has two columns; one is the date, the following is a binary code for debris flow events. Then my other file has also two columns; the date and then precipitation data. The thing is, is that the two date columns don't all contain the same dates. The binary one is every day from April - October from 1900 - 2005, yet the precipitation file has dates from from say, 1911 to 2004, with some missing data on certain months and during certain years. So my question is how to make a data frame which would have the date, the binary 0 or 1, and then the corresponding precip value from that particular date. I only want the precip information for the days where I have information in the binary file; the others can be disregarded. I have tried using codes which I found in answer to other questions asked but none of them work with my issue. If I'm honest I don't really know if this is what I need. I'm hoping to end up doing a logistic regression. I've uploaded the two files in case I've not been very clear... I'd be really grateful if anyone could help me and suggest a way to do it! I'm also really not very technical and am not at all comfortable with R so if you could be really basic in your advice I'd appreciate it! Many thanks in advance, Lucy Landeck_vec.txt <http://r.789695.n4.nabble.com/file/n4644986/Landeck_vec.txt>? Kaurnetal_vec.txt <http://r.789695.n4.nabble.com/file/n4644986/Kaurnetal_vec.txt>? -- View this message in context: http://r.789695.n4.nabble.com/R-combining-vectors-into-a-data-frame-but-without-a-continuous-common-variable-tp4644986.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.