lucy88
2012-Oct-04 11:18 UTC
[R] R combining vectors into a data frame but without a continuous common variable
Hello, I have two different files which I'd like to combine to make one data frame but I've no idea how to do it! The first file has two columns; one is the date, the following is a binary code for debris flow events. Then my other file has also two columns; the date and then precipitation data. The thing is, is that the two date columns don't all contain the same dates. The binary one is every day from April - October from 1900 - 2005, yet the precipitation file has dates from from say, 1911 to 2004, with some missing data on certain months and during certain years. So my question is how to make a data frame which would have the date, the binary 0 or 1, and then the corresponding precip value from that particular date. I only want the precip information for the days where I have information in the binary file; the others can be disregarded. I have tried using codes which I found in answer to other questions asked but none of them work with my issue. If I'm honest I don't really know if this is what I need. I'm hoping to end up doing a logistic regression. I've uploaded the two files in case I've not been very clear... I'd be really grateful if anyone could help me and suggest a way to do it! I'm also really not very technical and am not at all comfortable with R so if you could be really basic in your advice I'd appreciate it! Many thanks in advance, Lucy Landeck_vec.txt <http://r.789695.n4.nabble.com/file/n4644986/Landeck_vec.txt> Kaurnetal_vec.txt <http://r.789695.n4.nabble.com/file/n4644986/Kaurnetal_vec.txt> -- View this message in context: http://r.789695.n4.nabble.com/R-combining-vectors-into-a-data-frame-but-without-a-continuous-common-variable-tp4644986.html Sent from the R help mailing list archive at Nabble.com.
lucy88
2012-Oct-04 17:22 UTC
[R] R combining vectors into a data frame but without a continuous common variable
Oh my word, you're a genius!! That is absolutely perfect, thank you so much!! I've no idea how you've learnt these things but I would never ever have been able to do that. You've just made my day so much better after the horror of confusion before. Thank you!! -- View this message in context: http://r.789695.n4.nabble.com/R-combining-vectors-into-a-data-frame-but-without-a-continuous-common-variable-tp4644986p4645049.html Sent from the R help mailing list archive at Nabble.com.
Rui Barradas
2012-Oct-04 17:42 UTC
[R] R combining vectors into a data frame but without a continuous common variable
Hello, Try the following. url1 <- "http://r.789695.n4.nabble.com/file/n4644986/Landeck_vec.txt" url2 <- "http://r.789695.n4.nabble.com/file/n4644986/Kaurnetal_vec.txt" dat1 <- read.table(url1, header = TRUE) dat2 <- read.table(url2, header = TRUE) str(dat1) str(dat2) # Precip is a factor, so convert to numeric dat2$Precip <- as.numeric(levels(dat2$Precip)[dat2$Precip]) dat1$Landeck <- as.Date(dat1$Landeck, format = "%d.%m.%Y") dat2$Date <- as.Date(dat2$Date, format = "%d.%m.%Y") dat3 <- merge(dat1, dat2, by.x = "Landeck", by.y = "Date") str(dat3) head(dat3, 20) # See first 20 rows Hope this helps, Rui Barradas Em 04-10-2012 12:18, lucy88 escreveu:> Hello, > > I have two different files which I'd like to combine to make one data frame > but I've no idea how to do it! The first file has two columns; one is the > date, the following is a binary code for debris flow events. Then my other > file has also two columns; the date and then precipitation data. > > The thing is, is that the two date columns don't all contain the same dates. > The binary one is every day from April - October from 1900 - 2005, yet the > precipitation file has dates from from say, 1911 to 2004, with some missing > data on certain months and during certain years. > > So my question is how to make a data frame which would have the date, the > binary 0 or 1, and then the corresponding precip value from that particular > date. I only want the precip information for the days where I have > information in the binary file; the others can be disregarded. > > I have tried using codes which I found in answer to other questions asked > but none of them work with my issue. If I'm honest I don't really know if > this is what I need. I'm hoping to end up doing a logistic regression. I've > uploaded the two files in case I've not been very clear... > > I'd be really grateful if anyone could help me and suggest a way to do it! > I'm also really not very technical and am not at all comfortable with R so > if you could be really basic in your advice I'd appreciate it! > > Many thanks in advance, > Lucy > > Landeck_vec.txt > <http://r.789695.n4.nabble.com/file/n4644986/Landeck_vec.txt> > > Kaurnetal_vec.txt > <http://r.789695.n4.nabble.com/file/n4644986/Kaurnetal_vec.txt> > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/R-combining-vectors-into-a-data-frame-but-without-a-continuous-common-variable-tp4644986.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
arun
2012-Oct-04 19:03 UTC
[R] R combining vectors into a data frame but without a continuous common variable
Hi Lucy,
No problem.
Just a correction to my earlier email.
dat1<-read.table("Landeck_vec.txt",sep="",header=TRUE,stringsAsFactors=FALSE)
dat2<-read.table("Kaurnetal_vec.txt",sep="",header=TRUE,stringsAsFactors=FALSE)
colnames(dat1)[1]<-"Date"
(Rui:
#dat2 Date format is inconsistent.)
dat2$Date<-gsub("\\.","\\/",dat2$Date)
dat1$Date<-as.POSIXct(dat1$Date,format="%d.%m.%Y")
dat2$Date<-as.POSIXct(dat2$Date,format="%d/%m/%Y")
?str(dat1)
#'data.frame':??? 22623 obs. of? 2 variables:
# $ Date : POSIXct, format: "1900-04-01" "1900-04-02" ...
# $ Event: int? 0 0 0 0 0 0 0 0 0 0 ...
?str(dat2)
#'data.frame':??? 36598 obs. of? 2 variables:
# $ Date? : POSIXct, format: "1900-01-01" "1900-01-02" ...
# $ Precip: chr? "0" "0" "0" "0" ...
Precip is "character", which I convert it to numeric
?#dat2<-within(dat2,{Precip<-as.numeric(Precip)})
#Warning message:
#In eval(expr, envir, enclos) : NAs introduced by coercion
The reason is that there are datapoints which has some unusual characters.
which(is.na(dat2$Precip))
# [1]? 7060? 8584? 8798 11235 12848 13701 14006 14038 14098 14311 16016 16748
#[13] 18575 19307 19489 19702 19764 21196
dat2[8584,]
#?????????? Date Precip
#8584 1923-09-01???? NA
When I looked into the data, I found this:
01/09/1923 L?cke
? count(is.na(dat2$Precip))
#????? x? freq
#1 FALSE 36580
#2? TRUE??? 18
#Removed those rows.
dat3<-subset(dat2,!is.na(Precip))
?nrow(dat3)
#[1] 36580
dat4<-merge(dat1,dat3,by="Date")
?dat5<-subset(dat4,Event!=0)
?nrow(dat5)
#[1] 132
?rownames(dat5)<-1:nrow(dat5)
?head(dat5)
#??????? Date Event Precip
#1 1901-06-02???? 1??? 0.0
#2 1905-06-02???? 1??? 0.0
#3 1906-08-03???? 1?? 15.6
#4 1908-05-08???? 1??? 0.0
#5 1911-06-02???? 1??? 3.0
#6 1911-09-15???? 1?? 23.2
A.K.
----- Original Message -----
From: lucy88 <lucy.foggin at gmail.com>
To: r-help at r-project.org
Cc:
Sent: Thursday, October 4, 2012 7:18 AM
Subject: [R] R combining vectors into a data frame but without a continuous
common variable
Hello,
I have two different files which I'd like to combine to make one data frame
but I've no idea how to do it! The first file has two columns; one is the
date, the following is a binary code for debris flow events. Then my other
file has also two columns; the date and then precipitation data.
The thing is, is that the two date columns don't all contain the same dates.
The binary one is every day from April - October from 1900 - 2005, yet the
precipitation file has dates from from say, 1911 to 2004, with some missing
data on certain months and during certain years.
So my question is how to make a data frame which would have the date, the
binary 0 or 1, and then the corresponding precip value from that particular
date. I only want the precip information for the days where I have
information in the binary file; the others can be disregarded.
I have tried using codes which I found in answer to other questions asked
but none of them work with my issue. If I'm honest I don't really know
if
this is what I need. I'm hoping to end up doing a logistic regression.
I've
uploaded the two files in case I've not been very clear...
I'd be really grateful if anyone could help me and suggest a way to do it!
I'm also really not very technical and am not at all comfortable with R so
if you could be really basic in your advice I'd appreciate it!
Many thanks in advance,
Lucy
Landeck_vec.txt
<http://r.789695.n4.nabble.com/file/n4644986/Landeck_vec.txt>?
Kaurnetal_vec.txt
<http://r.789695.n4.nabble.com/file/n4644986/Kaurnetal_vec.txt>?
--
View this message in context:
http://r.789695.n4.nabble.com/R-combining-vectors-into-a-data-frame-but-without-a-continuous-common-variable-tp4644986.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.