Hello I am a new user of R and I need help to merge two large datasets about stocks with different number of rows and columns. Both have 2 variable(column) that are have same values ("name" and "date1), but they are not in same order and "data3" contains much more observations. In the "data" have dividend information about each stock, and in the second there are information about the stock prices. I tried to use this function: data4<-merge(data,data3, by="name","date1") But it does not work. "data 3" has more than 800.000 observations of daily stock prices for each stock, and "data" has only 14.000, so i want to make a function that merge the two datasets to a new dataset with the same length as "data", and also include all the variables in "data3" where "name" (example: Statoil) and "date1"(example:31.12.2000) is the same in both sets. Can someone please help me? Regards -- View this message in context: http://r.789695.n4.nabble.com/merging-two-dataframes-tp3932869p3932869.html Sent from the R help mailing list archive at Nabble.com.
here is some more info: http://r.789695.n4.nabble.com/file/n3932898/help.jpg this is what`s happend when I run the function in R I appreciate if someone can give me some inputs here Thanks in advance -- View this message in context: http://r.789695.n4.nabble.com/merging-two-dataframes-tp3932869p3932898.html Sent from the R help mailing list archive at Nabble.com.
Hi> Hello > > I am a new user of R and I need help to merge two large datasets about > stocks with different number of rows and columns. > Both have 2 variable(column) that are have same values ("name" and"date1),> but they are not in same order and "data3" contains much moreobservations.> In the "data" have dividend information about each stock, and in thesecond> there are information about the stock prices. > > I tried to use this function: > data4<-merge(data,data3, by="name","date1")Does data4<-merge(data,data3, by=c("name","date1"), all=T) do what you want? I think merge help page is quite explanatory and you shall look at it. Regards Petr> > But it does not work. > > "data 3" has more than 800.000 observations of daily stock prices foreach> stock, and "data" has only 14.000, so i want to make a function thatmerge> the two datasets to a new dataset with the same length as "data", andalso> include all the variables in "data3" where "name" (example: Statoil) and > "date1"(example:31.12.2000) is the same in both sets. > > Can someone please help me? > > Regards > > > > -- > View this message in context: http://r.789695.n4.nabble.com/merging-two- > dataframes-tp3932869p3932869.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
Thank you Petr! I have read on the merge help page, but I cant figure out how to write this function. When I use your function it includes all data from "data3", but all columns in "data" has "NA"(without "name" and "date". I hoped to keep these values to. I try to explain it more precise: In "data" with 14000 observations about company name, date, size of dividend etc. In "data3" there are 800000 daily observations of the stockprices for each company listed: name date div statoil 17.05.2000 5 statoil 18.05.2001 6 ........ .............. ... Yara 17.05.2000 10 etc I want to get the stockprice for statoil, yara etc from "data3", and merge it into a new data like this: name date div price statoil 17.05.2000 5 120 statoil 18.05.2001 6 130 ........ .............. ... Yara 17.05.2000 10 200 etc And also keep the rest of the columns from both datasets. name date div price industry secid etc statoil 17.05.2000 5 120 ....... ..... ... statoil 18.05.2001 6 130 ....... ...... .... ........ .............. ... Yara 17.05.2000 10 200 ..... ...... ... etc Using the first function i get this result: name date div price industry secid etc statoil 17.05.2000 NA 120 ....... ..... ... statoil 18.05.2001 NA 130 ....... ...... .... ........ .............. ... Yara 17.05.2000 NA 200 ..... ...... ... Hope you understand what i want to do. Thanks again, I really appreciate it! etc -- View this message in context: http://r.789695.n4.nabble.com/merging-two-dataframes-tp3932869p3933101.html Sent from the R help mailing list archive at Nabble.com.
> > Thank you Petr! > > I have read on the merge help page, but I cant figure out how to writethis> function. > When I use your function it includes all data from "data3", but allcolumns> in "data" has "NA"(without "name" and "date". I hoped to keep thesevalues Because name and date common columns and you do not need them twice. Merge shall work as you expected provided input data are OK. The result of command I suggested shall be all values from both data frames aligned by name and date. It is impossible to say what is wrong without reproducible data. Did you check size of your data frames after merge? I suspect that date in data and date in data3 is somewhat different. You shall check it with ?str function. As Posting guide says you shall provide some suitable data so that we can look at it and work with it. Something like dput(data[1:20,]) could be enough. Regards Petr> to. > > I try to explain it more precise: > In "data" with 14000 observations about company name, date, size ofdividend> etc. > In "data3" there are 800000 daily observations of the stockprices foreach> company listed: > > name date div > statoil 17.05.2000 5 > statoil 18.05.2001 6 > ........ .............. ... > Yara 17.05.2000 10 > etc > > I want to get the stockprice for statoil, yara etc from "data3", andmerge> it into a new data like this: > > name date div price > statoil 17.05.2000 5 120 > statoil 18.05.2001 6 130 > ........ .............. ... > Yara 17.05.2000 10 200 > etc > > And also keep the rest of the columns from both datasets. > > name date div price industry secid > etc > statoil 17.05.2000 5 120 ....... ..... > ... > statoil 18.05.2001 6 130 ....... > ...... .... > ........ .............. ... > Yara 17.05.2000 10 200 ..... ...... > ... > etc > > Using the first function i get this result: > > name date div price industry secid > etc > statoil 17.05.2000 NA 120 ....... ..... > ... > statoil 18.05.2001 NA 130 ....... > ...... .... > ........ .............. ... > Yara 17.05.2000 NA 200 ..... ...... > ... > > Hope you understand what i want to do. > > Thanks again, I really appreciate it! > etc > > -- > View this message in context: http://r.789695.n4.nabble.com/merging-two- > dataframes-tp3932869p3933101.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.