Dear list users, I have two data frames df1 and df2, where the columns of df1 are Sensor_RM Place_RM Station_RM Y_init_RM M_init_RM D_init_RM Y_fin_RM M_fin_RM D_fin_RM and the columns of df2 are Sensor_RM Station_RM Place_RM Province_RM Region_RM Net_init_RM GaussBoaga_EST_RM GaussBoaga_NORD_RM Gradi_Long_RM Primi_Long_RM Secondi_Long_RM Gradi_Lat_RM Primi_Lat_RM Secondi_Lat_RM Long_Cent_RM Lat_Cent_RM Height_RM When I merge the two data frames through df3 <- merge(df1, df2, by=c("Sensor_RM", "Station_RM")) I get a new data frame with columns Sensor_RM Station_RM Place_RM.x Y_init_RM M_init_RM D_init_RM Y_fin_RM M_fin_RM D_fin_RM Place_RM.y Province_RM Region_RM Net_init_RM GaussBoaga_EST_RM GaussBoaga_NORD_RM Gradi_Long_RM Primi_Long_RM Secondi_Long_RM Gradi_Lat_RM Primi_Lat_RM Secondi_Lat_RM Long_Cent_RM Lat_Cent_RM Height_RM I am sure that df1$Place_RM and df2$Place_RM are equal. I checked it from the shell using awk and diff. Why then I have a duplicate of Place_RM, i.e. Place_RM.x and Place_RM.y, and only of them? Thank you for your help Stefano ________________________________ AVVISO IMPORTANTE: Questo messaggio di posta elettronica può contenere informazioni confidenziali, pertanto è destinato solo a persone autorizzate alla ricezione. I messaggi di posta elettronica per i client di Regione Marche possono contenere informazioni confidenziali e con privilegi legali. Se non si è il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo messaggio. Se si è ricevuto questo messaggio per errore, inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi dell’art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessità ed urgenza, la risposta al presente messaggio di posta elettronica può essere visionata da persone estranee al destinatario. IMPORTANT NOTICE: This e-mail message is intended to be received only by persons entitled to receive the confidential information it may contain. E-mail messages to clients of Regione Marche may contain information that is confidential and legally privileged. Please do not read, copy, forward, or store this message unless you are an intended recipient of it. If you have received this message in error, please forward it to the sender and delete it completely from your computer system. [[alternative HTML version deleted]]
On Mar 13, 2014, at 10:19 AM, Stefano Sofia <stefano.sofia at regione.marche.it> wrote:> Dear list users, > I have two data frames df1 and df2, where the columns of df1 are > > Sensor_RM Place_RM Station_RM Y_init_RM M_init_RM D_init_RM Y_fin_RM M_fin_RM D_fin_RM > > and the columns of df2 are > > Sensor_RM Station_RM Place_RM Province_RM Region_RM Net_init_RM GaussBoaga_EST_RM GaussBoaga_NORD_RM Gradi_Long_RM Primi_Long_RM Secondi_Long_RM Gradi_Lat_RM Primi_Lat_RM Secondi_Lat_RM Long_Cent_RM Lat_Cent_RM Height_RM > > When I merge the two data frames through > > df3 <- merge(df1, df2, by=c("Sensor_RM", "Station_RM")) > > I get a new data frame with columns > > Sensor_RM Station_RM Place_RM.x Y_init_RM M_init_RM D_init_RM Y_fin_RM M_fin_RM D_fin_RM Place_RM.y Province_RM Region_RM Net_init_RM GaussBoaga_EST_RM GaussBoaga_NORD_RM Gradi_Long_RM Primi_Long_RM Secondi_Long_RM Gradi_Lat_RM Primi_Lat_RM Secondi_Lat_RM Long_Cent_RM Lat_Cent_RM Height_RM > > I am sure that df1$Place_RM and df2$Place_RM are equal. I checked it from the shell using awk and diff. > Why then I have a duplicate of Place_RM, i.e. Place_RM.x and Place_RM.y, and only of them? > > Thank you for your help > Stefano >From the Details section of ?merge: "If the columns in the data frames not used in merging have any common names, these have suffixes (".x" and ".y" by default) appended to try to make the names of the result unique. If this is not possible, an error is thrown." If you don't want both columns in the resultant data frame, use them in the 'by' argument or remove one of them prior to merge()ing. If you use them in the 'by' argument, be sure that they will be compared as exactly equal, which can be problematic if they are floating point values. If so, you would be better of subsetting one of the source data frames to remove the column first: df3 <- merge(df1, subset(df2, select = -Place_RM), by=c("Sensor_RM", "Station_RM")) Regards, Marc Schwartz