Dear Experts I'm very new to R and after some days or reading and testing I tried to make my first small application (and failed ...) In general I would like to work with sqldf , ggplot2 to create some graphical output. At present I got stuck with this: PROG ############################################################################# deviceSummary <- data.frame(item = character(0) , value = numeric(0) , unit = character(0) ) print ( sapply(deviceSummary, class)) newRow <- c("primitiveSpace", 1.1 , "mm2") deviceSummary <-rbind(deviceSummary , newRow ) print(deviceSummary) newRow <- c("primitiveCellSpace", 2.2 , "mm2") deviceSummary <-rbind(deviceSummary , newRow ) print(deviceSummary ) OUTPUT ############################################################################ item value unit "factor" "numeric" "factor" X.primitiveSpace. X.1.1. X.mm2. 1 primitiveSpace 1.1 mm2 X.primitiveSpace. X.1.1. X.mm2. 1 primitiveSpace 1.1 mm2 2 <NA> <NA> mm2 Warning messages: 1: In `[<-.factor`(`*tmp*`, ri, value = "primitiveCellSpace") : invalid factor level, NA generated 2: In `[<-.factor`(`*tmp*`, ri, value = "2.2") : invalid factor level, NA generated Inserting the first record went fine , but the next one will fail as you can see. Repeating only the first one (value1.1) went fine. May be my imagination of DF is totally wrong. Hope someone can guide me. Thanks a lot Rolf Rolf Kemper, Manager, Mixed Signal Design, Networking, Renesas Electronics Europe GmbH, , Arcadiastr. 10, 40472, Duesseldorf, Germany, Phone:+49 211 6503-1475, Fax:+49 211 6503-1540, mailto:Rolf.Kemper@renesas.com, http://www.renesas.eu This message is intended only for the use of the address...{{dropped:24}}
Hi, Not sure if this helps. deviceSummary <- data.frame(item = character(0) ,? value = numeric(0) , unit = character(0) ,stringsAsFactors=FALSE) newlst<-? list("primitiveSpace",1.1,"mm2") deviceSummary[nrow(deviceSummary)+1,]<- newlst newlst2<-? list("primitiveSpace",2.2,"mm2") ?deviceSummary[nrow(deviceSummary)+1,]<- newlst2 ?str(deviceSummary) #'data.frame':??? 2 obs. of? 3 variables: # $ item : chr? "primitiveSpace" "primitiveSpace" # $ value: num? 1.1 2.2 # $ unit : chr? "mm2" "mm2" ?deviceSummary #??????????? item value unit #1 primitiveSpace?? 1.1? mm2 #2 primitiveSpace?? 2.2? mm2 A.K. ----- Original Message ----- From: "rolf.kemper at renesas.com" <rolf.kemper at renesas.com> To: r-help at r-project.org Cc: Sent: Tuesday, October 1, 2013 9:24 AM Subject: [R] Basic help on DF creation row by row Dear Experts I'm very new to R and after some days or reading and testing? I tried to make my first small application (and failed ...) In general I would like to work with sqldf , ggplot2 to create some graphical output. At present I got stuck with this: PROG ############################################################################# deviceSummary <- data.frame(item = character(0) ,? value = numeric(0) , unit = character(0) ) print ( sapply(deviceSummary, class)) newRow <- c("primitiveSpace", 1.1 , "mm2") deviceSummary <-rbind(deviceSummary , newRow ) print(deviceSummary) newRow <- c("primitiveCellSpace", 2.2 , "mm2") deviceSummary <-rbind(deviceSummary , newRow ) print(deviceSummary ) OUTPUT ############################################################################ ? ? item? ? value? ? ? unit "factor" "numeric"? "factor" ? X.primitiveSpace. X.1.1. X.mm2. 1? ? primitiveSpace? ? 1.1? ? mm2 ? X.primitiveSpace. X.1.1. X.mm2. 1? ? primitiveSpace? ? 1.1? ? mm2 2? ? ? ? ? ? ? <NA>? <NA>? ? mm2 Warning messages: 1: In `[<-.factor`(`*tmp*`, ri, value = "primitiveCellSpace") : ? invalid factor level, NA generated 2: In `[<-.factor`(`*tmp*`, ri, value = "2.2") : ? invalid factor level, NA generated Inserting the first record went fine , but the next one will fail as you can see. Repeating only the first one (value1.1) went fine. May be my imagination of DF is totally wrong. Hope someone can guide me. Thanks a lot Rolf Rolf? Kemper, Manager, Mixed Signal Design, Networking, Renesas Electronics Europe GmbH, , Arcadiastr. 10, 40472, Duesseldorf, Germany,? Phone:+49 211 6503-1475, Fax:+49 211 6503-1540, mailto:Rolf.Kemper at renesas.com, http://www.renesas.eu This message is intended only for the use of the address...{{dropped:24}} ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hello, The main problem is in the way you form newRow. You can't mix data classes in a vector created with c(), so all its elements become characters, the least common denominator: newRow <- c("primitiveSpace", 1.1 , "mm2") newRow [1] "primitiveSpace" "1.1" "mm2" Then when you rbind it with the data frame, they are all converted to factors. This is because the default behavior is to have the option stringsAsFactors set to TRUE. Try to check it: options()$stringsAsFactors # TRUE Now, each of those factors was created with only one level, so when you try to assign something different to them, with the second rbind, NAs are generated. The correct way would be something like the following. newRow <- data.frame(item = "primitiveSpace", value = 1.1 , unit = "mm2") deviceSummary <- rbind(deviceSummary , newRow ) print(deviceSummary) str(deviceSummary) # to check what you have newRow <- data.frame(item = "primitiveCellSpace", value = 2.2 , unit = "mm2") deviceSummary <-rbind(deviceSummary , newRow ) print(deviceSummary ) Hope this helps, Rui Barradas Em 01-10-2013 14:24, rolf.kemper at renesas.com escreveu:> > > Dear Experts > > I'm very new to R and after some days or reading and testing I tried to make my first small application (and failed ...) > In general I would like to work with sqldf , ggplot2 to create some graphical output. > > At present I got stuck with this: > > PROG ############################################################################# > deviceSummary <- data.frame(item = character(0) , value = numeric(0) , unit = character(0) ) > print ( sapply(deviceSummary, class)) > > newRow <- c("primitiveSpace", 1.1 , "mm2") > deviceSummary <-rbind(deviceSummary , newRow ) > print(deviceSummary) > > newRow <- c("primitiveCellSpace", 2.2 , "mm2") > deviceSummary <-rbind(deviceSummary , newRow ) > print(deviceSummary ) > > OUTPUT ############################################################################ > item value unit > "factor" "numeric" "factor" > X.primitiveSpace. X.1.1. X.mm2. > 1 primitiveSpace 1.1 mm2 > X.primitiveSpace. X.1.1. X.mm2. > 1 primitiveSpace 1.1 mm2 > 2 <NA> <NA> mm2 > Warning messages: > 1: In `[<-.factor`(`*tmp*`, ri, value = "primitiveCellSpace") : > invalid factor level, NA generated > 2: In `[<-.factor`(`*tmp*`, ri, value = "2.2") : > invalid factor level, NA generated > > Inserting the first record went fine , but the next one will fail as you can see. > Repeating only the first one (value1.1) went fine. > > May be my imagination of DF is totally wrong. Hope someone can guide me. > > Thanks a lot > > Rolf > > > > > Rolf Kemper, Manager, Mixed Signal Design, Networking, Renesas Electronics Europe GmbH, , Arcadiastr. 10, 40472, Duesseldorf, Germany, Phone:+49 211 > 6503-1475, Fax:+49 211 6503-1540, mailto:Rolf.Kemper at renesas.com, http://www.renesas.eu > > This message is intended only for the use of the address...{{dropped:24}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hello, Inline. Em 02-10-2013 15:16, rolf.kemper at renesas.com escreveu:> > Hello Rui, > > thanks a lot for the quick and exhaustive reply ! > It shows that my understanding of DF is totally wrong. > I'm familiar with relational DBs where the columns of table have specific data types. > Hence , I thought a DF collum is always of one specific type and the rows are just an index equal to all columns.Right. In a df, each column is a vector, therefore all elements are of the same type.> In particular , what you call FACTORS is something new to me. I have nod idea of statistics , but looking into several examples it looks like very > useful to me.A factor corresponds to the statistical concept of categorical variable. They are internally coded as integers, with a levels attribute, giving the several categories of the factor.> > What is the best reference manual to get a better imagination of how a DF is implemented and what can I do with c() function ? > The doc I found so-far is not very exhaustive.Try the R project home page, http://www.r-project.org. There's a link to several manuals, on the lower left. Rui Barradas> > Thanks a lot > R. Kemper > > > > > > > Rolf Kemper, Manager, Mixed Signal Design, Networking, Renesas Electronics Europe GmbH, , Arcadiastr. 10, 40472, Duesseldorf, Germany, Phone:+49 211 > 6503-1475, Fax:+49 211 6503-1540, mailto:Rolf.Kemper at renesas.com, http://www.renesas.eu > > This message is intended only for the use of the addressee(s) and may contain confidential and/or legally privileged information. If you are not the > intended recipient, you are hereby notified that any dissemination of this email (including any attachments thereto) is strictly prohibited. If you > have received this email in error, please notify the sender immediately by telephone or email and permanently destroy the original without making any > copy. Please note that any material and advice from this mail is provided free of charge and shall be used as an example for demonstration purposes > only. > RENESAS MAKES NO WARRANTIES THAT THE USAGE OF INFORMATION OR ADVICE FROM THIS E-MAIL WILL NOT INFRINGE ANY INTELLECTUAL PROPERTY RIGHTS (E.G. PATENTS, > COPYRIGHTS). RENESAS CANNOT GUARANTEE BUG FREE OPERATION AND THE RECIPIENT WILL USE AND/OR DISTRIBUTE IT ONLY AT HIS OWN RISK. IN NO EVENT SHALL > RENESAS BE LIABLE FOR ANY DAMAGE. The communication with Renesas Electronics Europe GmbH does not amend any written agreement in place. Renesas > Electronics Europe GmbH > > Geschaeftsfuehrer/Managing Director: Robert Green Sitz der Gesellschaft/Registered office: Duesseldorf, Arcadiastrasse 10, 40472 Duesseldorf, Germany > Handelsregister/Commercial Register: Duesseldorf, HRB 3708 USt-IDNr./Tax identification no.: DE 119353406 WEEE-Reg.-Nr./WEEE reg. no.: DE 14978647 > > From: Rui Barradas <ruipbarradas at sapo.pt> > To: rolf.kemper at renesas.com, > Cc: r-help at r-project.org > Date: 10/01/2013 08:38 PM > Subject: Re: [R] Basic help on DF creation row by row > > > > Hello, > > The main problem is in the way you form newRow. You can't mix data > classes in a vector created with c(), so all its elements become > characters, the least common denominator: > > newRow <- c("primitiveSpace", 1.1 , "mm2") > newRow > [1] "primitiveSpace" "1.1" "mm2" > > > Then when you rbind it with the data frame, they are all converted to > factors. This is because the default behavior is to have the option > stringsAsFactors set to TRUE. Try to check it: > > options()$stringsAsFactors # TRUE > > > Now, each of those factors was created with only one level, so when you > try to assign something different to them, with the second rbind, NAs > are generated. > > The correct way would be something like the following. > > > newRow <- data.frame(item = "primitiveSpace", value = 1.1 , unit = "mm2") > deviceSummary <- rbind(deviceSummary , newRow ) > print(deviceSummary) > str(deviceSummary) # to check what you have > > newRow <- data.frame(item = "primitiveCellSpace", value = 2.2 , unit > "mm2") > deviceSummary <-rbind(deviceSummary , newRow ) > print(deviceSummary ) > > > Hope this helps, > > Rui Barradas > > Em 01-10-2013 14:24, rolf.kemper at renesas.com escreveu: >> >> >> Dear Experts >> >> I'm very new to R and after some days or reading and testing I tried to make my first small application (and failed ...) >> In general I would like to work with sqldf , ggplot2 to create some graphical output. >> >> At present I got stuck with this: >> >> PROG ############################################################################# >> deviceSummary <- data.frame(item = character(0) , value = numeric(0) , unit = character(0) ) >> print ( sapply(deviceSummary, class)) >> >> newRow <- c("primitiveSpace", 1.1 , "mm2") >> deviceSummary <-rbind(deviceSummary , newRow ) >> print(deviceSummary) >> >> newRow <- c("primitiveCellSpace", 2.2 , "mm2") >> deviceSummary <-rbind(deviceSummary , newRow ) >> print(deviceSummary ) >> >> OUTPUT ############################################################################ >> item value unit >> "factor" "numeric" "factor" >> X.primitiveSpace. X.1.1. X.mm2. >> 1 primitiveSpace 1.1 mm2 >> X.primitiveSpace. X.1.1. X.mm2. >> 1 primitiveSpace 1.1 mm2 >> 2 <NA> <NA> mm2 >> Warning messages: >> 1: In `[<-.factor`(`*tmp*`, ri, value = "primitiveCellSpace") : >> invalid factor level, NA generated >> 2: In `[<-.factor`(`*tmp*`, ri, value = "2.2") : >> invalid factor level, NA generated >> >> Inserting the first record went fine , but the next one will fail as you can see. >> Repeating only the first one (value1.1) went fine. >> >> May be my imagination of DF is totally wrong. Hope someone can guide me. >> >> Thanks a lot >> >> Rolf >> >> >> >> >> Rolf Kemper, Manager, Mixed Signal Design, Networking, Renesas Electronics Europe GmbH, , Arcadiastr. 10, 40472, Duesseldorf, Germany, Phone:+49 > 211 >> 6503-1475, Fax:+49 211 6503-1540, mailto:Rolf.Kemper at renesas.com, http://www.renesas.eu >> >> This message is intended only for the use of the address...{{dropped:24}} >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >