thr3ads.net - R help - [R] Fwd: Questions about working with a dataframe [Jun 2013]

If this information is useful, please help other people find it:
Share via:

Jacqueline Oehri

2013-Jun-25 14:25 UTC

[R] Fwd: Questions about working with a dataframe

> Dear R-Users, 
> I hope this is the right e-mail adress to post questions about Programming
in R, and I hope somebody of you can help me with the troubles I have :)
> 
> 
> 1) First Question:
> 
> I have a dataframe called "WWA" (its attached to this e-mail-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: WWA.txt
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20130625/f8a2e152/attachment-0001.txt>
-------------- next part --------------> ). It looks a little bit like the following one:
> 
> 
> testcoordID testcommunity testaltitude     testSpeciesName
> 1      503146       Bournes        523.2     Bellis perennis
> 2      503146       Bournes        321.5 Cynosurus cristatus
> 3      557154       Bournes        654.1   Festuca pratensis
> 4      557154         Aigle        938.6     Bellis perennis
> 5      569226         Aigle        401.3     Bellis perennis
> 6      599246         Aigle        765.9   Prunella vulgaris
> 
> ((I programmed this little one like this: 
> testcoordID
<-c(as.integer("503146"),as.integer("503146"),as.integer("557154"),as.integer("557154"),as.integer("569226"),as.integer("599246"))
> testcommunity
<-factor(c("Bournes","Bournes","Bournes",
"Aigle", "Aigle", "Aigle"))
> testaltitude <- c(523.2,321.5,654.1,938.6,401.3,765.9)
> testSpeciesName <-c( "Bellis perennis",
>                      "Cynosurus cristatus",
>                      "Festuca pratensis",
>                     "Bellis perennis",
>                     "Bellis perennis",
>                      "Prunella vulgaris")
> testframe <- data.frame(testcoordID,testcommunity,testaltitude,
testSpeciesName))
> 
> 
> 
> I needed to manipulate WWA in Excel, therefore i wrote
> it as a text-file:
> 
>> write.table(WWA, "WWA.txt", col.names=T, row.names=F, sep=
";", quote =T)
> 
> Then I manipulated the WWA.txt in Excel and saved it as
"noWWA.csv"(
-------------- next part --------------> ) and re-importet it under the new name "oWWA" in R:
> 
>> oWWA <- read.csv("~/Desktop/NCCR master projekt/BDM
Beschreibungen/BDM Daten/noWWA.csv", header=TRUE, sep=";")
> 
> What i need to do with this "WWA" or "oWWA"is finally
to create a list (or a dataframe but this is not possible i think), that shows
for each coordinateID ("testcoordID") the species Names occuring at
this place:
> 
>> species_per_coordID1<- tapply((WWA$speciesName), WWA$coordID, list)
>> species_per_coordID2 <- split(WWA$speciesName, WWA$coordID)
> 
> ---> now my Question: This works very well with the WWA table, but not
with the oWWA!! I think i changed something in the dataframe by converting it to
a .txt file and than back to a .csv;
> But does anybody know why it works with WWA and not with oWWA? how can I
treat the WWA dataframe in Excel without changing any format of it?
> 
> 
> Thaanks a lot for any help or suggestions!!!!!
> 
> Have a nice day, 
> 
> Kind regards Jacqueline
>

John Kane

2013-Jun-25 15:57 UTC

head link

[R] Fwd: Questions about working with a dataframe

Hi, welcome to R

Try using the function str() on both files so str(WWA) and str(oWWA) and compare
the structures that you get.  Probably one of the varables you defined when
creating the original WWA data set has changed from a character variable to a
factor or vis versa.

It is a good idea to use dput to supply sample data here.

So dput(WWA) and paste the results into the email and repeat with the other data
set.  Then readers can paste the actual data sets into R and work on them
directly.

If the str() approach does not give you enough information please paste in the
dput results in your next email.

Good luck

John Kane
Kingston ON Canada

> -----Original Message-----
> From: jacqueline.oehri at gmx.ch
> Sent: Tue, 25 Jun 2013 16:25:59 +0200
> To: r-help at r-project.org
> Subject: [R] Fwd: Questions about working with a dataframe
> 
> 
> 
>> Dear R-Users,
>> I hope this is the right e-mail adress to post questions about
>> Programming in R, and I hope somebody of you can help me with the
>> troubles I have :)
>> 
>> 
>> 1) First Question:
>> 
>> I have a dataframe called "WWA" (its attached to this e-mail
>> ). It looks a little bit like the following one:
>> 
>> 
>> testcoordID testcommunity testaltitude     testSpeciesName
>> 1      503146       Bournes        523.2     Bellis perennis
>> 2      503146       Bournes        321.5 Cynosurus cristatus
>> 3      557154       Bournes        654.1   Festuca pratensis
>> 4      557154         Aigle        938.6     Bellis perennis
>> 5      569226         Aigle        401.3     Bellis perennis
>> 6      599246         Aigle        765.9   Prunella vulgaris
>> 
>> ((I programmed this little one like this:
>> testcoordID
>>
<-c(as.integer("503146"),as.integer("503146"),as.integer("557154"),as.integer("557154"),as.integer("569226"),as.integer("599246"))
>> testcommunity
<-factor(c("Bournes","Bournes","Bournes",
"Aigle",
>> "Aigle", "Aigle"))
>> testaltitude <- c(523.2,321.5,654.1,938.6,401.3,765.9)
>> testSpeciesName <-c( "Bellis perennis",
>>                      "Cynosurus cristatus",
>>                      "Festuca pratensis",
>>                     "Bellis perennis",
>>                     "Bellis perennis",
>>                      "Prunella vulgaris")
>> testframe <- data.frame(testcoordID,testcommunity,testaltitude,
>> testSpeciesName))
>> 
>> 
>> 
>> I needed to manipulate WWA in Excel, therefore i wrote
>> it as a text-file:
>> 
>>> write.table(WWA, "WWA.txt", col.names=T, row.names=F,
sep= ";", quote
>>> =T)
>> 
>> Then I manipulated the WWA.txt in Excel and saved it as
"noWWA.csv"(
>> ) and re-importet it under the new name "oWWA" in R:
>> 
>>> oWWA <- read.csv("~/Desktop/NCCR master projekt/BDM
Beschreibungen/BDM
>>> Daten/noWWA.csv", header=TRUE, sep=";")
>> 
>> What i need to do with this "WWA" or "oWWA"is
finally to create a list
>> (or a dataframe but this is not possible i think), that shows for each
>> coordinateID ("testcoordID") the species Names occuring at
this place:
>> 
>>> species_per_coordID1<- tapply((WWA$speciesName), WWA$coordID,
list)
>>> species_per_coordID2 <- split(WWA$speciesName, WWA$coordID)
>> 
>> ---> now my Question: This works very well with the WWA table, but
not
>> with the oWWA!! I think i changed something in the dataframe by
>> converting it to a .txt file and than back to a .csv;
>> But does anybody know why it works with WWA and not with oWWA? how can
I
>> treat the WWA dataframe in Excel without changing any format of it?
>> 
>> 
>> Thaanks a lot for any help or suggestions!!!!!
>> 
>> Have a nice day,
>> 
>> Kind regards Jacqueline
>> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
____________________________________________________________
GET FREE SMILEYS FOR YOUR IM & EMAIL - Learn more at
http://www.inbox.com/smileys
Works with AIM?, MSN? Messenger, Yahoo!? Messenger, ICQ?, Google Talk? and most
webmails

John Kane

2013-Jun-25 16:12 UTC

head link

[R] Fwd: Questions about working with a dataframe

Ouch. My apologies David, after reading the message I didn't bother to look
at the txt file.
John Kane
Kingston ON Canada

> -----Original Message-----
> From: dwinsemius at comcast.net
> Sent: Tue, 25 Jun 2013 09:09:15 -0700
> To: jrkrideau at inbox.com
> Subject: Re: [R] Fwd: Questions about working with a dataframe
> 
> 
> On Jun 25, 2013, at 8:57 AM, John Kane wrote:
> 
>> Hi, welcome to R
>> 
>> Try using the function str() on both files so str(WWA) and str(oWWA)
and
>> compare the structures that you get.  Probably one of the varables you
>> defined when creating the original WWA data set has changed from a
>> character variable to a factor or vis versa.
>> 
>> It is a good idea to use dput to supply sample data here.
>> 
>> So dput(WWA) and paste the results into the email and repeat with the
>> other data set.  Then readers can paste the actual data sets into R and
>> work on them directly.
> 
> In this case I think it would be much more courteous to include:
> 
> dput(head(WWA))
> dput(head(oWWA))
> 
> ... in light of the attached 5MB file in that email.
> 
> I apologize to the other list readers for approving it in the moderation
> queue. It should ahve been rejected, but my excuse is that the moderation
> viewer doesn't always highlight all aspects of hte postings being
viewed
> that should be highlighted.
> 
> --
> David.
>> If the str() approach does not give you enough information please paste
>> in the dput results in your next email.
>> 
>> Good luck
>> 
>> John Kane
>> Kingston ON Canada
>> 
>> 
>>> -----Original Message-----
>>> From: jacqueline.oehri at gmx.ch
>>> Sent: Tue, 25 Jun 2013 16:25:59 +0200
>>> To: r-help at r-project.org
>>> Subject: [R] Fwd: Questions about working with a dataframe
>>> 
>>> 
>>> 
>>>> Dear R-Users,
>>>> I hope this is the right e-mail adress to post questions about
>>>> Programming in R, and I hope somebody of you can help me with
the
>>>> troubles I have :)
>>>> 
>>>> 
>>>> 1) First Question:
>>>> 
>>>> I have a dataframe called "WWA" (its attached to this
e-mail
>>>> ). It looks a little bit like the following one:
>>>> 
>>>> 
>>>> testcoordID testcommunity testaltitude     testSpeciesName
>>>> 1      503146       Bournes        523.2     Bellis perennis
>>>> 2      503146       Bournes        321.5 Cynosurus cristatus
>>>> 3      557154       Bournes        654.1   Festuca pratensis
>>>> 4      557154         Aigle        938.6     Bellis perennis
>>>> 5      569226         Aigle        401.3     Bellis perennis
>>>> 6      599246         Aigle        765.9   Prunella vulgaris
>>>> 
>>>> ((I programmed this little one like this:
>>>> testcoordID
>>>>
<-c(as.integer("503146"),as.integer("503146"),as.integer("557154"),as.integer("557154"),as.integer("569226"),as.integer("599246"))
>>>> testcommunity
<-factor(c("Bournes","Bournes","Bournes",
"Aigle",
>>>> "Aigle", "Aigle"))
>>>> testaltitude <- c(523.2,321.5,654.1,938.6,401.3,765.9)
>>>> testSpeciesName <-c( "Bellis perennis",
>>>>                     "Cynosurus cristatus",
>>>>                     "Festuca pratensis",
>>>>                    "Bellis perennis",
>>>>                    "Bellis perennis",
>>>>                     "Prunella vulgaris")
>>>> testframe <-
data.frame(testcoordID,testcommunity,testaltitude,
>>>> testSpeciesName))
>>>> 
>>>> 
>>>> 
>>>> I needed to manipulate WWA in Excel, therefore i wrote
>>>> it as a text-file:
>>>> 
>>>>> write.table(WWA, "WWA.txt", col.names=T,
row.names=F, sep= ";", quote
>>>>> =T)
>>>> 
>>>> Then I manipulated the WWA.txt in Excel and saved it as
"noWWA.csv"(
>>>> ) and re-importet it under the new name "oWWA" in R:
>>>> 
>>>>> oWWA <- read.csv("~/Desktop/NCCR master projekt/BDM
>>>>> Beschreibungen/BDM
>>>>> Daten/noWWA.csv", header=TRUE, sep=";")
>>>> 
>>>> What i need to do with this "WWA" or
"oWWA"is finally to create a list
>>>> (or a dataframe but this is not possible i think), that shows
for each
>>>> coordinateID ("testcoordID") the species Names
occuring at this place:
>>>> 
>>>>> species_per_coordID1<- tapply((WWA$speciesName),
WWA$coordID, list)
>>>>> species_per_coordID2 <- split(WWA$speciesName,
WWA$coordID)
>>>> 
>>>> ---> now my Question: This works very well with the WWA
table, but not
>>>> with the oWWA!! I think i changed something in the dataframe by
>>>> converting it to a .txt file and than back to a .csv;
>>>> But does anybody know why it works with WWA and not with oWWA?
how can
>>>> I
>>>> treat the WWA dataframe in Excel without changing any format of
it?
>>>> 
>>>> 
>>>> Thaanks a lot for any help or suggestions!!!!!
>>>> 
>>>> Have a nice day,
>>>> 
>>>> Kind regards Jacqueline
>>>> 
> 
> 
> David Winsemius
> Alameda, CA, USA
>
____________________________________________________________
FREE ONLINE PHOTOSHARING - Share your photos online with your friends and
family!
Visit http://www.inbox.com/photosharing to find out more!

John Kane

2013-Jun-26 15:34 UTC

head link

[R] Fwd: Questions about working with a dataframe

It is always better when dealing with R to use plain text. HTML messes things up
badly sometimes and it is also a good idea to reply to the R-help list rather
than individual respondents.? You can get more responses if the problem
continues and if either of us were away then it might be weeks before we managed
to reply

Other responses in line

John Kane
Kingston ON Canada

-----Original Message-----
From: jacqueline.oehri at gmx.ch
Sent: Wed, 26 Jun 2013 11:18:41 +0200 (CEST)
To: dwinsemius at comcast.net, jrkrideau at inbox.com, r-help at r-project.org
Subject: Aw: Re: [R] Fwd: Questions about working with a dataframe

Dear Mr. Kane and dear Mr. Winsemius

Thanks a lot for your quick answers and good recommendations!!! And I apologise
for attaching such a big file before!!

I think I could solve the problem;

Maybe you can tell me if its right what I have done?

As John said, the str(WWA) and str(oWWA) gave different outputs for
"WWA$speciesName":
> class(WWA$speciesName) [1] "character"
> class(oWWA$speciesName) [1] "factor"

What I did is this:
> oWWA$speciesName <-as.character(oWWA$speciesName)
and now I've got:
> class(oWWA$speciesName) [1] "character"

and the function I wanted to use works well:
> Sp_per_coordID_oWWA <-tapply((oWWA$speciesName), oWWA$coordID, list)
-->Question: Do you think I did this right and this didn't mess up the
structure of the dataset? As far as I can see, I see no problem but I m not so
experienced as you are!

.Yes, I think you found the problem.? It is always a good idea to use str() when
reading in a csv file and even when doing data transformations with R as things
can change in sometimes unexpected ways.

You might also want to look at "stringsAsFactors" which is either TRUE
or FALSE.? For historical reasons, what I cannot remember, R often reads in
repetitive strings as Factors rather than Characters.

Thank you very very much for your answers!!! It helped me a lot!!

-->second Question: I had problems with using dput(head(WWA)), because I
think its still too big, so that I m not able to post all the output from
"dput(head(WWA)))", even when i subsetted it first to only three rows:
> WWAsubset <-WWA[c(1:3),] > dput(head(WWAsubset))

(see after str(WWA) and str(oWWA)

Yes you can set the number of lines of output with something like
dput(head(dat1, 10)) which would output the first 10 lines of the data.frame
dat1.

____________________________________________________________
FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

R help - Jun 2013 - Fwd: Questions about working with a dataframe

[R] Fwd: Questions about working with a dataframe

[R] Fwd: Questions about working with a dataframe

[R] Fwd: Questions about working with a dataframe

[R] Fwd: Questions about working with a dataframe