thr3ads.net - similar to: "Basic aggregate help"

2006 Oct 01

3

aggregate function with 'NA'

Dear r-help reader, I have some problems with the aggregate function. My datframe looks like >frame Day Time V1 V2 1 M 0 3 NA 2 M 0 4 NA 3 M 0 5 2 4 M 1 NA 4 5 M 1 10 6 6 T 0 4 45 7 T 1 4 3 8 T 1 3 2 9 T 1 6 1 I used the aggegate function to obtain the mean in V1 and V2 over the grouping variable Time and Day

Iterations

2008 Jul 15

4

Iterations

I have a command that reads in some data: x <- read.csv("Sales2007.dat", header=TRUE) Then I try to organize the data: sc <- split(x, list(x$Category, x$SubCategory), drop=TRUE) Then I want to iterate through the data. I was able to get the following to run on the R console: for(i in 1:length(sc)) { sum(sc[[i]]$Quantity) } But notiing is primted on the console. I find

Reshape set operations?

2008 Aug 09

1

Reshape set operations?

I have mange to use the library reshape to give me data structures that I want. Specifically: m2008 <- melt(t2008, id.var=c("DayOfYear","Category","SubCategory","Sku"), measure.var=c("Quantity")) m2007 <- melt(t2007, id.var=c("DayOfYear","Category","SubCategory","Sku"),

PDF Corrupted?

2009 Oct 28

5

PDF Corrupted?

I am running R 2.9.2 and creating a PDF that I am trying to open with Adobe Reader 9.2 but when I try to open it the reader responds with "There was an error opening this document. The file is damaged and cannot be repaired.: I am using the R command(s): pdf(file="cat.pdf", title="Historical Sales By Category") for(j in 1:length(master)) { d <-

Simple vector question.

2008 Jul 26

1

Simple vector question.

I have some data that I read in via read.csv: sales2007 <- read.csv("Total2007.dat", header=TRUE) The data looks like: > sales2007[1:605,] Year DayOfYear Sku Quantity CatId Category SubCategory 1 2007 1 100091 1 10862 HOLIDAY Christmas 2 2007 1 100138 1 11160 PET COSTUMES Famous (Licensed) 3 2007

hwo to speed up "aggregate"

2011 Jan 26

2

hwo to speed up "aggregate"

I have > df quantity branch client date name 1 10 1 1 2010-01-01 one 2 20 2 1 2010-01-01 one 3 30 3 2 2010-01-01 two 4 15 4 1 2010-01-01 one 5 10 5 2 2010-01-01 two 6 20 6 3 2010-01-01 three 7 1000 1 1 2011-01-01 one 8 2000 2 1 2011-01-01

Data length mismatch.

2008 Jul 26

4

Data length mismatch.

I have two vectos (list) that represent a years of data. Each "row" is represented by the day of year and the quantity that was sold for that day. I would like to form a new vector that is the difference between the two years of data. A sample of A (and similarly B) looks like: > A[1:5,] DayOfYear x 1 1 1429 2 2 3952 3 3 3049 4 4 2844 5 5

RESHAPE cast help.

2008 Aug 05

1

RESHAPE cast help.

I have a set of data that is basically sales figures for a given year. It has columns for Yeaqr, Day Of Year, Sku, SubCatetory, and Category. The first few lines of data look like: Year DayOfYear Sku Quantity CatId Category SubCategory 1 2007 1 100091 1 10862 HOLIDAY Christmas 2 2007 1 100138 1 11160 PET COSTUMES Famous

Updating a list.

2008 Aug 27

1

Updating a list.

I have a list that is generated from the resape package function 'cast'. It consists of three columns, Sku, DayOfYear, variable it is generated like: r2007 <- cast(m2008, DayOfYear ~ variable | Sku, sum) Now DayOfYear can range from 1:365 but there are not necessarily that many rows in the list. What I want to do is make every row in the list of lenght 365 and have the values

question on aggregate

2011 Jan 11

1

question on aggregate

an example available on the net goes like > df identifier quantity 1 1 10 2 1 20 3 2 30 4 1 15 5 2 10 6 3 20 > aggregate(df$quantity, by=list(df$identifier), sum) Group.1 x 1 1 45 2 2 40 3 3 20 I'd like Group.1 to retain the name "identifier" and would like to

Length of data.frame column

2008 Aug 08

2

Length of data.frame column

I have a beginner question. After I finally get the data to a data.frame that I can work with I have the following a data frame that is fairly long: > length(r2007) [1] 17409 If I look at the first element: > r2007[1] $`100009` DayOfYear Quantity 1 66 1 2 128 1 3 137 1 4 193 1 Now how do I get the length of this list (actually it is

Reshape question.

2009 Mar 11

1

Reshape question.

This hopefully is trivial. I am trying to reshape the data using the reshape package. First I read in the data: a2009 <- read.csv("Total2009.dat", header = TRUE) Then I trim it so that it only contains the columns that I have interested in: m2009 <- melt(a2009, id.var=c("DayOfYear","Category","SubCategory","Sku"),

setdiff bizarre (was: odd behavior out of setdiff)

2009 May 30

3

setdiff bizarre (was: odd behavior out of setdiff)

Dear R-devel, Please see the recent thread on R-help, "Odd Behavior Out of setdiff(...) - addition of duplicate entries is not identified" posted by Jason Rupert. I gave an answer, then read David Winsemius' answer, and then did some follow-up investigation. I would like to change my answer. My current version of setdiff() is acting in a way that I do not understand, and a way

setdiff bizarre (was: odd behavior out of setdiff)

2009 May 30

3

setdiff bizarre (was: odd behavior out of setdiff)

Dear R-devel, Please see the recent thread on R-help, "Odd Behavior Out of setdiff(...) - addition of duplicate entries is not identified" posted by Jason Rupert. I gave an answer, then read David Winsemius' answer, and then did some follow-up investigation. I would like to change my answer. My current version of setdiff() is acting in a way that I do not understand, and a way

Assoociative array?

2008 Jul 12

1

Assoociative array?

I have search the archive and I could not find what I need so I will try to ask the question here. I read a table in (read.table) a <- read.table(.....) The table has column names like DayOfYear, Quantity, and Category. The values in the row for Category are strings (characters). I want to get all of the rows grouped by Category. The number of unique category names could be around 50. Say

Odd Behavior Out of setdiff(...) - addition of duplicate entries is not identified

2009 May 29

2

Odd Behavior Out of setdiff(...) - addition of duplicate entries is not identified

I think I am using the improved version of setdiff(...) that handles data.frames, so I think some odd behavior was expected but this one is escaping me. It appears that the the addition of duplicate entries is not caught by the setdiff(...). Is this expected behavior? If so, is there another method or approach that should be used to identify duplicate row entries between two different data

setdiff for data frames

2007 Dec 10

1

setdiff for data frames

Hello, I have been interested in setdiff() for data frames that operates row-wise. I looked in the documentation, mailing lists, etc., and didn't find exactly the right thing. Given data frames A, B with the same columns, the goal is to extract the rows that are in A, but not in B. Of course, one can usually do setdiff(rownames(A), rownames(B)) but that is cheating. :-) I played around a

question about setdiff()

2004 Feb 27

1

question about setdiff()

Thank you for your answers, I have another question: the behaviour of setdiff(indicesFalse, indicesNA) does not seem predictable to me. > indices [1] 1 2 3 4 5 6 > compareVector [1] NA TRUE TRUE TRUE FALSE NA > indicesNA = indices[is.na(compareVector)] > indicesNA [1] 1 6 > indicesFalse = indices[compareVector == FALSE] > indicesFalse [1] NA 5 NA >

Help summarizing R data frame

2010 Dec 02

5

Help summarizing R data frame

I am trying to aggregate data in column 2 to identifiers in col 1 eg.. take this> identifier quantity 1 10 1 20 2 30 1 15 2 10 3 20 and make this> identifier quantity 1 45 2 40 3 20 Thanks in

Can we interlink these three if conditions?

2012 Aug 14

1

Can we interlink these three if conditions?

key1.=c(1, 2, 3) key2.=c(2) if (identical(key1.,key2.) == "TRUE") { cat("No Errors found") } if (length(setdiff(key1., key2.)) !=0) {

similar to: Basic aggregate help