plessthanpointohfive at gmail.com
2013-Mar-18 17:44 UTC
[R] Loop or some other way to parse by data generated values when it is not linear
I'm sorry for the really vague subject line but I am not sure how to
succinctly describe what I am doing and what the problem is.
But, here goes:
1. I have data with two-way data with frequencies. Below is an
example, though in reality I am looking at about 10 different variables
that I am crossing so the values of X1 and X2 change. X1 and X2 are
place holders.
Here's the dataset (though using this first part does not happen in
reality):
X1 <- matrix(c(0, 1, 2, 3, 4, 99), nrow=18, ncol=1, byrow=T)
X2 <- sort(matrix(c(0, 2, 4), nrow=18, ncol=1, byrow=T), decreasing=F)
Y <- matrix(c(83, 107, 47, 27, 38, 1, 12 ,25, 14, 4, 9, 0, 14, 27, 28,
13, 18, 0), nrow=18, ncol=1, byrow=T)
tmp.n <- data.frame(X1, X2, Y)
The final data frame is what I actually get:
X1 X2 Y
1 0 0 83
2 1 0 107
3 2 0 47
4 3 0 27
5 4 0 38
6 99 0 1
7 0 2 12
8 1 2 25
9 2 2 14
10 3 2 4
11 4 2 9
12 99 2 0
13 0 4 14
14 1 4 27
15 2 4 28
16 3 4 13
17 4 4 18
18 99 4 0
2. What I want is:
0 2 4
0 83 12 14
1 107 25 27
2 47 14 28
3 27 4 13
4 38 9 18
99 1 0 0
3. I've been trying to do it using this (which is inside a function so
I can vary what variables X1 and X2 are):
X1 <- table(tmp.n[,1])
X2 <- table(tmp.n[,2])
# Create the tmp.n.# datasets that contain the Y's. Do this in a loop
to automate
dta <- NULL
for (i in 0:length(X1)) {
assign("tmp.n_", tmp.n[tmp.n[,1] == i, c(1,3)])
tmp.n_ <- data.frame(tmp.n_[,2])
dta[i] <- assign(paste("tmp.n.", i, sep=""), tmp.n_)
dta
}
dta2 <- (data.frame(matrix(unlist(dta), nrow=n2[1], byrow=T)))
colnames(dta2) <- names(X2)
dta2
And that works so long as X1 and X2 are linear. In other words, if X1
<- seq(0, 4, 1). But that 99 throws the whole thing off and it gives me
this:
X1 X2
1 107 25
2 27 47
3 14 28
4 27 4
5 13 38
6 9 18
It's basically breaks the whole thing.
I've not been able to figure this out and I've been like a dog with a
bone trying to make it work with modifications to the for loop. I know
there is an easier way to do this, but my brain is no longer capable of
thinking outside the box I've put it in. So, I am turning to you for help.
Best,
Jen
arun
2013-Mar-18 17:49 UTC
[R] Loop or some other way to parse by data generated values when it is not linear
Hi,
library(reshape2)
?dcast(tmp.n,X1~X2,value.var="Y")
?# X1?? 0? 2? 4
#1? 0? 83 12 14
#2? 1 107 25 27
#3? 2? 47 14 28
#4? 3? 27? 4 13
#5? 4? 38? 9 18
#6 99?? 1? 0? 0
A.K.
----- Original Message -----
From: "plessthanpointohfive at gmail.com" <plessthanpointohfive at
gmail.com>
To: r-help at r-project.org
Cc:
Sent: Monday, March 18, 2013 1:44 PM
Subject: [R] Loop or some other way to parse by data generated values when it is
not linear
I'm sorry for the really vague subject line but I am not sure how to
succinctly describe what I am doing and what the problem is.
But, here goes:
1.? I have data with two-way data with frequencies.? Below is an example, though
in reality I am looking at about 10 different variables that I am crossing so
the values of X1 and X2 change.? X1 and X2 are place holders.
Here's the dataset (though using this first part does not happen in
reality):
X1 <- matrix(c(0, 1, 2, 3, 4, 99), nrow=18, ncol=1, byrow=T)
X2 <- sort(matrix(c(0, 2, 4), nrow=18, ncol=1, byrow=T), decreasing=F)
Y <- matrix(c(83, 107, 47, 27, 38, 1, 12 ,25, 14, 4, 9, 0, 14, 27, 28, 13,
18, 0), nrow=18, ncol=1, byrow=T)
tmp.n <- data.frame(X1, X2, Y)
The final data frame is what I actually get:
? X1 X2? Y
1? 0? 0? 83
2? 1? 0 107
3? 2? 0? 47
4? 3? 0? 27
5? 4? 0? 38
6? 99? 0? 1
7? 0? 2? 12
8? 1? 2? 25
9? 2? 2? 14
10? 3? 2? 4
11? 4? 2? 9
12 99? 2? 0
13? 0? 4? 14
14? 1? 4? 27
15? 2? 4? 28
16? 3? 4? 13
17? 4? 4? 18
18 99? 4? 0
2.? What I want is:
? ? ? ? 0? 2? 4
0? ? 83 12 14
1? 107 25 27
2? ? 47 14 28
3? ? 27? 4 13
4? ? 38? 9 18
99? ? 1? 0? 0
3.? I've been trying to do it using this (which is inside a function so I
can vary what variables X1 and X2 are):
X1 <- table(tmp.n[,1])
X2 <- table(tmp.n[,2])
# Create the tmp.n.# datasets that contain the Y's.? Do this in a loop to
automate
dta <- NULL
for (i in 0:length(X1)) {
assign("tmp.n_", tmp.n[tmp.n[,1] == i, c(1,3)])
tmp.n_ <- data.frame(tmp.n_[,2])
dta[i] <- assign(paste("tmp.n.", i, sep=""), tmp.n_)
dta
}
dta2 <- (data.frame(matrix(unlist(dta), nrow=n2[1], byrow=T)))
colnames(dta2) <- names(X2)
dta2
And that works so long as X1 and X2 are linear.? In other words, if X1 <-
seq(0, 4, 1).? But that 99 throws the whole thing off and it gives me this:
? X1 X2
1 107 25
2? 27 47
3? 14 28
4? 27? 4
5? 13 38
6? 9 18
It's basically breaks the whole thing.
I've not been able to figure this out and I've been like a dog with a
bone trying to make it work with modifications to the for loop.? I know there is
an easier way to do this, but my brain is no longer capable of thinking outside
the box I've put it in.? So, I am turning to you for help.
Best,
Jen
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Marc Schwartz
2013-Mar-18 17:50 UTC
[R] Loop or some other way to parse by data generated values when it is not linear
On Mar 18, 2013, at 12:44 PM, plessthanpointohfive at gmail.com wrote:> I'm sorry for the really vague subject line but I am not sure how to succinctly describe what I am doing and what the problem is. > > But, here goes: > > 1. I have data with two-way data with frequencies. Below is an example, though in reality I am looking at about 10 different variables that I am crossing so the values of X1 and X2 change. X1 and X2 are place holders. > > Here's the dataset (though using this first part does not happen in reality): > > X1 <- matrix(c(0, 1, 2, 3, 4, 99), nrow=18, ncol=1, byrow=T) > X2 <- sort(matrix(c(0, 2, 4), nrow=18, ncol=1, byrow=T), decreasing=F) > Y <- matrix(c(83, 107, 47, 27, 38, 1, 12 ,25, 14, 4, 9, 0, 14, 27, 28, 13, 18, 0), nrow=18, ncol=1, byrow=T) > tmp.n <- data.frame(X1, X2, Y) > > The final data frame is what I actually get: > > > X1 X2 Y > 1 0 0 83 > 2 1 0 107 > 3 2 0 47 > 4 3 0 27 > 5 4 0 38 > 6 99 0 1 > 7 0 2 12 > 8 1 2 25 > 9 2 2 14 > 10 3 2 4 > 11 4 2 9 > 12 99 2 0 > 13 0 4 14 > 14 1 4 27 > 15 2 4 28 > 16 3 4 13 > 17 4 4 18 > 18 99 4 0 > > > 2. What I want is: > > > 0 2 4 > 0 83 12 14 > 1 107 25 27 > 2 47 14 28 > 3 27 4 13 > 4 38 9 18 > 99 1 0 0 > > > 3. I've been trying to do it using this (which is inside a function so I can vary what variables X1 and X2 are): > > > X1 <- table(tmp.n[,1]) > X2 <- table(tmp.n[,2]) > > # Create the tmp.n.# datasets that contain the Y's. Do this in a loop to automate > dta <- NULL > for (i in 0:length(X1)) { > assign("tmp.n_", tmp.n[tmp.n[,1] == i, c(1,3)]) > tmp.n_ <- data.frame(tmp.n_[,2]) > dta[i] <- assign(paste("tmp.n.", i, sep=""), tmp.n_) > dta > } > dta2 <- (data.frame(matrix(unlist(dta), nrow=n2[1], byrow=T))) > colnames(dta2) <- names(X2) > dta2 > > > And that works so long as X1 and X2 are linear. In other words, if X1 <- seq(0, 4, 1). But that 99 throws the whole thing off and it gives me this: > > X1 X2 > 1 107 25 > 2 27 47 > 3 14 28 > 4 27 4 > 5 13 38 > 6 9 18 > > It's basically breaks the whole thing. > > I've not been able to figure this out and I've been like a dog with a bone trying to make it work with modifications to the for loop. I know there is an easier way to do this, but my brain is no longer capable of thinking outside the box I've put it in. So, I am turning to you for help. > > Best, > > JenSomething like this?> xtabs(Y ~ X1 + X2, data = tmp.n)X2 X1 0 2 4 0 83 12 14 1 107 25 27 2 47 14 28 3 27 4 13 4 38 9 18 99 1 0 0 See ?xtabs Regards, Marc Schwartz