Dale Steele
2006-Jan-26 18:31 UTC
[R] Data management problem: convert text string to matrix of 0's and 1's
I have a data management problem which exceeds my meager R programming skills and would greatly appreciate suggestions on how to proceed? The data consists of a series of observation periods. Specific behaviors are recorded for each time period in the order each is observed. Their are 8 possible behaviors, coded as "i" "c" "s" "r" "v" "e" "p" "f". The data looks like: --> icsrvepf fpevrsci ics p f ic <-- I would like to convert the about to a matrix of the form: i c s r v e p f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 Thanks. Dale Dale Steele, MD Pediatric Emergency Medicine Brown Medical School
Thomas Lumley
2006-Jan-26 18:40 UTC
[R] Data management problem: convert text string to matrix of 0's and 1's
On Thu, 26 Jan 2006, Dale Steele wrote:> The data looks like: > --> > icsrvepf > fpevrsci > ics > p > > f > ic > <-- > > I would like to convert the about to a matrix of the form: > > i c s r v e p f > 1 1 1 1 1 1 1 1 > 1 1 1 1 1 1 1 1 > 1 1 1 0 0 0 0 0 > 0 0 0 0 0 0 1 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 1 > 1 1 0 0 0 0 0 0 >One possibility is to use grep()> a[1] "icsrvepf" "fpevrsci" "p" "" "f" "ic"> grep("i",a)[1] 1 2 6 so> results<-matrix(0,nrow=length(a),ncol=length(behaviours)) > colnames(results)<-behaviours > for(b in behaviours) results[grep(b,a),b]<-1 > resultsi c s r v e p f [1,] 1 1 1 1 1 1 1 1 [2,] 1 1 1 1 1 1 1 1 [3,] 0 0 0 0 0 0 1 0 [4,] 0 0 0 0 0 0 0 0 [5,] 0 0 0 0 0 0 0 1 [6,] 1 1 0 0 0 0 0 0>-thomas
Martin Lam
2006-Jan-27 08:58 UTC
[R] Data management problem: convert text string to matrix of 0's and 1's
Hi Dale,
Unfortunately, you didn't say in what for format your
data is saved into, so I presume it's saved as a list
of strings. Perhaps there is a faster/better way, but
this should suffice if your datasize isn't enormous.
data = list()
data[1] = "icsrvepf"
data[2] = "fpevrsci"
data[3] = "ics"
data[4] = "p"
data[5] = ""
data[6] = "f"
data[7] = "ic"
names = as.character(c("i", "c", "s",
"r", "v", "e",
"p", "f"))
mymatrix = matrix(0, nrow = 7, ncol = 8)
colnames(mymatrix) = names
for (i in 1:length(data)) {
# split the string into separate characters
chars = strsplit(data[[i]], split="")[[1]]
mymatrix[i,which(names %in%chars)] = 1
}
mymatrix
HTH,
Martin Lam
--- Dale Steele <Dale_Steele at brown.EDU> wrote:
> I have a data management problem which exceeds my
> meager R programming
> skills and would greatly appreciate suggestions on
> how to proceed? The
> data consists of a series of observation periods.
> Specific behaviors are
> recorded for each time period in the order each is
> observed. Their are
> 8 possible behaviors, coded as "i" "c" "s"
"r" "v"
> "e" "p" "f".
>
> The data looks like:
> -->
> icsrvepf
> fpevrsci
> ics
> p
>
> f
> ic
> <--
>
> I would like to convert the about to a matrix of the
> form:
>
> i c s r v e p f
> 1 1 1 1 1 1 1 1
> 1 1 1 1 1 1 1 1
> 1 1 1 0 0 0 0 0
> 0 0 0 0 0 0 1 0
> 0 0 0 0 0 0 0 0
> 0 0 0 0 0 0 0 1
> 1 1 0 0 0 0 0 0
>
> Thanks.
>
> Dale
>
> Dale Steele, MD
> Pediatric Emergency Medicine
> Brown Medical School
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>