thr3ads.net - R help - [R] data manipulation [Apr 2005]

If this information is useful, please help other people find it:
Share via:

Yoko Nakajima

2005-Apr-14 00:56 UTC

[R] data manipulation

Hello,
my question is about the data handling.

I have a data set that is lined as:

4 1 17 1 1
 -5.1536 -0.1668 -2.3412 -0.5062  0.9621  0.3640  0.3678 -0.5081 -0.2227
  0.8142 -0.0389 -0.0445 -0.0578 -0.1175 -0.1232  0.8673 -0.1033 -0.0796
 -0.0341 -0.1716 -0.1801 -0.7014  0.6578  0.5611
4 1 17 2 1
 -5.1536 -0.1668 -2.3412 -0.5062  0.9621  0.3640  0.3678 -0.5081 -0.2227
  0.8142 -0.0389 -0.0445 -0.0578 -0.1175 -0.1232  0.8673 -0.1033 -0.0796
 -0.0341 -0.1716 -0.1801 -0.7014  0.6578  0.5611

This means that 29 variables are together as a set. You saw two sets of them in
example. I have about 1000 sets (of 29 variables) in my data. When I
"scan" this data set, the result comes with 7 columns and it is not
possible, so far, to read the table by column wise, and thus it is not possible
to analyze the data. I would like to know whether there is a way to solve this
problem, say, by arranging columns or increasing the number of columns of data
matrix by R.

Also, I would like to know how you could name each column of the data so that
you could use the individual column separately.

Sincerely.
	[[alternative HTML version deleted]]

John Fox

2005-Apr-14 01:11 UTC

head link

[R] data manipulation

Dear Yoko,

If you're sure that the data are complete, then data <-
matrix(scan("file-name"), ncol=29) should do the trick. Then to name
the
columns of the data matrix, colnames(data) <- c("one",
"two", etc.). [Of
course, you'd substitute meaningful names.]

I hope this helps,
 John

--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
-------------------------------- 
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Yoko Nakajima
> Sent: Wednesday, April 13, 2005 7:56 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] data manipulation
> 
> Hello,
> my question is about the data handling.
> 
> I have a data set that is lined as:
> 
> 4 1 17 1 1
>  -5.1536 -0.1668 -2.3412 -0.5062  0.9621  0.3640  0.3678 
> -0.5081 -0.2227
>   0.8142 -0.0389 -0.0445 -0.0578 -0.1175 -0.1232  0.8673 
> -0.1033 -0.0796
>  -0.0341 -0.1716 -0.1801 -0.7014  0.6578  0.5611
> 4 1 17 2 1
>  -5.1536 -0.1668 -2.3412 -0.5062  0.9621  0.3640  0.3678 
> -0.5081 -0.2227
>   0.8142 -0.0389 -0.0445 -0.0578 -0.1175 -0.1232  0.8673 
> -0.1033 -0.0796
>  -0.0341 -0.1716 -0.1801 -0.7014  0.6578  0.5611
> 
> This means that 29 variables are together as a set. You saw 
> two sets of them in example. I have about 1000 sets (of 29 
> variables) in my data. When I "scan" this data set, the 
> result comes with 7 columns and it is not possible, so far, 
> to read the table by column wise, and thus it is not possible 
> to analyze the data. I would like to know whether there is a 
> way to solve this problem, say, by arranging columns or 
> increasing the number of columns of data matrix by R.
> 
> Also, I would like to know how you could name each column of 
> the data so that you could use the individual column separately.
> 
> Sincerely.
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html

Marc Schwartz

2005-Apr-14 01:15 UTC

head link

[R] data manipulation

On Wed, 2005-04-13 at 20:56 -0400, Yoko Nakajima wrote:> Hello,
> my question is about the data handling.
> 
> I have a data set that is lined as:
> 
> 4 1 17 1 1
>  -5.1536 -0.1668 -2.3412 -0.5062  0.9621  0.3640  0.3678 -0.5081
> -0.2227
>   0.8142 -0.0389 -0.0445 -0.0578 -0.1175 -0.1232  0.8673 -0.1033
> -0.0796
>  -0.0341 -0.1716 -0.1801 -0.7014  0.6578  0.5611
> 4 1 17 2 1
>  -5.1536 -0.1668 -2.3412 -0.5062  0.9621  0.3640  0.3678 -0.5081
> -0.2227
>   0.8142 -0.0389 -0.0445 -0.0578 -0.1175 -0.1232  0.8673 -0.1033
> -0.0796
>  -0.0341 -0.1716 -0.1801 -0.7014  0.6578  0.5611
> 
> This means that 29 variables are together as a set. You saw two sets
> of them in example. I have about 1000 sets (of 29 variables) in my
> data. When I "scan" this data set, the result comes with 7
columns and
> it is not possible, so far, to read the table by column wise, and thus
> it is not possible to analyze the data. I would like to know whether
> there is a way to solve this problem, say, by arranging columns or
> increasing the number of columns of data matrix by R.
> 
> Also, I would like to know how you could name each column of the data
> so that you could use the individual column separately.
You probably change some default setting in scan(). By default it treats
'white space' as field delimiters.

Using your data above, which I save in file called 'test.dat':
> mat <- matrix(scan("test.dat"), ncol = 29)Read 58 items
> dim(mat)[1]  2 29
> colnames(mat) <- paste("Col", 1:29, sep = "")
> mat     Col1 Col2    Col3    Col4    Col5   Col6    Col7    Col8    Col9
[1,]    4   17  1.0000 -0.1668 -0.5062 0.3640 -0.5081  0.8142 -0.0445
[2,]    1    1 -5.1536 -2.3412  0.9621 0.3678 -0.2227 -0.0389 -0.0578
       Col10   Col11   Col12   Col13   Col14  Col15 Col16 Col17   Col18
[1,] -0.1175  0.8673 -0.0796 -0.1716 -0.7014 0.5611     1     2 -5.1536
[2,] -0.1232 -0.1033 -0.0341 -0.1801  0.6578 4.0000    17     1 -0.1668
       Col19  Col20   Col21   Col22   Col23   Col24   Col25   Col26
[1,] -2.3412 0.9621  0.3678 -0.2227 -0.0389 -0.0578 -0.1232 -0.1033
[2,] -0.5062 0.3640 -0.5081  0.8142 -0.0445 -0.1175  0.8673 -0.0796
       Col27   Col28  Col29
[1,] -0.0341 -0.1801 0.6578
[2,] -0.1716 -0.7014 0.5611

In this case, 'mat' is a matrix with 2 rows and 29 columns.

You can restructure this differently as per your requirements.

HTH,

Marc Schwartz

Yoko Nakajima

2005-Apr-23 21:13 UTC

head link

[R] data manipulation

Hello,

may I ask a further question?

I have realized that "data <-
matrix(scan("file-name"), ncol=29)" will read the data
differently than I
thought, i.e., (4,1) is the first column,  (17,1) is the second column, and
(1,1) is the third and so on by this code - please see the data below.
Therefore, the data set I have would not be in order if I used this code.

It needed to be read as: (4.4) first column, (1,1) the second column, and
(17, 17) is the third and so on (i.e., from 4 to 0.5611 makes the first row
and another 4 to 0.5611 makes the second row and so on). So,

V1 V2 V3 ...     V29
4    1    17   ...  0.5611
4    1    17   ...  0.5611

was needed.

(Now I have ,
V1 V2 V3  ....         V29
4    17   1           ...  0.6578
1    1   -5.1536  ...   0.5611)


[The data set I have may have around 1000 sets of them (29 variables times
around 1000 sets of these 29 variables). I only paste here two sets of
them.]
4 1 17 1 1
-5.1536 -0.1668 -2.3412 -0.5062  0.9621  0.3640  0.3678
-0.5081 -0.2227
0.8142 -0.0389 -0.0445 -0.0578 -0.1175 -0.1232  0.8673
-0.1033 -0.0796
-0.0341 -0.1716 -0.1801 -0.7014  0.6578  0.5611

4 1 17 2 1
-5.1536 -0.1668 -2.3412 -0.5062  0.9621  0.3640  0.3678
-0.5081 -0.2227
0.8142 -0.0389 -0.0445 -0.0578 -0.1175 -0.1232  0.8673
-0.1033 -0.0796
-0.0341 -0.1716 -0.1801 -0.7014  0.6578  0.5611



I need 29 columns. This is true. But the data was read differently by
"ncol=29". Is there any way I can handle this problem by R?

I would very appreciate it if you could let me know. My guess is that I
should probably rearrange the data set  by excel etc.. I have used
"data.entry(data)" and found this. I can not analyze this data set.

Thank you very much, in advance.
Sincerely,
Yoko.

Liaw, Andy

2005-Apr-23 21:37 UTC

head link

[R] data manipulation

You just need to try harder in reading the documentation.  Try:

data <- matrix(scan("file-name"), ncol=29, byrow=TRUE)

Andy
> From: Yoko Nakajima
> 
> Hello,
> 
> may I ask a further question?
> 
> I have realized that "data <-
> matrix(scan("file-name"), ncol=29)" will read the data 
> differently than I
> thought, i.e., (4,1) is the first column,  (17,1) is the 
> second column, and
> (1,1) is the third and so on by this code - please see the data below.
> Therefore, the data set I have would not be in order if I 
> used this code.
> 
> It needed to be read as: (4.4) first column, (1,1) the second 
> column, and
> (17, 17) is the third and so on (i.e., from 4 to 0.5611 makes 
> the first row
> and another 4 to 0.5611 makes the second row and so on). So,
> 
> V1 V2 V3 ...     V29
> 4    1    17   ...  0.5611
> 4    1    17   ...  0.5611
> 
> was needed.
> 
> (Now I have ,
> V1 V2 V3  ....         V29
> 4    17   1           ...  0.6578
> 1    1   -5.1536  ...   0.5611)
> 
> 
> [The data set I have may have around 1000 sets of them (29 
> variables times
> around 1000 sets of these 29 variables). I only paste here two sets of
> them.]
> 4 1 17 1 1
> -5.1536 -0.1668 -2.3412 -0.5062  0.9621  0.3640  0.3678
> -0.5081 -0.2227
> 0.8142 -0.0389 -0.0445 -0.0578 -0.1175 -0.1232  0.8673
> -0.1033 -0.0796
> -0.0341 -0.1716 -0.1801 -0.7014  0.6578  0.5611
> 
> 4 1 17 2 1
> -5.1536 -0.1668 -2.3412 -0.5062  0.9621  0.3640  0.3678
> -0.5081 -0.2227
> 0.8142 -0.0389 -0.0445 -0.0578 -0.1175 -0.1232  0.8673
> -0.1033 -0.0796
> -0.0341 -0.1716 -0.1801 -0.7014  0.6578  0.5611
> 
> 
> 
> I need 29 columns. This is true. But the data was read differently by
> "ncol=29". Is there any way I can handle this problem by R?
> 
> I would very appreciate it if you could let me know. My guess 
> is that I
> should probably rearrange the data set  by excel etc.. I have used
> "data.entry(data)" and found this. I can not analyze this data
set.
> 
> Thank you very much, in advance.
> Sincerely,
> Yoko.
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> 
>

Maybe Matching Threads

Search for more apparently analagous threads

R help - Apr 2005 - data manipulation

[R] data manipulation

[R] data manipulation

[R] data manipulation

[R] data manipulation

[R] data manipulation

Maybe Matching Threads