Displaying 20 results from an estimated 40000 matches similar to: "read.table() versus scan()"
2011 Mar 05
2
Repeating the same calculation across multiple pairs of variables
Hi all,
I frequently encounter datasets that require me to repeat the same calculation across many variables. For example, given a dataset with total employment variables and manufacturing employment variables for the years 1990-2010, I might have to calculate manufacturing's share of total employment in each year. I find it cumbersome to have to manually define a share for each year and
2008 Jan 02
2
Seeking a more efficient way to read in a file
Hi.
I have a matrix stored in a large, tab-delimited flat file. The first row contains column names. Because the matrix is symmetric, the file has lower triangular format, so the second row contains one number, the third row two numbers, etc. In general, row k+1 contains k numbers; the matrix has 3000 rows, so the file has 3001 rows. The file has variable length records, so each row ends
2011 Feb 02
2
Efficient way to determine if a data frame has missing observations
I have a data set covering a large number of cities with values for characteristics such as land area, population, and employment. The problem I have is that some cities lack observations for some of the characteristics and I'd like a quick way to determine which cities have missing data. For example:
2003 May 21
2
moving onto returning a data.frame?
I've been studying some of the code and I'm still a little shakey on the
proper method for returning a data.frame from a C function (which is my
ultimate goal here). I've started some code that I've "stolen" from the
archives and I'm running into crashes, etc. I've been trying to gleen some
insight from the src/main/scan.c file and didn't find many comments in
2011 Mar 09
1
How does the cex parameter scale circles?
I'm wondering how the cex parameter is used to scale circles (i.e. does it scale the radius, diameter, area, circumference, etc.?). In my case I'm using lattice with filled circles (pch=19).
Based on example, it looks like R scales the radius of the circle:
library(lattice)
dta<-data.frame(x=rep(1,6),y=rep(1,6),sz=c(1,2,4,8,16,32))
2012 Feb 02
9
sqldf for Very Large Tab Delimited Files
Hi All,
I have a very (very) large tab-delimited text file without headers. There
are only 8 columns and millions of rows. I want to make numerous pieces of
this file by sub-setting it for individual stations. Station is given as in
the first column. I am trying to learn and use sqldf package for this but am
stuck in a couple of places.
To simulate my requirement, I have taken iris dataset as an
2011 Apr 16
1
CairoPDF, Fonts, and Windows 7
Hi All:
I have some basic questions about Cairo graphics engine. I'm trying to use the Cairo package to produce PDF output, mainly because I perceive it to be easy to use with a wide variety of fonts.
But right now, I'm stuck trying to figure out what fonts are available to be used with Cairo, specifically the CairoPDF function. I've been able to successfully produce some test PDFs
2004 Nov 11
1
polr probit versus stata oprobit
Dear All,
I have been struggling to understand why for the housing data in MASS
library R and stata give coef. estimates that are really different. I also
tried to come up with many many examples myself (see below, of course I
did not have the set.seed command included) and all of my
`random' examples seem to give verry similar output. For the housing data,
I have changed the data into numeric
2011 Aug 12
1
Details of subassignment (for vectors and data frames)
Hi All:
I'm looking to find out a bit more about how subassignment actually works and am hoping someone with knowledge of the details can fill me in (I've looked at the source code, but my knowledge of C is lacking).
In the case of vectors, my reading of ?"[" would indicate that for a vector, vec <- 1:25, vec[c(1,5,25)] <- c(101,102,103)is functionally the same as
2011 Feb 08
2
Convert the output of by() to a data frame
I'd like to summarize several variables in a data frame, for multiple groups, and store the results in a data.frame. To do so, I'm using by(). For example:
df<-data.frame(a=1:10,b=11:20,c=21:30,grp1=c("x","y"),grp2=c("x","y"),grp3=c("x","y"))
dfsum<-by(df[c("a","b","c")],
2011 Jan 23
1
How does the data.frame function generate column names?
Hi all,
I'm a new R user and am confused about how R behaves when converting a vector to a data frame when using the data.frame function. I'm specifically interested in cases where the vector is expressed as a subset of another data frame. For example, say I want to create a data frame from the last three rows of the third column of the data frame, d, that I've created below:
2000 Dec 26
1
More on scan: extra field at end of line
Suppose, I have a file "data1" containing:
450 390 467 654 30 542 334 432 421
357 497 493 550 549 467 575 578 342
446 547 534 495 979 479
I can read this file with:
scan("data1")
Read 24 items
[1] 450 390 467 654 30 542 334 432 421 357 497 493 550 549 467 575 578 342 446
[20] 547 534 495 979 479
2010 May 11
3
Advice needed on awkward tables
Dear r-help list members,
I am quite new to R, and hope to seek advice from you about a problem I have
been cracking my head over. Apologies if this seems like a simple problem.
I have essentially two tables. The first (Table A) is a standard patient
clinicopathological data table, where rows correspond to patient IDs and
columns correspond to clinical features. Records in this table are stored
2011 Mar 07
2
Preferred way to create bubble plots?
I have to create a number of bubble plots, and am wondering what methods folks prefer for this task. I've been experimenting with the symbols() function, with text() to provide plot labels. Any opinions on the relative merits of this method versus others? One criterion would be the ability to fine-tune the placement of text labels. I would like to use lattice, but haven't found a way to
2011 Jul 12
1
suggestions regarding reading in a messy file
I have a file in stata format, which I have read in, and I am trying
to create a text file. I have exported the data using various
delimiters, but I'm unable to read it back in. I originally read in
the file with:
library(foreign)
myData <- read.dta("mydata.dta")
I then exported it with write.table using comma, tab, and exclamation
marks as a delimiter.
When I was unable to
2001 Sep 24
1
Problem with read.table and scan
I have just installed R on a Windows NT system. Unfortunately I am unable
to open any of the data files I wish to work with. I have tried using
read.table and scan and to the best of my knowledge am using the correct
syntax. The error message I receive is
Error in file(file, "r") : cannot open file [file name]
I have the data in text files in white-space delimited form. I put them
2012 Apr 01
1
scan() vs readChar() speed
Dear list,
I am trying to find a fast solution to read moderately large (1 -- 10
million entries) text files containing only tab-delimited numeric
values. My test file is the following,
nr <- 1000
nc <- 5000
m <- matrix(round(rnorm(nr*nc),3),nr=nr)
write.table(m, file = "a.txt", append=FALSE,
row.names = FALSE, col.names = FALSE)
scan() is faster than
2010 Nov 01
1
Very odd problem
I had previously tried to migrate our PDC to a new machine by simply
copying the config over and such. That failed miserably but luckily the
various home servers (BDC's in samba speak I think) took up the slack.
So after much debate, this weekend we moved the PDC back to the original
machine. We never moved LDAP off of the original machine, as only samba
functions moved.
I now know I did
2012 May 18
1
UTF-16 input and read.delim/scan
Hi all,
I am running 64-bit R 2.15.0 on windows 7. I am trying to use read.delim
to read from a file that has 2-byte unicode (CJK) characters.
Here is an example of the data (it is tab-delimited if that gets messed up):
HITId HITTypeId Title
2Q69Z6KW4ZMAGKKFRT6Q4ONO6MJF68 2LVJ1LY58B72OP36GNBHH16YF7RS7Z 看看句子,写写想法
请看以下的句子,再回答问
So read.delim (code below) doesn't read in correctly. It reads
2003 Apr 12
3
WG: Samba 2.2.7a and XP pro
Dear Michael,
I just have the same problem you have with samba 2.2.3a on a Suse 8.0
machine and win xp pro. Besides I performed all the steps below.
>From my postings I learned that the prob might have something to do with
a wrong mapping in smbusers file. It seems to me that the workstation
account is mapped to user nobody.
Up to now I was not able to solve the prob.
Do you have any ideas