Hi: I have a data file in the following format. The first three digits stand for the ID of a respondent such as 402, 403. Different respondents may have the same ID. Followed the ID are 298 single digit number ranging from 1 to 5. My question is how to read this data file into R. I tried "scan" and "read" but they do not work because the numbers in the file are not separated. Any suggestions? Thank you! zhong ----------------------------------------------------------------------------------------------- 40212221211 11212323114345531221314222132311445542524542111412113212124145315324113113112411 14153111131421112514214413141115311543241411222213124115411415351211114251213111 51512242411411311413352111512221311113211423111131141413151433212221211454124214 11531111311211331512415451222211112311131311111512 40222221211 31122312123333532122221122221311444321432343211313223213222134323223114122313322 13343113112421222413324223221213431321322212332223222115222414431222214351312113 52221232312323213313222111222242312112411323122131143223251332112123231333121222 11422211214312332312324351213121121311233412212512 40312191211 21112311121534431151234222211213324522211214112212112312311125413412313112211111 24233113124442121211213442233211221321444411253212211113221344241211424252112314 11121224211414112421111111111141332231511345133211131132431122111311121443131114 32541311112223211512414541111111121451211311111541 40322191211 31124311121411333143142111311111444331311315212111221131221115333113115111411312 15414112115544144211325324213211121151345211152412111114121535153531514152111425 32141123112335112111121111111151241243511135125221111111441121111322141111131114 35411111444313111112413531111111111341311411111511 [[alternative HTML version deleted]]
Hi, ?read.fwf Try, read.fwf("filename.ext",width=c(3,rep(1,298)),header=F) Blay yyan liu wrote:> > Hi: > I have a data file in the following format. The first three digits > stand for the ID of a respondent such as 402, 403. Different respondents > may have the same ID. Followed the ID are 298 single digit number ranging > from 1 to 5. My question is how to read this data file into R. I tried > "scan" and "read" but they do not work because the numbers in the file are > not separated. Any suggestions? > Thank you! > > zhong > ----------------------------------------------------------------------------------------------- > 40212221211 > 11212323114345531221314222132311445542524542111412113212124145315324113113112411 > 14153111131421112514214413141115311543241411222213124115411415351211114251213111 > 51512242411411311413352111512221311113211423111131141413151433212221211454124214 > 11531111311211331512415451222211112311131311111512 > 40222221211 > 31122312123333532122221122221311444321432343211313223213222134323223114122313322 > 13343113112421222413324223221213431321322212332223222115222414431222214351312113 > 52221232312323213313222111222242312112411323122131143223251332112123231333121222 > 11422211214312332312324351213121121311233412212512 > 40312191211 > 21112311121534431151234222211213324522211214112212112312311125413412313112211111 > 24233113124442121211213442233211221321444411253212211113221344241211424252112314 > 11121224211414112421111111111141332231511345133211131132431122111311121443131114 > 32541311112223211512414541111111121451211311111541 > 40322191211 > 31124311121411333143142111311111444331311315212111221131221115333113115111411312 > 15414112115544144211325324213211121151345211152412111114121535153531514152111425 > 32141123112335112111121111111151241243511135125221111111441121111322141111131114 > 35411111444313111112413531111111111341311411111511 > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >----- Blay S KATH Kumasi, Ghana. -- View this message in context: http://www.nabble.com/how-to-important-a-date-file-into-R-tp17715519p17716831.html Sent from the R help mailing list archive at Nabble.com.
Try this. From your printout it seems that there are some extraneous spaces in the file so first we read it in and remove the spaces and just in case remove any completely blank lines. Then we re-read it using read.fwf. Note that the widths= argument in read.fwf can be a list where we specify the widths of successive 5 lines as 5 vectors: Lines <- readLines("mydata.dat") Lines <- gsub(" ", "", Lines) Lines <- subset(Lines, Lines != "") ones <- function(n) rep(1, n) mydata <- read.fwf(textConnection(Lines), list(c(3, ones(8)), ones(80), ones(80), ones(80), ones(50))) On Sun, Jun 8, 2008 at 12:09 AM, yyan liu <zhliur at yahoo.com> wrote:> Hi: > I have a data file in the following format. The first three digits stand for the ID of a respondent such as 402, 403. Different respondents may have the same ID. Followed the ID are 298 single digit number ranging from 1 to 5. My question is how to read this data file into R. I tried "scan" and "read" but they do not work because the numbers in the file are not separated. Any suggestions? > Thank you! > > zhong > ----------------------------------------------------------------------------------------------- > 40212221211 > 11212323114345531221314222132311445542524542111412113212124145315324113113112411 > 14153111131421112514214413141115311543241411222213124115411415351211114251213111 > 51512242411411311413352111512221311113211423111131141413151433212221211454124214 > 11531111311211331512415451222211112311131311111512 > 40222221211 > 31122312123333532122221122221311444321432343211313223213222134323223114122313322 > 13343113112421222413324223221213431321322212332223222115222414431222214351312113 > 52221232312323213313222111222242312112411323122131143223251332112123231333121222 > 11422211214312332312324351213121121311233412212512 > 40312191211 > 21112311121534431151234222211213324522211214112212112312311125413412313112211111 > 24233113124442121211213442233211221321444411253212211113221344241211424252112314 > 11121224211414112421111111111141332231511345133211131132431122111311121443131114 > 32541311112223211512414541111111121451211311111541 > 40322191211 > 31124311121411333143142111311111444331311315212111221131221115333113115111411312 > 15414112115544144211325324213211121151345211152412111114121535153531514152111425 > 32141123112335112111121111111151241243511135125221111111441121111322141111131114 > 35411111444313111112413531111111111341311411111511 > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
> Hi:> I have a data file in the following format. The first three digits stand for the ID of a respondent such as 402, 403. Different respondents may have the same ID. Followed the ID are 298 single digit number ranging from 1 to 5. My question is how to read this data file into R. I tried "scan" and "read" but they do not work because the numbers in the file are not separated. Any suggestions? > Thank you! > The answers provided to date (read.fwf()) look just fine. I thought I'd mention that you could always pre-process the data file in any text editor to insert commmas or tabs between every number (other than that leading 3-digit number) and then use scan() or read.csv() Carl