2008 Dec 21
Globbing Files in R
Dear all, For example I want to process set of files. Typically Perl's idiom would be: __BEGIN__ @files = glob("/mydir/*.txt"); foreach my $file (@files) { # process the file } __END__ What's the R's way to do that? - Gundala Viswanath Jakarta - Indonesia
2008 Jun 16
Superimposing Line over Histogram in Density Plot
Hi, Currently I have a density plot generated with this snippet. Is there a way I can add a line curve on top of it? I mean in one figure __BEGIN__ myhist <- hist(x col="blue", main = "Density Plot", xlab = "Exp Level", ) __END__ - Gundala Viswanath Jakarta - Indonesia
2008 Jun 11
Finding Coordinate of Max/Min Value in a Data Frame
Hi, Suppose I have the following data frame. __BEGIN__ > library(MASS) > data(crabs) > crab.pca <- prcomp(crabs[,4:8],retx=TRUE) > crab.pca$rotation PC1 PC2 PC3 PC4 PC5 FL 0.2889810 0.3232500 -0.5071698 0.7342907 0.1248816 RW 0.1972824 0.8647159 0.4141356 -0.1483092 -0.1408623 CL 0.5993986 -0.19...
2008 Aug 05
Iterating Named List
...-1063.893 -1062.815 -1062.121 -1059.004 $`200071_at` [1] -959.823 -953.980 -953.886 -948.781 -974.890 $`200084_at` [1] -1135.804 -1132.863 -1128.197 -1128.633 -1125.890 What I want to do is to iterate this name list and process its members. To do that I attempt the following code (but failed): __BEGIN__ ny <- names(y) for (i in ny) { val <- paste("`",i,"`",sep="") print(y$val) # later we want to process y$val } __END__ However the after printing it gives "NULL". What's wrong with my code above? - Gundala Viswanath Jakarta - Indonesia
2009 Jan 08
Faster Printing Alternatives to 'cat'
Dear all, I found that printing with 'cat' is very slow. For example in my machine this snippet __BEGIN__ # I need to resolve to use this type of loop. # because using write(), I need to create a matrix which # consumes so much memory. Note that "foo, bar, qux" object # is already very large (>2Gb) for ( s in 1:length(x) ) { cat(as.character(foo[s]),"\t",bar[s],"\t&qu...
2009 Jan 09
Pack and Unpack Strings in R
...function than Perl. Yet the data I need to process is so large that it required me to compress it into smaller unit -> process it -> finally recover them back again into string with new information. In Perl the implementation will look like this: I wonder how can this be implemented in R. __BEGIN__ my %charmap = ( A => '00', C => '01', G => '10', T => '11', ); my %digmap = ( '00' => "A", '01' => "C", '10' => "G", '11' => "T&quo...
2008 Jun 23
Pairwise Partitioning of a Vector
....9 part2 = 60.1 70.0 73.0 75.0 83.9 93.1 97.6 98.8 113.9 PAIR2 part1 = 30.9 60.1 part2 = 70.0 73.0 75.0 83.9 93.1 97.6 98.8 113.9 .... PAIR9 part1 = 30.9 60.1 70.0 73.0 75.0 83.9 93.1 97.6 98.8 part2 = 113.9 I'm stuck with this kind of loop: __BEGIN__ # gexp is a Vector process_two_partition <- function(gexp) { sort.gexp <- sort(as.matrix(gexp)) print(sort.gexp) for (posb in 1:ncol(gexp)) { for (pose in 1:ncol(gexp)) { sp_b <- pose+1 sp_e <- ncol(gexp) # This two doesn't do w...
2008 Aug 01
Extract Element of String with R's Regex
Hi, I have this string, in which I want to extract some of it's element: > x <- "Best-K Gene 11340 211952_at RANBP5 Noc= 3 - 2 LL= -963.669 -965.35" yielding this array [1] "211952_at" "RANBP5" "2" In Perl we would do it this way: __BEGIN__ my @needed =(); my $str = "Best-K Gene 11340 211952_at RANBP5 Noc= 3 - 2 LL= -963.669 -965.35"; $str =~ /Best-K Gene \d+ (\w+) (\w+) Noc= \d - (\d) LL= (.*)/; push @needed, ($1,$2,$3); __END___ How can we achieve this with R? - E.W.
2008 May 23
About Passing Arguments to Function
Hi, Below I have a function mlogl_k, later it's called with "nlm" . __BEGIN__ vsamples<- c(14.7, 18.8, 14, 15.9, 9.7, 12.8) mlogl_k <- function( k_func, x_func, theta_func, samp) { tot_mll <- 0 for (comp in 1:k_func) { curr_mll <- (- sum(dgamma(samp, shape = x_func, scale=theta_func, log = TRUE))) tot_mll <- tot_mll + curr_mll }...
2008 Jun 19
Create Matrix from Loop of Vectors, Sort It and Pick Top-K
...or each row, b) append variance with its original row in a vector, c) store a vector into multidimentional array (matrix), d) sort that array. But I am stuck at the step (b). Can anybody suggest what's the best way to achieve my aim above? This is the sample code I have so far (not working). __BEGIN__ #data <- read.table("testdata.txt") # Is this a right way to initialize? all.arr = NULL for (gi in 1:nofrow) { gex <- as.vector(data.matrix(data[gi,],rownames.force=FALSE)) #compute variance gexvar <- var(gex) # join variance with its original vector nvec &lt...
2008 Jun 10
Concat Multiple Plots into one PNG figure
Dear experts, I tried to put the two plots into one final PNG figure with the following script. However instead of giving 2 plots in one figure, it only gives the the last plot in one figure. What's wrong with my script below? __BEGIN__ in_fname <- paste("mydata.txt.",sep="") out_fname <- paste("finalplot.png",sep="") dat <- read.table(in_fname, comment.char = "!" , na.strings = "null"); dat.pca <- prcomp(dat[,1:ncol(dat)], retx=TRUE, scores=TRUE)
2008 Jun 13
Regex for Special Characters under Grep
Hi all, I am trying to capture lines of a file that DO NOT start with the following header: !, #, ^ But somehow my regex used under grep doesn't work. Please advice what's wrong with my code below. __BEGIN__ in_fname <- paste("mydata.txt,".soft",sep="") data_for_R <- paste("data_for_R/", args[3], ".softR", sep="") # my regex construction cat(temp[-grep("^[\^\!\#]",temp,perl=TRUE)], file=data_for_R, sep="\n") dat <- r...
2008 May 22
Computing Maximum Loglikelihood With "nlm" Problem
Hi, I tried to compute maximum likelihood under gamma distribution, using nlm function. The code is this: __BEGIN__ vsamples<- c(103.9, 88.5, 242.9, 206.6, 175.7, 164.4) mlogl <- function(alpha, x) { if (length(alpha) > 1) stop("alpha must be scalar") if (alpha <= 0) stop("alpha must be positive") return(- sum(dgamma(x, shape = alpha, log = TRUE))) } mlogl_out <...
2008 May 26
Joining Histograms Into a Figure
Hi, I have two histograms created separately using the following code. It creates two separate figures. dat <- read.table(file="GDS1096.modelout", head = FALSE ) __BEGIN__ dat <- read.table(file="GDS1096.modelout", head = FALSE ) hist(dat$V2, main="AIC Freq", xlab = "\# Component", breaks = 36, xlim = c(0,max(dat$V2)), col = "dark red", freq = TRUE) hist(dat$V3, main="BIC Freq", xlab = "\# Component", br...
2008 Jul 21
Howto Restart A Function with Try-Error Catch
...n. To avoid the problem, what I intend to do is the following: 1. Catch the try-error using class. 2. Redo the function if it returns "try-error" 3. Otherwise keep the output of the function. I'm not sure how to create the above construct. The code I have below doesn't work: __BEGIN__ myfunction <- function(the_x) { # do something a = list(output1=val1, output2 = val2) a } out <- try(suppressWarnings(myfunction(x)),silent=T) if (class(out) == "try-error") { #this clause doesn't seem to "redo&q...
2009 Jan 05
Process File Line By Line Without Slurping into Object all, In general practice one would slurp the whole file using this method before processing the data: dat <- read.table(filename) or variations of it. Is there a way we can access the file line by line without slurping/storing them into object? I am thinking something like this in Perl: __BEGIN__ open INFILE, '<' , 'filename.txt' or die $!; while (<INFILE>) { my $line = $_; # then process line by line } __END__ the reason I want to do that is because the data I am processing are large (~ 5GB), my PC may not be able to handle that. - Gundala Viswanath Jakar...
2008 Jun 12
Data.matrix fail to convert data.frame into matrix
...Select n-genes by random sample # n = 1 nosamp <- 1 geneid <- sequence(nrow(dat)) geneid.samp <- sample(geneid,nosamp) geneid.samp gexp<- dat[geneid.samp,] gexp.arr <- data.matrix(gexp, rownames.force = NA) print(is.matrix(gexp.arr)) print(gexp.arr) __END__ Yielding this output: __BEGIN__ > print(is.matrix(gexp.arr)) [1] TRUE > print(gexp.arr) V1 V2 V3 V4 V5 V6 V7 V8 10354 803.1 1107.8 431.6 349.8 386.7 646.3 744.2 620.9 __END__ I expect "gexp.arr" to be a plain vector (numeric). What's wrong with my code above? -- Gunda...
2008 Aug 05
About Creating a List by Parsing Text
...LL= -812.083 " __END__ I expect to get this kind of data structure: > wanted_output [['211952_at']] $ll.list [1] -970.692 -965.35 -963.669 [['213301_x_at']] $ll.list [1] -948.527 -947.275 -947.379 etc. How can I achieve that? I am stuck with the following construct __BEGIN__ comp.ll <- model_all[grep("Gene .* k=.*", model_all)] print(comp.ll) patt <- "Gene \\d+ ([\\w-/]+) [\\w-]+ k= (\\d) LL= ([-]\\d+\.\\d+)" nresk <- unlist(strsplit(sub(patt, "\\1 \\2 \\3",comp.ll,perl=TRUE)," ")) __END__ - Gundala Viswanath Jakar...
2013 Dec 07
How to perform clustering without removing rows where NA is present in R
I have a data which contain some NA value in their elements. What I want to do is to **perform clustering without removing rows** where the NA is present. I understand that `gower` distance measure in `daisy` allow such situation. But why my code below doesn't work? __BEGIN__ # plot heat map with dendogram together. library("gplots") library("cluster") # Arbitrarily assigning NA to some elements mtcars[2,2] <- "NA" mtcars[6,7] <- "NA" mydata <- mtcars hclustfunc <- function(x) hcl...
2009 Sep 02
Howto Superimpose Multiple Density Curves Into One Plot
I have a data that looks like this: And I intend to create multiple density curve into one plot, where each curve correspond to the unique ID. I tried to use "sm" package, with this code, but without success. __BEGIN__ library(sm) dat <- read.table("mydat.txt"); plotfn <- ("~/Desktop/flowgram_superimposed.pdf"); pdf(plotfn);$V1,dat$V2, xlab = "Flow Signal") colfill <- c(2:10); legend(locator(1), levels(dat$V2), fill=colfill); __END__ Pleas...