thr3ads.net - similar to: "How to do indexing after splitting my data-frame?"

Displaying 20 results from an estimated 80 matches similar to: "How to do indexing after splitting my data-frame?"

2008 Dec 20

NA, where no NA should (could!) be!

Hello, again I'm on my weblog-script... having problems... This code: =========================== weblog <- read_weblog("web.log") weblog_by_date <- split(weblog, weblog$date) #for ( i in names(weblog_by_day) ) { print(i); print(weblog_by_day$i) } for ( datum in names(weblog_by_date) ) { print(datum) selected <- weblog_by_date[[datum]] res_size_by_host <-

ggplot2: mixing colour and linetype in geom_line

2009 Sep 09

ggplot2: mixing colour and linetype in geom_line

Hi all, I try to represent a multiple curve graphic where the x-axis is the temperature and the different y-axes are the different X (X22,X43,X44...) some X corresponds to the same molecule (22 and 44 are for CO2 for instance) so I use the same colour for them. I wanna mix the linetype with the colour to be able to visually see the difference between X43 and X45 The best I have done up to now

use of variable labels

2003 Apr 08

use of variable labels

The R documentation for some of the foreign package's functions says that the set of variable labels becomes attributes in the resulting data frame. Thus, e.g., 5="strongly agree", 4="agree", etc. I'm happy that the labels are being passed, but unfortunately, when R summarizes the data, it will list it only as categories, and doesn't deal with the

summary(lm ... conrasts=...)

2006 Aug 22

summary(lm ... conrasts=...)

Hi Folks, I've encountered something I hadn't been consciously aware of previously, and I'm wondering what the explanation might be. In (on another list) using R to demonstrate the difference between different contrasts in 'lm' I set up an example where Y is sampled from three different normal distributions according to the levels ("A","B","C")

signif {base}: changes to scientific notation

2003 Feb 06

signif {base}: changes to scientific notation

PROBLEM `signif' does change to scientic notation at different levels depending on the number of significant digits in the input. This can generate tables where figures change ``irregularly'' from normal to scientific notation. PROPOSAL The change to the scientific notation should be made only if the figure in scientific notation - with potentially as

read.csv

2024 Apr 16

read.csv

Hum... This boils down to > as.numeric("1.23e") [1] 1.23 > as.numeric("1.23e-") [1] 1.23 > as.numeric("1.23e+") [1] 1.23 which in turn comes from this code in src/main/util.c (function R_strtod) if (*p == 'e' || *p == 'E') { int expsign = 1; switch(*++p) { case '-': expsign = -1; case

read.csv

2024 Apr 16

read.csv

Dear R-developers, I came to a somewhat unexpected behaviour of read.csv() which is trivial but worthwhile to note -- my data involves a protein named "1433E" but to save space I drop the quote so it becomes, Gene,SNP,prot,log10p YWHAE,13:62129097_C_T,1433E,7.35 YWHAE,4:72617557_T_TA,1433E,7.73 Both read.cv() and readr::read_csv() consider prot(ein) name as (possibly confused by

pls regression - optimal number of LVs

2003 Jul 24

pls regression - optimal number of LVs

Dear R-helpers, I have performed a PLS regression with the mvr function from the pls.pcr package an I have 2 questions : 1- do you know if mvr automatically centers the data ? It seems to me that it does so... 2- why in the situation below does the output say that the optimal number of latent variables is 4 ? In my humble opinion, it is 2 because the RMS increases and the R2 decreases when 3 LVs

Assignment of values with different indexes

2012 Dec 06

Assignment of values with different indexes

I would like to take the values of observations and map them to a new index. I am not sure how to accomplish this. The result would look like so: x[1,2,3,4,5,6,7,8,9,10] becomes y[2,4,6,8,10,12,14,16,18,20] The "newindex" would not necessarily be this sequence, but a sequence I have stored in a vector, so it could be all kinds of values. here is what happens: > x <- rnorm(10)

Modifying a data frame based on a vector that contains column numbers

2013 Mar 14

Modifying a data frame based on a vector that contains column numbers

Hello! # I have a data frame: mydf<-data.frame(c1=rep(NA,5),c2=rep(NA,5),c3=rep(NA,5)) # I have an index whose length is always the same as nrow(mydf): myindex<-c(1,2,3,2,1) # I need c1 to have 1s in rows 1 and 5 (based on the information in myindex) # I need c2 to have 1s in rows 2 and 4 (also based on myindex) # I need c3 to have 1 in row 3 # In other words, I am trying to achieve this

estimate the number of clusters

2003 Jun 09

estimate the number of clusters

Dear All, I am using Silhouette to estimate the number of clusters in a microarray dataset. Initially, I used the iris data to test my piece of code as follows: library(cluster) data(iris) mydata<-iris[,1:4] maxk<-15 # at most 15 clusters myindex<-rep(0,maxk) # hold the si values for each k clusters mdist<-1-cor(t(mydata)) #dissimlarity

Find index of a string inside a string?

2010 Oct 25

Find index of a string inside a string?

Hi, I am searching for the equivalent of the function Index from SAS. In SAS: index("abcd", "bcd") will return 2 because bcd is located in the 2nd cell of the abcd string. The equivalent in R should do this: > myIndex <- foo("abcd", "bcd") #return 2. What is the function that I am looking for? I want to use the return value in substr, like I do

turn list into dataframe

2012 Oct 08

turn list into dataframe

Dear R users, I'm starting to use 'apply' functions rather than for loops in R, and sometimes the output is a bit different than what I want. In this case, the command was tapply(myvector,myindex,cumsum) And the output was something like this: $`SNRL1 Core 120` [1] 2.8546 4.0778 5.2983 6.3863 7.5141 8.5498 9.5839 10.6933 $`SNRL1 Core 230` [1] 7.6810 8.7648 9.8382

how to use list index to get vector

2005 May 17

how to use list index to get vector

I have a simple question, but I couldn't find the answer in R manuals. Assume I have a list: > foo <- list() > foo[[1]] <- c(1, 2, 3) > foo[[2]] <- c(11,22,33) > foo[[3]] <- c(111,222,333) > foo [[1]] [1] 1 2 3 [[2]] [1] 11 22 33 [[3]] [1] 111 222 333 How to use list index to get a vector of, say, the first elements of list elements? That is, how to get a vector

Determine how many documents a term occurs in

2007 Apr 28

Determine how many documents a term occurs in

Is there a fast way to determine how many documents a term occurs in, besides iterating through every document with TermDocEnum? -- Best regards, Stian Gryt?yr

No matches

2006 Sep 05

No matches

The following script creates a search index and then searches it. I get no results? Where am I going wrong? Thanks. -----------BEGIN SCRIPT---------------- require ''rubygems'' require ''ferret'' include Ferret path = ''/tmp/myindex'' field_infos = Ferret::Index::FieldInfos.new() field_infos.add_field(:name, :store => :yes, :index => :yes)

repeating rows or columns within a matrix

2002 Mar 21

repeating rows or columns within a matrix

Hello Spse I have a matrix, say 1 2 3 4 5 6 7 8 9 and I would like to expand it by repeating rows within the matrix, to get, if the repeating factor is 2, say: 123 123 456 456 789 789 (or columnwise as well) . There must be a smart way of doing that? Many thanks Juhana Vartiainen juhana.vartiainen at labour.fi -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-

time series with quality codes

2007 Aug 16

time series with quality codes

list(...), I am working with environmental time series (eg rainfall, stream flow) that have attached quality codes for each data point. The quality codes have just a few factor levels, like "good", "suspect", "poor", "imputed". I use the quality codes in plots and summaries. They are carried through when a time series is aggregated to a longer time-step,

Segmentation fault on large index

2007 May 10

Segmentation fault on large index

I''m getting a segmentation fault on a large index (15GB). I''m running ferret 0.11.4 on OpenSuSE 10.2 with ruby 1.8.6. The segmentation fault appeared after I optimized the index, see further below for the error message I got before that. Ferret works perfectly on other (smaller) indexes. Is this a known issue, and if so, is there a workaround? --------------------- after

[Gluster-devel] Crash in glusterd!!!

2017 Dec 06

[Gluster-devel] Crash in glusterd!!!

Hi Atin, Please find the backtrace and logs files attached here. Also, below are the BT from core. (gdb) bt #0 0x00003fff8834b898 in __GI_raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:55 #1 0x00003fff88350fd0 in __GI_abort () at abort.c:89 [**ALERT: The abort() might not be exactly invoked from the following function line. If the trail function

similar to: How to do indexing after splitting my data-frame?