thr3ads.net - similar to: "subset question"

Displaying 20 results from an estimated 40000 matches similar to: "subset question"

Need a variant of rbind for datasets with different numbers of columns

2007 Aug 22

Need a variant of rbind for datasets with different numbers of columns

Hello. I am looking for a function that will allow me to paste rows together without regard for the numbers of columns in the datasets to be joined. The only columns where it matters if they are aligned correctly are at the beginning - the rest of the columns represent differing numbers of ICD9 (disease) codes reported by each person(record) at a health visit. They are in no particular order.

weight median by count for multiple records

2009 Jul 30

weight median by count for multiple records

Hello everyone, I have a .csv file with the following format: uniqueID SubjectID Distance_miles Tag 1 1001 5.5 3 2 1001 7 1 3 1001 6.5 1 4 1001 5 1 5 1002

simple coding question

2007 Jul 30

simple coding question

I have a list of ICD9 (disease) codes with various formats - 3 digit, 4 digit, 5 digit. The first three digits of these codes are what I am most interested in. I would like to either add zeros to the 3 and 4 digit codes to make them 5 digit codes or add decimal points to put them all in the format ###.##. I did not see a function that allows me to do this in the formatting command. This seems

sampling question

2007 Jun 28

sampling question

I am interested in locating a script to implement a sampling scheme that would basically make it more likely that a particular observation is chosen based on a weight associated with the observation. I am trying to select a sample of ~30 census blocks from each ZIP code area based on the proportion of women in a ZCTA living in a particular block. I want to make it more likely that a block will

multidimensional scaling with long form data

2009 Feb 18

multidimensional scaling with long form data

I have a dissimilarity dataset with the form: 1 1 dissimilarity value 1 2 ... 1 3 1 4 2 2 2 3 2 4 ... I would like to do nonmetric multidimensional scaling with this data, but I am having trouble using this format. I would like to either find a function that accepts this format or find a way to easily convert this format to a matrix for use with existing functions. Thanks!

dataframe subsetting

2003 Sep 10

dataframe subsetting

I can create a small dataset, "x" below, and subset out rows based on values of a certain variable. However, on the dataset I'm working on now, "latdata" below, I get a subscript error. Any advice is appreciated! Ryan Successful: > is.data.frame(x) [1] TRUE > x X1 X2 X3 1 1 3 5 2 2 4 6 > x[x$X2 %in% c(3),] X1 X2 X3 1 1 3 5 Unsuccessful: >

basic subset question of matrix

2012 Mar 31

basic subset question of matrix

Dear list, I would like to subset a large expression matrix based on rownames. That is, I have a list (as a txt-file) with gene names that matches some of the rows in my matrix. I've loaded my matrix as well as gene list using the read.table() command. myMatrix <- read.table("name_of_file.txt", header=T, row.names=1) list_to_keep <- read.table("name_of_file.txt",

Bug in list subset assignment due to NAMED optimization

2013 Jan 09

Bug in list subset assignment due to NAMED optimization

In R version 2.15.2 (2012-10-26) i386-apple-darwin9.8.0/i386 (32-bit) I get the following: > a <- list(1) > (a[[1]] <- a) [[1]] [[1]][[1]] [1] 1 but > a <- list(1) > b <- a > (a[[1]] <- a) [[1]] [1] 1 And similarly: > a <- list(x=1) > (a$x <- a) $x $x$x [1] 1 but > a <- list(x=1) > b <- a > (a$x <- a) $x [1] 1 In both cases the

sorting a data.frame using a vector

2004 Nov 26

sorting a data.frame using a vector

Hi all, I'm looking for an efficient solution (speed and memory) for the following problem: Given - a data.frame x containing numbers of type double with nrow(x)>ncol(x) and unique row lables and - a character vector y containing a sorted order labels Now, I'd like to sort the rows of the data.frame x w.r.t. the order of labels in y. example: x <- data.frame(c(1:4),c(5:8))

auth*.c

2001 Dec 26

auth*.c

Folks, During testing, we found a couple of issues with openssh3.0.2p1: 1. In userauth_finish() in auth2.c (as well as in do_authloop in auth1.c), the foll. check: if (authctxt->failures++ > AUTH_FAIL_MAX) is never satisfied and thus packet_disconnect() never gets called. I suspect the code just drops out of the dispatch_run function list instead. This should be an == instead of >.

multiple secondary axes

2009 Jan 14

multiple secondary axes

Dear R experts, I want to plot a line chart with another secondary axis placed right to the standard secondary axis which one can access with the axis command, so that the data lines are seen in the same plot. Is there any way to do this in R? Many thanks, Kirsten.

Why is the diag function so slow (for extraction)?

2015 May 05

Why is the diag function so slow (for extraction)?

Looks like the c(x)[...] bit used to be as.matrix(x)[...]. Not sure why the change was made many years ago, but this was before names were handled explicitly. It would definitely be better to not force the duplicate, at least in the case where we are sure c() and [ would not dispatch. Best, luke On Mon, 4 May 2015, peter dalgaard wrote: > >> On 04 May 2015, at 19:59 , franknarf

use of "@" character in variable name

2009 Mar 27

use of "@" character in variable name

Importing data with a header row using read.delim, one variable should be named @5HTT but it is automatically renamed to X.5HTT, presumably because the "@" is either unacceptable or misunderstood. I've tried to find out what the rules are on variable names but have been unsuccessful. I'll bet someone here can tell me where to look. Maybe it's hidden away in here

replace() error: new columns would leave holes after existing columns

2008 Oct 31

replace() error: new columns would leave holes after existing columns

Hello, I have a problem with using replace() to convert a vector of dates from yyyy-mm-dd to julian date. For example, I type replace(x,2004-05-14,134) and I receive an error: Error in `[<-.data.frame`(`*tmp*`, list, value = 134) : new columns would leave holes after existing columns If I can successfully convert, I have a script that will convert all of the dates in

plot residuals per factor

2013 Jan 08

plot residuals per factor

Dear R-users, I want to plot residuals vs fitted for multiple groups with ggplot2. I try this code, but unsuccessful. library("plyr") models<-dlply(dat1,"d",function(df) mod<-lm(y~x,data=df) ggplot(models,aes(.fitted,.resid), color=factor(d))+ geom_hline(yintercept=0,col="white",size=2)+ geom_point()+ geom_smooth(se=F) -- --- Catalin-Constantin ROIBU

error message re: max(i), but code and output seen O.K.

2009 May 20

error message re: max(i), but code and output seen O.K.

I have a researcher who is consistently get the warning message: In max(i) : no non-missing arguments to max; returning -Inf Best as I can tell the code is working properly and the output is as expected. I would like some help in understanding why he is getting this error message and what its implications are. I have his code. Sincerely, Kirsten Miles Support Specialist Research Computing Lab

Ifelse statements and combining columns

2017 Jul 24

Ifelse statements and combining columns

Hi everyone, I'm having some trouble with my ifelse statements. I'm trying to put 12 conditions within 3 groups. Here is the code I have so far: dat$cond <- ifelse(test = dat$cond == "cond1" | dat$cond == "cond2" | dat$cond == "cond3" dat$cond == "cond4" yes = "Uniform" no = ifelse(test =

Samba4, DHCP, & BIND DLZ

2012 Sep 20

Samba4, DHCP, & BIND DLZ

Hello, I have recently compiled, installed and configured samba4 to run on a FreeBSD server. samba -V reports the version to be Version 4.1.0pre1-GIT-57990cb. The server has working BIND 9.9 and ISC-DHCP services running on it. I have provisioned samba 4 to use the BIND_DLZ DNS backend. On the whole things seem to be working. local names are being resolved. phpLDAPAdmin shows the new

How to make a figure plotting p-values by range of different adjustment values?

2017 Jul 13

How to make a figure plotting p-values by range of different adjustment values?

Hi Jim, Thanks for your help, I really appreciate it. Perhaps I'm misunderstanding, but does this formula run different ajustment values for this function? logit(p = doc$value, adjust = 0.025) I'm looking to plot the p-values of different adjustment values. Thanks so much, Kirsten On Wed, Jul 12, 2017 at 8:49 PM, Jim Lemon <drjimlemon at gmail.com> wrote: > Hi Kirsten,

Conditional model in R

2012 Nov 28

Conditional model in R

Hello all, I have a data set where the response variable is the percent cover of a specific plant (represented in cover classes 0,1,2,3,4,5, or 6). This data set has a lot of zeros (plots where the plant was not present). I am trying to model cover class of the plant as a function of both total nitrogen and shrub cover. After quite a bit of research I have come across a conditional approach

similar to: subset question