thr3ads.net - search: "dsheuman"

Displaying 10 results from an estimated 10 matches for "dsheuman".

Removing leading and trailing spaces (string manipulation)

2004 Mar 31

Removing leading and trailing spaces (string manipulation)

Hi all, I'm running the following code to generate 40 different jpegs based on the resulting data. I'd like the file names to be 'Cluster1.jpeg', however the code write filenames like 'Cluster 1 .jpeg'. How can I get rid of the unwanted spaces? I've looked at ?format and it doesn't seem to work - at least in this context. ################### ClusCount <- 40

Re: Re: Find Closest 5 Cases?

2004 Feb 13

Re: Re: Find Closest 5 Cases?

Art (and group), I'm doing this as a form of missing value analysis. Approximately 30% of the cases are missing data for one variable. To impute values for those cases, I'd like to match those cases that are missing the variable to all other cases and then take an average of those to infill. I realize there are many methods for imputing data. I'm not well versed on any in

Calculate Distance and Aggregate Data?

2004 Feb 24

Calculate Distance and Aggregate Data?

Hi all, I've been struggling learning R and need to turn to the list again. I've got a dataset (comma-delimited file) with the following fields: recid, latitude, longitude, population, dwelling and age. For each observation, I'd like to calculate the total number of people and dwellings and average age within 2 k.m. Distance could be Euclidean, however, a proper distance

Calculate Closest 5 Cases?

2004 Feb 13

Calculate Closest 5 Cases?

I've only begun investigating R as a substitute for SPSS. I have a need to identify for each CASE the closest (or most similar) 5 other CASES (not including itself as it is automatically the closest). I have a fairly large matrix (50000 cases by 50 vars). In SPSS, I can use Correlate > Distances to generate a matrix of similarity, but only on a small sample. The entire matrix can not

Distance and Aggregate Data - Again...

2004 Feb 26

Distance and Aggregate Data - Again...

I appreciate the help I've been given so far. The issue I face is that the data I'm working with has 53000 rows, so in calculating distance, finding all recids that fall within 2km and summing the population, etc. - a) takes too long and b) have no sense of progress. Below is a loop that reads each recid one at a time, calculates the distance and identifies the recids that fall within 2

How to improve this code?

2004 Apr 04

How to improve this code?

Hi all, I've got some functioning code that I've literally taken hours to write. My 'R' coding is getting better...it used to take days :) I know I've done a poor job of optimizing the code. In addition, I'm missing an important step and don't know where to put it. So, three questions: 1) I'd like the resulting output to be sorted on distance (ascending) and

Speed up graphics output?

2004 May 03

Speed up graphics output?

Hi all, I've written some code to generate 4 maps per screen and write the output to a jpeg. The output is fairly quick at the start (about 5 jpegs per minute) and then slows down greatly (1-2 jpegs per minute). Is there some way to speed it up? One of my thoughts is to keep the base map static on the screen and just update the points that are being plotted on the map (with the exception

Cluster Analysis with minimum cluster size?

2004 Mar 27

Cluster Analysis with minimum cluster size?

Hi all, Is it possible to run kmeans, pam or clara with a constraint such that no resulting cluster has fewer than X cases? These kmeans algorithms often find clusters that are too small for my use. There are usually a few clusters with 1-10 cases (generally substantial outliers). I then have to manually assign the small ones to other sizable clusters. If this doesn't exist, it there such

Calculating sum of squares deviation between 2 similar matrices

2004 Jul 13

Calculating sum of squares deviation between 2 similar matrices

Hi all, I've got clusters and would like to match individual records to each cluster based on a sum of squares deviation. For each cluster and individual, I've got 50 variables to use (measured in the same way). Matrix 1 is individuals and is 25000x50. Matrix 2 is the cluster centroids and is 100x50. The same variables are found in each matrix in the same order. I'd like to

Sparse Matrices in R

2004 Aug 31

Sparse Matrices in R

I have data in i,j,r format, where r is the value in location A[i,j] for some imaginary matrix A. I need to build this matrix A, but given the sizes of i and j, I believe that using a sparse format would be most adequate. Hopefully this will allow me to perform some basic matrix manipulation such as multiplication, addition, rowsums, transpositions, subsetting etc etc. Is there any way

search for: dsheuman