Christopher Wills
2003-Oct-06 23:06 UTC
[R] randomizing within factors and combining unequal-sized arrays
Dear R folks - Sorry to be so dense, but can you help me with two programming problems? 1. Randomizing within factors. I have factored the values in an array according to a series of ranked breakpoints. Now I want to randomize the values that fall within each of the breakpoint boundaries, without replacement. I want to end up with an array in which the values within each factored category have been randomized, but in which the factored category of each value has been unchanged. For example, starting with an array: 10,11,12,20,21,22, which has been factored into two groups that lie between 10 and 19 and 20 and 29, I want to end up with an array that might look like this: 11,10,12,22,21,20 I can generate the randomized values using a statement of the form: randvalues = tapply(values,grouped.ranks,sample). How do I get these randomized values back into the original array, so that each factored subsample of the array now has its values randomized within that subsample? At the same time, I do not want to mix up the array too much by substituting values in one factored category by values in another category. 2. Combining unequal-sized arrays. I have a series of arrays in a dataframe, of diffferent lengths. One column of the arrays consists of a series of values between 1 and 5,000, and each of these values denotes the position of that row of data in a larger data set that has 5,000 positions. Each row of the arrays carries other information about the individual that has that positional value. I would like to be able to take each array from the dataframe in turn and "paste" it into the larger data set that has 5,000 positions, performing various arithmetic operations in the process. At the moment I can only do this through a series of loops, in which I go through the shorter arrays position by position, do the arithmetic operations on each row in turn, and then add the values to the values at that position in the larger array. The loops, of course, are very slow. Is there some clever way to do this quickly? The major difficulty is that the shorter arrays often have sets of values with the same position - that is, there may be two or more sets of values that have position 200. Thanks in advance for your help! Chris Wills [[alternative HTML version deleted]]
Possibly Parallel Threads
- Unequal sized three plots in a window
- Please help: ANOVA with SS Type III for unequal sample sized data
- Linux Software RAID 1 - Unequal Sized Hard Disks
- Printing 'k' levels of factors 'n' times each, but 'n' is unequal for all levels ?
- combining vectors on unequal length