Hi all: I'm struggling with getting my data re-formatted using functions in reshape/reshape2 to get from: 1957 0.862500000 1958 0.750000000 1959 0.300000000 1960 0.287500000 1963 0.675000000 1964 0.937500000 1965 0.025000000 1966 0.387500000 1969 0.087500000 1970 0.275000000 1973 0.500000000 1974 0.362500000 1976 0.925000000 1978 0.712500000 1979 0.337500000 1980 0.700000000 1981 0.425000000 1982 0.212500000 1983 0.312500000 1986 0.237500000 1958 0.643564356 1963 0.250000000 1968 0.211538462 1976 0.317307692 1981 0.673076923 1985 0.730769231 1986 0.057692308 1957 0.073394495 1966 0.742574257 1961 0.082568807 1964 0.165137615 1965 0.137614679 1959 0.128712871 1968 0.587155963 1969 0.660550459 1970 0.477064220 1971 0.513761468 1973 0.449541284 1974 0.128440367 1968 0.415841584 1977 0.009174312 1979 0.339449541 1981 0.596330275 1982 0.348623853 1984 0.146788991 1986 0.651376147 1959 0.451923077 1965 0.750000000 1962 0.326732673 1964 0.782178218 1970 0.336538462 1975 0.277227723 1978 0.712871287 1957 0.509615385 1960 0.490384615 1961 0.721153846 1966 0.298076923 1969 0.413461538 1971 0.500000000 1972 0.692307692 1974 0.653846154 1984 0.049504950 1978 0.442307692 1973 0.079207921 1983 0.355769231 1984 0.038461538 1979 0.237623762 1982 0.564356436 to: 1957 1958 1959 1960 ... 1985 1986 0.509615385 0.750000000 0.451923077 0.287500000 ... 0.651376147 0.509615385 and so on. It's likely the column lengths will be different, so I'm guessing padding with NAs will be needed. I have on the order of 1335 rows with years spanning 1957 to 2016. Thank you... Tom -- [[alternative HTML version deleted]]
This does not use reshape/reshape2, but it is pretty straightforward. Assuming X is your example data:> Y <- split(X[, 2], X[, 1]) > vals <- sapply(Y, length) > pad <- max(vals) - vals > Y2 <- lapply(seq_along(Y), function(x) c(Y[[x]], rep(NA, pad[x]))) > names(Y2) <- names(Y) > X2 <- do.call(cbind, Y2) > X2[, 1:6]1957 1958 1959 1960 1961 1962 [1,] 0.8625000 0.7500000 0.3000000 0.2875000 0.08256881 0.3267327 [2,] 0.0733945 0.6435644 0.1287129 0.4903846 0.72115385 NA [3,] 0.5096154 NA 0.4519231 NA NA NA ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Thomas Adams Sent: Wednesday, July 5, 2017 1:17 PM To: r-help at r-project.org Subject: [R] Help with reshape/reshape2 needed Hi all: I'm struggling with getting my data re-formatted using functions in reshape/reshape2 to get from: 1957 0.862500000 1958 0.750000000 1959 0.300000000 1960 0.287500000 1963 0.675000000 1964 0.937500000 1965 0.025000000 1966 0.387500000 1969 0.087500000 1970 0.275000000 1973 0.500000000 1974 0.362500000 1976 0.925000000 1978 0.712500000 1979 0.337500000 1980 0.700000000 1981 0.425000000 1982 0.212500000 1983 0.312500000 1986 0.237500000 1958 0.643564356 1963 0.250000000 1968 0.211538462 1976 0.317307692 1981 0.673076923 1985 0.730769231 1986 0.057692308 1957 0.073394495 1966 0.742574257 1961 0.082568807 1964 0.165137615 1965 0.137614679 1959 0.128712871 1968 0.587155963 1969 0.660550459 1970 0.477064220 1971 0.513761468 1973 0.449541284 1974 0.128440367 1968 0.415841584 1977 0.009174312 1979 0.339449541 1981 0.596330275 1982 0.348623853 1984 0.146788991 1986 0.651376147 1959 0.451923077 1965 0.750000000 1962 0.326732673 1964 0.782178218 1970 0.336538462 1975 0.277227723 1978 0.712871287 1957 0.509615385 1960 0.490384615 1961 0.721153846 1966 0.298076923 1969 0.413461538 1971 0.500000000 1972 0.692307692 1974 0.653846154 1984 0.049504950 1978 0.442307692 1973 0.079207921 1983 0.355769231 1984 0.038461538 1979 0.237623762 1982 0.564356436 to: 1957 1958 1959 1960 ... 1985 1986 0.509615385 0.750000000 0.451923077 0.287500000 ... 0.651376147 0.509615385 and so on. It's likely the column lengths will be different, so I'm guessing padding with NAs will be needed. I have on the order of 1335 rows with years spanning 1957 to 2016. Thank you... Tom -- [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
David, That's just awesome! LOL -- no wonder I couldn't see how the reshape functions could do this; you saved me MANY days! Thank you so much! Best, Tom On Wed, Jul 5, 2017 at 3:48 PM, David L Carlson <dcarlson at tamu.edu> wrote:> This does not use reshape/reshape2, but it is pretty straightforward. > Assuming X is your example data: > > > Y <- split(X[, 2], X[, 1]) > > vals <- sapply(Y, length) > > pad <- max(vals) - vals > > Y2 <- lapply(seq_along(Y), function(x) c(Y[[x]], rep(NA, pad[x]))) > > names(Y2) <- names(Y) > > X2 <- do.call(cbind, Y2) > > X2[, 1:6] > 1957 1958 1959 1960 1961 1962 > [1,] 0.8625000 0.7500000 0.3000000 0.2875000 0.08256881 0.3267327 > [2,] 0.0733945 0.6435644 0.1287129 0.4903846 0.72115385 NA > [3,] 0.5096154 NA 0.4519231 NA NA NA > > ------------------------------------- > David L Carlson > Department of Anthropology > Texas A&M University > College Station, TX 77840-4352 > > -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Thomas > Adams > Sent: Wednesday, July 5, 2017 1:17 PM > To: r-help at r-project.org > Subject: [R] Help with reshape/reshape2 needed > > Hi all: > > I'm struggling with getting my data re-formatted using functions in > reshape/reshape2 to get from: > > 1957 0.862500000 > 1958 0.750000000 > 1959 0.300000000 > 1960 0.287500000 > 1963 0.675000000 > 1964 0.937500000 > 1965 0.025000000 > 1966 0.387500000 > 1969 0.087500000 > 1970 0.275000000 > 1973 0.500000000 > 1974 0.362500000 > 1976 0.925000000 > 1978 0.712500000 > 1979 0.337500000 > 1980 0.700000000 > 1981 0.425000000 > 1982 0.212500000 > 1983 0.312500000 > 1986 0.237500000 > 1958 0.643564356 > 1963 0.250000000 > 1968 0.211538462 > 1976 0.317307692 > 1981 0.673076923 > 1985 0.730769231 > 1986 0.057692308 > 1957 0.073394495 > 1966 0.742574257 > 1961 0.082568807 > 1964 0.165137615 > 1965 0.137614679 > 1959 0.128712871 > 1968 0.587155963 > 1969 0.660550459 > 1970 0.477064220 > 1971 0.513761468 > 1973 0.449541284 > 1974 0.128440367 > 1968 0.415841584 > 1977 0.009174312 > 1979 0.339449541 > 1981 0.596330275 > 1982 0.348623853 > 1984 0.146788991 > 1986 0.651376147 > 1959 0.451923077 > 1965 0.750000000 > 1962 0.326732673 > 1964 0.782178218 > 1970 0.336538462 > 1975 0.277227723 > 1978 0.712871287 > 1957 0.509615385 > 1960 0.490384615 > 1961 0.721153846 > 1966 0.298076923 > 1969 0.413461538 > 1971 0.500000000 > 1972 0.692307692 > 1974 0.653846154 > 1984 0.049504950 > 1978 0.442307692 > 1973 0.079207921 > 1983 0.355769231 > 1984 0.038461538 > 1979 0.237623762 > 1982 0.564356436 > > to: > > 1957 1958 1959 1960 ... > 1985 1986 > 0.509615385 0.750000000 0.451923077 0.287500000 ... > 0.651376147 > 0.509615385 > > and so on. It's likely the column lengths will be different, so I'm > guessing padding with NAs will be needed. I have on the order of 1335 rows > with years spanning 1957 to 2016. > > Thank you... > Tom > > -- > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Thomas E Adams, III 1724 Sage Lane Blacksburg, VA 24060 tea3rd at gmail.com (personal) tea at terrapredictions.org (work) 1 (513) 739-9512 (cell) [[alternative HTML version deleted]]
The reason it doesn't work easily with reshape/reshape2 is that the order of the rows is not determined. Your answer could be 1957 1958 ... 1985 1986 0.8625000 0.7500000 ... 0.7307692 0.23750000 0.0733945 0.6435644 ... NA 0.05769231 0.5096154 NA ... NA 0.65137615 or 1957 1958 ... 1985 1986 0.0733945 0.6435644 ... NA 0.05769231 0.8625000 0.7500000 ... 0.7307692 0.23750000 0.5096154 NA ... NA 0.65137615 or 1957 1958 ... 1985 1986 0.8625000 0.6435644 ... NA 0.23750000 0.0733945 NA 0.7307692 0.05769231 0.5096154 0.7500000 ... NA 0.65137615 or any other combination of orders. You might not care about the order, but reshape does. The usual way around it is to just make up an order variable, e.g., assuming your data.frame is named "example_data" and the columns are named "year" and "score": example_data <- do.call(rbind, lapply(split(example_data, example_data$year), transform, obs = seq_along(year))) dcast(example_data, obs ~ year, value.var = "score") Best, Ista On Wed, Jul 5, 2017 at 2:16 PM, Thomas Adams <tea3rd at gmail.com> wrote:> Hi all: > > I'm struggling with getting my data re-formatted using functions in > reshape/reshape2 to get from: > > 1957 0.862500000 > 1958 0.750000000 > 1959 0.300000000 > 1960 0.287500000 > 1963 0.675000000 > 1964 0.937500000 > 1965 0.025000000 > 1966 0.387500000 > 1969 0.087500000 > 1970 0.275000000 > 1973 0.500000000 > 1974 0.362500000 > 1976 0.925000000 > 1978 0.712500000 > 1979 0.337500000 > 1980 0.700000000 > 1981 0.425000000 > 1982 0.212500000 > 1983 0.312500000 > 1986 0.237500000 > 1958 0.643564356 > 1963 0.250000000 > 1968 0.211538462 > 1976 0.317307692 > 1981 0.673076923 > 1985 0.730769231 > 1986 0.057692308 > 1957 0.073394495 > 1966 0.742574257 > 1961 0.082568807 > 1964 0.165137615 > 1965 0.137614679 > 1959 0.128712871 > 1968 0.587155963 > 1969 0.660550459 > 1970 0.477064220 > 1971 0.513761468 > 1973 0.449541284 > 1974 0.128440367 > 1968 0.415841584 > 1977 0.009174312 > 1979 0.339449541 > 1981 0.596330275 > 1982 0.348623853 > 1984 0.146788991 > 1986 0.651376147 > 1959 0.451923077 > 1965 0.750000000 > 1962 0.326732673 > 1964 0.782178218 > 1970 0.336538462 > 1975 0.277227723 > 1978 0.712871287 > 1957 0.509615385 > 1960 0.490384615 > 1961 0.721153846 > 1966 0.298076923 > 1969 0.413461538 > 1971 0.500000000 > 1972 0.692307692 > 1974 0.653846154 > 1984 0.049504950 > 1978 0.442307692 > 1973 0.079207921 > 1983 0.355769231 > 1984 0.038461538 > 1979 0.237623762 > 1982 0.564356436 > > to: > > 1957 1958 1959 1960 ... > 1985 1986 > 0.509615385 0.750000000 0.451923077 0.287500000 ... > 0.651376147 > 0.509615385 > > and so on. It's likely the column lengths will be different, so I'm > guessing padding with NAs will be needed. I have on the order of 1335 rows > with years spanning 1957 to 2016. > > Thank you... > Tom > > -- > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Tom, Or perhaps: #assume the data frame is named "tadf" library(prettyR) stretch_df(tadf,1,2) Jim On Thu, Jul 6, 2017 at 6:50 AM, Ista Zahn <istazahn at gmail.com> wrote:> The reason it doesn't work easily with reshape/reshape2 is that the > order of the rows is not determined. Your answer could be > > 1957 1958 ... 1985 1986 > 0.8625000 0.7500000 ... 0.7307692 0.23750000 > 0.0733945 0.6435644 ... NA 0.05769231 > 0.5096154 NA ... NA 0.65137615 > > or > > 1957 1958 ... 1985 1986 > 0.0733945 0.6435644 ... NA 0.05769231 > 0.8625000 0.7500000 ... 0.7307692 0.23750000 > 0.5096154 NA ... NA 0.65137615 > > or > > 1957 1958 ... 1985 1986 > 0.8625000 0.6435644 ... NA 0.23750000 > 0.0733945 NA 0.7307692 0.05769231 > 0.5096154 0.7500000 ... NA 0.65137615 > > or any other combination of orders. You might not care about the > order, but reshape does. > > The usual way around it is to just make up an order variable, e.g., > assuming your data.frame is named "example_data" and the columns are > named "year" and "score": > > example_data <- do.call(rbind, > lapply(split(example_data, > example_data$year), > transform, > obs = seq_along(year))) > > dcast(example_data, > obs ~ year, > value.var = "score") > > Best, > Ista > > > On Wed, Jul 5, 2017 at 2:16 PM, Thomas Adams <tea3rd at gmail.com> wrote: >> Hi all: >> >> I'm struggling with getting my data re-formatted using functions in >> reshape/reshape2 to get from: >> >> 1957 0.862500000 >> 1958 0.750000000 >> 1959 0.300000000 >> 1960 0.287500000 >> 1963 0.675000000 >> 1964 0.937500000 >> 1965 0.025000000 >> 1966 0.387500000 >> 1969 0.087500000 >> 1970 0.275000000 >> 1973 0.500000000 >> 1974 0.362500000 >> 1976 0.925000000 >> 1978 0.712500000 >> 1979 0.337500000 >> 1980 0.700000000 >> 1981 0.425000000 >> 1982 0.212500000 >> 1983 0.312500000 >> 1986 0.237500000 >> 1958 0.643564356 >> 1963 0.250000000 >> 1968 0.211538462 >> 1976 0.317307692 >> 1981 0.673076923 >> 1985 0.730769231 >> 1986 0.057692308 >> 1957 0.073394495 >> 1966 0.742574257 >> 1961 0.082568807 >> 1964 0.165137615 >> 1965 0.137614679 >> 1959 0.128712871 >> 1968 0.587155963 >> 1969 0.660550459 >> 1970 0.477064220 >> 1971 0.513761468 >> 1973 0.449541284 >> 1974 0.128440367 >> 1968 0.415841584 >> 1977 0.009174312 >> 1979 0.339449541 >> 1981 0.596330275 >> 1982 0.348623853 >> 1984 0.146788991 >> 1986 0.651376147 >> 1959 0.451923077 >> 1965 0.750000000 >> 1962 0.326732673 >> 1964 0.782178218 >> 1970 0.336538462 >> 1975 0.277227723 >> 1978 0.712871287 >> 1957 0.509615385 >> 1960 0.490384615 >> 1961 0.721153846 >> 1966 0.298076923 >> 1969 0.413461538 >> 1971 0.500000000 >> 1972 0.692307692 >> 1974 0.653846154 >> 1984 0.049504950 >> 1978 0.442307692 >> 1973 0.079207921 >> 1983 0.355769231 >> 1984 0.038461538 >> 1979 0.237623762 >> 1982 0.564356436 >> >> to: >> >> 1957 1958 1959 1960 ... >> 1985 1986 >> 0.509615385 0.750000000 0.451923077 0.287500000 ... >> 0.651376147 >> 0.509615385 >> >> and so on. It's likely the column lengths will be different, so I'm >> guessing padding with NAs will be needed. I have on the order of 1335 rows >> with years spanning 1957 to 2016. >> >> Thank you... >> Tom >> >> -- >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.