Dear friends, I apologize if the description is a bit long, but I think that I need to be as specific as possible so that you guys can help. I wil share with you a file (train.csv), which contains gray-scale images of hand-drawn digits, from zero through 9. Each image is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total. Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker. This pixel-value is an integer between 0 and 255, inclusive. The training data set, (train.csv), has 785 columns. The first column, called ?label?, is the digit that was drawn by the user. The rest of the columns contain the pixel-values of the associated image. Each pixel column in the training set has a name like pixel x, where x is an integer between 0 and 783, inclusive. To locate this pixel on the image, suppose that we have decomposed x as x = i ? 28 + j, where i and j are integers between 0 and 27, inclusive. Then pixel x is located on row i and column j of a 28 x 28 matrix, (indexing by zero). or example, pixel 31 indicates the pixel that is in the fourth column from the left, and the second row from the top, as in the ascii-diagram below. This data is set up in a csv file which will require the reshaping of the data to be 28 ? 28 matrix representing images. There are 42000 images in the train.csv file. For this problem it is only necessary to process approximately 100 images, 10 each of the numbers from 0 through 9. The goal is to learn how to generate features from images using transforms and first order statistics. So I need to develop an algorithm to store the data in a data structure such that the data is reshaped into a matrix of size 28 x 28 and then I have to plot the developed matrix for indices 1, 2, 4, 7, 8, 9, 11. 12, 17 and 22. I have been looking for information about how to process this with R, but have not found anything yet. The dataset is attached in this e-mail for your reference. Any help and/or guidance will be greatly appreciated. Best regards, Paul train.csv <https://drive.google.com/file/d/1WPb7bKHJ8BlzuLKJogMOAOqb-VCoXDMp/view?usp=drive_web> [[alternative HTML version deleted]]
On Thu, 24 Feb 2022 11:00:08 -0500 Paul Bernal <paulbernal07 at gmail.com> wrote:> Each pixel column in the training set has a name like pixel x, where > x is an integer between 0 and 783, inclusive. To locate this pixel on > the image, suppose that we have decomposed x as x = i ? 28 + j, where > i and j are integers between 0 and 27, inclusive.> I have been looking for information about how to process this with R, > but have not found anything yet.Given a 784-element vector x, you can reshape it into a 28 by 28 matrix: dim(x) <- c(28, 28) Or create a new matrix: matrix(x, 28, 28) Working with more dimensions is also possible. A matrix X with dim(X) == c(n, 784) can be transformed into a three-way array in place or copied into one: dim(X) <- c(dim(X)[1], 28, 28) array(X, c(dim(X)[1], 28, 28)) (Replace 28,28 with 784 for an inverse transformation. In modern versions of R, two-way arrays are more or less the same as matrices, but old versions may disagree with that in some corner cases.) For more information, see ?dim, ?matrix, ?array. -- Best regards, Ivan
Hi Paul, I may be missing something, but you can transform a vector to a matrix of any desired size by using matrix(). For more nuanced processing of images, you might look into one of the many image processing packages in R, or even the raster package (or the newer terra). Sarah On Thu, Feb 24, 2022 at 11:00 AM Paul Bernal <paulbernal07 at gmail.com> wrote:> > Dear friends, > > I apologize if the description is a bit long, but I think that I need to > be as specific as possible so that you guys can help. > > I wil share with you a file (train.csv), which contains gray-scale images > of hand-drawn digits, from zero through 9. > > Each image is 28 pixels in height and 28 pixels in width, for a total of > 784 pixels in total. Each pixel has a single pixel-value associated with > it, indicating the lightness or darkness of that pixel, with higher numbers > meaning darker. This pixel-value is an integer between 0 and 255, > inclusive. The training data set, (train.csv), has 785 columns. The first > column, called ?label?, is the digit that was drawn by the user. The rest > of the columns contain the pixel-values of the associated image. Each pixel > column in the training set has a name like pixel x, where x is an integer > between 0 and 783, inclusive. To locate this pixel on the image, suppose > that we have decomposed x as x = i ? 28 + j, where i and j are integers > between 0 and 27, inclusive. Then pixel x is located on row i and column j > of a 28 x 28 matrix, (indexing by zero). or example, pixel 31 indicates the > pixel that is in the fourth column from the left, and the second row from > the top, as in the ascii-diagram below. > > This data is set up in a csv file which will require the reshaping of the > data to be 28 ? 28 matrix representing images. There are 42000 images in > the train.csv file. For this problem it is only necessary to process > approximately 100 images, 10 each of the numbers from 0 through 9. The goal > is to learn how to generate features from images using transforms and first > order statistics. > > So I need to develop an algorithm to store the data in a data structure > such that the data is reshaped into a matrix of size 28 x 28 and then I > have to plot the developed matrix for indices 1, 2, 4, 7, 8, 9, 11. 12, 17 > and 22. > > I have been looking for information about how to process this with R, but > have not found anything yet. > > The dataset is attached in this e-mail for your reference. > > Any help and/or guidance will be greatly appreciated. > > Best regards, > Paul > train.csv > <https://drive.google.com/file/d/1WPb7bKHJ8BlzuLKJogMOAOqb-VCoXDMp/view?usp=drive_web> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Sarah Goslee (she/her) http://www.sarahgoslee.com