Dear All, For a project I am given a set of images. They represent either healthy or tumoral tissue, but the specific nature of the images does not matter. I need to train a classifier which is expected to tell me in which category (let's call it 0 vs 1) each image falls. I am thinking about a random forest classifier, but I am uncertain about a couple of (fairly important) points (1) The size of the images varies, so for instance the number of pixels is not the same for every image and as a consequence some methodologies (e.g. the PCA) when applied to these images will lead to results not immediately comparable. Is trying to blur/flatten the images a good idea to have always (artificially) the same size (number of pixels) for every image? (2) Which features do you recommend to associate\calculate for every image? This is what I will use to train my model upon. Any suggestion is welcome. Cheers Lorenzo
This is an R-help list. These are not questions about R. You should talk to a local statistical expert instead of posting here. Cheers, Bert On Mon, Oct 14, 2013 at 1:23 PM, Lorenzo Isella <lorenzo.isella at gmail.com> wrote:> Dear All, > For a project I am given a set of images. They represent either healthy or > tumoral tissue, but the specific nature of the images does not matter. > I need to train a classifier which is expected to tell me in which category > (let's call it 0 vs 1) each image falls. > I am thinking about a random forest classifier, but I am uncertain about a > couple of (fairly important) points > > (1) The size of the images varies, so for instance the number of pixels is > not the same for every image and as a consequence some methodologies (e.g. > the PCA) when applied to these images will lead to results not immediately > comparable. Is trying to blur/flatten the images a good idea to have always > (artificially) the same size (number of pixels) for every image? > (2) Which features do you recommend to associate\calculate for every image? > This is what I will use to train my model upon. > > Any suggestion is welcome. > Cheers > > Lorenzo > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374
Hello Lorenzo, Try to locate related R packages from here: http://cran.r-project.org/web/views/MedicalImaging.html On 14 October 2013 22:23, Lorenzo Isella <lorenzo.isella at gmail.com> wrote:> Dear All, > For a project I am given a set of images. They represent either healthy or > tumoral tissue, but the specific nature of the images does not matter. > I need to train a classifier which is expected to tell me in which category > (let's call it 0 vs 1) each image falls. > I am thinking about a random forest classifier, but I am uncertain about a > couple of (fairly important) points > > (1) The size of the images varies, so for instance the number of pixels is > not the same for every image and as a consequence some methodologies (e.g. > the PCA) when applied to these images will lead to results not immediately > comparable. Is trying to blur/flatten the images a good idea to have always > (artificially) the same size (number of pixels) for every image? > (2) Which features do you recommend to associate\calculate for every image? > This is what I will use to train my model upon. > > Any suggestion is welcome. > Cheers > > Lorenzo > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.