Paul Bernal
2020-Sep-01 20:19 UTC
[R] Odd Results when generating predictions with nnet function
Dear friends, Hope you are all doing well. I am currently using R version 4.0.2 and working with the nnet package. My dataframe consists of three columns, FECHA which is the date, x, which is a sequence from 1 to 159, and y, which is the number of covid cases (I am also providing the dput for this data frame below). I tried fitting a neural net model using the following code: xnew = 1:159 Fit <- nnet(a$y ~ a$x, a, size = 5, maxit = 1000, lineout = T, decay 0.001) Finally, I attempted to generate predictions with the following code: predictions <- predict(Fit, newdata = list(x = xnew), type = "raw") But obtained extremely odd results: As you can see, instead of obtaining numbers, more or less in the range of the last observations of a$y, I end up getting a bunch of 1s, which doesn?t make any sense (if anyone could help me understand what could be causing this): dput(predictions) structure(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), .Dim = c(159L, 1L), .Dimnames = list(c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48", "49", "50", "51", "52", "53", "54", "55", "56", "57", "58", "59", "60", "61", "62", "63", "64", "65", "66", "67", "68", "69", "70", "71", "72", "73", "74", "75", "76", "77", "78", "79", "80", "81", "82", "83", "84", "85", "86", "87", "88", "89", "90", "91", "92", "93", "94", "95", "96", "97", "98", "99", "100", "101", "102", "103", "104", "105", "106", "107", "108", "109", "110", "111", "112", "113", "114", "115", "116", "117", "118", "119", "120", "121", "122", "123", "124", "125", "126", "127", "128", "129", "130", "131", "132", "133", "134", "135", "136", "137", "138", "139", "140", "141", "142", "143", "144", "145", "146", "147", "148", "149", "150", "151", "152", "153", "154", "155", "156", "157", "158", "159"), NULL)) head(a) FECHA x y 1 2020-03-09 1 1 2 2020-03-10 2 8 3 2020-03-11 3 14 4 2020-03-12 4 27 5 2020-03-13 5 36 6 2020-03-14 6 43 dput(a) structure(list(FECHA = structure(c(18330, 18331, 18332, 18333, 18334, 18335, 18336, 18337, 18338, 18339, 18340, 18341, 18342, 18343, 18344, 18345, 18346, 18347, 18348, 18349, 18350, 18351, 18352, 18353, 18354, 18355, 18356, 18357, 18358, 18359, 18360, 18361, 18362, 18363, 18364, 18365, 18366, 18367, 18368, 18369, 18370, 18371, 18372, 18373, 18374, 18375, 18376, 18377, 18378, 18379, 18380, 18381, 18382, 18383, 18384, 18385, 18386, 18387, 18388, 18389, 18390, 18391, 18392, 18393, 18394, 18395, 18396, 18397, 18398, 18399, 18400, 18401, 18402, 18403, 18404, 18405, 18406, 18407, 18408, 18409, 18410, 18411, 18412, 18413, 18414, 18415, 18416, 18417, 18418, 18419, 18420, 18421, 18422, 18423, 18424, 18425, 18426, 18427, 18428, 18429, 18430, 18431, 18432, 18433, 18434, 18435, 18436, 18437, 18438, 18439, 18440, 18441, 18442, 18443, 18444, 18445, 18446, 18447, 18448, 18449, 18450, 18451, 18452, 18453, 18454, 18455, 18456, 18457, 18458, 18459, 18460, 18461, 18462, 18463, 18464, 18465, 18466, 18467, 18468, 18469, 18470, 18471, 18472, 18473, 18474, 18475, 18476, 18477, 18478, 18479, 18480, 18481, 18482, 18483, 18484, 18485, 18486, 18487, 18488), class = "Date"), x = 1:159, y = c(1, 8, 14, 27, 36, 43, 55, 69, 86, 109, 137, 200, 245, 313, 345, 443, 558, 674, 786, 901, 989, 1075, 1181, 1317, 1475, 1673, 1801, 1988, 2100, 2249, 2528, 2752, 2974, 3234, 3400, 3472, 3574, 3751, 4016, 4210, 4273, 4467, 4658, 4821, 4992, 5166, 5338, 5538, 5779, 6021, 6200, 6378, 6532, 6720, 7090, 7197, 7387, 7523, 7731, 7868, 8070, 8282, 8448, 8616, 8783, 8944, 9118, 9268, 9449, 9606, 9726, 9867, 9977, 10116, 10267, 10577, 10926, 11183, 11447, 11728, 12131, 12531, 13015, 13463, 13837, 14095, 14609, 15044, 15463, 16004, 16425, 16854, 17233, 17889, 18586, 19211, 20059, 20686, 21422, 21962, 22597, 23351, 24274, 25222, 26030, 26752, 27314, 28030, 29037, 29905, 30658, 31686, 32785, 33550, 34463, 35237, 35995, 36983, 38149, 39334, 40291, 41251, 42216, 43257, 44352, 45633, 47177, 48096, 49243, 50373, 51408, 52261, 53468, 54426, 55153, 55906, 56817, 57993, 58864, 60296, 61442, 62223, 63269, 64191, 65256, 66383, 67453, 68456, 69424, 70231, 71418, 72560, 73651, 74492, 75394, 76464, 77377, 78446, 79402)), row.names = c(NA, 159L), class "data.frame") Any help and/or guidance will be greatly appreciated, Cheers, Paul [[alternative HTML version deleted]]
peter dalgaard
2020-Sep-02 06:41 UTC
[R] Odd Results when generating predictions with nnet function
Generically, nnet(a$y ~ a$x, a ...) should be nnet(y ~ x, data=a, ...) otherwise predict will go looking for a$x, no matter what is in xnew. But more importantly, nnet() is a _classifier_, so the LHS should be a class, not a numeric variable. -pd> On 1 Sep 2020, at 22:19 , Paul Bernal <paulbernal07 at gmail.com> wrote: > > Dear friends, > > Hope you are all doing well. I am currently using R version 4.0.2 and > working with the nnet package. > > My dataframe consists of three columns, FECHA which is the date, x, which > is a sequence from 1 to 159, and y, which is the number of covid cases (I > am also providing the dput for this data frame below). > > I tried fitting a neural net model using the following code: > > xnew = 1:159 > Fit <- nnet(a$y ~ a$x, a, size = 5, maxit = 1000, lineout = T, decay > 0.001) > > Finally, I attempted to generate predictions with the following code: > > predictions <- predict(Fit, newdata = list(x = xnew), type = "raw") > > But obtained extremely odd results: > As you can see, instead of obtaining numbers, more or less in the range of > the last observations of a$y, I end up getting a bunch of 1s, which > doesn?t make any sense (if anyone could help me understand what could be > causing this): > dput(predictions) > structure(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), .Dim = c(159L, > 1L), .Dimnames = list(c("1", "2", "3", "4", "5", "6", "7", "8", > "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", > "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", > "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", > "42", "43", "44", "45", "46", "47", "48", "49", "50", "51", "52", > "53", "54", "55", "56", "57", "58", "59", "60", "61", "62", "63", > "64", "65", "66", "67", "68", "69", "70", "71", "72", "73", "74", > "75", "76", "77", "78", "79", "80", "81", "82", "83", "84", "85", > "86", "87", "88", "89", "90", "91", "92", "93", "94", "95", "96", > "97", "98", "99", "100", "101", "102", "103", "104", "105", "106", > "107", "108", "109", "110", "111", "112", "113", "114", "115", > "116", "117", "118", "119", "120", "121", "122", "123", "124", > "125", "126", "127", "128", "129", "130", "131", "132", "133", > "134", "135", "136", "137", "138", "139", "140", "141", "142", > "143", "144", "145", "146", "147", "148", "149", "150", "151", > "152", "153", "154", "155", "156", "157", "158", "159"), NULL)) > > head(a) > FECHA x y > 1 2020-03-09 1 1 > 2 2020-03-10 2 8 > 3 2020-03-11 3 14 > 4 2020-03-12 4 27 > 5 2020-03-13 5 36 > 6 2020-03-14 6 43 > > dput(a) > structure(list(FECHA = structure(c(18330, 18331, 18332, 18333, > 18334, 18335, 18336, 18337, 18338, 18339, 18340, 18341, 18342, > 18343, 18344, 18345, 18346, 18347, 18348, 18349, 18350, 18351, > 18352, 18353, 18354, 18355, 18356, 18357, 18358, 18359, 18360, > 18361, 18362, 18363, 18364, 18365, 18366, 18367, 18368, 18369, > 18370, 18371, 18372, 18373, 18374, 18375, 18376, 18377, 18378, > 18379, 18380, 18381, 18382, 18383, 18384, 18385, 18386, 18387, > 18388, 18389, 18390, 18391, 18392, 18393, 18394, 18395, 18396, > 18397, 18398, 18399, 18400, 18401, 18402, 18403, 18404, 18405, > 18406, 18407, 18408, 18409, 18410, 18411, 18412, 18413, 18414, > 18415, 18416, 18417, 18418, 18419, 18420, 18421, 18422, 18423, > 18424, 18425, 18426, 18427, 18428, 18429, 18430, 18431, 18432, > 18433, 18434, 18435, 18436, 18437, 18438, 18439, 18440, 18441, > 18442, 18443, 18444, 18445, 18446, 18447, 18448, 18449, 18450, > 18451, 18452, 18453, 18454, 18455, 18456, 18457, 18458, 18459, > 18460, 18461, 18462, 18463, 18464, 18465, 18466, 18467, 18468, > 18469, 18470, 18471, 18472, 18473, 18474, 18475, 18476, 18477, > 18478, 18479, 18480, 18481, 18482, 18483, 18484, 18485, 18486, > 18487, 18488), class = "Date"), x = 1:159, y = c(1, 8, 14, 27, > 36, 43, 55, 69, 86, 109, 137, 200, 245, 313, 345, 443, 558, 674, > 786, 901, 989, 1075, 1181, 1317, 1475, 1673, 1801, 1988, 2100, > 2249, 2528, 2752, 2974, 3234, 3400, 3472, 3574, 3751, 4016, 4210, > 4273, 4467, 4658, 4821, 4992, 5166, 5338, 5538, 5779, 6021, 6200, > 6378, 6532, 6720, 7090, 7197, 7387, 7523, 7731, 7868, 8070, 8282, > 8448, 8616, 8783, 8944, 9118, 9268, 9449, 9606, 9726, 9867, 9977, > 10116, 10267, 10577, 10926, 11183, 11447, 11728, 12131, 12531, > 13015, 13463, 13837, 14095, 14609, 15044, 15463, 16004, 16425, > 16854, 17233, 17889, 18586, 19211, 20059, 20686, 21422, 21962, > 22597, 23351, 24274, 25222, 26030, 26752, 27314, 28030, 29037, > 29905, 30658, 31686, 32785, 33550, 34463, 35237, 35995, 36983, > 38149, 39334, 40291, 41251, 42216, 43257, 44352, 45633, 47177, > 48096, 49243, 50373, 51408, 52261, 53468, 54426, 55153, 55906, > 56817, 57993, 58864, 60296, 61442, 62223, 63269, 64191, 65256, > 66383, 67453, 68456, 69424, 70231, 71418, 72560, 73651, 74492, > 75394, 76464, 77377, 78446, 79402)), row.names = c(NA, 159L), class > "data.frame") > Any help and/or guidance will be greatly appreciated, > > Cheers, > > Paul > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Martin Maechler
2020-Sep-02 07:37 UTC
[R] Odd Results when generating predictions with nnet function
>>>>> peter dalgaard >>>>> on Wed, 2 Sep 2020 08:41:09 +0200 writes:> Generically, nnet(a$y ~ a$x, a ...) should be nnet(y ~ x, > data=a, ...) otherwise predict will go looking for a$x, no > matter what is in xnew. > But more importantly, nnet() is a _classifier_, > so the LHS should be a class, not a numeric variable. > -pd Well, nnet() can be used for both classification *and* regression, which is quite clear from the MASS book, but indeed, not from its help page, which indeed mentions one formula 'class ~ ...' and then only has classification examples. So, indeed, the ?nnet help page could improved. In his case, y are counts, so John Tukey's good old "first aid transformation" principle would suggest to model sqrt(y) ~ .. in a *regression* model which nnet() can do. Martin Maechler ETH Zurich and R Core team >> On 1 Sep 2020, at 22:19 , Paul Bernal >> <paulbernal07 at gmail.com> wrote: >> >> Dear friends, >> >> Hope you are all doing well. I am currently using R >> version 4.0.2 and working with the nnet package. >> >> My dataframe consists of three columns, FECHA which is >> the date, x, which is a sequence from 1 to 159, and y, >> which is the number of covid cases (I am also providing >> the dput for this data frame below). >> >> I tried fitting a neural net model using the following >> code: >> >> xnew = 1:159 Fit <- nnet(a$y ~ a$x, a, size = 5, maxit >> 1000, lineout = T, decay = 0.001) >> >> Finally, I attempted to generate predictions with the >> following code: >> >> predictions <- predict(Fit, newdata = list(x = xnew), >> type = "raw") >> >> But obtained extremely odd results: As you can see, >> instead of obtaining numbers, more or less in the range >> of the last observations of a$y, I end up getting a bunch >> of 1s, which doesn?t make any sense (if anyone could help >> me understand what could be causing this): >> dput(predictions) structure(c(1, 1, 1, 1, 1, 1, 1, 1, 1, >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), .Dim >> = c(159L, 1L), .Dimnames = list(c("1", "2", "3", "4", >> "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", >> "15", "16", "17", "18", "19", "20", "21", "22", "23", >> "24", "25", "26", "27", "28", "29", "30", "31", "32", >> "33", "34", "35", "36", "37", "38", "39", "40", "41", >> "42", "43", "44", "45", "46", "47", "48", "49", "50", >> "51", "52", "53", "54", "55", "56", "57", "58", "59", >> "60", "61", "62", "63", "64", "65", "66", "67", "68", >> "69", "70", "71", "72", "73", "74", "75", "76", "77", >> "78", "79", "80", "81", "82", "83", "84", "85", "86", >> "87", "88", "89", "90", "91", "92", "93", "94", "95", >> "96", "97", "98", "99", "100", "101", "102", "103", >> "104", "105", "106", "107", "108", "109", "110", "111", >> "112", "113", "114", "115", "116", "117", "118", "119", >> "120", "121", "122", "123", "124", "125", "126", "127", >> "128", "129", "130", "131", "132", "133", "134", "135", >> "136", "137", "138", "139", "140", "141", "142", "143", >> "144", "145", "146", "147", "148", "149", "150", "151", >> "152", "153", "154", "155", "156", "157", "158", "159"), >> NULL)) >> >> head(a) FECHA x y 1 2020-03-09 1 1 2 2020-03-10 2 8 3 >> 2020-03-11 3 14 4 2020-03-12 4 27 5 2020-03-13 5 36 6 >> 2020-03-14 6 43 >> >> dput(a) structure(list(FECHA = structure(c(18330, 18331, >> 18332, 18333, 18334, 18335, 18336, 18337, 18338, 18339, >> 18340, 18341, 18342, 18343, 18344, 18345, 18346, 18347, >> 18348, 18349, 18350, 18351, 18352, 18353, 18354, 18355, >> 18356, 18357, 18358, 18359, 18360, 18361, 18362, 18363, >> 18364, 18365, 18366, 18367, 18368, 18369, 18370, 18371, >> 18372, 18373, 18374, 18375, 18376, 18377, 18378, 18379, >> 18380, 18381, 18382, 18383, 18384, 18385, 18386, 18387, >> 18388, 18389, 18390, 18391, 18392, 18393, 18394, 18395, >> 18396, 18397, 18398, 18399, 18400, 18401, 18402, 18403, >> 18404, 18405, 18406, 18407, 18408, 18409, 18410, 18411, >> 18412, 18413, 18414, 18415, 18416, 18417, 18418, 18419, >> 18420, 18421, 18422, 18423, 18424, 18425, 18426, 18427, >> 18428, 18429, 18430, 18431, 18432, 18433, 18434, 18435, >> 18436, 18437, 18438, 18439, 18440, 18441, 18442, 18443, >> 18444, 18445, 18446, 18447, 18448, 18449, 18450, 18451, >> 18452, 18453, 18454, 18455, 18456, 18457, 18458, 18459, >> 18460, 18461, 18462, 18463, 18464, 18465, 18466, 18467, >> 18468, 18469, 18470, 18471, 18472, 18473, 18474, 18475, >> 18476, 18477, 18478, 18479, 18480, 18481, 18482, 18483, >> 18484, 18485, 18486, 18487, 18488), class = "Date"), x >> 1:159, y = c(1, 8, 14, 27, 36, 43, 55, 69, 86, 109, 137, >> 200, 245, 313, 345, 443, 558, 674, 786, 901, 989, 1075, >> 1181, 1317, 1475, 1673, 1801, 1988, 2100, 2249, 2528, >> 2752, 2974, 3234, 3400, 3472, 3574, 3751, 4016, 4210, >> 4273, 4467, 4658, 4821, 4992, 5166, 5338, 5538, 5779, >> 6021, 6200, 6378, 6532, 6720, 7090, 7197, 7387, 7523, >> 7731, 7868, 8070, 8282, 8448, 8616, 8783, 8944, 9118, >> 9268, 9449, 9606, 9726, 9867, 9977, 10116, 10267, 10577, >> 10926, 11183, 11447, 11728, 12131, 12531, 13015, 13463, >> 13837, 14095, 14609, 15044, 15463, 16004, 16425, 16854, >> 17233, 17889, 18586, 19211, 20059, 20686, 21422, 21962, >> 22597, 23351, 24274, 25222, 26030, 26752, 27314, 28030, >> 29037, 29905, 30658, 31686, 32785, 33550, 34463, 35237, >> 35995, 36983, 38149, 39334, 40291, 41251, 42216, 43257, >> 44352, 45633, 47177, 48096, 49243, 50373, 51408, 52261, >> 53468, 54426, 55153, 55906, 56817, 57993, 58864, 60296, >> 61442, 62223, 63269, 64191, 65256, 66383, 67453, 68456, >> 69424, 70231, 71418, 72560, 73651, 74492, 75394, 76464, >> 77377, 78446, 79402)), row.names = c(NA, 159L), class >> "data.frame") Any help and/or guidance will be greatly >> appreciated, >> >> Cheers, >> >> Paul >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and >> more, see https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html and provide >> commented, minimal, self-contained, reproducible code. > -- > Peter Dalgaard, Professor, Center for Statistics, > Copenhagen Business School Solbjerg Plads 3, 2000 > Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and > more, see https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide > commented, minimal, self-contained, reproducible code.