Dear R-users: I am R-usser begginer and at the same time a beginner running discriminant analysis; I wanted to perform a DA using just the 80?% of the original data but I have some problems with simmilarity in variables, Here my Skript.... set.seed(123) data80 <- data[sample(472, 378), ] data80 #Remove all missing values listwise data80.withoutna<-na.omit(data80) #Group variable data_grouping80<-data80.withoutna[,2] data_grouping80 #dim(data80) #Possible independent variables variables80<-data80.withoutna[,4:447] variables80 #Data set for discriminant analysis ldadataset80<-cbind(data_grouping80,variables80) ldadataset80 #Discriminant analysis as SPSS does it (excluded variables by SPSS, denoted by -) library(MASS) model_lda80<-lda(data_grouping80 ~. ,data=ldadataset80, prior=c(255/471,100/471,76/471,40/471)) model_lda80<-lda(data_grouping80 ~. -CHLOSTA-DIGGRAN-DRYFILMA-EQUSYLV-EUPALPI-GERPHAE-GERROBE-HYPORAD-JUNCOMNA-JUNEFFU-JUNFILI-JUNJAQU-JUNTRIF-KNAARVE-KNAMAXI-KOBMYOS-KOEHIRS-KOEPYRA-LASHALL-LASKRAP-LATLAEV-LATPRAT-LEOHISP-LEUALPI-LEUVULG- LILMART-LINCART-LISOVAT-LOIPROC-LOLPERE-LOTCORSL-LUZCAMP-LUZLUTE-LUZLUZO-LUZPILO-LUZSPIC-LUZSYLV-LYCALPI-MAIBIFO-MELPRAT-MELSYLV-MENAQUA-MINGERA-MOECILI-MOLCAER-MUTADON-MYOALPE-MYOARVE-MYODECU-MYOSCOR- NARSTRI-NIGRHEL-ONOMONT-OREDIST-OXYCAMP-PARLILI-PARPALU-PEDELON-PEDFOLI-PEDROSC-PEDTUBE-PEDVERT-PELAPHT-PERBIST-PERVIVI-PEUOSTR-PHLCOMM-PHLPRAT-PHLRHAE-PHYBETO-PHYHEMI-PHYORBI-PHYOVAT-PICABIE-PIMMAJO- PIMSAXI-PINCEMB-PINMUG-PINVULG-PLAALPI-PLAATRA-PLABIFOL-PLALANC-PLAMEDI-PLESCHR-POAALPI-POAAMAR-POAANN-POAPRAT-POASUPI-POATRIV-POAVARI-POLALPE-POLAMAR-POLCOMO-POLJUNI-POLVULG-POTANSE-POTAURE-POTCRAN-POTEREC- POTGRAND-PRIAURI-PRIELAT-PRIFARI-PRIMINI-PRIVERI-PRUGRAN-PRUVULG-PSEALBI-PULALPAL-PULALPAP-PULANGU-PULVERN-PYRCHLO-PYRMEDI-RANACON-RANACRI-RANBULB-RANMONT-RANNEMO-RHIALEC-RHIGLAC-RHIMINO-RHOFERR-RHYSQUA- RHYTRIQ-ROSPEND-RUMACELL-RUMACET-RUMALPE-RUMALPI-RUMOBTU-RUMSCUT-SAGINASP-SALAURI-SALHERB-SALRETI-SALRETU-SALVPRA-SANMINO-SANOFFI-SCACANE-SCACOLU-SCALUCI-SCOAUTU-SCOHELV-SCOHUMI-SCOMONT-SELSELA-SEMARAC- SEMMONT-SEMWULF-SENABRO-SENDORO-SENINCA--SESALBI-SIBPROCU-SILACAU-SILDIOI-SILLATI-SILNUTA-SILVULG-SOLALPI-SOLMINI-SOLPUSI-SOLVIRG-SORAUCU-STEGRAM-STEMEDI-TARALPI-TAROFFI-THAAQUI-THEALPI-THYPRAE-THEPYR- THYPULE-THYSERP-TOFCALY-TRAGLOB-TRAPRAT-TRIALPE-TRIALPI-TRIBADI-TRICESP-TRIFLAV-TRIMEDI-TRIMONT-TRIPRAT-TRIREPE-TROEURO-URTDIOI-VACGAUL-VACMYRT-VACVITI-VALMONT-VALOFFI-VERALBU-VERALPI-VERBELL-VERCHAM- VERFRUT-VEROFFI-VERSERP-VICCRAC-VIOBIFL-VIOCANI-VIOHIRT-VIOTHOM-VIOTRIC-WILSTIP ,data=ldadataset80) ##New variables# (variables 82 103 128 146 181 appear to be constant within groups) #####I got as an answer that some variables are constant within groups, so I delete them fro the data as follows set.seed(123) data80 <- data[sample(472, 378), ] data80 newdata80 <- data80[c(-82,-103,-128,-146,-181)] newdata80 #####Then I computed the whole analisis again, but then i got the same answer at the end, just in this case the variables are different.. #Remove all missing values listwise newdata80.withoutna<-na.omit(newdata80) newdata80.withoutna #Group variable ndata_grouping80<-newdata80.withoutna[,2] ndata_grouping80 dim(newdata80) #Possible independent variables nvariables80<-newdata80.withoutna[,4:442] nvariables80 ldadatasetn80<-cbind(ndata_grouping80,nvariables80) ldadatasetn80 library(MASS) model_ldan80<-lda(ndata_grouping80 ~. -CHLOSTA-DIGGRAN-DRYFILMA-EQUSYLV-EUPALPI-GERPHAE-GERROBE-HYPORAD-JUNCOMNA-JUNEFFU-JUNFILI-JUNJAQU-JUNTRIF-KNAARVE-KNAMAXI-KOBMYOS-KOEHIRS-KOEPYRA-LASHALL-LASKRAP-LATLAEV-LATPRAT-LEOHISP-LEUALPI-LEUVULG- LILMART-LINCART-LISOVAT-LOIPROC-LOL.......,data=ldadatasetn80) Were an I falling? I can?t understand this seceond answer wit new similar variables when I alreday drop the ?variables that initially were similar within groups ones said .. Thank you very much in advance!!!! Kind regards