I have a dataset (data) with 700 rows and 7000 columns. I am trying to do recursive feature selection with the SVM model. A quick google search helped me get a code for a recursive search with SVM. However, I am unable to understand the first part of the code, How do I introduce my dataset in the code? If the dataset is a matrix, named data. Please give me an example for recursive feature selection with SVM. Bellow is the code I got for recursive feature search. svmrfeFeatureRanking = function(x,y){ #Checking for the variables stopifnot(!is.null(x) == TRUE, !is.null(y) == TRUE) n = ncol(x) survivingFeaturesIndexes = seq_len(n) featureRankedList = vector(length=n) rankedFeatureIndex = n while(length(survivingFeaturesIndexes)>0){ #train the support vector machine svmModel = svm(x[, survivingFeaturesIndexes], y, cost = 10, cachesize=500, scale=FALSE, type="C-classification", kernel="linear" ) #compute the weight vector w = t(svmModel$coefs)%*%svmModel$SV #compute ranking criteria rankingCriteria = w * w #rank the features ranking = sort(rankingCriteria, index.return = TRUE)$ix #update feature ranked list featureRankedList[rankedFeatureIndex] survivingFeaturesIndexes[ranking[1]] rankedFeatureIndex = rankedFeatureIndex - 1 #eliminate the feature with smallest ranking criterion (survivingFeaturesIndexes = survivingFeaturesIndexes[-ranking[1]])} return (featureRankedList)} I tried taking an idea from the above code and incorporate the idea in my code as shown below library(e1071) library(caret) data<- read.csv("matrix.csv", header = TRUE) x <- data y <- as.factor(data$Class) svmrfeFeatureRanking = function(x,y){ #Checking for the variables stopifnot(!is.null(x) == TRUE, !is.null(y) == TRUE) n = ncol(x) survivingFeaturesIndexes = seq_len(n) featureRankedList = vector(length=n) rankedFeatureIndex = n while(length(survivingFeaturesIndexes)>0){ #train the support vector machine svmModel = svm(x[, survivingFeaturesIndexes], y, cross=10,cost 10, type="C-classification", kernel="linear" ) #compute the weight vector w = t(svmModel$coefs)%*%svmModel$SV #compute ranking criteria rankingCriteria = w * w #rank the features ranking = sort(rankingCriteria, index.return = TRUE)$ix #update feature ranked list featureRankedList[rankedFeatureIndex] survivingFeaturesIndexes[ranking[1]] rankedFeatureIndex = rankedFeatureIndex - 1 #eliminate the feature with smallest ranking criterion (survivingFeaturesIndexes = survivingFeaturesIndexes[-ranking[1]])} return (featureRankedList)} But couldn't do anything at the stage "update feature ranked list" Please guide [[alternative HTML version deleted]]
On 1/1/19 4:40 AM, Priyanka Purkayastha wrote:> I have a dataset (data) with 700 rows and 7000 columns. I am trying to do > recursive feature selection with the SVM model. A quick google search > helped me get a code for a recursive search with SVM. However, I am unable > to understand the first part of the code, How do I introduce my dataset in > the code?Generally the "labels" is given to such a machine learning device as the y argument, while the "features" are passed as a matrix to the x argument. -- David.> > If the dataset is a matrix, named data. Please give me an example for > recursive feature selection with SVM. Bellow is the code I got for > recursive feature search. > > svmrfeFeatureRanking = function(x,y){ > > #Checking for the variables > stopifnot(!is.null(x) == TRUE, !is.null(y) == TRUE) > > n = ncol(x) > survivingFeaturesIndexes = seq_len(n) > featureRankedList = vector(length=n) > rankedFeatureIndex = n > > while(length(survivingFeaturesIndexes)>0){ > #train the support vector machine > svmModel = svm(x[, survivingFeaturesIndexes], y, cost = 10, > cachesize=500, > scale=FALSE, type="C-classification", kernel="linear" ) > > #compute the weight vector > w = t(svmModel$coefs)%*%svmModel$SV > > #compute ranking criteria > rankingCriteria = w * w > > #rank the features > ranking = sort(rankingCriteria, index.return = TRUE)$ix > > #update feature ranked list > featureRankedList[rankedFeatureIndex] > survivingFeaturesIndexes[ranking[1]] > rankedFeatureIndex = rankedFeatureIndex - 1 > > #eliminate the feature with smallest ranking criterion > (survivingFeaturesIndexes = survivingFeaturesIndexes[-ranking[1]])} > return (featureRankedList)} > > > > I tried taking an idea from the above code and incorporate the idea in my > code as shown below > > library(e1071) > library(caret) > > data<- read.csv("matrix.csv", header = TRUE) > > x <- data > y <- as.factor(data$Class) > > svmrfeFeatureRanking = function(x,y){ > > #Checking for the variables > stopifnot(!is.null(x) == TRUE, !is.null(y) == TRUE) > > n = ncol(x) > survivingFeaturesIndexes = seq_len(n) > featureRankedList = vector(length=n) > rankedFeatureIndex = n > > while(length(survivingFeaturesIndexes)>0){ > #train the support vector machine > svmModel = svm(x[, survivingFeaturesIndexes], y, cross=10,cost > 10, type="C-classification", kernel="linear" ) > > #compute the weight vector > w = t(svmModel$coefs)%*%svmModel$SV > > #compute ranking criteria > rankingCriteria = w * w > > #rank the features > ranking = sort(rankingCriteria, index.return = TRUE)$ix > > #update feature ranked list > featureRankedList[rankedFeatureIndex] > survivingFeaturesIndexes[ranking[1]] > rankedFeatureIndex = rankedFeatureIndex - 1 > > #eliminate the feature with smallest ranking criterion > (survivingFeaturesIndexes = survivingFeaturesIndexes[-ranking[1]])} > > return (featureRankedList)} > > But couldn't do anything at the stage "update feature ranked list" > Please guide > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Thankyou David.. I tried the same, I gave x as the data matrix and y as the class label. But it returned an empty "featureRankedList". I get no output when I try the code. On Tue, 1 Jan 2019 at 11:42 PM, David Winsemius <dwinsemius at comcast.net> wrote:> > On 1/1/19 4:40 AM, Priyanka Purkayastha wrote: > > I have a dataset (data) with 700 rows and 7000 columns. I am trying to do > > recursive feature selection with the SVM model. A quick google search > > helped me get a code for a recursive search with SVM. However, I am > unable > > to understand the first part of the code, How do I introduce my dataset > in > > the code? > > > Generally the "labels" is given to such a machine learning device as the > y argument, while the "features" are passed as a matrix to the x argument. > > > -- > > David. > > > > > If the dataset is a matrix, named data. Please give me an example for > > recursive feature selection with SVM. Bellow is the code I got for > > recursive feature search. > > > > svmrfeFeatureRanking = function(x,y){ > > > > #Checking for the variables > > stopifnot(!is.null(x) == TRUE, !is.null(y) == TRUE) > > > > n = ncol(x) > > survivingFeaturesIndexes = seq_len(n) > > featureRankedList = vector(length=n) > > rankedFeatureIndex = n > > > > while(length(survivingFeaturesIndexes)>0){ > > #train the support vector machine > > svmModel = svm(x[, survivingFeaturesIndexes], y, cost = 10, > > cachesize=500, > > scale=FALSE, type="C-classification", kernel="linear" ) > > > > #compute the weight vector > > w = t(svmModel$coefs)%*%svmModel$SV > > > > #compute ranking criteria > > rankingCriteria = w * w > > > > #rank the features > > ranking = sort(rankingCriteria, index.return = TRUE)$ix > > > > #update feature ranked list > > featureRankedList[rankedFeatureIndex] > > survivingFeaturesIndexes[ranking[1]] > > rankedFeatureIndex = rankedFeatureIndex - 1 > > > > #eliminate the feature with smallest ranking criterion > > (survivingFeaturesIndexes = survivingFeaturesIndexes[-ranking[1]])} > > return (featureRankedList)} > > > > > > > > I tried taking an idea from the above code and incorporate the idea in my > > code as shown below > > > > library(e1071) > > library(caret) > > > > data<- read.csv("matrix.csv", header = TRUE) > > > > x <- data > > y <- as.factor(data$Class) > > > > svmrfeFeatureRanking = function(x,y){ > > > > #Checking for the variables > > stopifnot(!is.null(x) == TRUE, !is.null(y) == TRUE) > > > > n = ncol(x) > > survivingFeaturesIndexes = seq_len(n) > > featureRankedList = vector(length=n) > > rankedFeatureIndex = n > > > > while(length(survivingFeaturesIndexes)>0){ > > #train the support vector machine > > svmModel = svm(x[, survivingFeaturesIndexes], y, cross=10,cost > > 10, type="C-classification", kernel="linear" ) > > > > #compute the weight vector > > w = t(svmModel$coefs)%*%svmModel$SV > > > > #compute ranking criteria > > rankingCriteria = w * w > > > > #rank the features > > ranking = sort(rankingCriteria, index.return = TRUE)$ix > > > > #update feature ranked list > > featureRankedList[rankedFeatureIndex] > > survivingFeaturesIndexes[ranking[1]] > > rankedFeatureIndex = rankedFeatureIndex - 1 > > > > #eliminate the feature with smallest ranking criterion > > (survivingFeaturesIndexes > survivingFeaturesIndexes[-ranking[1]])} > > > > return (featureRankedList)} > > > > But couldn't do anything at the stage "update feature ranked list" > > Please guide > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >-- Regards, Priyanka Purkayastha, M.Tech, Ph.D., SERB National Postdoctoral Researcher Genomics and Systems Biology Lab, Department of Chemical Engineering, Indian Institute of Technology Bombay (IITB), Powai, Mumbai- 400076 [[alternative HTML version deleted]]