Dear all, i am new in R. I am writing a review paper about batteries. However, i am interested in analyzing all the papers by keywords, author, references and year. This could be done by "refviz" a software, which is only running on windows machines and which is not free. So my question to you is, is it somehow possible to write a script that can do all of this work? And if yes, with what i should start? Thanks a lot in advance, Schwan -- ------------------------------------------- Hosseiny, MSc. S.S. (Seyed Schwan) University of Twente Science and Technology Meander, ME 322 P.O. Box 217 7500 AE Enschede The Netherlands Phone +31 534892869 Email: S.S.Hosseiny at utwente.n
Schwan wrote:> > Dear all, > > i am new in R. I am writing a review paper about batteries. However, i > am interested in analyzing all the papers by keywords, author, > references and year. > This could be done by "refviz" a software, which is only running on > windows machines and which is not free. > > So my question to you is, is it somehow possible to write a script that > can do all of this work?Describing what you mean by "all of this work" would be useful as the above is rather vague as you don't describe what analysis refviz actually performs. Schwan wrote:> > And if yes, with what i should start? >Start learning how to use R. There are good links from the R-project homepage under the Wiki, Other and Books section. I've found Braun & Murdoch "A First Course in Statistical Programming with R" to be a good book to get me going. Neil -- View this message in context: http://n4.nabble.com/Literature-analysis-tp960960p960968.html Sent from the R help mailing list archive at Nabble.com.
Hi, from what I understand, you may be interested in text mining, so perhaps you want to look at the tm package. Then again, depending on what you are really trying to do, you may be better served with perl, awk and similar tools than with R... HTH, Stephan Schwan schrieb:> Dear all, > > i am new in R. I am writing a review paper about batteries. However, i > am interested in analyzing all the papers by keywords, author, > references and year. > This could be done by "refviz" a software, which is only running on > windows machines and which is not free. > > So my question to you is, is it somehow possible to write a script that > can do all of this work? > > And if yes, with what i should start? > > Thanks a lot in advance, > > Schwan >
Thanks for all the comments, and sorry about the unstructured question! I am trying to: 1: analyze keywords, names from Authors and year of publication from citations (with abstracts) i downloaded fron various sites(these downloads can be converted into ".txt" files as well) 2: to cluster these literature according to the analyzed keywords, authors or year of publication The software Refviz I was referring to earlier can be found here: http://refviz.com/ As I said, I have never worked with R before so I can not send any example. Hope this helps to understand my question better. Cheers On Fri, 2009-12-11 at 13:06 +0100, Stephan Kolassa wrote:> Hi, > > from what I understand, you may be interested in text mining, so perhaps > you want to look at the tm package. > > Then again, depending on what you are really trying to do, you may be > better served with perl, awk and similar tools than with R... > > HTH, > Stephan > > > Schwan schrieb: > > Dear all, > > > > i am new in R. I am writing a review paper about batteries. However, i > > am interested in analyzing all the papers by keywords, author, > > references and year. > > This could be done by "refviz" a software, which is only running on > > windows machines and which is not free. > > > > So my question to you is, is it somehow possible to write a script that > > can do all of this work? > > > > And if yes, with what i should start? > > > > Thanks a lot in advance, > > > > Schwan > > >-- ------------------------------------------- Hosseiny, MSc. S.S. (Seyed Schwan) University of Twente Science and Technology Meander, ME 322 P.O. Box 217 7500 AE Enschede The Netherlands Phone +31 534892869 Email: S.S.Hosseiny at utwente.n
Dear Liviu, I have tried the Rcmdr GUI but when i load the data, there is no active data set(this is the error message i got). Can someone help me further to realize my project? Thanks On Fri, 2009-12-11 at 19:05 +0000, Liviu Andronic wrote:> Hello > > On 12/11/09, Schwan <s.s.hosseiny at utwente.nl> wrote: > > However, I dont know how to tell R that it just should look for e.g. > > author,keywords and year and how to plot these for example on x axis the > > author and y axis the keywords and on z axis the year? > > > I suggest that you try to get going in R with the Rcmdr GUI. Import > the data using Data > Import > Text file. Then in Graphics try various > graphs: Scatterplot, Scatterplot matrix, 3D Scatterplot. > > Regards > Liviu-- ------------------------------------------- Hosseiny, MSc. S.S. (Seyed Schwan) University of Twente Science and Technology Meander, ME 322 P.O. Box 217 7500 AE Enschede The Netherlands Phone +31 534892869 Email: S.S.Hosseiny at utwente.n
Hello On 12/14/09, Schwan <s.s.hosseiny at utwente.nl> wrote:> I have tried the Rcmdr GUI but when i load the data, there is no active > data set(this is the error message i got). >Can you post the exact error message? Also, could you post a sample data file (for example, on tinyupload.com) so that we could replicate what you do? Liviu
I have uploaded the csv file here: http://s000.tinyupload.com/?file_id=79349565739733953435 It contains the citations, which I generated from the bibtex file. The exact error message is: [2] ERROR: more columns than column names A screenshot of the Rcmd is located here: http://s000.tinyupload.com/?file_id=82322752628856954474 On Mon, 2009-12-14 at 11:37 +0000, Liviu Andronic wrote:> Hello > > On 12/14/09, Schwan <s.s.hosseiny at utwente.nl> wrote: > > I have tried the Rcmdr GUI but when i load the data, there is no active > > data set(this is the error message i got). > > > Can you post the exact error message? Also, could you post a sample > data file (for example, on tinyupload.com) so that we could replicate > what you do? > Liviu-- ------------------------------------------- Hosseiny, MSc. S.S. (Seyed Schwan) University of Twente Science and Technology Meander, ME 322 P.O. Box 217 7500 AE Enschede The Netherlands Phone +31 534892869 Email: S.S.Hosseiny at utwente.n
The screenshot shows your error. Just have a look at the error message in the bottom part of the window. Your csv file has more columns than column names. On Dec 14, 2009, at 2:42 PM, Schwan wrote:> I have uploaded the csv file here: > > http://s000.tinyupload.com/?file_id=79349565739733953435 > > It contains the citations, which I generated from the bibtex file. > The exact error message is: > > > [2] ERROR: > more columns than column names > > A screenshot of the Rcmd is located here: > > http://s000.tinyupload.com/?file_id=82322752628856954474 > > > > > > On Mon, 2009-12-14 at 11:37 +0000, Liviu Andronic wrote: >> Hello >> >> On 12/14/09, Schwan <s.s.hosseiny at utwente.nl> wrote: >>> I have tried the Rcmdr GUI but when i load the data, there is no active >>> data set(this is the error message i got). >>> >> Can you post the exact error message? Also, could you post a sample >> data file (for example, on tinyupload.com) so that we could replicate >> what you do? >> Liviu > > > -- > ------------------------------------------- > Hosseiny, MSc. S.S. (Seyed Schwan) > University of Twente Science and Technology > Meander, ME 322 > P.O. Box 217 7500 AE Enschede > The Netherlands > Phone +31 534892869 > Email: S.S.Hosseiny at utwente.n > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On 12/14/09, Schwan <s.s.hosseiny at utwente.nl> wrote:> [2] ERROR: > more columns than column names >I looked at the data and there is a column name called "ISBN#". Try to remove "#" and then import the data. Liviu
Hello, Sorry to arrive late on this. Did you try the recently uploaded "bibtex" package. It reads a bibtex file into an R list (of class citationList). So for example : if your posted example lives in biblio.bib for example, you can read it like this : > bib <- read.bib( "biblio.bib" ) > sapply( bib[[1]], function(item) item$title ) [1] "Adsorption and Diffusion of VO<sup>2+</sup> and\nVO<sub>2</sub> <sup>+</sup>\n\tacross Cation Membrane for All-Vanadium Redox Flow Battery" [2] "Modification of Daramic, microporous separator, for redox\nflow battery\n\tapplications" Romain On 12/11/2009 03:41 PM, Schwan wrote:> > They are in Bibtex > > For example: > > "@ARTICLE{adsdifvanadiumcationexchange, > author = {Jin-qing Chen and Bao-guo Wang and Ji-chu Yang}, > title = {Adsorption and Diffusion of VO<sup>2+</sup> and > VO<sub>2</sub> <sup>+</sup> > across Cation Membrane for All-Vanadium Redox Flow Battery}, > journal = {Solvent Extraction and Ion Exchange}, > year = {2009}, > volume = {27}, > pages = {312--327}, > number = {2}, > abstract = {A method based on a selectivity coefficient and the > Nernst-Planck > equation is proposed to determine diffusion coefficients of vanadium > ions across a cation exchange membrane in VO<sup>2+</sup>/H<sup>+</sup> > and VO<sub>2</sub> <sup>+</sup>/H<sup>+</sup> systems. This simplified > method can be applied to high concentrations of vanadium ions. Three > cation exchange membranes were studied. The logarithmic value of > the selectivity coefficient was linearly dependent on the molar > fraction > of vanadium ions in solution. The diffusion coefficient of vanadium > ions decreased with decreasing water content. The membrane with the > lowest diffusion coefficient was selected as a battery separator > and showed the lowest capacity loss of the studied membranes.}, > issn = {0736-6299}, > publisher = {Taylor \& Francis}, > url = {http://www.informaworld.com/10.1080/07366290802674614} > } > > @ARTICLE{Chieng1992, > author = {Chieng, S.C. and Kazacos, M. and Skyllas-Kazacos, M.}, > title = {Modification of Daramic, microporous separator, for redox > flow battery > applications}, > journal = {Journal of Membrane Science}, > year = {1992}, > volume = {75}, > pages = {81--91}, > number = {1-2}, > month = dec, > issn = {0376-7388}, > keywords = {Daramic, microporous separator, redox flow cell and > battery}, > owner = {schwan}, > timestamp = {2009.11.30}, > url > {http://www.sciencedirect.com/science/article/B6TGK-43S71CR-7K/2/06f90d391c0eff0ff5df3f282ad5fe28} > }" > > > > On Fri, 2009-12-11 at 15:37 +0100, Gustaf Rydevik wrote: >> >> >> On Fri, Dec 11, 2009 at 3:04 PM, Schwan<s.s.hosseiny at utwente.nl> >> wrote: >> Thanks, but how should I put the citation inside a data frame? >> >> data.frame(first txt file, second txt file...) >> plot (what should I insert here????) type="p" >> >> And how should I load the txt files anyway inside the frame? >> >> >> >> >> Can you give an example of a couple of text files? Are they in a >> standardised format (i.e. bibTEX or similar)? >> >> >> /Gustaf >> >> -- >> Gustaf Rydevik, M.Sci. >> tel: +46(0)703 051 451 >> address:Essingetorget 40,112 66 Stockholm, SE >> skype:gustaf_rydevik > >-- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/HlX9 : new package : bibtex |- http://tr.im/Gq7i : ohloh `- http://tr.im/FtUu : new package : highlight
Thanks, but unfortunately somehow the package "bibtex" dont want to install (actually it installs, but if i follow your instruction: > bib <- read.bib( "/home/schwan/Desktop/science.bib" ) I got an Error Message which says: > Error: could not find function "read.bib" I already installed the package as root, but this didnt help as well, also I copied the package inside the Library of R, where all the other packages are, but also this was not successful. Any help? Thanks On Mon, 2009-12-14 at 15:41 +0100, Erich Neuwirth wrote:> The screenshot shows your error. > Just have a look at the error message in the bottom part of the window. > Your csv file has more columns than column names. > > > On Dec 14, 2009, at 2:42 PM, Schwan wrote: > > > I have uploaded the csv file here: > > > > http://s000.tinyupload.com/?file_id=79349565739733953435 > > > > It contains the citations, which I generated from the bibtex file. > > The exact error message is: > > > > > > [2] ERROR: > > more columns than column names > > > > A screenshot of the Rcmd is located here: > > > > http://s000.tinyupload.com/?file_id=82322752628856954474 > > > > > > > > > > > > On Mon, 2009-12-14 at 11:37 +0000, Liviu Andronic wrote: > >> Hello > >> > >> On 12/14/09, Schwan <s.s.hosseiny at utwente.nl> wrote: > >>> I have tried the Rcmdr GUI but when i load the data, there is no active > >>> data set(this is the error message i got). > >>> > >> Can you post the exact error message? Also, could you post a sample > >> data file (for example, on tinyupload.com) so that we could replicate > >> what you do? > >> Liviu > > > > > > -- > > ------------------------------------------- > > Hosseiny, MSc. S.S. (Seyed Schwan) > > University of Twente Science and Technology > > Meander, ME 322 > > P.O. Box 217 7500 AE Enschede > > The Netherlands > > Phone +31 534892869 > > Email: S.S.Hosseiny at utwente.n > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > >-- ------------------------------------------- Hosseiny, MSc. S.S. (Seyed Schwan) University of Twente Science and Technology Meander, ME 322 P.O. Box 217 7500 AE Enschede The Netherlands Phone +31 534892869 Email: S.S.Hosseiny at utwente.n
On 12/14/09, Schwan <s.s.hosseiny at utwente.nl> wrote:> but unfortunately somehow the package "bibtex" dont want to install > (actually it installs, but if i follow your instruction: > bib <- > read.bib( "/home/schwan/Desktop/science.bib" ) I got an Error Message > which says: > Error: could not find function "read.bib" >Did you library(bibtex) before running the function? Liviu
OK this does work: I can subtract eg the titles. But how can I say R that it should make a link between title, keyword and frequency of the keywords in order to have a 3 d plot? Thanks a lot for helping me until here anyway! Cheers On Mon, 2009-12-14 at 16:15 +0100, Romain Francois wrote:> Hello, > > Sorry to arrive late on this. Did you try the recently uploaded "bibtex" > package. It reads a bibtex file into an R list (of class citationList). > > So for example : if your posted example lives in biblio.bib for example, > you can read it like this : > > > bib <- read.bib( "biblio.bib" ) > > sapply( bib[[1]], function(item) item$title ) > [1] "Adsorption and Diffusion of VO<sup>2+</sup> and\nVO<sub>2</sub> > <sup>+</sup>\n\tacross Cation Membrane for All-Vanadium Redox Flow Battery" > [2] "Modification of Daramic, microporous separator, for redox\nflow > battery\n\tapplications" > > Romain > > On 12/11/2009 03:41 PM, Schwan wrote: > > > > They are in Bibtex > > > > For example: > > > > "@ARTICLE{adsdifvanadiumcationexchange, > > author = {Jin-qing Chen and Bao-guo Wang and Ji-chu Yang}, > > title = {Adsorption and Diffusion of VO<sup>2+</sup> and > > VO<sub>2</sub> <sup>+</sup> > > across Cation Membrane for All-Vanadium Redox Flow Battery}, > > journal = {Solvent Extraction and Ion Exchange}, > > year = {2009}, > > volume = {27}, > > pages = {312--327}, > > number = {2}, > > abstract = {A method based on a selectivity coefficient and the > > Nernst-Planck > > equation is proposed to determine diffusion coefficients of vanadium > > ions across a cation exchange membrane in VO<sup>2+</sup>/H<sup>+</sup> > > and VO<sub>2</sub> <sup>+</sup>/H<sup>+</sup> systems. This simplified > > method can be applied to high concentrations of vanadium ions. Three > > cation exchange membranes were studied. The logarithmic value of > > the selectivity coefficient was linearly dependent on the molar > > fraction > > of vanadium ions in solution. The diffusion coefficient of vanadium > > ions decreased with decreasing water content. The membrane with the > > lowest diffusion coefficient was selected as a battery separator > > and showed the lowest capacity loss of the studied membranes.}, > > issn = {0736-6299}, > > publisher = {Taylor \& Francis}, > > url = {http://www.informaworld.com/10.1080/07366290802674614} > > } > > > > @ARTICLE{Chieng1992, > > author = {Chieng, S.C. and Kazacos, M. and Skyllas-Kazacos, M.}, > > title = {Modification of Daramic, microporous separator, for redox > > flow battery > > applications}, > > journal = {Journal of Membrane Science}, > > year = {1992}, > > volume = {75}, > > pages = {81--91}, > > number = {1-2}, > > month = dec, > > issn = {0376-7388}, > > keywords = {Daramic, microporous separator, redox flow cell and > > battery}, > > owner = {schwan}, > > timestamp = {2009.11.30}, > > url > > {http://www.sciencedirect.com/science/article/B6TGK-43S71CR-7K/2/06f90d391c0eff0ff5df3f282ad5fe28} > > }" > > > > > > > > On Fri, 2009-12-11 at 15:37 +0100, Gustaf Rydevik wrote: > >> > >> > >> On Fri, Dec 11, 2009 at 3:04 PM, Schwan<s.s.hosseiny at utwente.nl> > >> wrote: > >> Thanks, but how should I put the citation inside a data frame? > >> > >> data.frame(first txt file, second txt file...) > >> plot (what should I insert here????) type="p" > >> > >> And how should I load the txt files anyway inside the frame? > >> > >> > >> > >> > >> Can you give an example of a couple of text files? Are they in a > >> standardised format (i.e. bibTEX or similar)? > >> > >> > >> /Gustaf > >> > >> -- > >> Gustaf Rydevik, M.Sci. > >> tel: +46(0)703 051 451 > >> address:Essingetorget 40,112 66 Stockholm, SE > >> skype:gustaf_rydevik > > > > > >-- ------------------------------------------- Hosseiny, MSc. S.S. (Seyed Schwan) University of Twente Science and Technology Meander, ME 322 P.O. Box 217 7500 AE Enschede The Netherlands Phone +31 534892869 Email: S.S.Hosseiny at utwente.n