Hi
I am very acquainted with R. I use it occasionally via the org-babel library of
GNU emacs.
I wanted to check the first, second and third quartiles of the scientific
science index JCR
https://support.clarivate.com/ScientificandAcademicResearch/s/article/Journal-Citation-Reports-Quartile-rankings-and-other-metrics?language=en_U
S
Its criterion is
#+begin_src
| Quartil | range | |
| ---------+------------------+---------------------------------------|
| Q1 | 0.0 < Z \leq 0.25 | Highest ranked journals in a category |
| Q2 | 0.25 < Z \leq 0.5 | |
| Q3 | 0.5 < Z \leq 0.75 | |
| Q4 | 0.75 < Z | Lowest ranked journals in a category |
#+end_src
Z=(X/Y)
Where X is the journal rank in category and Y is the number of journals in the
category.
Now I have a list of 267 journals.
What turns me crazy is that the way R, matlab and the JCR calculate the
quartiles gives different results.
Here is a table
#+begin_matlab :exports both :eval never-export :results output latex
#+RESULTS:
| quartil-limit (last member) | | floor_Rlang | jcr | jcr_check | floor_check
|
|-----------------------------+----+-------------+-----+-----------+-------------|
| 67.5 | Q1 | 67 | 66 | 0.2472 | 0.2509
|
| 134 | Q2 | 134 | 133 | 0.4981 | 0.5019
|
| 200.5 | Q3 | 200 | 200 | 0.7491 | 0.7491
|
| 267 | | 267 | 267 | 1 | 1
|
#+TBLFM: $5=$4/267::$6=$3/267
#+end_matlab
I calculated using R (I don't provide the vector from 1 to 267)
#+begin_src R :colnames t :var t1=jcr22
quantile(t1$Data,c(1/4,1/2,3/4,1))
#+end_src
#+begin_src
#+RESULTS:
| x |
|-------|
| 67.5 |
| 134 |
| 200.5 |
| 267 |
#+end_src
So you see the problem with Q1 and Q2.
On top of that matlab gives
#+begin_src matlab :exports results :eval never-export :results output latex
format short
x=1:267;
q1 = quantile(x,1/4);
q2 = quantile(x,1/2);
q3 = quantile(x,3/4);
Q=[q1; q2; q3];
sprintf('|%g| \n', Q)
#+end_src
#+RESULTS:
#+begin_export latex
|67.25|
|134|
|200.75|
#+end_export
Which is also slightly different from R.
Can anybody enlighten me please?
Thanks and regards
Uwe Brauer
--
I strongly condemn Putin's war of aggression against the Ukraine.
I support to deliver weapons to Ukraine's military.
I support the ban of Russia from SWIFT.
I support the EU membership of the Ukraine.
Read ?quantile carefully, please (and any references therein that you may wish to consult). You are estimating a continuous function by a discrete finite step function, and as the Help page (and further references) explains, there are many ways to do this. Bert On Thu, Jul 14, 2022 at 2:33 PM Uwe Brauer <oub at mat.ucm.es> wrote:> > > Hi > > I am very acquainted with R. I use it occasionally via the org-babel library of GNU emacs. > > I wanted to check the first, second and third quartiles of the scientific science index JCR > https://support.clarivate.com/ScientificandAcademicResearch/s/article/Journal-Citation-Reports-Quartile-rankings-and-other-metrics?language=en_U > S > Its criterion is > #+begin_src > | Quartil | range | | > | ---------+------------------+---------------------------------------| > | Q1 | 0.0 < Z \leq 0.25 | Highest ranked journals in a category | > | Q2 | 0.25 < Z \leq 0.5 | | > | Q3 | 0.5 < Z \leq 0.75 | | > | Q4 | 0.75 < Z | Lowest ranked journals in a category | > #+end_src > > Z=(X/Y) > > Where X is the journal rank in category and Y is the number of journals in the category. > > Now I have a list of 267 journals. > > What turns me crazy is that the way R, matlab and the JCR calculate the quartiles gives different results. > > Here is a table > #+begin_matlab :exports both :eval never-export :results output latex > #+RESULTS: > | quartil-limit (last member) | | floor_Rlang | jcr | jcr_check | floor_check | > |-----------------------------+----+-------------+-----+-----------+-------------| > | 67.5 | Q1 | 67 | 66 | 0.2472 | 0.2509 | > | 134 | Q2 | 134 | 133 | 0.4981 | 0.5019 | > | 200.5 | Q3 | 200 | 200 | 0.7491 | 0.7491 | > | 267 | | 267 | 267 | 1 | 1 | > #+TBLFM: $5=$4/267::$6=$3/267 > #+end_matlab > > I calculated using R (I don't provide the vector from 1 to 267) > > #+begin_src R :colnames t :var t1=jcr22 > quantile(t1$Data,c(1/4,1/2,3/4,1)) > #+end_src > #+begin_src > #+RESULTS: > | x | > |-------| > | 67.5 | > | 134 | > | 200.5 | > | 267 | > #+end_src > > > So you see the problem with Q1 and Q2. > > On top of that matlab gives > > #+begin_src matlab :exports results :eval never-export :results output latex > format short > x=1:267; > q1 = quantile(x,1/4); > q2 = quantile(x,1/2); > q3 = quantile(x,3/4); > Q=[q1; q2; q3]; > sprintf('|%g| \n', Q) > #+end_src > > #+RESULTS: > #+begin_export latex > |67.25| > |134| > |200.75| > #+end_export > > Which is also slightly different from R. > > Can anybody enlighten me please? > Thanks and regards > > Uwe Brauer > > -- > I strongly condemn Putin's war of aggression against the Ukraine. > I support to deliver weapons to Ukraine's military. > I support the ban of Russia from SWIFT. > I support the EU membership of the Ukraine. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
? Thu, 14 Jul 2022 14:58:17 +0200 Uwe Brauer <oub at mat.ucm.es> ?????:> What turns me crazy is that the way R, matlab and the JCR calculate > the quartiles gives different results.R by itself can give up to 9 slightly different results: sapply(1:9, function(type) quantile(1:267, 1:3/4, type = type)) # [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] # 25% 67 67 67 66.75 67.25 67 67.5 67.16667 67.1875 # 50% 134 134 134 133.50 134.00 134 134.0 134.00000 134.0000 # 75% 201 201 200 200.25 200.75 201 200.5 200.83333 200.8125 Choose the ones that fit your ideas of quantile best. See ?quantile for more info. -- Best regards, Ivan