thr3ads.net - R devel - [Rd] An interesting chat with ChatGPT [Feb 2023]

If this information is useful, please help other people find it:
Share via:

Boris Steipe

2023-Feb-13 15:54 UTC

[Rd] An interesting chat with ChatGPT

Duncan -

Indeed, this has now been well documented; I have called these constructions
"Schr?dinger Facts", since they arise from a superposition of truths
in the training data that collapse into an untruth when observed.

   https://sentientsyllabus.substack.com/p/chatgpts-achilles-heel

Now, the curious question is: why can it program. Why is its production of
R-syntax less vulnerable to this effect than a literature quote, or reference?
Maybe it has to do with the fact that in producing correct syntax there is a
ground truth - errors just will not run. But I'm not sure.

Regardless: it signals that we need a reinvigorated culture of validation.
Actually ChatGPT will happily make test data for you. Whisvh. in a way, it judt
did  ;-)


Boris



> On 2023-02-13, at 08:37, Duncan Murdoch <murdoch.duncan at gmail.com>
wrote:
> 
> I was wondering which anchor would match in the regular expression
"(a)|(b)" if both "a" and "b" matches the target
string.  What appears to happen is that the first match wins, and that's how
it is documented in Python, but I was looking for docs on this in R.
> 
> As far as I can see, the ?regex help page doesn't resolve this. So just
for fun, I decided to ask ChatGPT, and got an interesting response.  I won't
show the whole conversation here, but just some extracts:
> 
> ---------------------------------------
> Me:  In a regular expression in R, if both "a" and "b"
match, which does "(a)|(b)" match?  Please include authoritative
references in your answer.
> 
> ChatGPT: [long answer about grep]
> 
> Me:  In grep(), it doesn't matter which of a or b matches.  It's
only in sub() or gsub() that it makes a difference.  Do you know an
authoritative reference that discusses those functions?
> 
> ChatGPT:  [long irrelevant answer about sub and gsub, with a reference to
"R Cookbook" by P. Teetor.]
> 
> Me:  Teetor's book is not authoritative, and the other reference
doesn't discuss ambiguous alternation.
> 
> ... finally we get to the interesting part ...
> 
> ChatGPT: I apologize for the mistake in my previous answer.
> 
> A more authoritative reference for the behavior of regular expressions in R
can be found in the official R documentation.
> 
> Regarding the behavior of ambiguous alternation, the regex section of the
manual "An Introduction to R" explains it as follows:
> 
> "When several alternatives are possible, the first alternative is
chosen, so it is important to specify alternatives in the order in which they
should be tried."
> 
> Here is the reference for the official R documentation:
> 
>    "An Introduction to R" manual:
https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Regular-expression-facilities
> 
> I hope this reference provides the information you were looking for.
> ---------------------------------------
> 
> 
> All right, an authoritative answer!  The only problem is that *it
doesn't exist*.  There's no such anchor in that document.  R-intro
isn't a very rapidly changing document, so I could look at every change
recorded in svn, and that section has *never* existed.
> 
> I guess ChatGPT is a pattern matching machine, and if such a question had
an answer, it would probably be in a location like that.  (I would have guessed
a help page like ?regex instead of R-intro, but I had just told it that answer
was wrong.)
> 
> Duncan Murdoch
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

--
Boris Steipe MD, PhD

Professor em.
Department of Biochemistry 
Temerty Faculty of Medicine
University of Toronto

Kevin Coombes

2023-Feb-13 17:14 UTC

head link

[Rd] An interesting chat with ChatGPT

Chat bots are like politicians, or talking dogs. The fact that they exist
is interesting. But no same person would believe anything they say.

On Mon, Feb 13, 2023, 10:58 AM Boris Steipe <boris.steipe at utoronto.ca>
wrote:
> Duncan -
>
> Indeed, this has now been well documented; I have called these
> constructions "Schr?dinger Facts", since they arise from a
superposition of
> truths in the training data that collapse into an untruth when observed.
>
>    https://sentientsyllabus.substack.com/p/chatgpts-achilles-heel
>
> Now, the curious question is: why can it program. Why is its production of
> R-syntax less vulnerable to this effect than a literature quote, or
> reference? Maybe it has to do with the fact that in producing correct
> syntax there is a ground truth - errors just will not run. But I'm not
sure.
>
> Regardless: it signals that we need a reinvigorated culture of validation.
> Actually ChatGPT will happily make test data for you. Whisvh. in a way, it
> judt did  ;-)
>
>
> Boris
>
>
>
>
> > On 2023-02-13, at 08:37, Duncan Murdoch <murdoch.duncan at
gmail.com>
> wrote:
> >
> > I was wondering which anchor would match in the regular expression
> "(a)|(b)" if both "a" and "b" matches the
target string.  What appears to
> happen is that the first match wins, and that's how it is documented in
> Python, but I was looking for docs on this in R.
> >
> > As far as I can see, the ?regex help page doesn't resolve this. So
just
> for fun, I decided to ask ChatGPT, and got an interesting response.  I
> won't show the whole conversation here, but just some extracts:
> >
> > ---------------------------------------
> > Me:  In a regular expression in R, if both "a" and
"b" match, which does
> "(a)|(b)" match?  Please include authoritative references in your
answer.
> >
> > ChatGPT: [long answer about grep]
> >
> > Me:  In grep(), it doesn't matter which of a or b matches. 
It's only in
> sub() or gsub() that it makes a difference.  Do you know an authoritative
> reference that discusses those functions?
> >
> > ChatGPT:  [long irrelevant answer about sub and gsub, with a reference
> to "R Cookbook" by P. Teetor.]
> >
> > Me:  Teetor's book is not authoritative, and the other reference
doesn't
> discuss ambiguous alternation.
> >
> > ... finally we get to the interesting part ...
> >
> > ChatGPT: I apologize for the mistake in my previous answer.
> >
> > A more authoritative reference for the behavior of regular expressions
> in R can be found in the official R documentation.
> >
> > Regarding the behavior of ambiguous alternation, the regex section of
> the manual "An Introduction to R" explains it as follows:
> >
> > "When several alternatives are possible, the first alternative is
> chosen, so it is important to specify alternatives in the order in which
> they should be tried."
> >
> > Here is the reference for the official R documentation:
> >
> >    "An Introduction to R" manual:
>
https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Regular-expression-facilities
> >
> > I hope this reference provides the information you were looking for.
> > ---------------------------------------
> >
> >
> > All right, an authoritative answer!  The only problem is that *it
> doesn't exist*.  There's no such anchor in that document.  R-intro
isn't a
> very rapidly changing document, so I could look at every change recorded in
> svn, and that section has *never* existed.
> >
> > I guess ChatGPT is a pattern matching machine, and if such a question
> had an answer, it would probably be in a location like that.  (I would have
> guessed a help page like ?regex instead of R-intro, but I had just told it
> that answer was wrong.)
> >
> > Duncan Murdoch
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
> --
> Boris Steipe MD, PhD
>
> Professor em.
> Department of Biochemistry
> Temerty Faculty of Medicine
> University of Toronto
>
>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
	[[alternative HTML version deleted]]

Ravi Varadhan

2023-Feb-16 23:37 UTC

head link

[Rd] An interesting chat with ChatGPT

ChatGPT provided an R code with error when I asked a question related to
prediction of a new data point for linear mixed effects model.  So, it is not
immune to coding mistakes!

Ravi

________________________________
From: R-devel <r-devel-bounces at r-project.org> on behalf of Boris Steipe
<boris.steipe at utoronto.ca>
Sent: Monday, February 13, 2023 10:54 AM
To: Duncan Murdoch <murdoch.duncan at gmail.com>
Cc: r-devel <r-devel at r-project.org>
Subject: Re: [Rd] An interesting chat with ChatGPT


      External Email - Use Caution



Duncan -

Indeed, this has now been well documented; I have called these constructions
"Schr?dinger Facts", since they arise from a superposition of truths
in the training data that collapse into an untruth when observed.

  
https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsentientsyllabus.substack.com%2Fp%2Fchatgpts-achilles-heel&data=05%7C01%7Cravi.varadhan%40jhu.edu%7C1a6aecbbe3974b52595608db0ddb3271%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C638119007350695881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WM23AlyRohiQh5EqTqkTI2B%2BXTG38k%2BtfaRv7KQwybw%3D&reserved=0

Now, the curious question is: why can it program. Why is its production of
R-syntax less vulnerable to this effect than a literature quote, or reference?
Maybe it has to do with the fact that in producing correct syntax there is a
ground truth - errors just will not run. But I'm not sure.

Regardless: it signals that we need a reinvigorated culture of validation.
Actually ChatGPT will happily make test data for you. Whisvh. in a way, it judt
did  ;-)


Boris



> On 2023-02-13, at 08:37, Duncan Murdoch <murdoch.duncan at gmail.com>
wrote:
>
> I was wondering which anchor would match in the regular expression
"(a)|(b)" if both "a" and "b" matches the target
string.  What appears to happen is that the first match wins, and that's how
it is documented in Python, but I was looking for docs on this in R.
>
> As far as I can see, the ?regex help page doesn't resolve this. So just
for fun, I decided to ask ChatGPT, and got an interesting response.  I won't
show the whole conversation here, but just some extracts:
>
> ---------------------------------------
> Me:  In a regular expression in R, if both "a" and "b"
match, which does "(a)|(b)" match?  Please include authoritative
references in your answer.
>
> ChatGPT: [long answer about grep]
>
> Me:  In grep(), it doesn't matter which of a or b matches.  It's
only in sub() or gsub() that it makes a difference.  Do you know an
authoritative reference that discusses those functions?
>
> ChatGPT:  [long irrelevant answer about sub and gsub, with a reference to
"R Cookbook" by P. Teetor.]
>
> Me:  Teetor's book is not authoritative, and the other reference
doesn't discuss ambiguous alternation.
>
> ... finally we get to the interesting part ...
>
> ChatGPT: I apologize for the mistake in my previous answer.
>
> A more authoritative reference for the behavior of regular expressions in R
can be found in the official R documentation.
>
> Regarding the behavior of ambiguous alternation, the regex section of the
manual "An Introduction to R" explains it as follows:
>
> "When several alternatives are possible, the first alternative is
chosen, so it is important to specify alternatives in the order in which they
should be tried."
>
> Here is the reference for the official R documentation:
>
>    "An Introduction to R" manual:
https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcran.r-project.org%2Fdoc%2Fmanuals%2Fr-release%2FR-intro.html%23Regular-expression-facilities&data=05%7C01%7Cravi.varadhan%40jhu.edu%7C1a6aecbbe3974b52595608db0ddb3271%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C638119007350695881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=6MFKiZJ0nyDLmEDLcVxg%2Bm%2BYTe5lQixS2NIdZNCMecE%3D&reserved=0
>
> I hope this reference provides the information you were looking for.
> ---------------------------------------
>
>
> All right, an authoritative answer!  The only problem is that *it
doesn't exist*.  There's no such anchor in that document.  R-intro
isn't a very rapidly changing document, so I could look at every change
recorded in svn, and that section has *never* existed.
>
> I guess ChatGPT is a pattern matching machine, and if such a question had
an answer, it would probably be in a location like that.  (I would have guessed
a help page like ?regex instead of R-intro, but I had just told it that answer
was wrong.)
>
> Duncan Murdoch
>
> ______________________________________________
> R-devel at r-project.org mailing list
>
https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=05%7C01%7Cravi.varadhan%40jhu.edu%7C1a6aecbbe3974b52595608db0ddb3271%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C638119007350695881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=tcAowvYXOiEUYTN0K40pcLcag7YenH1cDa9%2FtLnOn6c%3D&reserved=0

--
Boris Steipe MD, PhD

Professor em.
Department of Biochemistry
Temerty Faculty of Medicine
University of Toronto



______________________________________________
R-devel at r-project.org mailing list
https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=05%7C01%7Cravi.varadhan%40jhu.edu%7C1a6aecbbe3974b52595608db0ddb3271%7C9fa4f438b1e6473b803f86f8aedf0dec%7C0%7C0%7C638119007350695881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=tcAowvYXOiEUYTN0K40pcLcag7YenH1cDa9%2FtLnOn6c%3D&reserved=0

	[[alternative HTML version deleted]]

R devel - Feb 2023 - An interesting chat with ChatGPT

[Rd] An interesting chat with ChatGPT

[Rd] An interesting chat with ChatGPT

[Rd] An interesting chat with ChatGPT