Chatbot -generated R Code
A good summary. At the level of the summary it is not specific to R. However, asking a chatbot "I want a program to analyze my data" will not result in anything useful. You have to phrase the question to let the chatbot know what language, what does the data look like, and what do you want to do. So I might write: I have a data frame with three variables where V1 is time. I want an R program to calculate the mean of V3 for each unique value in V1. The answer will be R specific. The chatbot may guess at missing elements including generating fake data to act as an example. In "does the generated code work" the quality of the end result depends on the quality of the question asked. The chatbot works best with tightly focused simple questions. Complicated problems require complicated prompts and chatbots often ignore parts of long prompts. Developing good prompts is also a learned skill.
I would add
- is it worthwhile?
It takes time to write code. It also takes time to develop a good prompt. In both cases there will be a code validation and debugging step. What is the most efficient use of your time? What gives the best final product?
I might ask will you learn from it? The chatbot gave you an answer that you could not write yourself when you asked the chatbot. Given the chatbot answer do you understand it and could you now get that answer yourself?
The chatbot wrote: if(!require(tidyverse){install.packages('tidyverse')}
And explained the solution. I understand, and this helps to avoid installing packages that are already installed. You can take it further and say: I have a program that requires 30 packages. In R, what is the shortest code that will check to see if they are already installed and to install ones that I do not have?
I get several answers, one of which is:
pkgs <- c("dplyr", "ggplot2", "lme4")
to_install <- pkgs[!pkgs %in% rownames(installed.packages())]
if (length(to_install)) install.packages(to_install)
I can then copy the code and ask the chatbot to explain the code, and progress from there.
If the chatbot is generating black boxes where "a miracle occurs" they will in the end cause you more problems than they solve. If all the code is generated by a chatbot and you do not understand it, what will you do when the boss stops by as asks for a modification or enhancement or a customer stops by and states that your code generates errors?
-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Richard O'Keefe
Sent: Tuesday, December 9, 2025 8:02 PM
To: Gregg Powell <g.a.powell at protonmail.com>
Cc: R help project <r-help at r-project.org>; Hans W <hwborchers at gmail.com>; Robert Knight <bobby.knight at gmail.com>
Subject: Re: [R] Chatbot -generated R Code
[External Email]
So to summarise, there are three key issues so dar:
- does the generated code work
- does it infringe on someone else?s intellectual property rights
- do the AI?s terms of service permit you to use it
What are some other things people who want to use an AI to generate code should consider? Ither than the application domain,?is any of this specific to R?
On Wed, 10 Dec 2025 at 8:30?AM, Gregg Powell via R-help < r-help at r-project.org> wrote:
I did not say blindly trust LLMs nor did I recommend their use. That is up to each individual. Those who choose not to use LLMs will not be competitive against their peers who do - that is my claim. As for me, I use LLMs. I have no axe to grind against using LLMs or those who use them. Honestly, at 58 - I did not think I'd see AI in my lifetime. I see LLMs as a tool. A very useful tool. I would not want to be a younger person having to compete against AI. I am glad to be in a position where AI and its impact on society will have little or no financial impact on me personally. I commiserate with those not in a similar circumstance. I see many taking a supercilious attitude toward those who use AI (as demonstrated in your emails, for instance) - particularly among coders. Ironically, coders are among the first and hardest hit by AI, along with graphic designers, writers, researchers, data scientists... there is a long and growing list. The genie is out of the bottle. Governments are run by people either too greedy or power hungry to curtail the technology. It is the start of a new arms race. Some claim it will help society, other claim it will destroy it. As most things usually go - the truth most probably lies somewhere in the middle. Only time will tell. All the best! Gregg On Tuesday, December 9th, 2025 at 10:06 AM, Robert Knight < bobby.knight at gmail.com> wrote:
Responding with LLM output to a question about risk and the legality of
something is not comforting. Naked Capitalism reported on hallucinations are increasing, not decreasing in language models.
I shall trust my own brain over an LLM output. Are you really suggesting
that people trust an LLM counterview of the meaning contracts they sign?
This kind of thinking, and that guy who did not understand central
tgeblaw of large numbers , both experts in the field, is why people like me have to work in other occupations and argue in the public sphere until someone like Kennedy can get into place.
An LLM tells me not believe my lying eyes and cognitive understanding of
the contract I am about to sign.... Trust the LLM you say.
My word.
On Tuesday, December 9, 2025, Gregg Powell <g.a.powell at protonmail.com>
wrote:
Let's let Claude respond back itself:
r/ Gregg On Tuesday, December 9th, 2025 at 8:39 AM, Robert Knight <
bobby.knight at gmail.com> wrote:
It seems like malpractice to recommend Claude to someone using R or
big data since what they would use it for is *explicitly* against the terms of service. Machine learning predates the microchip.
See below.
Also, quality control will make a comeback. Expert systems cannot be
replaced with something akin to Bayes probability charts indedinitely.
you may not use the service to ?develop any products or services
that compete with our Services, including to develop or train any artificial intelligence or machine learning algorithms or models.?
Claude?s terms further state
?Equitable relief. You agree that (a) no adequate remedy exists at
law if you breach Section 3 (Use of Our Services); (b) it would be difficult to determine the damages resulting from such breach, and any such breach would cause irreparable harm; and (c) a grant of injunctive relief provides the best remedy for any such breach. You waive any opposition to such injunctive relief, as well as any demand that we prove actual damage or post a bond or other security in connection with such injunctive relief.?
Machine learning includes linear regression. Other Machine Learning
algorithms include Logistic Regression, decision trees, random forests, support vector machines, K-Nearest Neighbors, & Bayes Algorithms. It seems to me, that as of 14 October 2024, no one seeking to handle any data science can legitimately use Claude
On Tuesday, December 9, 2025, Gregg Powell via R-help <
r-help at r-project.org> wrote:
Humans who don't adapt to LLMs, or whatever form AI takes as it
evolves, will be left in the dust.
People may just now be waking up to the fact that we're three
years into a tremendous revolution, one of the greatest in human history. It follows the Bronze Age, the Iron Age, the Industrial Revolution, the computer revolution, the Information Age, and now... AI.
AGI is approaching. How quickly? Who can say. Whether AI can ever
be truly sentient remains a mystery. But once it can adequately replicate sentience, some will ask: what's the difference?
As to the question of who judges what's acceptable from a coding
standpoint: capitalism will. Corporations will. And the question of whether this is the future of coding is already behind us. It is coding now, and it will only continue to improve in capability.
Try Replit, Cursor, Claude Code. Humans are incapable of keeping
up. AI still struggles with some of the most complex tasks, and it does poorly at orchestrating across large repositories, but it's improving rapidly.
Just my observations.
Those who look down their noses at all this will be left behind.
All the best! Gregg
On Tuesday, December 9th, 2025 at 6:32 AM, Hans W <
hwborchers at gmail.com> wrote:
SORRY if I missed such a discussion somewhere on R-HELP
For many years I wanted to write an R function that finds the
closest pair of
points among a, maybe huge, set of points on the 2-dimensional
plane. I never
did, perhaps considering the possible complexity of this task.
Now I found a book, among others describing the "sweeping
algorithm", perfectly
suited for the problem. And as a test, I questioned chatbots
like DeepSeek and
ChatGPT about such a function - and mentioned the sweeping
algorithm.
DeepSeek, for instance, came immediately up with a complete,
efficient solution
and test cases that I checked with brute force. I can see that
it utilized the
sweeping algorithm, documented the code, and set up a help file.
I made some
changes, improved the code a bit, but still it is code generated
by a clever
chatbot, whatever I do.
Now I ask myself: Is this a correct and lawful way to write code
in the future?
I am not even sure DeepSeek may not have used an implementation
of the sweeping
algorithm that is under ACM license and would not be allowed on
CRAN.
I wonder how one handles this matter? Will this be the future of
code writing
(for R and other languages)? I would really appreciate to hear
your opinion or
a hint to a discussion about it.
Hans Werner
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
https://nam10.safelinks.protection.outlook.com/?url=https%3A %2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C0 2%7Ctebert%40ufl.edu%7Cb8c767ee485143f501bd08de3787b5dc%7C0d 4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C639009253200707097% 7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAu MDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0 %7C%7C%7C&sdata=7hzmk87fHUB6PI9h%2FHZ5%2F0OvsI0CNob%2FMtr6Ee BYun4%3D&reserved=0 PLEASE do read the posting guide
https://www/. r-project.org%2Fposting-guide.html&data=05%7C02%7Ctebert%40ufl.edu%7Cb 8c767ee485143f501bd08de3787b5dc%7C0d4da0f84a314d76ace60a62331e1b84%7C0 %7C0%7C639009253200740416%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRy dWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D% 3D%7C0%7C%7C%7C&sdata=%2F5mjhFyXvzFKMjpm119HpfTW4fFRf6eOAx96YftRUgI%3D &reserved=0
and provide commented, minimal, self-contained, reproducible
code.______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat/ .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C02%7Ctebert%40ufl.edu %7Cb8c767ee485143f501bd08de3787b5dc%7C0d4da0f84a314d76ace60a62331e1b84 %7C0%7C0%7C639009253200758344%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGki OnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ %3D%3D%7C0%7C%7C%7C&sdata=o3BKtTG6MHwZrISq9qH4rPfQYfa8vZkJYTR3Qwybk6E% 3D&reserved=0 PLEASE do read the posting guide https://www/. r-project.org%2Fposting-guide.html&data=05%7C02%7Ctebert%40ufl.edu%7Cb 8c767ee485143f501bd08de3787b5dc%7C0d4da0f84a314d76ace60a62331e1b84%7C0 %7C0%7C639009253200774418%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRy dWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D% 3D%7C0%7C%7C%7C&sdata=EDhOMbeiEjYoybomGrcWp0J13VXnBWul7jaK%2F6Eg0aw%3D &reserved=0 and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.