35 Comments
User's avatar
David Silverbell's avatar

Boo! I expected click bait I received a very well thought out piece 1/5 stars will read again

Alan's avatar

It's a very interesting question of what languages are optimal when agents are generating and editing most of the lines of code.

Having written a lot of Python and Rust, I don't think I would pick Rust unless I really knew I needed control over memory allocation. As a codebase gets larger, build times can get pretty rough and that really starts to slow down cycle time. I don't really like Go, but if agents are writing it, that's less problematic. Also, it is harder to write hyper complicated code in Go.

I have been interested in structured editing for years, but I would push back on the idea that agents think in structure, not text. They quite literally think in text. But it is interesting for me to imagine what structured editing for LLMs would look like. It could go a long way towards alleviating the problem of long build times, because the tool calls would presumably validate every edit. My concern would be that LLMs are currently very adapted for text editing.

Julio Nobrega Netto's avatar

May I plug a project of mine, as I have strong opinions about this topic? I am designing Sigil https://github.com/inerte/sigil/ with LLMs-only in mind. I can't say I solved the speed issue, right now Sigil compiles to Typescript, that's because I wanted the usefulness of building a web app. But I've been debating, exactly because of performance, if I shouldn't just use LLVM+IR...

Caleb Fenton's avatar

Neat! You should add some information about the project: why sigil, what the problem is, why sigil solves it, etc. Would help with discoverability and getting traction.

Shawn Willden's avatar

I'm having a similar experience in the other direction. I've always been a fan of statically-typed languages and always disliked languages (like Python) that are loosey-goosey. I just know I'm writing a lot of bugs. I have 35 years of C++ under my belt, plus a lot of Java, and the last couple of years I've mostly been writing Rust. And in the last six months I've been using AI to write a lot of Rust -- and found it to be an incredibly smooth and excellent experience.

But in the last month or so I've switched to writing a lot of C++ using AI (a C++ adapter implementing an industry-standard C++ API that surfaces functionality from my Rust library)... and even though C++ is also a statically-typed language, it's a much looser one than Rust, and the AI is not nearly as good at it. There are a lot more bugs, and I have to scrutinize the code a lot more closely.

When using the AI to write Rust, I mostly scrutinize the data structures and high level flows, and make sure the tests are checking the right things. I don't need to check the details of the code closely, because if the input types and the output types are right, and the function works (as proven by decent unit tests), the code works. The odds that there are any weird edge cases is very small.

In C++ this isn't true, even using the Modern C++ style that is supposed to accomplish most of what Rust does. I've found memory leaks, dangling pointers and data races which are all just impossible in Rust. I've also found many more subtle bugs. It does help to have an agent critique the code that the agent wrote (note: the same model is fine, but it must be a fresh context window -- AI is just as blind to its own mistakes as people are blind to theirs).

I've thought a bit about how this would translate into really "loose" languages, like Python or <shudder>Javascript</shudder>, and I was pretty sure it wouldn't be good, even though the LLMs have a lot of coding habits that Java/C++/Rust/Haskell programers don't. Things like manually scanning the code to see where all of the effects of a change might be, while an experienced statically-typed language programmer would make the change and run the compiler and let it find all of the other places that need to change (I have written a few rules to try to break the agents of that Pythonish habit, with moderate success at best).

I don't know that I agree that Python is a great language for humans. It's okay. But current-generation AI needs a lot of guardrails to keep it on track, and Rust is *great* for that.

Nick Ruisi's avatar

Every minute spent chasing down type mismatch errors at runtime is a consequence of choosing a non-type-safe language in your architecture at design time. Compiler errors are better than pissed off customers, who may or may not be unable to conduct business or have lost their data presently because someone chose python or node/JS as the platform framework to implement the system being built. Compilers will catch model-created hallucinations when writing code as well. "Oh, it didn't compile because that method doesn't actually exist in the library! Let me re-write it so that it calls methods that actually exist." is what Copilot told me 2 weeks ago about the snippet it spat out when I asked it for one. That code *looked good, but the red squiggle underline in the IDE that is driven by some JIT compilation suggested otherwise.

helmingstay's avatar

This is a great piece, thanks. I've been thinking about what an LLM-centered lang might look like and this answered a lot of my questions.

I do have one comment re "clean code" guidelines. It's plausible that we want to import some of these guidelines into LLM programming practice for reasons other than human preferences. For example, preferring smaller, composable functions. On their own, LLMs don't reduce the computational complexity of problems, e.g., the combinatorics of NP-hard problems. We often use linearization techniques to reason about complex systems and nonlinear dynamics, all of which are a given in all but the simplest codebase. Even for the clankers, I expect that smart design decisions can reduce the effective solution-space that needs to be explored.

All of which is to say that there's no free lunch for hard problems. It makes perfect sense to me that employing the compiler as a verifiable rewards engine reduces churn and increases robustness. It's also plausible that we continue to get improvements from modularity and clean factorization even when humans move outside the primary programming loop.

Alexander's avatar

On that note I just had a bug today where a text message and a web click should trigger the same business logic. Yet the llm had duplicated the logic and, to nobodies surprise, drift occured and the text message route didn't work properly. Had this been extracted, as is recommend practice for humans, this wouldn't have happened.

I wonder which principles will remain useful. I believe SOLID principles have a solid future ahead of them.

helmingstay's avatar

Thanks, this is such a spot-on example. I'm curious to see how much tools like speckit can handle "principled" coding versus more formal tools. At the end of the day, no compiler in the world can stop a clanker from making poor life choices.

Scott Locklin's avatar

Should have picked golang. Congrats on your transition anyway.

Caleb Fenton's avatar

Seriously eyeing Elixir so both go and rust folks can hate on me.

Bill Allen's avatar

I love writing Elixir much like I love writing Python, which might give you pause. Typing in Elixir isn't as loose as Python, but it can bite at times.

Scott Locklin's avatar

Always been fascinated by this one as well.

TheElectricPilgrim's avatar

Hahaha such a clickbait article. Man, there’s still COBOL, PL1 and other ‘dead’ languages alive and well in the world in places like banking and engineering - greenfield design and exploration is great, but most people are dealing with legacy tied to monetisation, risk, assurance and statutory regulation - none of which can be vibed away in an afternoon.

Peter W.'s avatar

Actually, it would be interesting if anyone has tried to vibe-code in COBOL, and what the results were like. Let's see what happens when Claude does COBOL 😎!

Codebra's avatar

Claude or any of the other frontier models can write perfectly good COBOL. Any language that is documented online and not ridiculously obscure can be written by Claude or Codex.

Peter W.'s avatar

To follow up: here's something from Anthropic/Claude about this, but AFAIK they didn't list any examples. https://claude.com/blog/how-ai-helps-break-cost-barrier-cobol-modernization

Peter W.'s avatar

I agree, but haven't heard of an example for COBOL. There's actually a real demand there, as COBOL isn't going away anytime soon.

TheElectricPilgrim's avatar

True that would be cool

Peter W.'s avatar

I wondered for a while now how good Claude or another AI would be at writing machine language (machine code).

Lucas Mior's avatar

Lack of type checking was never a virtue of python.

Rothbard’s Spectacles's avatar

Excellent article. Working on something myself in rust that 3 months ago was all python. I only kept a small sidecar in python, but been rethinking that as well, and I’ve already downgraded source files to a projection of the “code atoms”, I hope it works;)

Prompt or not - AI 4 everyone's avatar

Even so, I’m a big fan of python. But this doesn’t mean I will not try Rust to see how it works.

Caleb Fenton's avatar

I love python. Been coding hardcore for 30 years, almost every day. Used a lot of python. Past 4 months was all agent coding and noticed at that level, I’m having to spend 30-40% of my time telling the agents: “add types, you didn’t add types, we need types, it’s in the style guide, it’s in the agents.md, it’s in the plan, it’s in the prompt, add types.” and then “no, don’t just use Any for everything”. You can wire in ty and other type checkers to make the agents fail to automatically kick them in the head. But then you’re also fighting: “no, don’t use RST style docstrings, use markdown”. And then you’re building pex files or wheels and all kinds of nonsense. Meanwhile, with rust and go, you compile it, it works, it’s 1000x faster, single binary.

Prompt or not - AI 4 everyone's avatar

Then I really need to try Rust. Where can I find more information about this solution?

Pjohn's avatar
4dEdited

Regarding a machine-oriented programming language: If AIs learned to write Rust and Python by reading textbooks, StackOverflow, RosettaCode, etc. - could they learn to write similar-quality code in a brand-new niche language that had only an API and that by definition wouldn't have human-generated StackOverflow posts, textbooks, etc.?

Could a general-purpose AI (plus maybe automated transpiling etc.) be used to generate training examples for the new-language-AI to train on? Or would these examples necessarily be too low-quality to bootstrap a comprehensive understanding of the language from?

Caleb Fenton's avatar

Great question.

In-context learning takes you really far, but you’re spending a lot on context. AI labs do more than pour in tokens for pre training. They pay a gazillion different groups for bespoke data sets for RLHF. They are looking for data sets that challenge the model, that the model doesn’t do well with, but the model can learn to do well. For example, here’s some new niche language. The agent only gets the syntax right 30% of the time at the beginning, because it’s different from other languages and the model isn’t trained on it. One team might be pouring in tokens for that language into the pretraining pipeline, and another team would be getting RLHF datasets to teach the model how to write good code with tricky challenges in that language.

So we’re probably 6 months away from an AI specific language, closer to 3 years is my guess before we can use it. It’s “good enough” for now the way things are. Maybe there’s an edge for using rust over python, but you’ll need to be a lot better to pay the switching cost to some new agent only language.

Kabeer Kumar's avatar

dude don’t tell me python is dead I am making an ML project in python right now😂

hearing that kind of stuff makes me want to cry, btw why do you think Python is dead?

Codebra's avatar

"why do you think Python is dead?"

Try reading the article.

Kabeer Kumar's avatar

dude I just thought about that when I was writing that comment problem is it is very hard to follow through on your heading in posts, as a substack writer myself, and although you tried your best, it felt more like "I like rust and not python because of types and ability to specify everything like in C++ " instead of python literally being dead, could do me a favor, and actually write about what you intend to write about, I am sure that way you'll write masterpieces.

p.s. I did actually read the post on your request liked this line "Mandarin has been doing compression that programmers think they invented"

spence witten's avatar

For all the thinking I have to do about AI I’ve never once thought about how languages are designed for humans and how different they might be if designed for machines. Thanks for making me feel dumb.

Kryptologyst's avatar

Mojo aims to collapse the gap between high-level code and hardware performance by giving Python developers the power to speak directly to the silicon without leaving the syntax they already know.

Mark Sundstrom's avatar

This is both click bait and also very interesting. My perspective: I'm a retired programmer who has programmed for fun in Python for 20+ years - it's the most enjoyable language I've used, and I started with APL and ALGOL and Fortran in the 70s. But I've used AI enough to see that the author has some great points