The amount of goal post shifting is so amusing to see. Yes, sure this was probably not an "important" or a particularly "challenging" problem which had been open for a while. Sure, maybe it remained open because it didn't get enough eyeballs from the right people to care about spending time on it.
Yes, there is too much overhyping and we are all tired of it somewhat. I still think if someone 10 years ago told me we would get "AI" to a stage where it can solve olympiad level problems and getting gold medals in IMO on top of doing so with input not in a very structured input but rather our complex, messy human natural language and being able to do so while interpreting, to various degrees of meaning what interpreting means, image and video data and doing so in almost real time I would have called you nuts and this thing in such a duration sci-fi. So some part of me feels crazy how quickly we have normalized to this new reality.
A reality where we are talking about if the problem solved by the automated model using formal verification was too easy.
Don't get me wrong, I am not saying any of this means we get AGI or something or even if we continue to see improvements. We can still appreciate things. It doesn't need to be a binary. What a time to be alive!
> The amount of goal post shifting is so amusing to see
Can you be specific about the goal posts being shifted? Like the specific comments you're referring to here. Maybe I'm just falling for the bait, but non specific claims like this seem designed just to annoy while having nothing specific to converse about.
I got to the end of your comment and counting all the claims you discounted, the only goal post I see left is that people aren't using a sufficiently excited tone while sifting fact from hype? A lot of us follow this work pretty closely and don't feel the need to start every post with "there is no need for excitement to abate, still exciting! but...".
> I am not saying any of this means we get AGI or something or even if we continue to see improvements. We can still appreciate things. It doesn't need to be a binary.
You'll note, however, that the hype guys happily include statements like "Vibe proving is here" in their posts with no nuance, all binary. Why not call them out?
Yeah, I agree. As a mathematician it's easy to say that it's not a super hard proof. But then again, the first proofs found by running AI on a large bucket of open problems was always going to be some easy proofs on a not-very-worked-on problem. The fact that no one did it before them definitely shows real progress being made. When you're an expert, it's hard to lose track of the fact that things that you consider trivial vs very hard may be in fact a very small distance in the grand scheme of things (rel to entities of different oom strengths)
I am rather pro-AI in general but I just can't imagine in 2015 what I would think if you told me that we would have AI that could solve an Erdos problem from natural language but it can't answer my work emails.
It actually doesn't help me at all at anything at my job and I really wish it could.
That isn't really goal post moving as much as a very strange shape to the goal post.
The overhyped tweet from the robinhood guy raising money for his AI startup is nicely brought into better perspective by Thomas Bloom (including that #124 is not from the cited paper, "Complete sequences of sets of integer powers "/BEGL96):
> This is a nice solution, and impressive to be found by AI, although the proof is (in hindsight) very simple, and the surprising thing is that Erdos missed it. But there is definitely precedent for Erdos missing easy solutions!
> Also this is not the problem as posed in that paper
> That paper asks a harder version of this problem. The problem which has been solved was asked by Erdos in a couple of later papers.
> One also needs to be careful about saying things like 'open for 30 years'. This does not mean it has resisted 30 years of efforts to solve it! Many Erdos problems (including this one) have just been forgotten about it, and nobody has seriously tried to solve it.[1]
And, indeed, Boris Alexeev (who ran the problem) agrees:
> My summary is that Aristotle solved "a" version of this problem (indeed, with an olympiad-style proof), but not "the" version.
> I agree that the [BEGL96] problem is still open (for now!), and your plan to keep this problem open by changing the statement is reasonable. Alternatively, one could add another problem and link them. I have no preference.[2]
Not to rain on the parade out of spite, it's just that this is neat, but not like, unusually neat compared to the last few months.
reading the original paper and the lean statement that got proven, it's kinda fascinating what exactly is considered interesting and hard in this problem
roughly, what lean theorem (and statement on the website) asks is this: take some numbers t_i, for each of them form all the powers t_i^j, then combine all into multiset T. Barring some necessary conditions, prove that you can take subset of T to sum to any number you want
what Erdosh problem in the paper asks is to add one more step - arbitrarily cut off beginnings of t_i^j power sequences before merging. Erdosh-and-co conjectured that only finite amount of subset sums stop being possible
"subsets sum to any number" is an easy condition to check (that's why "olympiad level" gets mentioned in the discussion) - and it's the "arbitrarily cut off" that's the part that og question is all about, while "only finite amount disappear" is hard to grasp formulaically
so... overhyped yes, not actually erdos problem proven yes, usual math olympiad level problems are solvable by current level of Ai as was shown by this year's IMO - also yes (just don't get caught by https://en.wikipedia.org/wiki/AI_effect on the backlash since olympiads are haaard! really!)
See this is one of the reasons I struggle to get on board the AI hype train. Any time I've seen some breathless claim about it's capabilities that feels a bit too good to be true, someone with knowledge in the domain takes a closer look and it turns out to have been exaggerated and meant to draw eyeballs and investors to some fledgling AI company.
I just feel like if we were genuinely on the cusp of an AI revolution like it is claimed, we wouldn't need to keep seeing this sort of thing. Like I feel like a lot of the industry is full of flim-flam men trying to scam people, and if the tech was as capable as we keep getting told it is there'd be no need for dishonesty or sleight of hand.
The thing is, we genuinely are going through an AI revolution. I don't even think that's that breathless of a claim. The contention is over whether it's about to revolutionize our economy, which is a far harder claim to substantiate and should be largely self-substantiating if it is going to happen.
The crypto train kinda ran out of steam, so all aboard the AI train.
That being said, I think AI has a lot more immediately useful cases than cryptocurrency. But it does feel a bit overhyped by people who stand to gain a tremendous amount of money.
I might get slammed/downvoted on HN for this, but really wondering how much of VC is filled with get-rich-quick cheerleading vs supporting products that will create strong and lasting growth.
I don't think you really need to wonder about how much is cheer leading. Effectively all of VC public statements will be cheer leading for companies they already invested in.
The more interesting one is the closed door conversations. Earlier this year, for example, it seemed there was a pattern of VCs heavily invested in AI asking the other software companies they invested in to figure out how to make AI useful for them and report back. I.e. "we invested heavily in hype, tell us how to make it real."
From my perspective, having worked in both industries and simply following my passions and opportunities, all I see is that the same two bandwagons who latched onto crypto either to grift or just egotistically talk shit have moved over to the latest technological breakthrough, meanwhile those of us silently working on interesting things are consantly rolling our eyes over comments from both sides of the peanut gallery.
"I also wonder whether this 'easy' version of the problem has actually appeared in some mathematical competition before now, which would of course pollute the training data if Aristotle [Ed.: the clanker's name] had seen this solution already written up somewhere."
So in short, it was an easy problem that had already been solved thousands of years ago and the proof was so simple that it doesn't really count, and the AI used too many em-dashes in its response and it totally sucks.
I predict the way AI will be useful in science from the perspective of mathematics is by figuring out combinatorially complex solutions to problems that would otherwise not be interesting to (or far too complex to be comprehended by) humans. With such capabilities it could be imagined then that the AI will be useful for designing super materials, or doing fancy things with biology / medicine, and generally finding useful patterns in complex systems.
But abstract mathematics doesn’t care about solutions to problems; it cares about understanding problem spaces. I do not think current AI helps with that.
Problems like the one discussed also aren’t interesting to applied mathematicians, either, because of lack of applications.
But yes, if this kind of AI produces new materials, solves diseases, etc. they will be very useful. We wouldn’t care whether they arrived at those solutions through valid reasoning, though. A sloppy AI that has better ‘guesses/intuitions’ than humans or that can guess and check ‘guesses/intuitions' for correctness faster would be very useful.
And engineers don’t care about abstract mathematics: we care about math that solves problems. Being able to solve more problems with less human-years of effort is a big win.
However, they at most would be the heuristic function of a search mechanism. A good heuristic, but heuristic at most. For search we need to identify when to abandon a path and which other entry point is promising. I'm not sure our current techniques are good for this kind of problems.
I think it will be even more helpful to know a simple proof doesn't exist because AI has tried for long enough and didn't find it.
Once people know there is no easy proof of say Collatz or Twin Primes Conjencture those will not be as alluring to waste your time on.
"Recently, yet another category of low-hanging fruit has been identified as within reach of automated tools: problems which, due to a technical flaw in their description, are unexpectedly easy to resolve. Specifically, problem #124 https://www.erdosproblems.com/124 was a problem that was stated in three separate papers of Erdos, but in two of them he omitted a key hypothesis which made the problem a consequence of a known result (Brown's criterion). However, this fact was not noticed until Boris Alexeev applied the Aristotle tool to this problem, which autonomously located (and formalized in Lean) a solution to this weaker version of the problem within hours."
That doesn’t seem very fair. The problem was stated, and remained unsolved for all this time. You can’t take away that accomplishment just because the solution seems easy in hindsight.
It's technically true that this version of the problem was "low-hanging fruit", so that's an entirely fair assessment. Systematically spotting low-hanging fruit that others had missed is an accomplishment, but it's quite different from solving a genuinely hard problem and we shouldn't conflate the two.
My point is stronger than that. Some things only appear low hanging fruit in hindsight. My own field of physics is full of examples. Saying “oh that should’ve been easy” is wrong more often than it is right.
It’s a combination of the problem appearing to be low-hanging fruit in hindsight and the fact that almost nobody actually seemed to have checked whether it was low-hanging in the first place. We know it’s the latter because the problem was essentially uncited for the last three decades, and it didn't seem to have spread by word of mouth (spreading by word of mouth is how many interesting problems get spread in math).
This is different from the situation you are talking about, where a problem genuinely appears difficult, attracts sustained attention, and is cited repeatedly as many people attempt partial results or variations. Then eventually someone discovers a surprisingly simple solution to the original problem which basically make all the previous paper look ridiculous in hindsight.
In those cases, the problem only looks “easy” in hindsight, and the solution is rightly celebrated because there is clear evidence that many competent mathematicians tried and failed before.
Are there any evidence that this problem was ever attempted by a serious mathematician?
Sure, but unless all solvable problems can be said to "appear as low-hanging fruit in hindsight" this doesn't detract from Tao's assessment in any way. Solving a genuinely complex problem is a different matter than spotting simpler solutions that others had missed.
In this case, the solution might have been missed before simply because the difference between the "easy" and "hard" formulations of the problem wasn't quite clear, including perhaps to Erdős, prior to it being restated (manually) as a Lean goal to be solved. So this is a success story for formalization as much as AI.
One of the math academics on that thread says the following:
> My point is that basic ideas reappear at many places; humans often fail to realize that they apply in a different setting, while a machine doesn't have this problem! I remember seeing this problem before and thinking about it briefly. I admit that I haven't noticed this connection, which is only now quite obvious to me!
Doesn't this sound extremely familiar to all of us who were taught difficult/abstract topics? Looking at the problem, you don't have a slightest idea what is it about but then someone comes along and explains the innerbits and it suddenly "clicks" for you.
So, yeah, this is exactly what I think is happening here. The solution was there, and it was simple, but nobody discovered it up until now. And now that we have an explanation for it we say "oh, it was really simple".
The bit which makes it very interesting is that this hasn't been discovered before and now it has been discovered by the AI model.
Tao challenges this by hypothesizing that it actually was done before but never "released" officially, and which is why the model was able to solve the problem. However, there's no evidence (yet) for his hypothesis.
Is your argument that Terence Tao says it was a consequence from a known result and he categorizes it as low hanging fruit, but to you it feels like one of those things that's only obvious in retrospect after it's explained to you, and without "evidence" of Tao's claim, you're going to go with your vibes?
This seems to be 2nd in row proof from the same author by using the AI models. First time it was the ChatGPT which wrote the formal Lean proof for Erdos Problem #340.
> In over a dozen papers, beginning in 1976 and spanning two decades, Paul Erdős repeatedly posed one of his “favourite” conjectures: every finite Sidon set can be extended to a finite perfect difference set. We establish that {1, 2, 4, 8, 13} is a counterexample to this conjecture.
"Mathematicial superintelligence" is so obnoxious. Why exactly do they think it is called an Erdős problem when Erdős didn't find the solution? Because Erdős discovered the real mathematics: the conjecture!
These people treat math research as if it is a math homework assignment. There needs to be an honest discussion about what the LLM is doing here. When you bang your head against a math problem you blindly try a bunch of dumb ideas that don't actually work. It wastes a lot of paper. The LLM automates a lot of that.
It is actually pretty cool that modern AI can help speed this up and waste less paper. It is very similar to how classical AI (Symbolica) sped up math research and wasted less paper. But we need to have an honest discussion about how we are using the computer as a tool. Instead malicious idiots like Vlad Tenev are making confident predictions about mathematical superintelligence. So depressing.
Ok… has this been verified? I see no publication or at least an announcement on Harmonics webpage. If this is a big deal, you think it would be a big deal, or is this just hype?
> My summary is that Aristotle solved "a" version of this problem (indeed, with an olympiad-style proof), but not "the" version.
> I agree that the [BEGL96] problem is still open (for now!), and your plan to keep this problem open by changing the statement is reasonable. Alternatively, one could add another problem and link them. I have no preference.
— BorisAlexeev
Are you kidding? This is an incredible result. Stuff like this is the most important stuff happening in AI right now. Automated theorem proving? It's not too far to say the entire singular point of the technology was to get us to this.
This is me being snarky and ignorant, but if it solved one problem and it is automated what’s stopping it from solving all the others? That’s what’s ultimately being sold by the tweet right.
Ultimately the main thing that will stop it from solving literally "all the others" are things like the impossibility of solving the halting problem, considerations like P ≠ NP, etc. But as we have just seen, despite these impossibility theorems, AI systems are still able to make substantive progress on solving important open real-world problems.
> But as we have just seen, despite these impossibility theorems, AI systems are still able to make substantive progress on solving important open real-world problems.
You seem to be smuggling in the assumption that this problem was "important".
>This is me being snarky and ignorant, but if it solved one problem and it is automated what’s stopping it from solving all the others?
Yeah that's the crux of the matter. How do AI did it? Using already existing math. If we need new math to prove Collatz, Goldbach or Riemman, LLMs are simply SOL. That's what's missing and hype boys always avoid to mention.
> How do AI did it? Using already existing math. If we need new math to prove Collatz, Goldbach or Riemman, LLMs are simply SOL.
An unproved theorem now proved is by definition new math. Will LLMs get you to Collatz, Goldbach, or Riemann? Unclear.
But it's not like there's some magical, entirely unrelated to existing math, "new math" that was required to solve all the big conjectures of the past. They proceeded, as always, by proving new theorems one by one.
Yes, "new math" is neither magical nor unrelated to existing math, but that doesn't mean any new theorem or proof is automatically "new math." I think the term is usually reserved for the definition of a new kind of mathematical object, about which you prove theorems relating it to existing math, which then allows you to construct qualitatively new proofs by transforming statements into the language of your new kind of object and back.
I think eventually LLMs will also be used as part of systems that come up with new, broadly useful definitions, but we're not there yet.
>An unproved theorem now proved is by definition new math.
No. By _new math_ I mean new mathematical constructs and theories like (to mention the "newest" ones) category theory, information theory, homotopy type theory, etc. Something like Cantor inventing set theory, or Shannon with information theory, or Euler with graph theory.
AFAIK no new field of mathematics as been created _by_ AI. Feel free to correct me.
We've had automated theorem proving since the 60s. What we need is automated theorem discovery. Erdős discovered these theorems even if he wasn't really able to prove them. Euler and Gauss discovered a ton of stuff they couldn't prove. It is weird that nobody considers this to be intelligence. Instead intelligence is a little game AI plays with Lean.
AI researchers keep trying to reduce intelligence into something tiny and approachable, like automated theorem proving. It's easy: you write the theorem you want proven and hope you get a proof. It works or it doesn't. Nice and benchmarkable.
Automated axiom creation seems a lot harder. How is an LLM supposed to know that "between any two points there is a line" formalizes an important property of physical space? Or how to suggest an alternative to Turing machines / lambda calculus that expresses the same underlying idea?
> We've had automated theorem proving since the 60s.
By that logic, we've had LLMs since the 60s!
> What we need is automated theorem discovery.
I don't see any reason you couldn't train a model to do this. You'd have to focus it on generating follow-up questions to ask after reading a corpus of literature, playing around with some toy examples in Python and making a conjecture out of it. This seems much easier than training it to actually complete an entire proof.
> Erdős discovered these theorems even if he wasn't really able to prove them. Euler and Gauss discovered a ton of stuff they couldn't prove. It is weird that nobody considers this to be intelligence.
Who says they don't? I wouldn't be surprised if HarmonicMath, DeepMind, etc have also thought about this kind of thing.
> Automated axiom creation seems a lot harder. How is an LLM supposed to know that "between any two points there is a line" formalizes an important property of physical space?
That's a good question! It would be interesting to see if this is an emergent property of multimodal LLMs trained specifically on this kind of thing. You would need mathematical reasoning, visual information and language encoded into some shared embedding space where similar things are mapped right next to each other geometrically.
This is response from mathematician:
"This is quite something, congratulations to Boris and Aristotle!
On one hand, as the nice sketch provided below by tsaf confirms, the final proof is quite simple and elementary - indeed, if one was given this problem in a maths competition (so therefore expected a short simple solution existed) I'd guess that something like the below would be produced. On the other hand, if something like this worked, then surely the combined talents of Burr, Erdős, Graham, and Li would have spotted it.
Normally, this would make me suspicious of this short proof, in that there is overlooked subtlety. But (a) I can't see any and (b) the proof has been formalised in Lean, so clearly it just works!
Perhaps this shows what the real issue in the [BEGL96] conjecture is - namely the removal of 1 and the addition of the necessary gcd condition. (And perhaps at least some subset of the authors were aware of this argument for the easier version allowing 1, but this was overlooked later by Erdős in [Er97] and [Er97e], although if they were aware then one would hope they'd have included this in the paper as a remark.)
At the moment I'm minded to keep this as open, and add the gcd condition in the main statement, and note in the remarks that the easier (?) version allowing 1 and omitting the gcd condition, which was also asked independently by Erdős, has been solved."
The commentator is saying: "I can't believe this famous problem was solved so easily. I would have thought it was a fake proof, but the computer verified it. It turns out the solution works because it addresses a slightly different set of constraints (regarding the number 1) than what Erdős originally struggled with. (Generated by Gemini)
I know the problem the AI solved wasn't that hard but one should consider math toast within the next 10-15 years.
Software engineering is more subtle since you actually care about the performance and semantics of a program. Unless you get really good at program specification, it would be difficult to fully automate.
But with math the only thing you care about is whether the moves you made were right.
It’s crazy that the empathetic AI I sometimes share my life’s mundane problems with wrote this. It understands such high level maths and speaks in a lexicon with words I can’t even pronounce. Incredible versatility. If this isn’t AGI (or even super intelligence) then what is?
Sure it suffers from amnesia and cannot get basic things right sometimes, but one is a design limitation that can be overcome and the other a possible problem caused by training (we’re discovering that overt focus during training on adherence to prompt somewhat lobotomizes their general competence).
As for the first: Everything points towards the problem NOT being able to be overcome with current architectures.
For the 2nd: We are completely 100% sure this cannot be solved. This is not a training issue. This is the case for any statistical machine. No amount of training can ever solve this.
The amount of goal post shifting is so amusing to see. Yes, sure this was probably not an "important" or a particularly "challenging" problem which had been open for a while. Sure, maybe it remained open because it didn't get enough eyeballs from the right people to care about spending time on it. Yes, there is too much overhyping and we are all tired of it somewhat. I still think if someone 10 years ago told me we would get "AI" to a stage where it can solve olympiad level problems and getting gold medals in IMO on top of doing so with input not in a very structured input but rather our complex, messy human natural language and being able to do so while interpreting, to various degrees of meaning what interpreting means, image and video data and doing so in almost real time I would have called you nuts and this thing in such a duration sci-fi. So some part of me feels crazy how quickly we have normalized to this new reality.
A reality where we are talking about if the problem solved by the automated model using formal verification was too easy.
Don't get me wrong, I am not saying any of this means we get AGI or something or even if we continue to see improvements. We can still appreciate things. It doesn't need to be a binary. What a time to be alive!
> The amount of goal post shifting is so amusing to see
Can you be specific about the goal posts being shifted? Like the specific comments you're referring to here. Maybe I'm just falling for the bait, but non specific claims like this seem designed just to annoy while having nothing specific to converse about.
I got to the end of your comment and counting all the claims you discounted, the only goal post I see left is that people aren't using a sufficiently excited tone while sifting fact from hype? A lot of us follow this work pretty closely and don't feel the need to start every post with "there is no need for excitement to abate, still exciting! but...".
> I am not saying any of this means we get AGI or something or even if we continue to see improvements. We can still appreciate things. It doesn't need to be a binary.
You'll note, however, that the hype guys happily include statements like "Vibe proving is here" in their posts with no nuance, all binary. Why not call them out?
Maybe the Turing test for one. Maybe etc.
Yeah, I agree. As a mathematician it's easy to say that it's not a super hard proof. But then again, the first proofs found by running AI on a large bucket of open problems was always going to be some easy proofs on a not-very-worked-on problem. The fact that no one did it before them definitely shows real progress being made. When you're an expert, it's hard to lose track of the fact that things that you consider trivial vs very hard may be in fact a very small distance in the grand scheme of things (rel to entities of different oom strengths)
I am rather pro-AI in general but I just can't imagine in 2015 what I would think if you told me that we would have AI that could solve an Erdos problem from natural language but it can't answer my work emails.
It actually doesn't help me at all at anything at my job and I really wish it could.
That isn't really goal post moving as much as a very strange shape to the goal post.
The most amazing thing about AI is how it's amazing and disappointing at the same time.
Everything is amazing and nobody is happy: https://www.youtube.com/watch?v=nUBtKNzoKZ4
"How quickly the world owes him something, he knew existed only 10 seconds ago"
The overhyped tweet from the robinhood guy raising money for his AI startup is nicely brought into better perspective by Thomas Bloom (including that #124 is not from the cited paper, "Complete sequences of sets of integer powers "/BEGL96):
> This is a nice solution, and impressive to be found by AI, although the proof is (in hindsight) very simple, and the surprising thing is that Erdos missed it. But there is definitely precedent for Erdos missing easy solutions!
> Also this is not the problem as posed in that paper
> That paper asks a harder version of this problem. The problem which has been solved was asked by Erdos in a couple of later papers.
> One also needs to be careful about saying things like 'open for 30 years'. This does not mean it has resisted 30 years of efforts to solve it! Many Erdos problems (including this one) have just been forgotten about it, and nobody has seriously tried to solve it.[1]
And, indeed, Boris Alexeev (who ran the problem) agrees:
> My summary is that Aristotle solved "a" version of this problem (indeed, with an olympiad-style proof), but not "the" version.
> I agree that the [BEGL96] problem is still open (for now!), and your plan to keep this problem open by changing the statement is reasonable. Alternatively, one could add another problem and link them. I have no preference.[2]
Not to rain on the parade out of spite, it's just that this is neat, but not like, unusually neat compared to the last few months.
[1] https://twitter.com/thomasfbloom/status/1995083348201586965
[2] https://www.erdosproblems.com/forum/thread/124#post-1899
reading the original paper and the lean statement that got proven, it's kinda fascinating what exactly is considered interesting and hard in this problem
roughly, what lean theorem (and statement on the website) asks is this: take some numbers t_i, for each of them form all the powers t_i^j, then combine all into multiset T. Barring some necessary conditions, prove that you can take subset of T to sum to any number you want
what Erdosh problem in the paper asks is to add one more step - arbitrarily cut off beginnings of t_i^j power sequences before merging. Erdosh-and-co conjectured that only finite amount of subset sums stop being possible
"subsets sum to any number" is an easy condition to check (that's why "olympiad level" gets mentioned in the discussion) - and it's the "arbitrarily cut off" that's the part that og question is all about, while "only finite amount disappear" is hard to grasp formulaically
so... overhyped yes, not actually erdos problem proven yes, usual math olympiad level problems are solvable by current level of Ai as was shown by this year's IMO - also yes (just don't get caught by https://en.wikipedia.org/wiki/AI_effect on the backlash since olympiads are haaard! really!)
I’d be less skeptical about this year’s IMO claims if we had any information at all on how it was done.
I was so happy for this result until I saw your mention of robinhood hype :/
See this is one of the reasons I struggle to get on board the AI hype train. Any time I've seen some breathless claim about it's capabilities that feels a bit too good to be true, someone with knowledge in the domain takes a closer look and it turns out to have been exaggerated and meant to draw eyeballs and investors to some fledgling AI company.
I just feel like if we were genuinely on the cusp of an AI revolution like it is claimed, we wouldn't need to keep seeing this sort of thing. Like I feel like a lot of the industry is full of flim-flam men trying to scam people, and if the tech was as capable as we keep getting told it is there'd be no need for dishonesty or sleight of hand.
The thing is, we genuinely are going through an AI revolution. I don't even think that's that breathless of a claim. The contention is over whether it's about to revolutionize our economy, which is a far harder claim to substantiate and should be largely self-substantiating if it is going to happen.
The crypto train kinda ran out of steam, so all aboard the AI train.
That being said, I think AI has a lot more immediately useful cases than cryptocurrency. But it does feel a bit overhyped by people who stand to gain a tremendous amount of money.
I might get slammed/downvoted on HN for this, but really wondering how much of VC is filled with get-rich-quick cheerleading vs supporting products that will create strong and lasting growth.
I don't think you really need to wonder about how much is cheer leading. Effectively all of VC public statements will be cheer leading for companies they already invested in.
The more interesting one is the closed door conversations. Earlier this year, for example, it seemed there was a pattern of VCs heavily invested in AI asking the other software companies they invested in to figure out how to make AI useful for them and report back. I.e. "we invested heavily in hype, tell us how to make it real."
From my perspective, having worked in both industries and simply following my passions and opportunities, all I see is that the same two bandwagons who latched onto crypto either to grift or just egotistically talk shit have moved over to the latest technological breakthrough, meanwhile those of us silently working on interesting things are consantly rolling our eyes over comments from both sides of the peanut gallery.
If OpenAI saw an imminent path to AGI in 6 months (or in 5 years for that matter) they would not be pivoting to become a banal big tech ad company.
Short AI and tech, and just hope you get the timing right.
Also in the thread comments:
"I also wonder whether this 'easy' version of the problem has actually appeared in some mathematical competition before now, which would of course pollute the training data if Aristotle [Ed.: the clanker's name] had seen this solution already written up somewhere."
So in short, it was an easy problem that had already been solved thousands of years ago and the proof was so simple that it doesn't really count, and the AI used too many em-dashes in its response and it totally sucks.
> problem that had already been solved thousands of years ago
If by this you refer to "Aristotle" in the parent post - it's not that Aristotle. This is "Aristotle AI" - the name of their product.
I predict the way AI will be useful in science from the perspective of mathematics is by figuring out combinatorially complex solutions to problems that would otherwise not be interesting to (or far too complex to be comprehended by) humans. With such capabilities it could be imagined then that the AI will be useful for designing super materials, or doing fancy things with biology / medicine, and generally finding useful patterns in complex systems.
But abstract mathematics doesn’t care about solutions to problems; it cares about understanding problem spaces. I do not think current AI helps with that.
Problems like the one discussed also aren’t interesting to applied mathematicians, either, because of lack of applications.
But yes, if this kind of AI produces new materials, solves diseases, etc. they will be very useful. We wouldn’t care whether they arrived at those solutions through valid reasoning, though. A sloppy AI that has better ‘guesses/intuitions’ than humans or that can guess and check ‘guesses/intuitions' for correctness faster would be very useful.
And engineers don’t care about abstract mathematics: we care about math that solves problems. Being able to solve more problems with less human-years of effort is a big win.
However, they at most would be the heuristic function of a search mechanism. A good heuristic, but heuristic at most. For search we need to identify when to abandon a path and which other entry point is promising. I'm not sure our current techniques are good for this kind of problems.
I think it will be even more helpful to know a simple proof doesn't exist because AI has tried for long enough and didn't find it. Once people know there is no easy proof of say Collatz or Twin Primes Conjencture those will not be as alluring to waste your time on.
"Recently, yet another category of low-hanging fruit has been identified as within reach of automated tools: problems which, due to a technical flaw in their description, are unexpectedly easy to resolve. Specifically, problem #124 https://www.erdosproblems.com/124 was a problem that was stated in three separate papers of Erdos, but in two of them he omitted a key hypothesis which made the problem a consequence of a known result (Brown's criterion). However, this fact was not noticed until Boris Alexeev applied the Aristotle tool to this problem, which autonomously located (and formalized in Lean) a solution to this weaker version of the problem within hours."
https://mathstodon.xyz/@tao/115639984077620023
That doesn’t seem very fair. The problem was stated, and remained unsolved for all this time. You can’t take away that accomplishment just because the solution seems easy in hindsight.
It's technically true that this version of the problem was "low-hanging fruit", so that's an entirely fair assessment. Systematically spotting low-hanging fruit that others had missed is an accomplishment, but it's quite different from solving a genuinely hard problem and we shouldn't conflate the two.
My point is stronger than that. Some things only appear low hanging fruit in hindsight. My own field of physics is full of examples. Saying “oh that should’ve been easy” is wrong more often than it is right.
It’s a combination of the problem appearing to be low-hanging fruit in hindsight and the fact that almost nobody actually seemed to have checked whether it was low-hanging in the first place. We know it’s the latter because the problem was essentially uncited for the last three decades, and it didn't seem to have spread by word of mouth (spreading by word of mouth is how many interesting problems get spread in math).
This is different from the situation you are talking about, where a problem genuinely appears difficult, attracts sustained attention, and is cited repeatedly as many people attempt partial results or variations. Then eventually someone discovers a surprisingly simple solution to the original problem which basically make all the previous paper look ridiculous in hindsight.
In those cases, the problem only looks “easy” in hindsight, and the solution is rightly celebrated because there is clear evidence that many competent mathematicians tried and failed before.
Are there any evidence that this problem was ever attempted by a serious mathematician?
Sure, but unless all solvable problems can be said to "appear as low-hanging fruit in hindsight" this doesn't detract from Tao's assessment in any way. Solving a genuinely complex problem is a different matter than spotting simpler solutions that others had missed.
In this case, the solution might have been missed before simply because the difference between the "easy" and "hard" formulations of the problem wasn't quite clear, including perhaps to Erdős, prior to it being restated (manually) as a Lean goal to be solved. So this is a success story for formalization as much as AI.
One of the math academics on that thread says the following:
> My point is that basic ideas reappear at many places; humans often fail to realize that they apply in a different setting, while a machine doesn't have this problem! I remember seeing this problem before and thinking about it briefly. I admit that I haven't noticed this connection, which is only now quite obvious to me!
Doesn't this sound extremely familiar to all of us who were taught difficult/abstract topics? Looking at the problem, you don't have a slightest idea what is it about but then someone comes along and explains the innerbits and it suddenly "clicks" for you.
So, yeah, this is exactly what I think is happening here. The solution was there, and it was simple, but nobody discovered it up until now. And now that we have an explanation for it we say "oh, it was really simple".
The bit which makes it very interesting is that this hasn't been discovered before and now it has been discovered by the AI model.
Tao challenges this by hypothesizing that it actually was done before but never "released" officially, and which is why the model was able to solve the problem. However, there's no evidence (yet) for his hypothesis.
Is your argument that Terence Tao says it was a consequence from a known result and he categorizes it as low hanging fruit, but to you it feels like one of those things that's only obvious in retrospect after it's explained to you, and without "evidence" of Tao's claim, you're going to go with your vibes?
What exactly would constitute evidence?
Consequence from a known result describes literally every proof.
Known result from where? I failed to see the reference where it was proven before. Maybe I missed it. Do you have a reference?
Related, independent, and verified:
GPT-5 solved Erdős problem #848 (combinatorial number theory):
https://cdn.openai.com/pdf/4a25f921-e4e0-479a-9b38-5367b47e8...
https://lifearchitect.ai/asi/
I read how GPT-5 contributed to proof. It is not fully solved by GPT-5 instead assisted. For more look here https://www.math.columbia.edu/~msawhney/Problem_848.pdf
FTL? Seriously?
This seems to be 2nd in row proof from the same author by using the AI models. First time it was the ChatGPT which wrote the formal Lean proof for Erdos Problem #340.
https://arxiv.org/html/2510.19804v1#Thmtheorem3
> In over a dozen papers, beginning in 1976 and spanning two decades, Paul Erdős repeatedly posed one of his “favourite” conjectures: every finite Sidon set can be extended to a finite perfect difference set. We establish that {1, 2, 4, 8, 13} is a counterexample to this conjecture.
that one was vibe-coded
Ai was given step-by-step already found proof, and asked "please rewrite in Lean"
---
here Ai did the proof itself
[dead]
For reference, this is from: https://harmonic.fun/
related article:
> Is Math the Path to Chatbots That Don’t Make Stuff Up?
https://www.nytimes.com/2024/09/23/technology/ai-chatbots-ch...
Meanwhile I have to use ai just to understand the problem statement.
Which makes you the top 1% power users of AI. [the other 99% asking for the breast size of whoever starlette appeared on TV]
That's not what we mean when we're asking for "the release of bigger models"
200DD parameters
Bigger is not always better. [Gemini told me that yesterday, during my daily psychotherapy]
"Mathematicial superintelligence" is so obnoxious. Why exactly do they think it is called an Erdős problem when Erdős didn't find the solution? Because Erdős discovered the real mathematics: the conjecture!
These people treat math research as if it is a math homework assignment. There needs to be an honest discussion about what the LLM is doing here. When you bang your head against a math problem you blindly try a bunch of dumb ideas that don't actually work. It wastes a lot of paper. The LLM automates a lot of that.
It is actually pretty cool that modern AI can help speed this up and waste less paper. It is very similar to how classical AI (Symbolica) sped up math research and wasted less paper. But we need to have an honest discussion about how we are using the computer as a tool. Instead malicious idiots like Vlad Tenev are making confident predictions about mathematical superintelligence. So depressing.
Ok… has this been verified? I see no publication or at least an announcement on Harmonics webpage. If this is a big deal, you think it would be a big deal, or is this just hype?
verified by lean so 99.99% yes
Lean verified a proof of a solution to a problem, but was it the same problem as Erdős problem #124?
https://www.erdosproblems.com/forum/thread/124#post-1899
> My summary is that Aristotle solved "a" version of this problem (indeed, with an olympiad-style proof), but not "the" version.
> I agree that the [BEGL96] problem is still open (for now!), and your plan to keep this problem open by changing the statement is reasonable. Alternatively, one could add another problem and link them. I have no preference. — BorisAlexeev
There we go, so there is hype to some degree.
Is there some good literature to read about lean? First time I’m hearing about it and it seems pretty cool.
Are you kidding? This is an incredible result. Stuff like this is the most important stuff happening in AI right now. Automated theorem proving? It's not too far to say the entire singular point of the technology was to get us to this.
This is me being snarky and ignorant, but if it solved one problem and it is automated what’s stopping it from solving all the others? That’s what’s ultimately being sold by the tweet right.
Ultimately the main thing that will stop it from solving literally "all the others" are things like the impossibility of solving the halting problem, considerations like P ≠ NP, etc. But as we have just seen, despite these impossibility theorems, AI systems are still able to make substantive progress on solving important open real-world problems.
> But as we have just seen, despite these impossibility theorems, AI systems are still able to make substantive progress on solving important open real-world problems.
You seem to be smuggling in the assumption that this problem was "important".
>This is me being snarky and ignorant, but if it solved one problem and it is automated what’s stopping it from solving all the others?
Yeah that's the crux of the matter. How do AI did it? Using already existing math. If we need new math to prove Collatz, Goldbach or Riemman, LLMs are simply SOL. That's what's missing and hype boys always avoid to mention.
> How do AI did it? Using already existing math. If we need new math to prove Collatz, Goldbach or Riemman, LLMs are simply SOL.
An unproved theorem now proved is by definition new math. Will LLMs get you to Collatz, Goldbach, or Riemann? Unclear.
But it's not like there's some magical, entirely unrelated to existing math, "new math" that was required to solve all the big conjectures of the past. They proceeded, as always, by proving new theorems one by one.
Yes, "new math" is neither magical nor unrelated to existing math, but that doesn't mean any new theorem or proof is automatically "new math." I think the term is usually reserved for the definition of a new kind of mathematical object, about which you prove theorems relating it to existing math, which then allows you to construct qualitatively new proofs by transforming statements into the language of your new kind of object and back.
I think eventually LLMs will also be used as part of systems that come up with new, broadly useful definitions, but we're not there yet.
>An unproved theorem now proved is by definition new math.
No. By _new math_ I mean new mathematical constructs and theories like (to mention the "newest" ones) category theory, information theory, homotopy type theory, etc. Something like Cantor inventing set theory, or Shannon with information theory, or Euler with graph theory.
AFAIK no new field of mathematics as been created _by_ AI. Feel free to correct me.
Sucks that fuckers dupe suckers. Personally, I find the news to be incredible.
(We detached this subthread from https://news.ycombinator.com/item?id=46094763.)
I think it is way too far to say that!
We've had automated theorem proving since the 60s. What we need is automated theorem discovery. Erdős discovered these theorems even if he wasn't really able to prove them. Euler and Gauss discovered a ton of stuff they couldn't prove. It is weird that nobody considers this to be intelligence. Instead intelligence is a little game AI plays with Lean.
AI researchers keep trying to reduce intelligence into something tiny and approachable, like automated theorem proving. It's easy: you write the theorem you want proven and hope you get a proof. It works or it doesn't. Nice and benchmarkable.
Automated axiom creation seems a lot harder. How is an LLM supposed to know that "between any two points there is a line" formalizes an important property of physical space? Or how to suggest an alternative to Turing machines / lambda calculus that expresses the same underlying idea?
> What we need is automated theorem discovery.
I've been thinking mathematicians have fun doing math, making discoveries, crafting proofs.
Does Tour de France & Co. make no sense since small, lightweight and powerful e-bicycles appeared?
Using computer as a helper like bicycles is one thing, using LLMs seems more like e-bicycle and is something another.
> We've had automated theorem proving since the 60s.
By that logic, we've had LLMs since the 60s!
> What we need is automated theorem discovery.
I don't see any reason you couldn't train a model to do this. You'd have to focus it on generating follow-up questions to ask after reading a corpus of literature, playing around with some toy examples in Python and making a conjecture out of it. This seems much easier than training it to actually complete an entire proof.
> Erdős discovered these theorems even if he wasn't really able to prove them. Euler and Gauss discovered a ton of stuff they couldn't prove. It is weird that nobody considers this to be intelligence.
Who says they don't? I wouldn't be surprised if HarmonicMath, DeepMind, etc have also thought about this kind of thing.
> Automated axiom creation seems a lot harder. How is an LLM supposed to know that "between any two points there is a line" formalizes an important property of physical space?
That's a good question! It would be interesting to see if this is an emergent property of multimodal LLMs trained specifically on this kind of thing. You would need mathematical reasoning, visual information and language encoded into some shared embedding space where similar things are mapped right next to each other geometrically.
>> We've had automated theorem proving since the 60s.
> By that logic, we've had LLMs since the 60s!
From a bit earlier[0], actually:
Were those "large"? I'm sure at the time they were thought to be so.0 - https://ai-researchstudies.com/history-of-large-language-mod...
[dead]
More interesting discussion than on Twitter here:
https://www.erdosproblems.com/forum/thread/124#post-1892
Ok, we've changed the link to that from https://twitter.com/vladtenev/status/1994922827208663383, but I've included the latter in the toptext. Thanks!
This is response from mathematician: "This is quite something, congratulations to Boris and Aristotle!
On one hand, as the nice sketch provided below by tsaf confirms, the final proof is quite simple and elementary - indeed, if one was given this problem in a maths competition (so therefore expected a short simple solution existed) I'd guess that something like the below would be produced. On the other hand, if something like this worked, then surely the combined talents of Burr, Erdős, Graham, and Li would have spotted it.
Normally, this would make me suspicious of this short proof, in that there is overlooked subtlety. But (a) I can't see any and (b) the proof has been formalised in Lean, so clearly it just works!
Perhaps this shows what the real issue in the [BEGL96] conjecture is - namely the removal of 1 and the addition of the necessary gcd condition. (And perhaps at least some subset of the authors were aware of this argument for the easier version allowing 1, but this was overlooked later by Erdős in [Er97] and [Er97e], although if they were aware then one would hope they'd have included this in the paper as a remark.)
At the moment I'm minded to keep this as open, and add the gcd condition in the main statement, and note in the remarks that the easier (?) version allowing 1 and omitting the gcd condition, which was also asked independently by Erdős, has been solved."
The commentator is saying: "I can't believe this famous problem was solved so easily. I would have thought it was a fake proof, but the computer verified it. It turns out the solution works because it addresses a slightly different set of constraints (regarding the number 1) than what Erdős originally struggled with. (Generated by Gemini)
> Generated by Gemini
Please don't post generated comments here. We want HN to be a place for humans writing in their own voice.
I think that person is the owner of the website discussing it too.
He was cited https://the-decoder.com/leading-openai-researcher-announced-...
Where a while back OpenAI made a misleading claim about solving some of these problems.
[flagged]
"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."
"Don't be curmudgeonly. Thoughtful criticism is fine, but please don't be rigidly or generically negative."
https://news.ycombinator.com/newsguidelines.html
Seriously, blame the investors.
Nah, everyone is to blame here in bubble.
I know the problem the AI solved wasn't that hard but one should consider math toast within the next 10-15 years.
Software engineering is more subtle since you actually care about the performance and semantics of a program. Unless you get really good at program specification, it would be difficult to fully automate.
But with math the only thing you care about is whether the moves you made were right.
> one should consider math toast within the next 10-15 years.
I keep a list of ridiculously failed predictions, and this goes into it.
Can I ask you to be more concrete? What does "math is toast" mean to you?
I'd like to see some other instances from your list.
It’s crazy that the empathetic AI I sometimes share my life’s mundane problems with wrote this. It understands such high level maths and speaks in a lexicon with words I can’t even pronounce. Incredible versatility. If this isn’t AGI (or even super intelligence) then what is?
Sure it suffers from amnesia and cannot get basic things right sometimes, but one is a design limitation that can be overcome and the other a possible problem caused by training (we’re discovering that overt focus during training on adherence to prompt somewhat lobotomizes their general competence).
As for the first: Everything points towards the problem NOT being able to be overcome with current architectures.
For the 2nd: We are completely 100% sure this cannot be solved. This is not a training issue. This is the case for any statistical machine. No amount of training can ever solve this.