DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

206 points by victorbuilds 7 hours ago

Notable: they open-sourced the weights under Apache 2.0, unlike OpenAI and DeepMind whose IMO gold models are still proprietary.

PunchyHamster 3 hours ago

I think we should treat copyright for the weights the same way the AI companies treat source material ;)
- littlestymaar 2 hours ago
  
  We don't even have to do that: weights being entirely machine generated without human intervention, they are likely not copyrightable in the first place.
  In fact, we should collectively refuse to abide to these fantasy license before weight copyrightability gets created out of thin air because it's been commonplace for long enough.
SilverElfin 6 hours ago

If they open source just weights and not the training code and data, then it’s still proprietary.
- very_illiterate 5 hours ago
  
  Stop kvetching and read the submission title.
- mips_avatar 6 hours ago
  
  Yeah but you can distill
  - littlestymaar 2 hours ago
    
    You can distill closed weights models as well. (Just not logit-distillation)
  - amelius 6 hours ago
    
    Is that the equivalent of decompile?
    
    c0balt 6 hours ago
    
    No, that is the equivalent of lossy compression.
- falcor84 6 hours ago
  
  Isn't that a bit like saying that if I open source a tool, but not a full compendium of all the code that I had read, which led me to develop it, then it's not really open source?
  - KaiserPro 5 hours ago
    
    No its like releasing a binary. I can hook into it and its API and make it do other things. But I can't rebuild it from scratch.
    
    falcor84 4 hours ago
    
    > rebuild it from scratch
    That's beyond the definition of Open Source. Doing a bit of license research now, only the GPL has such a requirement - GPLv3:
    > The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities.
    But all other Open Source compliant licenses I checked don't, and just refer to making whatever is in the repo available to others.
    
    PunchyHamster 3 hours ago
    
    ok but just the model isn't even close to anything open, it's literally a compiled binary, without even the source data
  - nextaccountic 6 hours ago
    
    No, it's like saying that if you release under Apache license, it's not open source even though it's under an open source license
    For something to be open source it needs to have sources released. Sources are the things in the preferred format to be edited. So the code used for training is obviously source (people can edit the training code to change something about the released weights). Also the training data, under the same rationale: people can select which data is used for training to change the weights
    
    falcor84 4 hours ago
    
    Well, this is just semantics. I can have a repo that includes a collection of json files that I had generated via a semi-manual build process that depends on everything from the state of my microbiome to my cat's scratching pattern during Mercury's last retrograde. If I attach an open source license to it, then that's the source - do with it what you will. Otherwise, I don't see how this discussion doesn't lead to "you must first invent the universe".
    
    typ an hour ago
    
    The difference is that you can customize/debug it or not. You might say that a .EXE can be modified too. But I don't think that's the conventional definition of open source.
    I understand that these days, businesses and hobbyists just want to use free LLMs without paying subscriptions for economic motives, that is, either saving money or making money. They don't really care whether the source is truly available or not. They are just end users of a product, not open-source developers by any means.
  - exe34 5 hours ago
    
    "open source" as a verb is doing too much work here. are you proposing to release the human readable code or the object/machine code?
    if it's the latter, it's not the source. it's free as in beer. not freedom.
    
    falcor84 an hour ago
    
    Yes, I 100% agree. Open Source is a lot more about not paying than about liberty.
    This is exactly the tradeoff that we had made in the industry a couple of decades ago. We could have pushed all-in on Stallman's vision and the FSF's definition of Free Software, but we (collectively) decided that it's more important to get the practical benefits of having all these repos up there on GitHub and us not suing each other over copyright infringement. It's absolutely legitimate to say that we made the wrong choice, and I might agree, but a choice was made, and Open Source != Free Software.
    https://www.gnu.org/philosophy/open-source-misses-the-point....
  - nurettin 6 hours ago
    
    Is this a troll? They don't want to reproduce your open source code, they want to reproduce the weights.
    
    falcor84 2 hours ago
    
    What does open sourcing have to do with "reproducing"? Last I checked, open sourcing is about allowing others to modify and to distribute the modified version, which you can do with these. Yes, having the full training data and tooling would make it significantly easier, and it is a requirement for GPL, but not for Open Source licenses in general. You may add this as another argument in favor of going back in time and doing more to support Richard Stallman's vision, but this is the world in which we live now.
    
    nurettin 2 hours ago
    
    For obvious reasons, there is no world in which you can "build" this kind of so-called open source project without the data sets. Play around with words all you want.
  - fragmede 6 hours ago
    
    No. In that case, you're providing two things, a binary version of your tool, and the tool's source. That tool's source is available to inspect and build their own copy. However, given just the weights, we don't have the source, and can't inspect what alignment went into it. In the case of DeepSeek, we know they had to purposefully cause their model to consider Tiananmen Square something it shouldn't discuss. But without the source used to create the model, we don't know what else is lurking around inside the model.
    
    NitpickLawyer 6 hours ago
    
    > However, given just the weights, we don't have the source
    This is incorrect, given the definitions in the license.
    > (Apache 2.0) "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.
    (emphasis mine)
    In LLMs, the weights are the preferred form of making modifications. Weights are not compiled from something else. You start with the weights (randomly initialised) and at every step of training you adjust the weights. That is not akin to compilation, for many reasons (both theoretical and practical).
    In general licenses do not give you rights over the "know-how" or "processes" in which the licensed parts were created. What you get is the ability to inspect, modify, redistribute the work as you see fit. And most importantly, you modify the work just like the creators modify the work (hence the preferred form). Just not with the same data (i.e. you can modify the source of chrome all you want, just not with the "know-how and knowledge" of a google engineer - the license can not offer that).
    This is also covered in the EU AI act btw.
    > General-purpose AI models released under free and open-source licences should be considered to ensure high levels of transparency and openness if their parameters, including the weights, the information on the model architecture, and the information on model usage are made publicly available. The licence should be considered to be free and open-source also when it allows users to run, copy, distribute, study, change and improve software and data, including models under the condition that the original provider of the model is credited, the identical or comparable terms of distribution are respected.
    
    fragmede 6 hours ago
    
    > In LLMs, the weights are the preferred form of making modifications.
    No they aren't. We happen to be able to do things to modify the weights, sure, but why would any lab ever train something from scratch if editing weights was preferred?
    
    NitpickLawyer 5 hours ago
    
    training is modifying the weights. How you modify them is not the object of a license, never was.
    
    v9v 5 hours ago
    
    Would you accept the argument that compiling is modifying the bytes in the memory space reserved for an executable?
    I can edit the executable at the byte level if I so desire, and this is also what compilers do, but the developer would instead be modifying the source code to make changes to the program and then feed that through a compiler.
    Similarly, I can edit the weights of a neural network myself (using any tool I want) but the developers of the network would be altering the training dataset and the training code to make changes instead.
    
    falcor84 an hour ago
    
    The big difference that an Open Source license gives me is that regardless of the tool I use to make the edits, if I rewrite the bytes of the Linux kernel, I can freely release my version with the same license, but if I rewrite the bytes of Super Mario Odyssey and try to release the modified version, I'll soon be having a very fun time at the bankruptcy court.
    
    NitpickLawyer 5 hours ago
    
    I think the confusion for a lot of people comes from what they imagine compilation to be. In LLMs, the process is this (simplified):
    define_architecture (what the operations are, and the order in which they're performed)
    initialise_model(defined_arch) -> weights. Weights are "just" hardcoded values. Nothing more, nothing less.
    The weights are the result of the arch, at "compile" time.
    optimise_weights(weights, data) -> better_weights.
    ----
    You can, should you wish, totally release a model after iitialisation. It would be a useless model, but, again, the license does not deal with that. You would have the rights to run, modify and release the model, even if it were a random model.
    tl;dr; Licenses deal with what you can do with a model. You can run it, modify it, redistribute it. They do not deal with how you modify them (i.e. what data you use to arrive at the "optimal" hardcoded values). See also my other reply with a simplified code example.
    
    noodletheworld 5 hours ago
    
    > And most importantly, you modify the work just like the creators modify the work
    Emphasis mine.
    Weights are not open source.
    You can define terms to mean whatever you want, but fundametally if you cannot modify the “output” the way the original creators could, its not in the spirit of open source.
    Isnt that literally what you said?
    How can you possibly claim both that a) you can modify it the creators did, b) thats all you need to be open source, but…
    Also c) the categorically incorrect assertion that the weights allow you to do this?
    Whatever, I guess, but your argument is logically wrong, and philosophically flawed.
    
    NitpickLawyer 5 hours ago
    
    > Weights are not open source.
    If they are released under an open source license, they are.
    I think you are confusing two concepts. One is the technical ability to modify weights. And that's what the license grants you. The right to modify. The second is the "know-how" on how to modify the weights. That is not something that a license has ever granted you.
    Let me put it this way:
```python
THRESHOLD = 0.73214
if input() < THRESHOLD:
print ("low")
else:
print ("high")
```
    If I release that piece of code under Apache 2.0, you have the right to study it, modify it and release it as you see fit. But you can not have the right (at least the license doesn't deal with that) to know how I reached that threshold value. And me not telling you does not in any way invalidate the license being Apache 2.0. That's simply not something that licenses do.
    In LLMs the source is a collection of architecture (when and how to apply the "ifs"), inference code (how to optimise the computation of the "ifs") and hardcoded values (weights). You are being granted a license to run, study, modify and release those hardcoded values. You do not, never had, never will in the scope of a license, get the right to know how those hardcoded values were reached. The process by which those values were found can be anything from "dreamt up" to "found via ML". The fact that you don't know how those values were derived does not in any way preclude you from exercising the rights under the license.
    
    roblabla 4 hours ago
    
    You are fundamentally conflating releasing a binary under an open source license with the software being open source. Nobody is saying that they're violating the license of Apache2 by not releasing the training data. What people are objecting to is that calling this release "open source", when the only thing covered by the open source license is the weights, to be an abuse of the meaning of "Open Source".
    To give you an example: I can release a binary (without sources) under the MIT - an open source license. That will give you the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of said binary. In doing so, I would have released the binary under an open source license. However, most people would agree that the software would not be open source under the conventional definition, as the sources would not be published. While people could modify it by disassembling it and modifying it, there is a general understanding that Open Source requires distributing the _sources_.
    This is very similar to what is being done here. They're releasing the weights under an open source license - but the overall software is not open source.
- amelius 6 hours ago
  
  True. But the headline says open weights.
- ekianjo 5 hours ago
  
  It's just open weights, the source has no place in this expression
- jimmydoe 4 hours ago
  
  you are absolutely right. I'd rather use true closed models, not fake open source ones from China.

yorwba 6 hours ago

Previous discussion: https://news.ycombinator.com/item?id=46072786 218 points 3 days ago, 48 comments

victorbuilds 5 hours ago

Ah, missed that one. Thanks for the link.

ilmj8426 6 hours ago

It's impressive to see how fast open-weights models are catching up in specialized domains like math and reasoning. I'm curious if anyone has tested this model for complex logic tasks in coding? Sometimes strong math performance correlates well with debugging or algorithm generation.

alansaber 4 hours ago

It makes complete sense to me: highly-specific models don't have much commercial value, and at-scale llm training favours generalism.
stingraycharles 4 hours ago

kimi-k2 is pretty decent at coding but it’s nowhere near the SOTA models of Anthropic/OpenAI/Google.
- tripplyons an hour ago
  
  Are you referring to the new reasoning version of Kimi K2?

simianwords 4 hours ago

A bit important that this model is not general purpose whereas the ones Google and OpenAI used were general purpose.

yorwba 4 hours ago

Both OpenAI and Google used models made specifically for the task, not their general-purpose products.
OpenAI: https://xcancel.com/alexwei_/status/1946477756738629827#m "we are releasing GPT-5 soon, and we’re excited for you to try it. But just to be clear: the IMO gold LLM is an experimental research model. We don’t plan to release anything with this level of math capability for several months."
DeepMind: https://deepmind.google/blog/advanced-version-of-gemini-with... "we additionally trained this version of Gemini on novel reinforcement learning techniques that can leverage more multi-step reasoning, problem-solving and theorem-proving data. We also provided Gemini with access to a curated corpus of high-quality solutions to mathematics problems, and added some general hints and tips on how to approach IMO problems to its instructions."
- simianwords 2 hours ago
  
  https://x.com/sama/status/1946569252296929727
  >we achieved gold medal level performance on the 2025 IMO competition with a general-purpose reasoning system! to emphasize, this is an LLM doing math and not a specific formal math system; it is part of our main push towards general intelligence.
  asterisks mine
  - yorwba 2 hours ago
    
    DeepSeekMath-V2 is also an LLM doing math and not a specific formal math system. What interpretation of "general purpose" were you using where one of them is "general purpose" and the other isn't?
    
    simianwords an hour ago
    
    This model can’t be used for say questions on biology or history.
    
    yorwba an hour ago
    
    How do you know how well OpenAI's unreleased experimental model does on biology or history questions?
- simianwords 3 hours ago
  
  Not true
mangolie 4 hours ago

https://x.com/deepseek_ai/status/1995452646459858977
Boom
- andy12_ 2 hours ago
  
  Do note that that is a different model. The one we are talking about here, DeepSeekMath-V2, is indeed overcooked with math RL. It's so eager to solve math problems, that it even comes up with random ones if you prompt it with "Hello".
  https://x.com/AlpinDale/status/1994324943559852326?s=20
- yorwba 3 hours ago
  
  That's a different model: https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale
- simianwords 4 hours ago
  
  Oh you may be correct. Are these models general purpose or fine tuned for mathematics?

terespuwash 6 hours ago

Why isn’t OpenAI’s gold medal-winning model available to the public yet?

esafak 2 hours ago

'coz it was for advertisement. They'll roll their lessons into the next general purpose model.

H8crilA 5 hours ago

How do you run this kind of a model at home? On a CPU on a machine that has about 1TB of RAM?

pixelpoet 4 hours ago

Wow, it's 690GB of downloaded data, so yeah, 1TB sounds about right. Not even my two Strix Halo machines paired can do this, damn.
bertili 3 hours ago

Two 512GB Mac Studios connected with thunderbolt 5.
Gracana 4 hours ago

You can do it slowly with ik_llama.cpp, lots of RAM, and one good GPU. Also regular llama.cpp, but the ik fork has some enhancements that make this sort of thing more tolerable.

letmetweakit 4 hours ago

Does anyone know if this will become available on OpenRouter?

sschueller 5 hours ago

How is OpenAI going to be able to serve ads in chatgpt without everyone immediately jumping ship to another model?

Coffeewine 5 hours ago

I suppose the hope is that they don’t, and we wind up with commodity frontier models from multiple providers at market rates.
miroljub 5 hours ago

I don't care about OpenAI even if they don't serve ads.
I can't trust any of their output until they become honest enough to change their name to CloseAI.
PunchyHamster 3 hours ago

by having datacenters with GPUs and API everyone uses.
So they are either earning money directly or on the API calls.
Now, competition can come and compete on that, but they will probably still be the first choice for foreseeable future
KeplerBoy 4 hours ago

Google served ads for decades and no one ever jumped ship to another search engine.
- sschueller 4 hours ago
  
  Because Google gave the best results for a long time.
  - PunchyHamster 3 hours ago
    
    and now, when they are not, everyone else's results are also pretty terrible...
- bootsmann 4 hours ago
  
  They pay $30bn (more than OpenAIs lifetime revenue) each year to make sure noone does.
  - KeplerBoy 34 minutes ago
    
    What are you referring to?
dist-epoch 4 hours ago

The same way people stayed on Google despite DuckDuckGo existing.