漢德百科全書 | 汉德百科全书
       
Chinese — German
深度求索
  Shēn dù qiú suǒ
  4 11 months ago 10 months ago
DeepSeek
DeepSeek (深度求索) 成立于 2023 年,是一家致力于将 AGI 变为现实的中国公司。与以前的模型相比,DeepSeek-V3 在推理速度上实现了重大突破。它在开源模型中名列前茅,可与全球最先进的闭源模型相媲美。

DeepSeek-V3 Capabilities

DeepSeek-V3 achieves a significant breakthrough in inference speed over previous models.

It tops the leaderboard among open-source models and rivals the most advanced closed-source models globally.

  Benchmark (Metric) DeepSeek V3 DeepSeek V2.5 Qwen2.5 Llama3.1 Claude-3.5 GPT-4o
    0905 72B-Inst 405B-Inst Sonnet-1022 0513
               
  Architecture MoE MoE Dense Dense - -
               
  # Activated Params 37B 21B 72B 405B - -
               
  # Total Params 671B 236B 72B 405B - -
English MMLU (EM) 88.5 80.6 85.3 88.6 88.3 87.2
MMLU-Redux (EM) 89.1 80.3 85.6 86.2 88.9 88.0
MMLU-Pro (EM) 75.9 66.2 71.6 73.3 78.0 72.6
DROP (3-shot F1) 91.6 87.8 76.7 88.7 88.3 83.7
IF-Eval (Prompt Strict) 86.1 80.6 84.1 86.0 86.5 84.3
GPQA-Diamond (Pass@1) 59.1 41.3 49.0 51.1 65.0 49.9
SimpleQA (Correct) 24.9 10.2 9.1 17.1 28.4 38.2
FRAMES (Acc.) 73.3 65.4 69.8 70.0 72.5 80.5
LongBench v2 (Acc.) 48.7 35.4 39.4 36.1 41.0 48.1
Code HumanEval-Mul (Pass@1) 82.6 77.4 77.3 77.2 81.7 80.5
LiveCodeBench (Pass@1-COT) 40.5 29.2 31.1 28.4 36.3 33.4
LiveCodeBench (Pass@1) 37.6 28.4 28.7 30.1 32.8 34.2
Codeforces (Percentile) 51.6 35.6 24.8 25.3 20.3 23.6
SWE Verified (Resolved) 42.0 22.6 23.8 24.5 50.8 38.8
Aider-Edit (Acc.) 79.7 71.6 65.4 63.9 84.2 72.9
Aider-Polyglot (Acc.) 49.6 18.2 7.6 5.8 45.3 16.0
Math AIME 2024 (Pass@1) 39.2 16.7 23.3 23.3 16.0 9.3
MATH-500 (EM) 90.2 74.7 80.0 73.8 78.3 74.6
CNMO 2024 (Pass@1) 43.2 10.8 15.9 6.8 13.1 10.8
Chinese CLUEWSC (EM) 90.9 90.4 91.4 84.7 85.4 87.9
C-Eval (EM) 86.5 79.5 86.1 61.5 76.7 76.0
C-SimpleQA (Correct) 64.1 54.1 48.4 50.4 51.3 59.3
This image, video or audio may be copyrighted. It is used for educational purposes only. If you find it, please notify us byand we will remove it immediately.
11 months ago