• @Sl00k@programming.dev
    link
    fedilink
    English
    -22 months ago

    The jump from GPT-4o -> o1 (preview not full release) was a 20% cumulative knowledge jump. If that’s not an improvement in accuracy I’m not sure what is.

    • @Aceticon@lemmy.world
      link
      fedilink
      3
      edit-2
      2 months ago

      One of the first things they teach you in Experimental Physics is that you can’t derive a curve from just 2 data points.

      You can just as easilly fit an exponential growth curve to 2 points like that one 20% above the other, as you can a a sinusoidal curve, a linear one, an inverse square curve (that actually grows to a peak and then eventually goes down again) and any of the many curves were growth has ever diminishing returns and can’t go beyond a certain point (literally “with a limit”)

      I think the point that many are making is that LLM growth in precision is the latter kind of curve: growing but ever slower and tending to a limit which is much less than 100%. It might even be like more like the inverse square one (in that it might actually go down) if the output of LLM models ends up poluting the training sets of the models, which is a real risk.

      You showing that there was some growth between two versions of GPT (so, 2 data points, a before and an after) doesn’t disprove this hypotesis. I doesn’t prove it either: as I said, 2 data points aren’t enough to derive a curve.

      If you do look at the past growth of precision for LLMs, whilst improvement is still happening, the rate of improvement has been going down, which does support the idea that there is a limit to how good they can get.

      • @Sl00k@programming.dev
        link
        fedilink
        English
        12 months ago

        which does support the idea that there is a limit to how good they can get.

        I absolutely agree, im not necessarily one to say LLMs will become this incredible general intelligence level AIs. I’m really just disagreeing with people’s negative sentiment about them becoming worse / scams is not true at the moment.

        I doesn’t prove it either: as I said, 2 data points aren’t enough to derive a curve

        Yeah only reason I didn’t include more is because it’s a pain in the ass pulling together multiple research papers / results over the span of GPT 2, 3, 3.5, 4, 01 etc.