Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)B
Posts
19
Comments
162
Joined
2 yr. ago

  • My original comment was poorly worded. Ghana gained independence 69 years ago, not 200.

    The kidnapping and exploitation was in full swing 200 years ago, but it started earlier than that and ended way later, if ever.

  • Plunder a country

    Kidnap their people to make slaves

    Impose anti-gay laws on them

    200 years later: They are expanding the anti-gay laws! Why are they so backward?!

    Pay reparations or shut up.

  • Top tier post. Can't wait for the experts at awful.systems to come explain. The question by OP is a very good question so I am sure we will get expert explanations from users of the awful.systems instance.

  • They said they're working on Orthus for Qwen 3.5. It'll be amazing!

  • My oversimplified and possibly wrong understanding: this is like speculative decoding, but instead of a separate draft model (which does its own prompt processing), they use some diffusion thing strapped on top of the main model. The diffusion reuses the high-quality prompt processing result of the main model.

    The 7.8x faster claim sounds almost too good to be true. But even if we get like 3x then this is still a huge revolution in localLLMing.

  • LocalLLaMA @sh.itjust.works

    Orthrus-Qwen3: up to 7.8×tokens/forward on Qwen3, identical output distribution

    github.com /chiennv2000/orthrus
  • If you would rather not trust any AI company with your data, consider heading to !localllama@sh.itjust.works where self-hosted LLM are discussed!

  • I see. Mistral was the favorite in self-hosted LLM circles back in 2023-2024 but general opinion is that they have since been far surpassed by Chinese and American models, hence my question.

    Good to know they've found a market with their online offering.

  • Is there a use case where Mistral still beat Qwen or Gemma? If you're using Mistral, which model and what do you use it for?

  • Deleted

    Permanently Deleted

    Jump
  • Reinforcement learning makes the model better over time, so why should there be fewer and fewer good results?

    If you're talking about the rate of improvement going down, then yes, of course. That's bound to happen (unless you have an actual intelligence explosion, but in that case you won't know what "good results" even mean anyway).

  • Yes :)

  • Deleted

    Permanently Deleted

    Jump
  • No one feeds random LLM output straight back though. The whole idea of reinforcement learning is you take some ML model output, check if it is good, and push the model in that direction if it is good.

    As long as you believe that e.g. it's easier to verify a mathematical result than to come up with one, then RL should work.

  • Deleted

    Permanently Deleted

    Jump
  • You took those quotes wildly out of context. Of course there is a hard limit on how much information can be extracted from data. Clever processing won't break that limit. But only in basic cases have we seen proofs that certain statistical inference methods make optimal use of the data. In complicated systems like neural nets it is basically impossible to prove such optimality. In fact the models are almost definitely not using the data optimally. Processing can help. A lot.

  • My not very confident guess is that it's just to label what kind of galaxies are observed by the instrument at that range. Really not sure about the less dense ring in the outer part though.

  • what. there definitely are differences between the universe today and the universe billions years ago.

  • the latter. the map looks different further away from the center of the circle because further away = earlier time. if they attempted to compensate for how things far away have changed since the light was emitted, the map would look uniform.

  • 196 @lemmy.blahaj.zone

    2nd rule

  • Putin's Russia is a nationalist and imperialist state. The Russian army has committed countless war crimes in Ukraine.

    The PRC is an authoritarian state with a deeply corrupted party, extreme capitalist culture, and very little regards for individual privacy or rights. It is also engaged in homogenization campaigns suppressing the social and cultural identities of Uyghur people, constituting an ethnocide.

    PieFed is promoted by people claiming they want to take influence away from the tankies who support Russia and the PRC. But from the hardcoded behavior discussed in this very thread we can see the PieFed's devs and promoters intend to suppress criticisms against the (countries|institutions|capitalists) who they support, even though these entities have committed no less evil than Russia and the PRC.

    Does that satisfy you?

    We'll see if I get banned from .ml now (not being snarky. i genuinely am curious to see if they will ban me).

  • How would you go about changing the seven_things_plus variable via the GUI?

    You can't. You can get around the filtering by other means, but that doesn't make seven_things_plus any less hardcoded.

    Maybe the term hardcoded have some particular negative connotations for you. In that case please explain what that connotation may be.

    The correct definition just means data in the code rather than loaded at runtime. It is not necessarily a bad thing (things like unit conversion factors are perfectly reasonable to hardcode). In this case, I'll let everyone judge for themselves if hardcoding 'enoughmuskspam' is acceptable.

  • Hey. I saw your comment but didn't get to reply before you deleted it. I think you should restore it. Ignore the tankies bullying you.

    I fully stand by my wording that it is "hardcoded", but it is good that you show the code, so people can judge for themselves.

  • Memes @lemmy.ml

    "content curation"

  • Science @mander.xyz

    Private donors pledge 860M EUR for CERN's Future Circular Collider

    home.cern /news/press-release/cern/private-donors-pledge-860-million-euros-cerns-future-circular-collider
  • Physics @mander.xyz

    Private donors pledge 860M EUR for CERN's Future Circular Collider

    home.cern /news/press-release/cern/private-donors-pledge-860-million-euros-cerns-future-circular-collider
  • Technology @lemmy.zip

    So You Think You've Awoken ChatGPT...

    www.lesswrong.com /posts/2pkNCvBtK6G6FKoNn/so-you-think-you-ve-awoken-chatgpt
  • Memes @lemmy.ml

    someone teach them LaTeX

  • Science Memes @mander.xyz

    fruit flies are not ergodic

  • Memes @lemmy.ml

    I love open-weight models, especially when they steal from proprietary models (OC)

  • Science Memes @mander.xyz

    vibes-based astrophysics

  • LocalLLaMA @sh.itjust.works

    New open-weight 🐋 DeepSeek V3. 685B MoE. Beats Claude 3.5 Sonnet on Aider coding benchmark

    huggingface.co /deepseek-ai/DeepSeek-V3
  • 196 @lemmy.blahaj.zone

    "blahaj" is pronounced "blo-hai" rule

  • Science Memes @mander.xyz

    ur dada so buff he falls significantly faster than g

  • Science Memes @mander.xyz

    they tricked us

  • Science Memes @mander.xyz

    👁️ 🌹 💨 💨

  • Science Memes @mander.xyz

    the final boss after you clear Donald Knuth

  • Science Memes @mander.xyz

    what if the shop is empty?

  • Science Memes @mander.xyz

    I don’t understand quantum physics

  • Science Memes @mander.xyz

    calculate the transmission coefficient