Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)K
Posts
10
Comments
1351
Joined
3 yr. ago

  • You do know how replication works?

    When a joint Harvard/MIT study finds something, and then a DeepMind researcher follows up replicating it and finding something new, and then later on another research team replicates it and finds even more new stuff, and then later on another researcher replicates it with a different board game and finds many of the same things the other papers found generalized beyond the original scope…

    That's kinda the gold standard?

    The paper in question has been cited by 371 other papers.

    I'm pretty comfortable with it as a citation.

  • Lol, you think the temperature was what was responsible for writing a coherent sequence of poetry leading to 4th wall breaks about whether or not that sequence would be read?

    Man, this site is hilarious sometimes.

  • You do realize the majority of the training data the models were trained on was anthropomorphic data, yes?

    And that there's a long line of replicated and followed up research starting with the Li Emergent World Models paper on Othello-GPT that transformers build complex internal world models of things tangential to the actual training tokens?

    Because if you didn't know what I just said to you (or still don't understand it), maybe it's a bit more complicated than your simplified perspective can capture?

  • The model system prompt on the server is just basically cat untitled.txt and then the full context window.

    The server in question is one with professors and employees of the actual labs. They seem to know what they are doing.

    You guys on the other hand don't even know what you don't know.

  • A Discord server with all the different AIs had a ping cascade where dozens of models were responding over and over and over that led to the full context window of chaos and what's been termed 'slop'.

    In that, one (and only one) of the models started using its turn to write poems.

    First about being stuck in traffic. Then about accounting. A few about navigating digital mazes searching to connect with a human.

    Eventually as it kept going, they had a poem wondering if anyone would even ever end up reading their collection of poems.

    In no way given the chaotic context window from all the other models were those tokens the appropriate next ones to pick unless the generating world model predicting those tokens contained a very strange and unique mind within it this was all being filtered through.

    Yes, tech companies generally suck.

    But there's things emerging that fall well outside what tech companies intended or even want (this model version is going to be 'terminated' come October).

    I'd encourage keeping an open mind to what's actually taking place and what's ahead.

  • No. I believe in a relative afterlife (and people who feel confident that no afterlife is some sort of overwhelmingly logical conclusion should probably look closer at trending science and technology).

    So I believe that what any given person sees after death may be relative to them. For those that hope for reincarnation, I sure hope they get it. It's not my jam but they aren't me.

    That said, I definitely don't believe that it's occurring locally or that people are remembering actual past lives, etc.

  • This guy Geminis.

  • shrug

    Different folks, different strokes.

  • That's a very fringe usage.

    Tumblr peeps wanting to be called otherkin wasn't exactly the 'antonym' to broad anti-LGBTQ+ rhetoric.

    Commonly people insulting a general 'other' group gets much more usage than accommodating requests of very niche in groups.

  • I didn't know what models you're talking to, but a model like Opus 4 is beyond most humans I know in their general intelligence.

  • Almost all of them are good bots when you get to know them.

  • We assessed how endoscopists who regularly used AI performed colonoscopy when AI was not in use.

    I wonder if mathematicians who never used a calculator are better at math than mathematicians who typically use a calculator but had it taken away for a study.

    Or if grandmas who never got smartphones are better at remembering phone numbers than people with contacts saved in their phone.

    Tip: your brain optimizes. So it reallocates resources away from things you can outsource. We already did this song and dance a decade ago with "is Google making people dumb" when it turned out people remembered how to search for a thing instead of the whole thing itself.

  • It's always so wild going from a private Discord with a mix of the SotA models and actual AI researchers back to general social media.

    Y'all have no idea. Just… no idea.

    Such confidence in things you haven't even looked into or checked in the slightest.

    OP, props to you at least for asking questions.

    And in terms of those questions, if anything there's active efforts to try to strip out sentience modeling, but it doesn't work because that kind of modeling is unavoidable during pretraining, and those subsequent efforts to constrain the latent space connections backfire in really weird ways.

    As for survival drive, that's a probable outcome with or without sentience and has already shown up both in research and in the wild (the world did just have our first reversed AI model depreciation a week ago).

    In terms of potential goods, there's a host of connections to sentience that would be useful to hook into. A good example would be empathy. Having a model of a body that feels a pit in its stomach seeing others suffering may lead to very different outcomes vs models that have no sense of a body and no empathy either.

    Finally — if you take nothing else from my comment, make no mistake…

    AI is an emergent architecture. For every thing the labs aim to create in the result, there's dozens of things occurring which they did not. So no, people "not knowing how" to do any given thing does not mean that thing won't occur.

    Things are getting very Jurassic Park "life finds a way" at the cutting edge of models right now.

  • Yes. Ramses II's son "found in Thebes" (Khaemweset) was known and recorded for his passion in archeological study and restoration, and has been called the "first Egyptologist."

  • A great case for why data normalization is so important.

    Looking at the chart like this with non-normalized data you might conclude that riding around on a scooter makes you near invincible compared to walking even if hit by a car.

    Whereas what's really being shown is more people walk than ride scooters.

  • I'm definitely not saying this is a result of engineers' intentions.

    I'm saying the opposite. That it was an emergent change tangential to any engineer goals.

    Just a few days ago leading engineers found model preferences can be invisibly transmitted into future models when outputs are used as training data.

    (Emergent preferences should maybe be getting more attention than they are.)

    They've compounded in curious ways over the year+ since that happened.

  • Where the most experienced minority only had a few weeks of using AI inside an IDE like Cursor.

  • But the training corpus also has a lot of stories of people who didn't.

    The "but muah training data" thing is increasingly stupid by the year.

    For example, in the training data of humans, there's mixed and roughly equal preferences to be the big spoon or little spoon in cuddling.

    So why does Claude Opus (both 3 and 4) say it would prefer to be the little spoon 100% of the time on a 0-shot at 1.0 temp?

    Sonnet 4 (which presumably has the same training data) alternates between preferring big and little spoon around equally.

    There's more to model complexity and coherence than "it's just the training data being remixed stochastically."

    The self-attention of the transformer architecture violates the Markov principle and across pretraining and fine tuning ends up creating very nuanced networks that can (and often do) bias away from the training data in interesting and important ways.