Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)N
Posts
13
Comments
8
Joined
3 yr. ago

  • Please don't post links to reddit.

  • It seems like for creative text generation tasks, metrics have been shown to be deficient; this even holds for the new model-based metrics. That leaves human evaluation (both intrinsic and extrinsic) as the gold standard for those types of tasks. I wonder if the results from this paper (and other future papers that look automatic CV metrics) will lead reviewers to demand more human evaluation in CV tasks like they do for certain NLP tasks.

  • Machine Learning @kbin.social

    Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models

    arxiv.org /abs/2306.04675
  • hmmm... not sure which model you're referring to. do you have a paper link?

  • do you have a link?

  • @machinelearning am I in the right place? Lol

    Jump
  • Machine Learning @kbin.social

    Extending Context Window of Large Language Models via Positional Interpolation

    arxiv.org /abs/2306.15595
  • Machine Learning @kbin.social

    Inverse Scaling: When Bigger Isn't Better

    arxiv.org /abs/2306.09479
  • Machine Learning @kbin.social

    Craft an Iron Sword: Dynamically Generating Interactive Game Characters by Prompting Large Language Models Tuned on Code

    www.microsoft.com /en-us/research/project/grounded-conversational-characters/in-depth/
  • Machine Learning @kbin.social

    r/MachineLearning finally received a warning from u/ModCodeOfConduct

  • If the effect is strong enough, then it could have a very negative effect on LLM training in the near future, considering more and more of the internet contains ChatGPT & GPT-4 content in it and automatic detectors are currently quite poor.