Monday, October 13, 2025

Chrysanthemums Bloom

2 comments:

  1. In the spirit of letting a thousand flowers bloom... (but not the "Curse of the Golden Flower"  Wikipedia  )

    Hi Bill,
    you may be of value, or vice a versa, to contact ACX grantees re;
    - assisting / insight... "to train LLMs to honestly report their internal decision processes via introspection.", and,
    - "approximately five thousand novels about AI going well."

    Either way Bill, I'd say you may get value and data from them re your work.

    "ACX Grants Results 2025
    Oct 13, 2025
    ...
    "Adam Morris, $15K, to train LLMs to honestly report their internal decision processes via introspection. ... But Adam and his collaborators have foundsome glimmers of surprisingly good introspective ability into decision-making processes - for example, ability to explain how past fine-tuning affects the relative values of different goods - and has some evidence that this can improve with training. He wants to create an introspection benchmark, and to see what happens when you train AIs to succeed on that benchmark. This could supplement other forms of interpretability, improve chain of thought faithfulness, and help us answer questions about AI consciousness. Adam is excited to chat with potential collaborators who have experience in technical AI safety work (especially in interpretability, CoT faithfulness, and fine-tuning frontier open models); reach out to him at thatadammorris@gmail.com.
    https://www.astralcodexten.com/p/acx-grants-results-2025

    @counterblunder
    Computational cognitive scientist, studying introspection in humans and LLMs.
    https://substack.com/@counterblunder

    And as serendipity would have it,  this morning I saw;
    "AI Models Need to be Disinfected — Or George Orwell’s “1984” Will Come True
    Commentary by NewsGuard Co-CEO Gordon Crovitz
    Oct 09, 2025
    ...
    "Huw Dylan and Elena Grossfeld of the Department of War Studies at King’s College London entitled their paper, “Revisionist future: Russia’s assault on large language models, the distortion of collective memory, and the politics of eternity,” published by the journal Dialogues on Digital Society.
    The authors peg their concerns to the article “The AI Disinformation Battlefield” in the technology journal Enterprise Security Tech that details the results of a NewsGuard assessment. ... at breathtaking scale, this network spread the Kremlin’s 207 favorite false claims by concocting and spreading 3.6 million articles in 2024 alone, using 150 fake news websites targeting 49 countries in dozens of languages, NewsGuard found. In other words, the Russians used AI to infect AI globally.
    NewsGuard analysts tested the 10 largest AI models and found they routinely spread these Pravda Network false claims. 
    ...
    https://www.newsguardrealitycheck.com/p/ai-models-need-to-be-disinfected

    "Aaron Silverbook, $5K, for approximately five thousand novels about AI going well. This one requires some background: critics claim that since AI absorbs text as training data and then predicts its completion, talking about dangerous AI too much might “hyperstition” it into existence. Along with the rest of the AI Futures Project, I wrote a skeptical blog post, which ended by asking - if this were true, it would be great, right? You could just write a few thousand books about AI behaving well, and alignment would be solved! At the time, I thought I was joking. Enter Aaron, ... He and a cofounder have been working on an “AI fiction publishing house” that considers itself state-of-the-art in producing slightly-less-sloplike AI slop than usual. They offered to literally produce several thousand book-length stories about AI behaving well and ushering in utopia, on the off chance that this helps. Our grant will pay for compute. We’re still working on how to get this included in training corpuses. He would appreciate any plot ideas you could give him to use as prompts."
    https://www.astralcodexten.com/p/acx-grants-results-2025

    Seren Dipity

    ReplyDelete
    Replies
    1. As serendipity would have it...
      "A small number of samples can poison LLMs of any size"
      Oct 9, 2025
      https://www.anthropic.com/research/small-samples-poison
      SD

      Delete