Home » Uncategorized

OpenAI and Anthropic: Hopes for AI alignment and safety should not be centralized

  • David Stephen 
AdobeStock_651549737

For all the negative reasons that AI was in the news in the last year, regarding harms and misuses, what did Anthropic or OpenAI offer, as broad solutions?

One may contend that the models of OpenAI and Anthropic are not the sources—or the causes—and that their models are fairly safe, but if there is the possibility for harm, with the same technology that they are the best at, maybe expectations that they have answers to general problems in AI alignment and safety should be trenched.

The risks and threats with AI are in demand of general AI safety and alignment, not sole safety and alignment. The current possibilities with AI misuse are indications that the direction of focus for AI safety, testing, evaluation, regulation, governance and alignment for major models alone is inadequate. An intensity—on misuses from anywhere—would be more decisive against risks than two modest apples in a big bad gang.

All the efficient paths to AI alignment and safety would be technical, but arriving at technical answers does not have to start technically. Simply, to make AI less misused and less harmful—in general—the path to efficiency would be technical, but arriving at the answers would not have to begin technically.

Seeking out progress in AI safety and alignment, what prime assumptions are ahead—against hurdles? The history of technology has shown that advancement is often easier than ethics or safety. So, if safety, for AI—a technology that can be misused at scale—needs to equal the pace of progress, what far-flung assumptions are the frontier firms making?

There would have to be at least 18 other OpenAI and Anthropic, to make 20, to come close to anything like a serious effort for general AI safety, against harms and misuses from any source or era. Though OpenAI and Anthropic have teams from various fields, their work is led by engineering, which guarantees steep AI advances then some safety supplements. Also, commercial and competitive concerns require that they focus on advances. Still, safety, no matter how they try, may continue to be homeland—not encompassing.

The closest field to generating assumptions for AI is neuroscience. But what would be useful in safety for AI would be new assumptions in neuroscience, not existing ones, since current neuroscience is still based on neurons and the brain does not function or organize information by neurons—conceptually. Neural artificial imitations have limitations for safety without novel assumptions, especially how the human mind enables—consequence-induced caution for—society.

What new assumptions in theoretical neuroscience can be generated by Anthropic and OpenAI currently with enough promise that can inspire unparalleled approaches in engineering for alignment?

If their assumptions cannot break from those in neuroscience that say the brain makes predictions without saying what exactly or how, or that there is long-term memory or short-term memory without naming relays of memory or the phases for which memory exists, or that neurons are excited or inhibited or that they fire, without saying what is responsible and if what is responsible should be the focus not neurons, then their trajectory in stellar assumption generation has not begun.

This is just in neuroscience. There is physics, economics, and so on. There were problems that became obvious with e-commerce, social media, emails, instant messaging, text messages, and so forth, that with assumptions from different fields, leads for technical safety would have made much difference, but many of the efforts for safety started and ended technically, limiting the reach of solutions for safety and ethics for those technologies.

In fairness to Anthropic and OpenAI, neither Google, Meta, Microsoft, nor others hold much promise for general AI safety and alignment in ways that would be decisive for society. Their models would be safe, to an extent, but AI harms may sprawl.

There is a new press release, U.S. AI Safety Institute Signs Agreements Regarding AI Safety Research, Testing and Evaluation With Anthropic and OpenAI, stating that, “Today, the U.S. Artificial Intelligence Safety Institute at the U.S. Department of Commerce’s National Institute of Standards and Technology (NIST) announced agreements that enable formal collaboration on AI safety research, testing and evaluation with both Anthropic and OpenAI. Each company’s Memorandum of Understanding establishes the framework for the U.S. AI Safety Institute to receive access to major new models from each company prior to and following their public release. The agreements will enable collaborative research on how to evaluate capabilities and safety risks, as well as methods to mitigate those risks. Additionally, the U.S. AI Safety Institute plans to provide feedback to Anthropic and OpenAI on potential safety improvements to their models, in close collaboration with its partners at the U.K. AI Safety Institute.”

There is a recent story on WSJ, Apple, Nvidia Are in Talks to Invest in OpenAI, stating that, “Apple and Nvidia are in talks to invest in OpenAI, a move that would strengthen their ties to a partner integral to their efforts in the artificial-intelligence race. The investment would be part of a new OpenAI fundraising round that would value the ChatGPT maker above $100 billion, people familiar with the situation said.”

There is a recent report on Reuters, Ask Claude: Amazon turns to Anthropic’s AI for Alexa revamp, stating that, “Amazon’s revamped Alexa due for release in October ahead of the U.S. holiday season will be powered primarily by Anthropic’s Claude artificial intelligence models, rather than its own AI, five people familiar with the matter told Reuters. Amazon plans to charge $5 to $10 a month for its new “Remarkable” version of Alexa as it will use powerful generative AI to answer complex queries, while still offering the “Classic” voice assistant for free, Reuters reported in June. Announcing a deal to invest $4 billion in Anthropic in September last year, Amazon said its customers would gain early access to its technology. Reuters could not determine if Amazon would have to pay Anthropic additionally for the use of Claude in Alexa. Amazon declined to discuss the details of its agreements with the startup. Alphabet’s Google has also invested at least $2 billion in Anthropic.”

Leave a Reply

Your email address will not be published. Required fields are marked *