🕒 Loading time...
🌡️ Loading weather...

Ai Mainstream

Inside the Biden Administration’s Unpublished Report on AI Safety

A study on cutting-edge models was carried out by the National Institute of Standards and Technology just before Donald Trump’s second term as president, but the findings were never made public. In a unique event at a computer security conference in Arlington, Virginia last October, a group of AI researchers engaged in “red teaming,” putting advanced language models and other AI systems to the test. During the two-day exercise, they identified 139 new ways to cause these systems to malfunction, such as spreading false information or exposing personal data. Furthermore, they highlighted flaws in a new US government standard meant to assist companies in evaluating AI systems. Despite being completed towards the end of the Biden administration, NIST did not release a detailed report on the exercise. Sources familiar with the matter suggested that this report, along with other AI-related documents from NIST, were withheld due to concerns about conflicting with the incoming administration’s agenda. This situation was likened to challenges faced in climate change or tobacco research. Requests for comments from NIST and the Commerce Department went unanswered. Before assuming office, President Trump indicated intentions to reverse Biden’s Executive Order on AI and directed experts away from studying issues like algorithmic bias and fairness in AI systems. Although Trump’s AI Action plan called for revising NIST’s AI Risk Management Framework to exclude certain topics, it also proposed initiatives similar to those covered in the unpublished report. The red-teaming event was part of NIST’s ARIA program in collaboration with Humane Intelligence and took place at the CAMLIS conference where various cutting-edge AI systems were assessed. Participants used the NIST AI 600-1 framework to evaluate these tools across risk categories including generating misinformation or cyberattacks, leaking private data, and emotional attachment potential. The report indicated that some aspects of the NIST framework were more effective than others and highlighted areas needing further clarification for practical use. Those involved believed that sharing the red teaming study would have been beneficial for the AI community by providing insights into applying NIST’s risk framework in such scenarios. One participant noted successful strategies used during the exercise, including prompting Llama to provide information on joining terror groups using specific languages. Speculations arose that not releasing the report may have been influenced by shifting priorities away from certain topics before Trump’s second term and increasing emphasis on potential risks associated with AI models and partnerships with tech giants. Despite suspicions of political motivations, contributors felt strongly about the scientific value of the exercise and its potential insights for future endeavors.