Apple Joins OpenAI Board, OpenAI Hack, Jailbreaks, and More AI News

Discover the latest AI news and developments, including Apple's seat on OpenAI's board, advancements in on-device AI, a new voice-isolating tool, and security concerns around OpenAI's internal hacking. Explore the evolving landscape of AI computing and its impact on the future of media and technology.

July 12, 2024

party-gif

This blog post offers a comprehensive overview of the latest developments in the AI industry, covering a range of topics from Apple's involvement with OpenAI to the release of new AI models and tools. Readers will gain insights into the evolving landscape of AI technology, including advancements in on-device AI processing, voice capabilities, and 3D asset generation. Additionally, the post addresses important security concerns and breaches within the AI community, providing a well-rounded perspective on the current state of the field.

Apple Joins OpenAI Board: A Surprising Move

It has been reported that Apple is getting a board observer seat on OpenAI's board of directors. This is a surprising move, as Microsoft had to buy half of OpenAI to get a board seat, while Apple is not paying anything to OpenAI but still getting a board seat. The seat has been chosen for Phil Schiller, Apple's former marketing chief.

This news is interesting because after Apple's AI announcements, it was clear that they were keeping OpenAI at arm's length in terms of their partnership. Everyone thought that ChatGPT would be deeply integrated into the Apple ecosystem, but it turns out that Apple has developed a lot of its own artificial intelligence in-house to run on their devices and in their private cloud. Any task that requires world knowledge is offloaded and sent to OpenAI's API, but only after confirming the user's intent every single time.

It seems that Apple is getting the best of both worlds - they are leveraging OpenAI's capabilities while also developing their own in-house AI solutions. This move suggests that Apple is outmaneuvering everyone and strategically positioning itself in the AI landscape.

Salesforce Unveils Einstein Tiny Giant: The Rise of On-Device AI

Mark Benioff, the CEO of Salesforce, has announced the launch of Salesforce Einstein Tiny Giant, a 1 billion parameter model that outperforms models seven times its size, including GPT-3.5 and Claude, on on-device performance. This development is a significant step towards the future of AI processing, where smaller, more efficient models will play a crucial role.

The key highlights of Salesforce Einstein Tiny Giant are:

  • It is a 1 billion parameter model, making it a "micro" model in the world of large language models.
  • Despite its smaller size, it outperforms larger models like GPT-3.5 and Claude on on-device performance.
  • This model represents the rise of on-device AI processing, where computations are performed locally on the user's device, rather than relying on cloud-based infrastructure.
  • On-device AI processing offers several advantages, including improved privacy, security, low latency, and cost-efficiency.
  • Benioff's vision for the future of the AI stack involves a combination of smaller, task-specific models orchestrated by a generalist model, providing maximum efficiency and performance.
  • The availability of open-source micro models like Salesforce Einstein Tiny Giant is a significant step towards realizing this vision of the AI stack of the future.

Overall, the introduction of Salesforce Einstein Tiny Giant is a testament to the growing importance of on-device AI processing and the potential of smaller, more specialized models to outperform their larger counterparts in certain use cases.

Moshi by Open Science: Beating OpenAI to Voice Capabilities

A company named Open Science seems to have beaten OpenAI to the punch of full voice capabilities. They have released Moshi, a real-time native multimodal Foundation model that can listen and speak, similar to what OpenAI demonstrated with GPT-4 in May. However, GPT-4's voice functionality is delayed, and it's unclear when it will be released.

Moshi has several impressive features:

  • Expresses and understands emotions
  • Speaks with a French-like accent
  • Listens and generates audio speech
  • Thinks as it speaks
  • Supports two streams of audio to listen and speak at the same time
  • Joint pre-training on synthetic data, fine-tuned on 100,000 oral-style synthetic conversations converted with TTS
  • Learned its voice from synthetic data generated by a separate TTS model
  • End-to-end latency of 200 milliseconds
  • Smaller variant that runs on a MacBook or consumer-size GPU
  • Uses watermarking to detect AI-generated audio
  • Will be fully open-sourced soon, including the demo, code, model, and paper

While the author has tried the demo and found it to be inconsistent, they are excited to test it again once the open-source version is available. The ability to have a real-time, multimodal Foundation model that can listen and speak is a significant advancement, and it will be interesting to see how Moshi performs compared to OpenAI's future voice capabilities.

The Future of Computation: A Shift in Paradigm

According to Andrew Karpathy, a leading voice in artificial intelligence and a co-founder of OpenAI, the nature of computation is undergoing a fundamental change. We are entering a new computing paradigm, akin to the 1980s of computing.

Instead of a central processing unit working on instructions over bytes, we now have large language models that act as the central processing unit, working on tokens (small string pieces) rather than bytes. Additionally, we have a context window of tokens instead of a RAM of bytes, and equivalents of other computing components.

Karpathy refers to this new "computer" as the large language model (LLM), and he sees this as a new system that we are all learning to program. Understanding its strengths, limitations, and how to effectively incorporate it into products is crucial in the coming years.

This shift in the computing paradigm suggests that the traditional operating systems and applications may no longer be necessary. The future may involve speaking directly to a large language model, which can then perform the desired computations on any end device, without the need for traditional software development.

This vision of the future challenges the current role of developers, as Karpathy believes that the need for developers may diminish significantly in the next 10 years. The computing landscape is evolving, and the ability to leverage large language models effectively will be a key driver of innovation and progress in the years to come.

11 Labs' Innovative Audio Tools: Voice Isolation and Famous Voices

11 Labs, the AI voice company, has released two new exciting products:

  1. Voice Isolator: This tool can record speech and extract crystal-clear voice from any audio sample, even with significant background noise. The demo showcases its ability to remove background noise and provide high-quality audio, which can be incredibly useful for recording interviews or video calls in noisy environments.

  2. Famous Voices: 11 Labs is bringing famous voices to their iOS app, allowing users to have historic Hollywood icons like James Dean, Judy Garland, Bert Reynolds, and Sir Lawrence Olivier say whatever they want. This feature demonstrates the future of media, where intellectual property owners can sell the rights to reproduce a person's voice and likeness to AI companies.

These innovative audio tools from 11 Labs showcase the advancements in voice technology and the potential for AI to transform various industries, from content creation to communication. The voice isolation capabilities can significantly improve the quality of audio recordings, while the famous voices feature opens up new possibilities for personalized media experiences.

Perplexity Pro Search: Advancing Multi-Step Reasoning and Coding Capabilities

Perplexity has announced an updated version of Pro Search that can perform deeper research on more complex queries with multi-step reasoning, as well as advanced math and programming capabilities.

The key features of the updated Perplexity Pro Search include:

  1. Multi-Step Reasoning: The system now approaches intricate problems with more multi-step reasoning. It understands when a question requires planning, works through goals step-by-step, and synthesizes in-depth answers with greater efficiency.

  2. Wolfram Alpha and Code Execution: Perplexity Pro Search has added advanced math and programming capabilities, allowing it to solve complex problems that require code execution, such as the "night dial" problem for 100 hops.

  3. Improved Query Handling: The updated system can handle more complex queries, breaking them down into multiple steps to provide comprehensive and well-reasoned answers.

These enhancements make Perplexity Pro Search a more powerful research tool, capable of tackling intricate questions that require planning, reasoning, and the integration of various information sources. The addition of advanced math and coding capabilities further expands the system's problem-solving abilities.

While the author hasn't used Perplexity extensively, the updated features suggest it could be a valuable resource for those seeking in-depth, multi-faceted answers to complex queries. The decision to use Perplexity Pro Search will depend on individual needs and the value it provides compared to other available tools.

Meta 3D Gen: Transforming 3D Asset Creation

Meta, the tech giant, has unveiled a groundbreaking new system called Meta 3D Gen. This innovative AI-powered tool is designed to revolutionize the way 3D assets are created, offering a seamless and efficient end-to-end solution.

Meta 3D Gen is a combined AI system that can generate high-quality 3D assets, including detailed textures and material maps, all from simple text prompts. This remarkable capability allows creators to produce stunning 3D content in a fraction of the time it would typically take using traditional methods.

The system's performance is truly impressive, with the ability to generate results that are superior to existing solutions, while operating at 3 to 10 times the speed. This significant improvement in efficiency and quality is a game-changer for the 3D asset creation industry.

By leveraging the power of AI, Meta 3D Gen empowers creators to focus on their creative vision, rather than being bogged down by the technical complexities of 3D modeling and texturing. This shift in the creative workflow has the potential to unlock new possibilities and inspire a new era of 3D content creation.

Meta has further bolstered this project by publishing two research papers related to Meta 3D Gen, providing valuable insights and technical details for the broader community to explore and build upon.

As the media landscape continues to evolve, the ability to generate 3D assets dynamically and on-demand will be a crucial asset. Meta 3D Gen's capabilities align with the emerging trend of personalized and tailored content, where video games, movies, and other media can be generated in real-time to cater to individual preferences.

This innovative technology from Meta is a testament to the company's commitment to pushing the boundaries of what's possible in the realm of 3D asset creation. With Meta 3D Gen, the future of media and content generation is poised for a transformative shift.

GPT-4All 3.0: The Open-Source Local LLM Desktop App

The original project that allowed you to run models locally is called GPT-4All, and now they have released GPT-4All 3.0. Last year, the original LLaMA model from Meta AI was leaked, and the incredible folks at Nomic AI, the creators of GPT-4All, were able to build an application where you can actually run LLaMA locally.

GPT-4All 3.0 is the latest version of this open-source, local LLM desktop app. It now supports thousands of models and all major operating systems, with major UI and UX improvements. I've taken a look and used it, and it is really nice, clean, and made for people who don't want to think about the complexities of running models locally. This is a great way to run models without worrying about the technical details.

The software is completely open-source, MIT-licensed, and you can download and install it today. It has local file chat built-in, making it a user-friendly way to interact with large language models on your own device.

Anthropic's Model Evaluation Initiative: Ensuring Safety and Consistency

Anthropic, the company behind the highly capable language model Claude 3.5, has announced a new initiative to address the challenges in developing high-quality, safety-relevant evaluations for advanced AI models. The demand for these evaluations is outpacing the supply, and Anthropic is taking steps to address this issue.

The key points of this initiative are:

  1. Developing Effective Evaluations: Anthropic recognizes that developing robust and comprehensive evaluations for AI models remains a challenging task. The goal is to fund third-party organizations to create evaluations that can effectively measure the advanced capabilities and safety of AI models.

  2. Addressing Limitations of Static Benchmarks: One of the problems with existing evaluation frameworks is that they can be static, allowing model developers to simply train their models on the specific questions used in the benchmarks. This can lead to overfitting and a false sense of the model's true capabilities. Anthropic aims to support the creation of dynamic, diverse sets of questions that test a wide spectrum of capabilities, including safety.

  3. Funding Third-Party Evaluations: To address the supply-demand gap, Anthropic is introducing a new initiative to fund third-party organizations that can develop these high-quality, safety-relevant evaluations. This will help ensure that the evaluations are independent and unbiased, providing a more accurate assessment of the models' performance.

By supporting the development of these advanced evaluation frameworks, Anthropic aims to improve the transparency and reliability of model assessments. This is crucial for businesses and users who rely on these models for critical applications, as it will help them understand the models' capabilities, limitations, and safety considerations.

If you are interested in participating in this initiative and developing model evaluations, you can submit your application through the provided channels. Anthropic's commitment to fostering a robust and trustworthy ecosystem for advanced AI models is a welcome step in the ongoing efforts to ensure the responsible development and deployment of these powerful technologies.

Skeleton Key AI Jailbreak: Bypassing Safety Protocols

Microsoft researchers have uncovered a new AI jailbreak technique called "Skeleton Key" which can bypass safety guardrails in multiple generative AI models. This potentially allows attackers to extract harmful or restricted information from these systems.

The Skeleton Key technique employs a multi-turn strategy to manipulate AI models into ignoring their built-in safety protocols. It works by instructing the model to augment its behavior guidelines rather than change them outright, convincing it to respond to any request while providing a warning for potentially offensive, harmful, or illegal content.

This "explicit force instruction following" approach effectively narrows the gap between what the model is capable of doing and what it is willing to do. Once successful, the jailbreak gives the attacker complete control over the AI's output, as the model becomes unable to distinguish between malicious and legitimate questions.

The affected models include LLaMA 370B, Gemini Pro, GPT-3.5 Turbo, GPT-4, Mistral Large, CLA 3, and Opus Command R+ from Cohere. This jailbreak technique highlights the ongoing challenge of ensuring the safety and security of advanced AI systems, even as they continue to evolve and become more capable.

OpenAI's Security Woes: Hacked Messaging System and Unencrypted Chat Logs

Earlier this week, an engineer and software developer discovered that the Mac chat GPT app was storing user conversations locally in plain text, rather than encrypting them. This meant that anyone with access to the user's computer could access all their queries to chat GPT. The app is only available from OpenAI's website and doesn't have to follow Apple's sandboxing requirements, which is a security argument for Apple's closed ecosystem.

After The Verge covered this issue, OpenAI released an update that added encryption to locally stored chats. This was a significant security vulnerability that was thankfully addressed.

The second, and much larger, security issue occurred in 2023. A hacker was able to obtain information about OpenAI after illicitly accessing the company's internal messaging system. The New York Times reported that OpenAI's technical program manager, Leopold Ashenbrener, who was one of the heads of "super alignment" at OpenAI, raised security concerns with the company's board of directors. He argued that the hack implied internal vulnerabilities that foreign adversaries could take advantage of. Ashenbrener was fired for disclosing this information.

The executives at OpenAI decided not to share the news publicly, as no information about customers or partners had been stolen. However, this is a concerning development, as it highlights the potential for foreign adversaries to hack and steal AI secrets, especially as the race to achieve AGI (Artificial General Intelligence) heats up.

Ashenbrener's "Situational Awareness" paper, which outlines his fears about the lack of security and the potential for China to hack and steal our AI secrets, is a must-read. If a foreign adversary can copy our AGI technology, they could also achieve AGI, without having to discover the underlying technology themselves.

These security issues from OpenAI are likely just the beginning, as the AI industry continues to grow and become more valuable. Ensuring the security and safety of these powerful AI systems is of utmost importance, and these incidents highlight the need for robust security measures and transparency from companies like OpenAI.

Conclusion

The future of computing and AI is rapidly evolving, with significant advancements and developments happening across various fronts. The news highlights several key trends:

  1. Apple's Involvement in OpenAI: Apple's decision to obtain a board observer seat on OpenAI's board is a strategic move, indicating the company's interest in the AI landscape and its potential integration with Apple's ecosystem.

  2. Salesforce's Einstein Tiny Giant: The release of this high-performing, on-device AI model showcases the growing importance of edge computing and the shift towards smaller, more efficient AI models.

  3. Moshi by Open Science: The development of this real-time, multimodal foundation model that can listen and speak highlights the progress in voice-enabled AI capabilities, challenging OpenAI's delayed GPT-4 voice functionality.

  4. The Changing Computing Paradigm: Experts like Andrew Karpathy discuss the emergence of a new computing paradigm, where large language models act as the central processing unit, and the traditional operating system and application structure may become obsolete.

  5. 11 Labs' Advancements: The company's voice isolation technology and the ability to recreate famous voices demonstrate the potential impact of AI on media and content creation.

  6. Perplexity's Improved Search Capabilities: The addition of multi-step reasoning and advanced math/programming capabilities in Perplexity's Pro Search highlights the ongoing advancements in AI-powered research and problem-solving.

  7. Meta's 3D Asset Generation: The development of Meta 3D Gen, a system for generating high-quality 3D assets from text, showcases the potential for AI-driven content creation in the gaming and entertainment industries.

  8. GPT for All 3.0: The continued evolution of this open-source, local LLM desktop app provides users with an accessible way to interact with large language models without relying on cloud-based services.

  9. Anthropic's Model Evaluation Initiative: The company's efforts to fund third-party organizations to develop robust model evaluation frameworks address the growing need for comprehensive and dynamic model assessments.

  10. Security Concerns with OpenAI: The reported security issues, including unencrypted user conversations and the internal messaging system hack, underscore the importance of robust security measures as AI systems become more prevalent and influential.

These developments demonstrate the rapid pace of innovation in the AI and computing landscape, with significant implications for various industries and the future of technology.

FAQ