June 8, 2024

Apple + AI: What to Expect at WWDC 2024

Google and Microsoft have already shared their plans for integrating AI into their products. Now, it’s Apple’s turn. If the rumors are even remotely true, Apple will present a more pragmatic and user-centric strategy at WWDC 2024 next week. A widely cited Bloomberg news report suggests these features:

Project Greymatter will likely introduce a suite of AI tools integrated into core applications such as Safari, Photos, and Notes, along with enhanced notifications in iOS 18 and macOS 15. Processing will take place on-device for less intensive tasks and in the cloud for more complex ones. The “Greymatter Catch-up” feature will offer recaps of recent notifications through Siri.
AI enhancements are poised to enhance web searches in Safari, making them quicker and more dependable. Safari is also expected to include summaries, similar to a feature currently available in the upstart Arc Browser.
AI features include voice memo transcription, AI-powered photo retouching, quicker Spotlight searches, enhanced Safari web searches, and auto-suggested replies for emails and messages. Additionally, on-device AI capabilities have been extended to the Apple Watch. Expect an AI emoji creator.
More natural interactions using Apple has recently upgraded Siri, enhancing its ability to understand and respond to users more naturally using its proprietary large language models.
Developer Tools Like Xcode will get AI makeover.
Revamped iPhone home screen allowing custom app icon colors and placement.
Partnership with OpenAI to integrate their advanced chatbot into iOS 18. Discussions are also ongoing with Google to potentially offer its Gemini chatbot as an option.
AI Features Positioned as “Preview” in developer betas, suggesting the technology is still a work-in-progress.

AppleInsider reports that a newly improved Siri will be able to do the following:

In Books, Siri can open specific books or sections, flip pages, change themes, and open the bookstore.
With Siri, users can take pictures and videos, switch between front and rear cameras, and set timers.
In Keynote, Siri can add media to slides, create new presentations, set bullet points and more.
In Mail, Siri can send and unsend emails, mark emails as junk, mute senders, summarize emails, and enable smart replies.
With Siri, users can now create and delete folders, add or remove tags from notes, and include audio recordings and transcriptions in Notes.
In Photos, Siri will be able to search for objects/people, rotate/edit/organize photos into albums, hide photos, etc.
In Safari, Siri can read or summarize webpages, create tab groups, including private ones.
Siri to instantly generate smart reply suggestions in Mail, Messages.

On the surface, these might not be the whiz-bang offerings, but by embedding AI in its products, Apple can enhance the user experience and ease its customers into this new “AI-first” world. However, as mentioned earlier, this is all still speculation.

Siri was first introduced as an integrated feature of the iPhone 4S on Oct. 4, 2011, before the technology world had begun to take its first baby steps into the world of large language models. George Hinton and his team of researchers made crucial breakthroughs that eventually led to the current excitement around AI.

Siri, when introduced, was essentially a rules-based voice engine with very limited capabilities; Apple oversold it. The underlying technologies weren’t ready, and Apple’s silicon journey had just begun. (I have found that Siri seems to work better on Vision Pro and recent editions of the Apple Watch.) It didn’t help that Siri was shackled by its inability to work with third-party apps and was more restrictive than products eventually released by its rivals, Amazon and Google.

Amazon’s Alexa, released in 2014, and Google’s Assistant have shown improvements as the underlying technologies — natural language processing (NLP) and machine learning — continue to advance, enabling them to handle complex conversational queries.

Advances in deep learning models have enhanced recent Open AI technologies, such as GPT-4. These systems now boast the ability to engage in human-like conversational interactions, generate creative content, and perform a wide range of tasks beyond predefined commands.

“”Siri and Alexa are kind of old now. They just need a Botox,” I said on the Trends with Friends podcast. “And I think LLMs are going to be the Botox.” My bet is that Apple might start with OpenAI, but they would work with anyone — say, Google’s Gemini — as a backfill. Apple’s own models running locally and using OpenAI to work with other services to bolster Siri.

There has been speculation that Apple might discard Siri — a move that would be a mistake. Siri is an integral part of Apple’s brand identity. Despite being criticized for its inefficiency, it still holds significant brand cachet. A whole generation has grown up familiar with Siri and her voice. We don’t often think of voice interfaces as a “brand,” but in a future where screens and keyboards may vanish, they become a crucial part of brand identity. If Apple can enhance Siri, the early issues will be forgotten — much like the initial disaster of Apple Maps is rarely discussed now.

This aligns with typical Apple strategy. Apple is usually not the first to adopt technological shifts and prefers to take its time entering a market. Historically, the company has lagged behind others in internet software and services. It relied on Google Maps and Google Search to enhance its products before fully committing to developing its own maps. Now, it is reportedly partnering with OpenAI, as noted recently by Daniel Raffel, a Silicon Valley veteran, on his blog.

Reflecting on Apple’s past, particularly a rumored $1 billion deal with Nuance for a perpetual license to self-host and privately fork their speech technology for Siri’s iOS launch, a pattern emerges. When Apple needed to quickly catch up, they licensed and integrated advanced technologies. This strategy allowed them to provide a cutting-edge assistant with voice recognition and speech synthesis to iPhone users. Given this history, it wouldn’t be surprising if Apple is taking a similar approach with OpenAI to offer cloud based AI offerings. It would be in character for Apple to have negotiated a perpetual license to self-host and fork as much OpenAI software as possible and to potentially license custom models.

It would be interesting to see how Apple tackles the challenge of AI while maintaining its public and marketing stance of prioritizing privacy. Apple has emphasized “on-device” processing and data encryption as its unique selling points, in contrast to companies like Google or Meta. About a decade ago, I examined the differing approaches of Google and Apple in an article for The New Yorker. We live in infinitely more complex times now — AI as we know it relies heavily on cloud computing, and it often compromises privacy. It absorbs data liberally from the cloud, a process necessary for personalization.

In recent months, Apple has allowed details of all on-device research conducted by its employees to become public. I interpret these leaks as strategic marketing moves, signaling how Apple is likely to address the shift toward artificial intelligence.

OpenAI’s partnership enables the company to apply AI directly on devices while leveraging OpenAI’s capabilities in the background — a strategy that upholds the “privacy” argument. However, it is still unclear how they manage this without diminishing the services provided.

Nonetheless, I am pleased that Apple is beginning to adapt to this shifting paradigm. Just as smartphones revolutionized our interaction with information, AI signifies another transformation. In a recent article, I wrote:

Perhaps not today, but in the near future, we will begin to interact with information through non-textual interfaces. Until now, we have primarily used computers via keyboards (both physical and virtual) and mice. With smartphones, we started to use “touch” as a method of interaction. Whether it’s capturing receipts for Expensify or taking photos of items to shop for later, we have begun using the “camera” to capture information. As technology improves, we are increasingly using the camera as a conduit between text and ourselves.

Cameras are already acting as non-textual conduits. Facebook’s Glasses are a good case study. We will continue to see proliferation and experimentation with new devices. Vision Pro, Ai Pin, Rabbit, or “Enter Name” are just the beginning and might become failed experiments, but there is no turning back. Fast forward a decade from now — perhaps sooner — one thing is certain: the share of “keyboards” as a way to interact will be much less. A good comparison is the share of viewing minutes for linear television in the age of streaming.

This future will indeed feature a larger presence of voice interfaces in our computing lives. The reason this hasn’t occurred so far is that the technology is not quite complete. The new AI technologies will act like Botox for voice interfaces. I discussed this topic recently on Howard Lindzon’s podcast.

For me, Apple’s true strength in AI lies in its ability to drive innovations centered on core technologies. In previous discussions about Apple’s chips and AI, I noted that the company was slow to embrace the AI revolution. However, its capability to develop chips with graphical processing and neural capabilities, and then seamlessly integrate them with its operating system and software, stands out as a distinct advantage that is often underestimated.

Today, it may lag behind Nvidia in terms of chips needed for training, but the company can leverage its “device scale” to develop more powerful yet cheaper GPUs and neural capabilities required for on-device inference models. “Silicon” is Apple’s edge. Consequently, like everyone else, I am patiently waiting to learn what Apple will announce.

June 8, 2024. San Francisco

Technology

2 comments

2 thoughts on this post

Paul Guinnessy says:

June 8, 2024 at 12:53 pm

I’m still waiting for them to have a smart filter that shows what unread books are in the books app. I’d take that over adding AI to the product. It’s really irritating that every other books app has it, and ruins books killer feature of uploading books and reports to it.

Loading...
1. Om Malik says:
  
  June 8, 2024 at 5:18 pm
  
  Hehe! For me it is just a playlist of new songs from my favorite artists is what I want. Little shit that others do so well, but Apple sucks at.
  
  Loading...

Comments are closed.

Apple + AI: What to Expect at WWDC 2024

Subscribe to discover Om’s fresh perspectives on the present and future.

2 thoughts on this post

Share on Mastodon