Two decades into the new millennium, it is pretty apparent: hardware without software and smarts is nothing more than a gimmick. Apple’s product launch event today was a timely reminder of this new hardware reality. And I’m glad to say Apple delivered. The new iMac video cameras and the new iPad Pro’s front-facing cameras are a big step toward our visual future.
Ever since the pandemic began, and we started to live most of our lives on Zoom and its video conferencing cousins, we have complained about the poor quality of video cameras inside our computers. Apple computers, especially.
We mostly put up with poor quality, though some of us bought a gadget to connect our bigger and better camera to our computers. Or if you are like me, you used the camera on the iPhone via Camo App — after all, the iPhone camera stack is way better than any standalone streaming camera. Whatever the hack, we all were sick and tired of substandard front-facing cameras on our computers.
Throughout this pandemic experience, I have wondered why we don’t have computers with cameras that can use computer vision and depth perception to create a better video-calling experience. After all, the necessary technology building blocks are all here.
Computer vision algorithms are progressing by the day. Computers in our pockets, bags, and homes are power-packed with computing power, machine learning chips, and GPUs. Tiny cameras and sensors — you know, the ones they use in billions of smartphones— are now capable and cheap enough to be built into computers and to eventually create a better conference experience.
As I have written on multiple occasions, Zoom (or at least, working and interacting over video) culture is here to stay. Unlike the 1918 flu epidemic, when telephone failed to live up to its potential because telephone exchanges were manned by telephone operators who fell victim to the flu, the current pandemic has been made bearable by reliable video conferencing. The always-on servers are impervious to the kind of virus the rest of us are dealing with. It is not far-fetched to say that, without technology, the economic impact of the pandemic would have been far worse than we have currently experienced.
Just as the taste of the convenience of streaming media got us all addicted to Spotify and Netflix, we are all going to be video-calling for a long time. That is why we underestimate the behavior modification we have experienced due to the pandemic and video calling. Whether it is casual social contact, telemedicine, remote learning, or remote work, we have entered an era where communication will be increasingly visual.
Data during the pandemic makes this clear. Google Duo and Google Meet hosted over 1 trillion minutes of video calls globally in 2020. OpenVault, a company that studies US network traffic patterns, noted in a recent study that the total upstream traffic increased 63 percent in 2020. A lot of that was due to the use of video conferencing apps. OpenVault data shows that a one-hour call on Zoom takes up between 360 megabytes to 1.2 gigabytes. The usage of bandwidth consumption nearly doubled during business hours. This additional bandwidth presents new challenges and opportunities, and Google is already on the case with its AI division, using new codecs to compress audio and video into less space.
With so much of our work and life going visual, there is an opportunity for a superior experience. Apple has recognized that — and so have Amazon, Facebook, Google, and Microsoft, who have all launched dedicated video-calling devices.
The new iMac has a 1080pHD camera that taps into the image signal processor in the M1 chip and the Neural Engine, which boosts image quality with better noise reduction, greater dynamic range, and improved auto exposure and white balance. The three-microphone array results in less feedback. There is directional beamforming that helps ignore the background noise and focus on a user’s voice. In short, Apple says conversations will sound natural and clear. The six speakers, too, are designed to use algorithms and enable spatial audio.
Apple isn’t the first to marry video with artificial intelligence and machine learning. Google and Microsoft have been offering better features for a while. Google, for example, had launched noise cancellation, background blur, and low-light mode, among many other features on its video calling services last year. Apple, however, can take full advantage of its vertical integration. The new audio and video hardware will capitalize on the M1-chip’s components, such as image sensor, Neural Engine, and GPUs to their potential.
The question is, where are we going when it comes to video calling in the future? We don’t have to look too far — the new iPad Pro gives us ample clues. The iPad Pro is now powered by the same M1 chip that powers the Mac devices. In addition to the TrueDepth camera system, the iPad Pro has a new ultra-wide front camera. Looking at the new iPad Pro’s camera rig, it is not difficult to see these same capabilities coming to your laptop or desktop. And most certainly to your iPhone. The visual sensors are much more capable than what we have experienced. It is all about using that power and artificial intelligence to create a user experience.
Apple hopes to do that with a new feature launched on the iPad Pro called Center Stage. Not that I would recommend anything related to Facebook to even my worst enemy, but Center Stage reminds me of Facebook’s Portal device that offered similar zooming capabilities and tracked people’s movements.
“Center Stage uses the much larger field of view on the new front camera and the machine learning capabilities of M1 to recognize and keep users centered in the frame,” Apple notes in a press release. “As users move around, Center Stage automatically pans to keep them in the shot. When others join in, the camera detects them too, and smoothly zooms out to fit everyone into the view and make sure they are part of the conversation.”
Today, it is hard to tell the difference between my competent 2018 model of iPad Pro and the newest iPad Pro. The 2021 model’s screen might be better. It might have a newer chip. It even has a more capable port to connect peripherals. But in the end, the only reason for me to upgrade is to use the new camera capabilities to have a more productive video calling experience. The underlying specs won’t matter as much — it is all about smarts and intelligence.
April 20, 2021, San Francisco