Connected TV so far has been an experiment in futility, and it’s due to one thing: the act of typing out letters, either with a keyboard or a remote control, is just plain unnatural.
Attempts to integrate PCs, create set-top boxes with keyboards, and redesign interfaces are all leading nowhere fast. Techies love them and will play with them and maybe even figure out how to use them, but for mainstream TV viewers it’s just not going to happen.
I read a comment on an article today that said it all for me:
Smart TVs will fail because most people watch TV to avoid thinking.
There’s a great book on web design called “Don’t Make Me Think“. Many of the same user interaction principles apply to connected TV, but the barrier of the keyboard and mouse remains. A lot of the actions we now find intuitive on a PC just aren’t that way on a TV. If you stream TV content to your PC, you watch on a smaller screen and leverage the usability features you’re used to from the web, but you give up the big screen living room experience.
There are a lot of smart TV announcements at CES. Lenovo announced the Android K91, and there’s the LG Google TV with their magic motion remote which integrates qwerty capability with voice recognition. Further confusing matters, Canonical showed off their Ubuntu TV concept.
These are attempts to differentiate, so manufacturers can capture users. (Another problem: if you switch TV brands, you get to relearn everything about the interface.) But these operating system and visual redesigns won’t ultimately win users without some help, and it will come in one of two forms to make input more natural:
Smartphone or tablet app
I know, the Apple fans reading this are thinking that voice will immediately trump everything else, and it’s a distinct possibility that turns out to be true. We’ll talk about that in a second. There is a use case emerging that may make sense for apps, and it’s not just putting the remote control on the phone.
It’s a reality that most people armed with smartphones or tablets are engaged in some form of social networking while watching TV, unless they are fully engaged with a new-to-them movie, event, or premiere programming. What if the social experience was more integrated with the mobile device?
Shelby rolled out TouchPlay at CES. Here’s their vid. They have completely rethought the way their web app works, and integrated it with the TV experience.
To activate TouchPlay, turn on AirPlay with Mirroring in your iPad settings and when you launch the Shelby app, you’ll be greeted with gorgeous full screen video on your TV that has that familiar Shelby look and feel. We’ve also added a gesture remote, which turns your iPad into a no-look controller so you can kick back and fully enjoy what you’re watching.
If the tablet on your lap on the couch translated your gestures into stuff happening on the TV across the room, naturally, it’d be huge. I can’t find any evidence that the Google community is after this type of functionality, but it’s not far fetched. (Comment with a URL if you know something.)
Then there is voice input. The LG Magic Motion remote has limited voice control capability, but not the full range of things users will want. Searching for content, posting status updates, and controlling the experience via voice could transform the whole TV landscape.
We’ve seen the beginnings with Microsoft Kinect Voice and it’s integration with Bing. For Xbox 360 users it’s a great idea, but this kind of thing needs to move into the TV.
At CES, Nuance introduced Dragon TV. They’re trying to take advantage of how people are starting to interact with phones and cars using natural voice commands. They’ve got the most experience and patent IP and it’ll be interesting to see how OEMs adopt their technology.
Of course, the 800lb gorilla in the room is Apple, and the world’s worst kept secret of applying the Siri interface to what presumably would be branded iTV. Apple could have an instant winner with their design principles applied if they line up all the pieces.
All these voice innovations will require a better content API to be truly effective. Speech to text is pretty good in today’s technology, so posting a status update in lieu of a keyboard should be possible. Searching for and sifting the right content will be the true test, and presents a higher level of difficulty. (“TV, find me an image of Rick Santorum” could produce a really undesirable result in your living room. There’s an infinite range of those kinds of things that we’re learned to manage in a full web browser.)
That’s why there still may be a case for tablet/TV integration. Nuance, interestingly, has both sides of this game covered since they acquired Swype. Instead of opening up a huge keyboard on-screen to send a Tweet, the keyboard could be on the tablet, with the text appearing on the TV and the rest of the gesture control available.
There’s also the whole idea of HTML5 speech input, which could make an app much more portable to different platforms (Apple or Google or Ubuntu). There is also a great case for mobile sites optimized for phones – the same interface would apply to a 1/3 TV screen view. I have often wanted to pull up baseball-reference.com while watching a game, and if I had an easy interface to do that in 1/3 of the TV screen with the other 2/3rd still showing the game, then swap that out for the Twitter feed, it’d be awesome.
Whatever the winner here, it’ll have to be a much more natural act than using a keyboard and a pointer, or a motion game controller or remote, trying to type words. Connected TV will only start to move when this gets solved.