What’s broken, patents or the legal system?

Referring to the America Invents Act (AIA), aimed to cull low-quality software, the head of the United States Patent and Trademark Office, David Kappos says:

“Give it a rest already. Give the AIA a chance to work. Give it a chance to even get started.”

He’s mostly reacting to studies that claim patent trolls enabled by USPTO cost the economy upwards of $29 billion annually. While awards vary, what’s constant is the exorbitant cost of litigating patent cases. Large scale cases can easily run into tens of millions, taking months and years.

One way to make sense of this situation is to declare the very notion of (software) patents archaic and indefensible in the 21st century. But what if the problem isn’t the fundamental notion or the general utility of patents, rather the inefficiencies in our legal system?

If the legal costs associated with getting and defending patents were 10X cheaper and the process of adjudication much faster, professional and predictable, would we feel differently about patent claims?

Is Siri really Apple’s future?

Siri is a promise. A promise of a new computing environment, enormously empowering to the ordinary user, a new paradigm in our evolving relationship with machines. Siri could change Apple’s fortunes like iTunes and App Store…or end up being like the useful-but-inessential FaceTime or the essential-but-difficult Maps or the desirable-but-dead Ping. After spending hundreds of millions on acquiring and improving it, what does Apple expect to gain from Siri, at once the butt of late-night TV jokes but also the wonder of teary-eyed TV commercials?

Everyone expects different things from Siri. Some think top 5 wishes for Siri should include the ability to change iPhone settings. The impatient already think Siri should have become the omniscient Knowledge Navigator by now. And of course, the favorite pastime of Siri commentators is comparing her query output to Google Search results while giggling.

Siri isn’t a sexy librarian

The Google comparison, while expected and fun, is misplaced. It’d be very hard for Siri (or Bing or Facebook, for that matter) to beat Google at conventional Command Line Interface search given its intense and admirable algorithmic tuning and enormous infrastructure buildup for a decade. Fortunately for competitors, though, Google Search has an Achilles heel: you have to tell Google your intent and essentially instruct the CLI to construct and carry out the search. If you wanted to find a vegetarian restaurant in Quincy, Massachusetts within a price range of $25-$85 and you were a Google Search ninja, you could manually enter a very specific keyword sequence: “restaurant vegetarian quincy ma $25…$85” and still get “about 147,000 results (0.44 seconds)” to parse from. [All examples hereon are grossly simplified.]


This is a directed navigation system around The Universal Set — the entirety of the Internet. The user has to essentially tell Google his intent one. word. at. a. time and the search engine progressively filters the universal set with each keyword from billions of “pages” to a much smaller set of documents that are left for the user to select the final answer from.

Passive intelligence

Our computing devices, however, are far more “self-aware” circa 2012. A mobile device, for instance, is considerably more capable of passive intelligence thanks to its GPS, cameras, microphone, radios, gyroscope, myriad other in-device sensors, and dozens of dedicated apps, from finance to games, that know about the user enough to dramatically reduce the number of unknowns…if only all these input and sensing data could somehow be integrated.

Siri’s opportunity here to win the hearts and minds of users is to change the rules of the game from relatively rigid, linear and largely decontextualized CLI search towards a much more humane approach where the user declares his intent but doesn’t have to tell Siri how do it every step of the way. The user starts a spoken conversation with Siri, and Siri puts an impressive array of services together in the background:

  • precise location, time and task awareness derived from the (mobile) device,
  • speech-to-text, text-to-speech, text-to-intent and dialog flow processing,
  • semantic data, services APIs, task and domain models, and
  • personal and social network data integration.

Let’s look at the contrast more closely. Suppose you tell Siri:

“Remind me when I get to the office to make reservations at a restaurant for mom’s birthday and email me the best way to get to her house.”

Siri already knows enough to integrate Contacts, Calendar, GPS, geo-fencing, Maps, traffic, Mail, Yelp and Open Table apps and services to complete the overall task. A CLI search engine like Google’s could complete only some these and only with a lot of keyword and coordination help from the user. Now lets change “a restaurant” above to “a nice Asian restaurant”:

“Remind me when I get to the office to make reservations at a nice Asian restaurant for mom’s birthday and email me the best way to get to her house.”

“Asian” is easy, as any restaurant-related service would make at least a rough attempt to classify eateries by cuisine. But what about “nice”? What does “nice” mean in this context?

A conventional search engine like Google’s would execute a fairly straight forward search for the presence of “nice” in the text of restaurant reviews available to it (that’s why Google bought Zagat), and perhaps go the extra step of doing a “nice AND (romantic OR birthday OR celebration)” compound search to throw in potentially related words. Since search terms can’t be hand-tuned for an infinite number of domains, this comes into play for highly searched categories like finance, travel, electronics, automobiles, etc. In other words, if you’re searching for airline tickets or hotel rooms, the universe of relevant terms is finite, small and well understood. Goat shearing or olive-seed spitting contests, on the other hand, may not benefit as much from such careful human taxonomic curation.

Context is everything

And yet even when a conventional search engine can correlate “nice” with “romantic” or “cozy” to better filter Asian restaurants, it won’t matter to you if you cannot afford it. Google doesn’t have access to your current bank account, budget or spending habits. So for the restaurant recommendation to be truly useful, it would make sense for it to start at least in a range you could afford, say $$-$$$, but not $$$$ and up.

Therein comes the web browser vs. apps unholy war. A conventional search engine like Google has to maintain an unpalatable level of click-stream snooping to track your financial transactions to build your purchasing profile. That’s not easy (likely illegal on several continents) especially if you’re not constantly using Google Play or Google Wallet, for example. While your credit card history or your bank account is opaque to Google, your Amex or Chase app has all that info. If you allow Siri to securely link to such apps on your iPhone, because this is a highly selective request and you trust Siri/Apple, your app and/or Siri can actually interpret what “nice” is within your budget: up to $85 this month and certainly not in the $150-$250 range and not a $25 hole-in-the wall Chinese restaurant either because it’s your mother’s birthday.

Speaking of your mother, her entry in your Contacts app has a custom field next to “Birthday” called “Food” which lists: “Asian,” “Steak,” and “Rishi Organic White Tea”. On the other hand, Google has no idea, but your Yelp app has 37 restaurants bookmarked by you and every single one is vegetarian. Your mother may not care, but you need a vegetarian restaurant. Siri can do a proper mapping of the two sets of “likes” and find a mutually agreeable choice at their intersection.

So a simple search went from “a restaurant” to “a nice Asian vegetarian restaurant I can afford” because Siri already knew (as in, she can find out on demand) about your cuisine preference and your mother’s and your ability to pay:

Restaurant chain

Mind you, all these series of data lookups and rule arbitrations among multiple apps happen in milliseconds. Quite a bit of your personal info is cached at Apple servers and the vast majority of data lookups in third party apps are highly structured and available in a format Siri has learned (by commercial agreement between companies) to directly consume. Still, the degree of coordination underneath Siri’s reassuring voice is utterly nontrivial. And given the clever “personality” Siri comes with, it sounds like pure magic to ordinary users.

The transactional chain

In theory, Siri’s execution chains can be arbitrarily long. Let’s consider a generic Siri request:

Check weather at and daily traffic conditions to an event at a specific location, only if my calendar and my wife’s shared calendar are open and tickets are available for under $50 for tomorrow evening.

Siri would parse it semantically as:


and translate into an execution chain by apps and services:


Further, being an integral part of iOS and having programmatic access to third party applications on demand, Siri is fully capable of executing a fictional request like:

Transfer money to purchase two tickets, move receipt to Passbook, alert in own calendar, email wife, and update shared calendar, then text baby sitter to book her, and remind me later.

by translating it into a transactional chain, with bundled and 3rd party apps and services acting upon verbs and nouns:


By parsing a “natural language” request lexically into structural subject-predicate-object parts semantically, Siri can not only find documents and facts (like Google) but also execute stated or implied actions with granted authority. The ability to form deep semantic lookups, integrate information from multiple sources, devices and 3rd party apps, perform rules arbitration and execute transactions on behalf of the user elevates Siri from a schoolmarmish librarian (à la Google Search) into an indispensable butler, with privileges.

The future is Siri and Google knows it

After indexing 40 billion pages and their PageRank, legacy search has largely run its course. That’s why you see Google, for example, buying the world’s largest airline search company ITA, restaurant rating service Zagat, and cloning Yelp/Foursquare with Google Places, Amazon with Google Shopping, iTunes and App Store with Google Play, Groupon with Google Offers, Hotels.com with Google Hotel Finder…and, ultimately, Siri with Google Now. Google has to accumulate domain specific data, knowledge and expertise to better disambiguate users’ intent in search. Terms, phrases, names, lemmas, derivations, synonyms, conventions, places, concepts, user reviews and comments…all within a given domain help enormously to resolve issues of context, scope and intent.

Whether surfaced in Search results or Now, Google is indeed furiously building a semantic engine underneath many of its key services. “Normal search results” at Google are now almost an afterthought once you go past the various Google and third party (overt and covert) promoted services. Google has been giving Siri-like answers directly instead of providing interminable links. If you searched for “Yankees” in the middle of the MLB playoffs, you got real-time scores by inning, first and foremost, not the history of the club, the new stadium, etc.

Siri, a high-maintenance lady?

Google has spent enormous amounts of money on an army of PhDs, algorithm design, servers, data centers and constant refinements to create a global search platform. The ROI on search in terms of advertising revenue has been unparalleled in internet history. Apple’s investment in Siri has a much shorter history and far smaller visible footprint. While it’d be suicidal for Apple to attack Google Search in the realm of finding things, can Apple sustainably grow Siri to its fruition nevertheless? Very few projects at Apple that don’t manage to at least provide for their own upkeep tend to survive. Given Apple’s tenuous relationship with direct advertising, is there another business model for Siri?

By 2014, Apple will likely have about 500 million users with access to Siri. If Apple could get half of that user base to generate just a dozen Siri-originated transactions per month (say, worth on average $1 each, with a 30% cut), that would be roughly a $1 billion business. Optimistically, the average transaction could be much more than $1 or the number of Siri transactions much higher than 12/month/user or Siri usage more than 50% of iOS users, especially if Siri were to open to 3rd party apps. While these assumptions are obviously imaginary, even under the most conservative conditions, transactional revenue could be considerable. Let’s recall that, even within its media-only coverage, iTunes has now become a $8 billion business.

As Siri moves up the value chain from its original CLI-centric simplicity prior to Apple acquisition to its current status of speech recognition-dictation-search to a more conversationalist interface focused on transactional task completion, she becomes far more interesting and accessible to hundreds of millions of non-computer savvy users.

Siri as a transaction machine

A transactional Siri has the seeds to shake up the $500 billion global advertising industry. For a consumer with intent to purchase, the ideal input comes close to “pure” information, as opposed to ephemeral ad impression or a series of search results which need to be parsed by the user. Siri, well-oiled by the very rich contextual awareness of a personal mobile device, could deliver “pure” information with unmatched relevance at the time it’s most needed. Eliminating all intermediaries, Siri could “deliver” a customer directly to a vendor, ready for a transaction Apple doesn’t have to get involved in. Siri simply matches intent and offer more accurately, voluntarily and accountably than any other method at scale that we’ve ever seen.

Another advantage of Siri transactions over display and textual advertising is the fact that what’s transacted doesn’t have to be money. It could be discounts, Passbook coupons, frequent mileage, virtual goods, leader-board rankings, check-in credits, credit card points, iTunes gifts, school course credits and so on. Further, Siri doesn’t even need an interactive screen to communicate and complete tasks. With Eyes Free, Apple’s bringing Siri to voice controlled systems, first in cars, then perhaps to other embedded environments that don’t need a visual UI. Apple having the largest and the most lucrative app and content ecosystem on the planet with half a billion users with as many credit card accounts would make the nature of Siri “transactions” an entirely different value proposition to both users and commercial entities.

Siri, too early, too late or merely in progress?

And yet with all that promise, Siri’s future is not a certain one. A few potential barriers stand out:

  • Performance — Siri works mostly in the cloud, so any latency or network disruption renders it useless. It’s hard to overcome this limitation since domain knowledge must be aggregated from millions of users and coordinated with partners’ servers in the cloud.
  • Context — Siri’s promise is not only lexical, but also contextual across countless domains. Eventually, Siri has to understand many languages in over 100 countries where Apple sells iOS devices and navigate the extremely tricky maze of cultural differences and local data/service providers.
  • Partners — Choosing data providers, especially overseas, and maintaining quality control is nontrivial. Apple should also expect bidding wars for partner data, from Google and other competitors.
  • Scope — As Siri becomes more prominent, so grow expectations over its accuracy. Apple is carefully and slowly adding popular domains to Siri coverage, but the “Why can’t Siri answer my question in my {esoteric field}?” refrain is sure to erupt.
  • Operations — As Siri operations grow, Apple will have to seriously increase its staffing levels, not only for engineers from the very small semantic search and AI worlds, but also in the data acquisition, entry and correction processes, as well as business development and sales departments.
  • Leadership — Post-acquisition, two co-founders of Siri have left Apple, although another one, Tom Gruber, remains. Apple recently hired William Stasior, CEO of Amazon A9 search engine, to lead Siri. However, Siri needs as much engineering attention as data partnership building, but Stasior’s A9 is an older search engine different from Siri’s semantic platform.
  • API — Clearly, third party developers want and expect Apple someday to provide an API to Siri. Third party access to Siri is both a gold mine and a minefield, for Apple. Since same/similar data can be supplied via many third parties, access arbitrage could easily become an operational, technical and even legal quagmire.
  • Regulation — A notably successful Siri would mean a bevy of competitors likely to petition DoJ, FTC, FCC here and counterparts in Europe to intervene and slow down Apple with bundling/access accusations until they can catch up.

Obviously, no new platform as far-reaching as Siri comes without issues and risks. It also doesn’t help that the two commercial online successes Apple has had, iTunes and App Store, were done in another era of technology and still contain vestiges of many operational shortcomings. More recent efforts such as MobileMe, Ping, Game Center, iCloud, iTunes Match, Passbook, etc., have been less than stellar. Regardless, Siri stands as a monumental opportunity both for Apple as a transactional money machine and for its users as a new paradigm of discovery and task completion more approachable than any we’ve seen to date. In the end, Siri is Apple’s game to lose.


Slate iphone5

In what passes as technology journalism, 3 months = 180° turn. (Why this particular author changed his heart, brain, spleen and testosterone level for this particular story is a matter for another day.) What is worrisome here is that such fickleness of opinion has become excruciatingly common in online journalism. It pays to shout, shout first, shout often, shout loud, shout different, but most familiarly, just shout.

Shouting sells. We’ve known this for a long time. If companies are daft enough to let their ad buyers talk them into spending money on those who shout the most, then publishers would be reckless to leave money on the table. Some publishers say they would like to steer their publications away from yellow journalism, but in a compensation system based solely on pageviews and clicks, they are beholden to a Romneyesque principle: “We’re not going to let our campaign be dictated by fact-checkers.”

It’s far less important how one author feels about the iPhone 5 than the alarming fact that Slate let this author publish a 1,200-word essay about a device he hadn’t used, nearly three months before it shipped. Why? Because shouting creates pageviews and clicks, and…well, there’s nothing more to say: shouting sells. If this author or another wants to be in the game, sooner than later, he or she will have to start shouting, louder and louder.

Paradoxically, some of the most thoughtful people around work in journalism. And yet all efforts of transition from print-based to online publishing without reliance on pageviews and clicks have essentially flopped. The current crop ranges from VC-supported publicity outlets masquerading as online newsdailies to those whose contribution to civilization stop at copy-and-paste aggregation in a slide show.

While what’s new may not be fully satisfying, there’s no going back to the old either. Regardless, all around the world and especially in Europe there are calls to subsidize old print by taxing new tech:


Mind you, these aren’t really calls to incentivize companies to create new models of service delivery online but to subsidize and sustain their existing operating structures during transition to an online regime that expects them to inevitably adopt, yes, pageview advertising for survival.


Nobody likes advertising, and yet we seem to be stuck with its corrupting effects on public discourse online. It corrupts news delivery, Facebook privacy, Twitter flow, Google search, Kindle reading and so on. There doesn’t seem to be any way to make profits online, or often just survive, without pageviews and clicks, and all the shouting that entails.

Sadly, publishing is not the only industry suffering the ravages of transition to digital. We want better and cheaper telephony, faster and more ubiquitous Internet access, digitally efficient health care, on-demand online education, 21st century banking, always-available music, TV and movies…

We believe the future is fully digital, and the future is now. And yet experimenting with new digital models not based on advertising at a scale that matter have not been successful. Entrenched players spend hundreds of millions to maintain their regulatory moats and leverage their concentrated distribution power. In Canada, just three publishing groups own 54% of newspapers. If allowed to merge, Universal and EMI would control 51 of 2011’s Billboard Hot 100 songs. Six Hollywood studios account for well over 3/4 of the market. AT&T and Verizon alone have over 440,000 employees. Predictably, the FCC remains the poster child of regulatory capture.

The un-digital camp is far from relinquishing their power. Models that can replace them aren’t here. Advertising online has been corruptive of user privacy and editorial integrity. I’m afraid it’ll be a miracle if the shouting subsides anytime soon.

A Memory Hole

I am a phlegmatic man. But once, just once, I want to wake up and invent a new design philosophy, and acronymize it so sublimely even a sixth-grader can instantly grasp its exultation of the human spirit:


I want to shout down from the rooftops — especially from the rooftop of what was once the largest computer vendor in the world — making sure every soul hears it, even the Proles:


I want to get on every telescreen to explain The Theory and Practice of MUSE Design Philosophy:


I want to show everyone how hard our team worked:


Then, right after a Two Minutes Hate, I want to take the stage, hold the fruits of our labor in my hand and let everyone soak in its glory:


Yes, there will be doubters. And there will be haters. But we will deal with them…in Room 101:


In the fullness of time, there will be learning, there will be understanding, and there will be acceptance. One unperson after another.

One bright cold day in September when the clocks strike thirteen, I will come back and reassure everyone that we do what we do for the greater good.


Spirit of Siri at Apple 25 years ago


As a budding standup comedienne, Siri opened Apple’s WWDC 2012 Monday morning and concluded her act with the prophetic:

It’s really hard for me to get emotional, because as you can tell, my emotions haven’t been coded yet.

Clearly, Siri is a work in progress and she knows it. What others may not know, though, is that while Siri is a recent star in the iOS family, her genesis in the Apple constellation goes far back.

The Assistant and Assist

Nearly three decades ago, fluid access to linked data displayed in a friendly manner to mere mortals was an emerging area of research at Apple.

Samir Arora, a software engineer from India, was involved in R&D on application navigation and what was then called hypermedia. He wrote an important white paper entitled “Information Navigation: The Future of Computing.” In fact, working for Apple CEO John Sculley at the time, Arora had a hand in the making of the 1987 “Knowledge Navigator” video — Apple’s re-imagining of human-computer interaction into the future:


Unmistakably, the notion of Siri was firmly planted at Apple 25 years ago. But “Knowledge Navigator” was only a concept prototype, Apple’s last one to date. Functional code shipped to users along the same lines had to evolve gradually over the next few years.

After the “Knowledge Navigator,” Arora worked on important projects at Apple and ran the applications tools group that created HyperCard and 4th Dimension (one of the earliest GUI-based desktop relational databases). The group invented a new proprietary programming language called SOLO (Structure of Linked Objects) to create APIs for data access and navigation mostly for mobile devices.

In 1992, Arora and the SOLO group spun off from Apple as Rae Technology, headquartered on the Apple campus. A year later, Rae Assist, one of the first Personal Information Managers (PIMs), was introduced. Based on 4th Dimension DB, Assist combined contact management, scheduling and note taking in an integrated package (automatically linking contact and company information or categorizing scheduled items, etc) for PowerBook users on the go. Although three versions of Assist were released in the following two years, Rae didn’t make any money in the PIM business. But as Rae also worked with large enterprise customers like Chevron and Wells Fargo in database-centric projects, the company realized the SOLO frameworks could also be used to design large-scale commercial websites:

SOLO is based on a concept that any pieces of data must accommodate the requirement of navigation and contextual inheritance in a database environment. In layman terms, it means that every piece of text, graphics and page is embedded with an implicit navigation framework based on the groupings or order in which the items are organized. In other words, a picture, which is a data object, placed in this programming environment will automatically know the concept of ‘next’ and ‘previous’ without having to write an explicit line of code. This simplifies the coding process. Since the information and business logic organization models were already completed for the client-software, converting this to a web application was simply a recompilation of the codes for a different delivery platform. The project was completed within four weeks and we were stunned as to how simple it was. This was an important validation point illustrating the portability of our technology for cross-platform development.

It wasn’t long before we realized that SOLO, a technology based on information organization models, could be adapted and modified for an application to build web sites. A prototype was developed immediately and soon after a business plan was developed to raise venture funding. NetObjects was founded.

Rae quickly applied for patents for website design software and transferred its technology IP to NetObjects. With seed money and the core team from Rae, NetObjects had a splashy entry into what later came to be known as Content Management Systems (CMS). Unfortunately, the rest was rough going for the fledging company. Not long after IBM invested about $100M for 80% of NetObjects, the company went public on NASDAQ in 1999. Heavily dependent on IBM, NetObjects never made a profit and it was delisted from NASDAQ. IBM sold it in 2001.

Outside Apple, SOLO traveled a meandering path into insignificance. Rae Technology became Venture Capital and NetObjects eventually atrophied.

Flying through WWW

Only three years after the SOLO group left Apple for Rae, Ramanathan V. Guha, a researcher in Apple’s Advanced Technology Group, started work on the interactive display of structured, linkable data, from file system hierarchy to sitemaps on the emerging WWW. Guha had earlier worked on CycL knowledge representation language and created a database schema mapping tool called Babelfish, before moving to Apple to work for Alan Kay in 1994.

His new work at Apple, Project X (HotSauce, as it was later called), was based on 3D representation of data that a user could “fly through” and Meta-Content Format (MFC), a “language for representing a wide range of information about content” that defined relationships among individual pieces of data. At an Apple event at the time, I remember an evangelist telling me that HotSauce will do for datasets what HTML did for text on the web.


Apple submitted MCF to IETF as a standard for describing content and HotSauce (with browser plugins for Mac OS and Windows) found some early adopters. However, shortly after Steve Jobs’ return in 1997, it was a casualty of the grand house cleaning at Apple. Guha left Apple for Netscape, where he helped create an XML version of MCF, which later begot RDF (W3C’s Resource Description Framework) and the initial version of RSS standards.

It’s the metadata, stupid!

Even in its most dysfunctional years in the mid-199os, Apple had an abiding appreciation of the significance of metadata and the relationships among its constituent parts.

SOLO attempted to make sense of a user’s schedule by linking contacts and dates. HotSauce allowed users to navigate faceted metadata efficiently and with some measure of fun to find required information without having to become a data architect. The Assistant in the “Knowledge Navigator” had enough contextual data about its master to interpret temporal, geo-spatial, personal and other contextual bits of info to draw inferential conclusions to understand, recommend, guide, filter, alert, find or execute any number of actions automatically.

There is an app for that

A decade later, Apple was now in need of technology to counter Google’s dominance in search-driven ad revenue on its iOS platform. A frontal assault on Google Search would have been silly and suicidal, notwithstanding the fact that Apple had no relevant scalable search technology. But there was an app for that. And it was called Siri.


Siri was a natural language abstraction layer accessed through voice recognition technology from Nuance to extract information from primarily four service partners: OpenTable, Google Maps, MovieTickets and TaxiMagic. Siri was on the iPhone first but it was headed to BlackBerry and Android. Apple bought Siri on April 28, 2010 and that original app was discontinued on October 15, 2011. Now Siri is a deeply embedded part of iOS.

Of course, the Siri code and the team came to Apple from an entirely different trunk of the semantic forest, from SRI International’s DARPA-funded Artificial Intelligence Center projects: Personalized Assistant that Learns (PAL) and Cognitive Assistant that Learns and Organizes (CALO), with research also conducted at various universities.

What made Siri interesting to Apple wasn’t the speech recognition or the simple bypassing of browser-based search, but the semantic relationships in structured and linkable data accessed through natural language. It was SOLO redux at scale and HotSauce cloaked in speech. It wasn’t meant to compete with Google in search results but to provide something Google couldn’t: making contextual sense.

Unlike Google, Siri knows, for example, what “wife” or “son’s birthday” means and can thus provide, not a long list of departures for further clicks, but precise answers. Siri delivers on the wildest dreams of SOLO and HotSauce of an earlier generation. In two years, even as limited to just a few service partners, Siri progressed far more than the developers of SOLO or HotSauce could have imagined. It now speaks the vast majority of the world’s most prominent languages, with connections to local data providers around the globe.

Having intimate conversations with Samuel Jackson and John Malkovich, Siri has become a TV star. Most iOS users already think Siri has a personality, if not an attitude all together. Hard to say what will happen when she actually gets her “emotions coded.”