How Hambrecht Uses Machine Learning to Up its Venture Hit Rate 3X
Venture capital is a “hit business.” The relatively few hits pay for the high percentage of flops. Venture capitalists are risk takers who embrace the high-risk/high-reward equation. It's the allure of a mega-hit that animates the industry, that keeps investors dreaming, prospecting and writing checks.
But is frequent venture failure inescapable? Could a more probabilistic approach based on big data, machine learning and advanced computing help investors do better?
Thomas Thurston, a data scientist and CTO of venture firm W.R. Hambrecht, is someone who “look(s) at things the way they are, and asks(s) why?”* Speaking in New York recently at Tabor Communications’ HPC & AI on Wall Street conference, Thurston discussed his decade-plus quest to take on venture's immutable laws. His goal: to build a data-driven strategy for investment discovery and assessment that attacks the industry's high flop rate.
The conventional approach consists of building a personal network that unearths investment possibilities, listening to entrepreneurs’ talk about their startups, deciding who gets funding and then keeping tabs (riding herd) on companies in the fund.
This means meetings, meetings and more meetings, many of which prove fruitless. The feature of these meetings, of course, is the entrepreneur’s funding pitch. One might assume these sessions resemble an economics seminar at business school or a meeting of the Federal Reserve. Instead, they imitate “Shark Tank.” Yes, there’s a business case with balance sheets, total available market sizing and growth projections, but there also are strong elements of salesmanship, personal chemistry and Powerpoint panache. Question: is an entrepreneur's charisma predictive of a good investment? Or is it, as Thurston contends, a “substitution error” (see below)?
Thurston himself used to work within venture's conventional discovery methodology at Intel Capital, that company’s venture, investment and M&A arm. Common practice, he said, is for members of the investment committee to listen to a startup's funding pitch and then, after the entrepreneurs have left the room, go around the table sharing impressions.
“They say what… they liked, what they didn’t like, and… I’d sit at some of these tables and I just remember feeling it was kind of random…, and then whoever was the boss would just decide whatever that person wanted to do,” said Thurston. “The whole process started to seem very arbitrary to me..., and it’s wrong 70 percent of the time. It's a strange way of allocating capital that hasn’t worked for a hundred years.”
Perhaps Thurston's willingness to challenging venture’s precepts came out of his earlier experiences with data and supercomputing-class simulations. In college, he was a political science major who studied public policy outcomes based on statistical analysis. Later he worked in high performance computing for Intel (before moving to Intel Capital), and it was there he saw the power of HPC to simulate scientific problems on a massive scale. At Intel Capital he began forming ideas of using data, HPC and complex business models to analyze investment decisions.
“If you think about the billions of dollars being spent in venture capital and innovation, and the hit rate was so low…,” he said, “I thought, ‘Gosh, can we get new insights through simulation on a more statistical basis, a more algorithmic approach?’ That’s what started all this.”
Thurston said most VCs did not, and still don’t, respond positively to this approach. He often ran into the argument that “there are just too many variables in business, it’s intractable, you could never model all of the things.” But Thurston countered that business “is no more multivariate or complex than the cosmos, weather systems, cellular biology, nuclear explosions or other data-intensive things we model (in HPC) all the time. There’s no reason why business is more complex than physics, so certainly we can get some (investment) insight by using models. That was the idea.”
In his last year at Intel, Thurston began building a data set to look for patterns, then he spent a year at Harvard and began working with Bill Hambrecht on a data-driven approach to finding and evaluating venture opportunities.
Hambrecht, now 84, made his first venture deal more than 60 years ago and has invested in, among others, Adobe, Amazon, Apple, Google, Intel, Nvidia, Overstock.com, Pixar and Salesforce. In the 1990s, Hambrecht collaborated with Harvard Professor Clayton Christensen, whose The Innovator's Dilemma launched the influential “Disruption Theory” of market dynamics and innovation. W.R. Hambrecht built its investment strategy based on Christensen's research, leading to development of the firm’s Market Exaptation Simulation Engine (MESE) big data platform that Thurston and his group have worked on for more than a decade.
In fact, of Hambrecht’s 15 people, nine are on Thurston’s technical team. “Functionally, there’s more of us than any other single type of role,” he said, “which is very strange for a venture capital firm.”
MESE is highly proprietary – “We’d never let another venture capitalist come within a million miles of our code, it’s our proprietary advantage,” said Thurston – but to the extent he’s willing to reveal its workings, the system is a case study in how data can augment, or replace, venture’s traditional ways.
MESE is the result of endless toil, trial and error, a triumph of human will over tedium. Along with building out the investment simulation model, developing data sources – strands of data both dark and unstructured floating around the internet – that feed the MESE engine took years of development and remains ongoing.
“We can’t buy the database we need, we can’t find any aggregators that give us this data,” Thurston said, “so we actually have to have the use our own bots that go out there. Using AI, we’re able to figure out where those different data are that are helpful when used and aggregated in the right combinations.”
Not to be confused with the dark web, dark data is public, unformatted data on the internet, no hacking or firewall penetration is involved, Thurston said. Comprising about 90 percent of data on the internet, dark data is growing by roughly 60 percent a year.
“Most of the data out in the world is just sitting there and nobody sees the value in it, and for the most part we agree, a lot of it is just garbage,” Thurston said, declining to provide specifics about Hambrecht's data sources. “But we’ve found a few little gems in there, about 40 different things we can aggregate and fetch on call that, when combined in the right ways with the right coefficients, are actually quite helpful. They give us a really good indication of which solutions are gaining or losing relevant traction in their markets.”
How does the system work?
“We use very massive surveillance, the use of bots to harvest data to give us a sense of who’s winning in any market we’re interested in,” said Thurston. “Then we have big actuarial models that are very HPC in style, kind of like a multivariate weather simulation where we’ll create a marketplace populated with that company we're looking at, and all its competitors and data about the customers. Then we run iterative simulations where essentially the companies compete with each other to see who wins inside a virtual environment. If our company wins over 60 percent of the time in simulation, then well move on to real due diligence” – i.e., Hambrecht investigates the simulation winner and, when warranted, reaches out to meet entrepreneurs themselves.
Thurston said MESE, which runs on AWS, lets the firm identify targeted companies and look “thematically” at different markets. “We do this daily and we can pull up all the companies in those segments globally and see how they’re growing or declining,” he said, and they do it “all through the data we’ve learned to harvest and curate using bots because none of these startups are publicly disclosing much if not any data about their financial or commercial status. They’re all private little startups and nobody knows how they’re doing. But our system is able to if we can point our system at a certain segment of the market, like gut microbiome or some new SaaS building technology or whatever the case may be. We can see who’s out there globally and more importantly who’s winning. That’s the game changer.”
The result, according to Thurston: a 3x improvement over the firm’s previous, conventional approach to discovery and assessment.
The work assignment for Thurston and his technical team is to continually hunt for data and to refine MESE’s simulation engine.
“All we do every day is look for ways to improve our capabilities and our systems, we run them and work to improve them, we run experiments all the time, sometimes several a day, trying things, hypotheses, a whole queue of development,” he said. “Some are incremental, to add effectiveness or accuracy or functionality to what we have, some are brand new blue-sky development projects where we’re trying whole new hypotheses. We’ve been doing nothing but this for 12 years.
“It’s like Thomas Edison trying 10,000 things, and then maybe one works,” said Thurston. “But in a field like this, even one thing working gives you a tremendous advantage. And at this point, we have a number of tools that are very good at the specific things that we can combine when looking at a specific investment decision.”
In Thurston’s view, MESE does more than improve Hambrecht’s hit rate. It also lets the firm do an end-run around venture’s conventional, meeting-intensive and geographically limited approach.
In a word, MESE allows Hambrecht to scale.
“Most venture capitalists, if they want to choose their favorite 20 or 30 companies to invest in, they have to scan a thousand companies,” Thurston said, “so that's a thousand meetings with investors to hear their pitches. It takes at least a year to sit through a thousand pitches.”
Instead, Hambrecht can scan hundreds of companies a day and identify companies that appear to be performing well – this comprises roughly 5 percent of the startups scanned “that are highly correlated and predictive of the kind of success we need in our fund.” Those companies then are processed through a second and larger simulation, “more of an actuarial, predictive simulation on just those few companies.” Companies that emerge successfully then go through Hambrecht human due diligence, meeting with the investment committee and so forth.”
This three-step process enables the firm to cast a wider net, to examine more startups while spending less times in meetings.
“We get the best of both worlds,” Thurston said, “we’re sourcing an order of magnitude more companies on the intake side, but we can quickly winnow them down to a tiny number that are already high performing with really high odds (of success). So when we do human due diligence it's very focused and a very high percentage of those companies at that stage make it to getting funded. It’s nice to sit down and look at a startup knowing it's already winning…, there are lot of questions you don’t have to worry about, you can focus on a few things that might be specific risk factors and you can fine tune some of the areas where you might want to look closer.”
Asked if MESE has uncovered counterintuitive factors for venture success, Thurston said the system has shown that the better mousetrap proverb is usually invalid. That is, start-ups that produce a better-performing product in a mature product category, in which there are larger, mature players, fail 86 percent of the time.
Why?
The competitive response from existing market share leaders. “Once the new entrant starts to take away customers, the big incumbents have to respond very aggressively and tend to have the market dominance and reach to do that,” he said. “Their odds were terrible, so it’s not that they couldn’t have one, but it was just a 14 percent probability, and we’re not going to invest on those odds. You can’t bet against the average and consistently hope to beat the average.”
Another non-predictive factor for start-up success, Thurston said, is the company’s initial team.
“Most of what we’ve found empirically is very counterintuitive to what we, even I, thought before I got into the data side,” Thurston said. “I also believed that teams explained everything. As a human being… it seems the people are always screwing things up, that’s a very intuitive way to look at business and what tends to work.”
Especially with early-stage business, the people that comprise are “all you have, they may not have a product that works yet and all you can see and interact with is human beings.”
But MESE has determined “statistically we can’t pin more than 12 percent of the variant in outcome on the team, so it doesn’t count for 88 percent of the variant no matter how you define teams or outcomes. So teams matter, but in terms of prediction there’s almost no way of slicing or dicing a team that’s awfully useful.”
Which is to say no matter how much members of an investment committee may like an entrepreneur, it’s not in itself a strong factor likely to point a path to future riches.
“Bill Hambrecht was an early investor in Apple…,” he said, “he and Steve Jobs had a close relationship but (Hambrecht) is the first one to tell you nobody using today’s criteria of what an investable entrepreneur is would have said Jobs was an investable entrepreneur, he just wasn’t out of that central casting idea of today.” The same applies to Bill Gates who as a young man also "seemed awkward," Thurston said, “none of the greats were cool kids, all the greats didn’t fit the mold that we use today.”
Then what is predictive of startup success?
For that, you need to brush up on Christensen’s Disruption Theory. Related to “creative destruction,” it boils down to creation of innovations that form new markets, that disrupt existing industries, replacing market leading firms, products, supply chains and business networks.
The extent that venture capitalists rely on their personal sense of an entrepreneur’s charisma, leadership and brilliance, Thurston said, is “a substitution error.”
“You don’t know if the business is going to succeed," he said, "so you substitute it for an easier question, which is ‘Who do I like?’”
* George Bernard Shaw: "There are those who look at things the way they are, and ask why? I dream of things that never were, and ask why not?"