Rise of the machines (decision making when milliseconds matter)
One of the benefits of being part of a seed-stage fund is that we often get to see the next big trends before they become widely known. For example, Josh has a widely known thesis on data-exhaust and the implicit web where he identifies the value of the data trail we all leave behind. This thesis supported multiple investments including our investment in Mint.
[caption id=”attachment_419" align=”alignleft” width=”300" caption=”In the 200 milliseconds it takes you to draw your gun, my algorithmic gun slinger could make 39 stock trades and shoot you in the heart”]
[/caption]
I believe we are now seeing a change in data processing and decision-making that will be equally significant for investors and entrepreneurs.
Historically, data analysis and computation was done on log files and stored data pools. In these types of businesses, the data decisioning was done “out of band” — or not in real-time. However, we’ve now started to see a whole series of applications and businesses where data analysis and decisioning is happening “in band” on streams of data. In these applications, milliseconds matter. Massive data analysis and computation are being performed in real-time — and the user’s experience is affected by this analysis.
This is a fundamental change. Humans do not operate in milliseconds. For the real time web to function, the human decisions have to occur before the clock starts. We need to focus on predictive analysis and algorithms that make rule based decisions for us informed by the data stream.
The Facebook Newsfeed (already has machine intervention) and the Twitter Stream are the most frequently noted “data streams”, but the auto-generated data created by every consumer action, ad impression and click are orders of magnitude larger. Including the data streams generated by CDNs and ISP’s you can see the exponential nature of the decision requirements in this new streaming data world.
The first vertical to move to real-time is advertising (Spark Cpaital’s Mo Koyfman has a nice summary of the shift in on-line advertising on his blog). It is not surprising that advertising is the first industry to move in this direction. It is most similar to the financial markets and over the past ten years the percentage of equity trades on US exchanges driven by algorithms has grown to over 70%. This was the insight behind Appnexus and Invite Media and may explain why some of the first guys to envision a transparent market for display inventory and real-time bidding came out of a finance school.
In our portfolio I see the power of operating in stream. Ad insertion order compliance can now happen in real-time based on contextual data streams analyzed by Double Verify. VigLink can identify un-affiliated links across the web and not only append the link with an affiliate code, but choose the profit maximizing link in each instance, in real-time. Knewton’s testing platform is able to provide each student with a customized and personalized test-prep experience, based on their real-time adaptive education platform. Aggregate Knowledge automatically produces personalized and dynamic creative for advertisements by using real-time algorithms. In milliseconds Monetate applies a specific set of merchandising rules to individual consumer data streams. The result is a unique shopping experience for each visitor to an e-commerce site.
As we move from a world of data pools to data streams and processing power is distributed to the edge, what other changes will take place now that milliseconds matter? Will infrastructure changes take place as well? Will the real-time web increase the value of a millisecond enough to force companies to co-locate their algorithms at the CDN site or even the end-point device level?
I would love to discuss it in the comments, @phineasb or phin@firstround.com