The team behind Startup Victoria held the inaugural Above All Human conference in Melbourne last week, co-directed by Susan Wu and Bronwen Clune, and MC’d by futurist Mark Pesce. If there was a single, overarching theme to the day, I would sum it up as: don’t overlook the human component in what you do.
Whether you are a startup founder or investor, defining your purpose is not enough; it also takes considerable self-awareness to build an innovative, successful, and sustainable business. It also requires curiosity, risk-taking, resourcefulness, empathy, creativity, resilience, perception, drive, reflection, vision, perseverance, passion, luck and critical thinking….
Featuring an interesting mix of established, experienced and emerging startup entrepreneurs and experts, we were treated to a broad range of themes including:
bringing financial services to the “unbanked” world;
the importance of design;
building startup platforms and ecosystems;
the power of storytelling;
challenging gender bias in the tech sector;
the potential of mass customisation;
understanding the value of an accelerator program;
the ethics of driverless cars;
changing minds with technology; and
the wisdom of knowing when to give up the dream and move on to the next opportunity.
Aside from the plenary, Q&A and panel sessions, there were product demos and startup pitches, and the whole event offered a valuable learning opportunity for anyone interested in engaging with the local startup community, or those curious about making connections between technology and the human condition.
Finally, it should be said that without Melbourne’s growing status as a global startup venue, the organisers would have been unable to attract such an impressive cohort of international speakers. This also reinforces Melbourne’s reputation as one of the world’s most livable cities (#1 or #11 depending on which list you are reading…).
Here’s the paradox facing the consumption and analysis of #BigData: the cost of data collection, storage and distribution may be decreasing, but the effort to turn data into unique, valuable and actionable insights is actually increasing – despite the expanding availability of data mining and visualisation applications.
One colleague has described the deluge of data that businesses are having to deal with as “the firehose of information”. We are almost drowning in data and most of us are navigating up river without a steering implement. At the risk of stretching the aquatic metaphor, it’s rather like the Sorcerer’s Apprentice: we wanted “easy” data, so the internet, mobile devices and social media granted our wish in abundance. But we got lazy/greedy, forgot how to turn the tap off and now we can’t find enough vessels to hold the stuff, let alone figure out what we are going to do with it. Switching analogies, it’s a case of “can’t see the wood for the trees”.
Perhaps it would be helpful to provide some terms of reference: what exactly is “big data”?
First, size definitely matters, especially when you are thinking of investing in new technologies to process more data more often. For any database less than say, 0.5TB, the economies of scale may dissuade you from doing anything other than deploy more processing power and/or capacity, as opposed to paying for a dedicated, super-fast analytics engine. (Of course, the situation also depends on how fast the data is growing, how many transactions or records need to be processed, and how often those records change.)
Second, processing velocity, volume and data variety are also factors – for example, unless you are a major investment bank with a need for high-frequency, low-latency algorithmic market trading solutions, then you can probably make do with off-the-shelf order routing and processing platforms. Even “near real-time” data processing speeds may be overkill for what you are trying to analyze. Here’s a case in point:
Slick advertorial content, and I agree that the insights (and opportunities) are in the delta – what’s changed, what’s different? But do I really need to know what my customers are doing every 15 seconds? For a start, it might have been helpful to explain what APM is (I had to Google it, and CA did not come up in the Top 10 results). Then explain what it is about the resulting analytics that NAB is now using to drive business results. For instance, what does it really mean if peak mobile banking usage is 8-9am (and did I really need an APM solution to find this out?) Are NAB going to lease more mobile bandwidth to support client access on commuter trains? Has NAB considered push technology to give clients account balances at scheduled times? Is NAB adopting technology to shape transactional and service pricing according to peak demand? (Note: when discussing this example with some colleagues, we found it ironic that a simple inter-bank transfer can still take several days before the money reaches your account…)
Third, there are trade-offs when dealing with structured versus non-structured data. Buying dedicated analytics engines may make sense when you want to do deep mining of structured data (“tell me what I already know about my customers”), but that might only work if the data resides in a single location, or in multiple sites that can easily communicate with each other. Often, highly structured data is also highly siloed, meaning the efficiency gains may be marginal unless the analytics engine can do the data trawling and transformation more effectively than traditional data interrogation (e.g., query and matching tools). On the other hand, the real value may be in unstructured data (“tell me something about my customers I don’t know”), typically captured in a single location but usually monitored only for visitor volume or stickiness (e.g., a customer feedback portal or user bulletin board).
So, to data visualisation.
Put simplistically, if a picture can paint a thousand words, data visualisation should be able to unearth the nuggets of gold sitting in your data warehouse. Our “visual language” is capable of identifying patterns as well as discerning abstract forms, of describing subtle nuances of shade as well as defining stark tonal contrasts. But I think we are still working towards a visual taxonomy that can turn data into meaningful and actionable insights. A good example of this might be so-called sentiment analysis (e.g., derived from social media commentary), where content can be weighted and scored (positive/negative, frequency, number of followers, level of sharing, influence ranking) to show what your customers might be saying about your brand on Twitter or Facebook. The resulting heat map may reveal what topics are hot, but unless you can establish some benchmarks, or distinguish between genuine customers and “followers for hire”, or can identify other connections with this data (e.g., links with your CRM system), it’s an interesting abstract image but can you really understand what it is saying?
Another area where data visualisation is being used is in targeted marketing based on customer profiles and sales history (e.g., location-based promotion using NFC solutions powered by data analytics). For example, with more self-serve check-outs, supermarkets have to re-think where they place the impulse-buy confectionary displays (and those magazine racks that were great for killing time while queuing up to pay…). What if they could scan your shopping items as you place them in your basket, and combined with what they already know about your shopping habits, they could map your journey around the store to predict what’s on your shopping list, thereby prompting you via your smart phone (or the basket itself?) towards your regular items, even saving you time in the process. And then they reward you with a special “in-store only” offer on your favourite chocolate. Sounds a bit spooky, but we know retailers already do something similar with their existing loyalty cards and reward programs.
Finally, what are some of the tools that businesses are using? Here are just a few that I have heard mentioned recently (please note I have not used any of these myself, although I have seen sales demos of some applications – these are definitely not personal recommendations, and you should obviously do your own research and due diligence):
For managing and distributing big data, Apache Hadoop was name-checked at a financial data conference I attended last month, along with kdb+ to process large time-series data, and GetGo to power faster download speeds. Python was cited for developing machine learning and even predictive tools, while DataWatch is taking its data transformation platform into real-time social media sentiment analysis (including heat and field map visualisation). YellowFin is an established dashboard reporting tool for BI analytics and monitoring, and of course Tableau is a popular visualisation solution for multiple data types. Lastly, ThoughtWeb combines deep data mining (e.g., finding hitherto unknown connections between people, businesses and projects via media coverage, social networks and company filings) with innovative visualisation and data display.
Next week: a few profundities (and many expletives) from Dave McClure of 500 Startups
In recent years, market data vendors and their clients have been fixated on supporting the demand for low-latency feeds to support high-frequency, algorithmic and dark pool trading while simultaneously responding to the post-GFC regulatory environment. New regulations continue to place increased operating burdens and costs on market participants, with a current focus on know your customer (KYC), pre-trade analytics and benchmark transparency.
For banks and asset managers, the cost of managing data is now seen as big an issue as the cost of acquiring the data itself. Furthermore, the need to meet regulatory obligations at every stage of every client transaction is adding to operating expenses – costs which cannot easily be recovered, thereby diminishing previously healthy transactional margins.
I was in Hong Kong recently, and had the opportunity to attend the Asia Pacific Financial Information Conference, courtesy of FISD. This annual event, the largest of its kind in the region, brings together stock exchanges, data vendors and financial institutions. It has been a few years since I last attended this conference, so it was encouraging to see that delegate numbers have continued to grow, although of the many stock exchanges in the region, only a few had taken exhibition stands; and representation from among buy-side institutions and asset managers was still comparatively low. However, many major sell-side institutions and plenty of vendors were in attendance, along with a growing number of service providers across data networking, hosting and management.
Speaking to delegates, it was clear that there is a risk of regulation overload: not just the volume, but also the complexity and cost of compliance. Plus, it felt like that despite frequent industry consultation, there appears to be limited co-ordination between the various market regulators, resulting in overlap between jurisdictions and duplication across different regulatory functions. Are any of these regulations having the desired effect, or simply creating unforeseen outcomes?
One major post-GFC development has been the establishment of a common legal entity identifier (LEI) for issuers of securities and their counterparts. (This was in direct response to the Lehman collapse, as a result of a failure or inability to correctly and accurately identify counterparty risk in their trading portfolios, especially for derivatives such as credit default swaps.) However, despite a coordinated international effort, a published standard for the common identifier, and a network of approved LEI issuers, progress in assigning LEIs has been slow (especially in Asia Pacific), and coverage does not reflect market depth. For example, one data manager estimated that of the 20,000 reportable entities that his bank deals with, only 5,000 had so far been assigned LEIs.
Financial institutions need to consume ever more market data, for more complex purposes, and at multiple stages of the securities trading life-cycle:
pre-trade analysis (especially to meet KYC obligations);
trade transaction (often using best execution forums);
post-trade confirmation, settlement and payment;
portfolio reconciliation;
asset valuation (and in the absence of mark-to-market pricing, meaning evaluated pricing, often requiring more than one independent source);
processing corporate actions (in a consistent and timely fashion, and taking account of different taxation rules);
financial reporting and accounting standards (local and global); and
a requirement to provide more transparency around benchmarks (and other underlying data used in the creation and administration of market indices, and in constructing investable products).
Yet with lower trading volumes and increased compliance costs, this inevitably means that operating margins are being squeezed. Which is likely having most impact on data vendors, since data is increasingly seen as a commodity, and the cost of acquiring new data sets has to be offset against both the on boarding and switching costs and the costs of moving data around to multiple users, applications and locations.
The overloaded data managers from the major financial institutions said they wished stock exchanges and vendors would adopt more common industry standards for data licensing and pricing. Which seems reasonable, until you hear the same data managers claim they each have their own particular requirements, and therefore a “one size fits all” approach won’t work for them. Besides, whereas in the past, data was either sold on an enterprise-wide basis, or on a per-user basis, now data usage is divided between:
human users and machine consumption;
full access versus non-display only;
internal and external use;
“as is” compared to derived applications; and
pre-trade and post-trade execution.
Oh, and then there’s the ongoing separation of real-time, intraday, end-of-day and static data.
This all raises the obvious question: if more data consumption does not necessarily mean better margins for data vendors (despite the need to use the same data for multiple purposes), who is making money from market data?
While the stock exchanges are the primary source of market data for listed equities and exchange-traded securities, pricing data for OTC securities and derivatives has to be sourced from dealers, inter-bank brokers, contributing traders and order confirmation platforms. The major data vendors have done a good job over the years of collecting, aggregating and distributing this data – but now, with a combination of cost pressures and advances in technology, new providers are offering to help clients to manage the sourcing, processing, transmission and delivery of data. One conference delegate commented that the next development will be in microbilling (i.e., pricing based on actual consumption of each data item by individual users for specific purposes) and suggested this was an opportunity for a disruptive newcomer.
Finally, other emerging developments included the use of social media in market sentiment analysis (e.g., for algo-based trading), data visualisation, and the deployment of dedicated apps to manage “big data” analytics.
Games and social media apps may currently be generating the most downloads and revenue, but the real innovation in the online economy comes from the “Three Pillars”: Health, Finance and Education.
What these pillars have in common are:
clearly defined market verticals
well-established business models
life-long customer engagement
highly regulated operating environments
They are also industries that are continuously innovating, which makes them interesting bellwethers for what might emerge in other sectors of the economy.
However, they do not display closely integrated vertical markets, and despite the regulatory barriers to entry they are vulnerable to disruptive technologies and new business models.
I’m reminded of the proverb “early to bed, early to rise, makes a man healthy, wealthy and wise” – such that we cannot afford to ignore what is going on in these industries, and nor can we fail to understand the implications for each of them based on what is going on elsewhere.
From mobile payment systems to wearable fitness devices, from Apple’s new “Health” app to mass open online courses, from peer-to-peer lending to shared health alerts, these sectors are responsible for (and responding to) significant changes in the online economy, and over the next few weeks I’ll be offering some personal observations on the trends, threats, lessons and observations for each of the three pillars.
I encourage readers to contribute to the debate via this blog….