The New Alchemy – Turning #BigData into Valuable Insights

Here’s the paradox facing the consumption and analysis of #BigData: the cost of data collection, storage and distribution may be decreasing, but the effort to turn data into unique, valuable and actionable insights is actually increasing – despite the expanding availability of data mining and visualisation applications.

One colleague has described the deluge of data that businesses are having to deal with as “the firehose of information”. We are almost drowning in data and most of us are navigating up river without a steering implement. At the risk of stretching the aquatic metaphor, it’s rather like the Sorcerer’s Apprentice: we wanted “easy” data, so the internet, mobile devices and social media granted our wish in abundance. But we got lazy/greedy, forgot how to turn the tap off and now we can’t find enough vessels to hold the stuff, let alone figure out what we are going to do with it. Switching analogies, it’s a case of “can’t see the wood for the trees”.

Perhaps it would be helpful to provide some terms of reference: what exactly is “big data”?

First, size definitely matters, especially when you are thinking of investing in new technologies to process more data more often. For any database less than say, 0.5TB, the economies of scale may dissuade you from doing anything other than deploy more processing power and/or capacity, as opposed to paying for a dedicated, super-fast analytics engine. (Of course, the situation also depends on how fast the data is growing, how many transactions or records need to be processed, and how often those records change.)

Second, processing velocity, volume and data variety are also factors – for example, unless you are a major investment bank with a need for high-frequency, low-latency algorithmic market trading solutions, then you can probably make do with off-the-shelf order routing and processing platforms. Even “near real-time” data processing speeds may be overkill for what you are trying to analyze. Here’s a case in point:

Slick advertorial content, and I agree that the insights (and opportunities) are in the delta – what’s changed, what’s different? But do I really need to know what my customers are doing every 15 seconds? For a start, it might have been helpful to explain what APM is (I had to Google it, and CA did not come up in the Top 10 results). Then explain what it is about the resulting analytics that NAB is now using to drive business results. For instance, what does it really mean if peak mobile banking usage is 8-9am (and did I really need an APM solution to find this out?) Are NAB going to lease more mobile bandwidth to support client access on commuter trains? Has NAB considered push technology to give clients account balances at scheduled times? Is NAB adopting technology to shape transactional and service pricing according to peak demand? (Note: when discussing this example with some colleagues, we found it ironic that a simple inter-bank transfer can still take several days before the money reaches your account…)

Third, there are trade-offs when dealing with structured versus non-structured data. Buying dedicated analytics engines may make sense when you want to do deep mining of structured data (“tell me what I already know about my customers”), but that might only work if the data resides in a single location, or in multiple sites that can easily communicate with each other. Often, highly structured data is also highly siloed, meaning the efficiency gains may be marginal unless the analytics engine can do the data trawling and transformation more effectively than traditional data interrogation (e.g., query and matching tools). On the other hand, the real value may be in unstructured data (“tell me something about my customers I don’t know”), typically captured in a single location but usually monitored only for visitor volume or stickiness (e.g., a customer feedback portal or user bulletin board).

So, to data visualisation.

Put simplistically, if a picture can paint a thousand words, data visualisation should be able to unearth the nuggets of gold sitting in your data warehouse. Our “visual language” is capable of identifying patterns as well as discerning abstract forms, of describing subtle nuances of shade as well as defining stark tonal contrasts. But I think we are still working towards a visual taxonomy that can turn data into meaningful and actionable insights. A good example of this might be so-called sentiment analysis (e.g., derived from social media commentary), where content can be weighted and scored (positive/negative, frequency, number of followers, level of sharing, influence ranking) to show what your customers might be saying about your brand on Twitter or Facebook. The resulting heat map may reveal what topics are hot, but unless you can establish some benchmarks, or distinguish between genuine customers and “followers for hire”, or can identify other connections with this data (e.g., links with your CRM system), it’s an interesting abstract image but can you really understand what it is saying?

Another area where data visualisation is being used is in targeted marketing based on customer profiles and sales history (e.g., location-based promotion using NFC solutions powered by data analytics). For example, with more self-serve check-outs, supermarkets have to re-think where they place the impulse-buy confectionary displays (and those magazine racks that were great for killing time while queuing up to pay…). What if they could scan your shopping items as you place them in your basket, and combined with what they already know about your shopping habits, they could map your journey around the store to predict what’s on your shopping list, thereby prompting you via your smart phone (or the basket itself?) towards your regular items, even saving you time in the process. And then they reward you with a special “in-store only” offer on your favourite chocolate. Sounds a bit spooky, but we know retailers already do something similar with their existing loyalty cards and reward programs.

Finally, what are some of the tools that businesses are using? Here are just a few that I have heard mentioned recently (please note I have not used any of these myself, although I have seen sales demos of some applications – these are definitely not personal recommendations, and you should obviously do your own research and due diligence):

For managing and distributing big data, Apache Hadoop was name-checked at a financial data conference I attended last month, along with kdb+ to process large time-series data, and GetGo to power faster download speeds. Python was cited for developing machine learning and even predictive tools, while DataWatch is taking its data transformation platform into real-time social media sentiment analysis (including heat and field map visualisation). YellowFin is an established dashboard reporting tool for BI analytics and monitoring, and of course Tableau is a popular visualisation solution for multiple data types. Lastly, ThoughtWeb combines deep data mining (e.g., finding hitherto unknown connections between people, businesses and projects via media coverage, social networks and company filings) with innovative visualisation and data display.

Next week: a few profundities (and many expletives) from Dave McClure of 500 Startups

Who’s making money from market data?

In recent years, market data vendors and their clients have been fixated on supporting the demand for low-latency feeds to support high-frequency, algorithmic and dark pool trading while simultaneously responding to the post-GFC regulatory environment. New regulations continue to place increased operating burdens and costs on market participants, with a current focus on know your customer (KYC), pre-trade analytics and benchmark transparency.

For banks and asset managers, the cost of managing data is now seen as big an issue as the cost of acquiring the data itself. Furthermore, the need to meet regulatory obligations at every stage of every client transaction is adding to operating expenses – costs which cannot easily be recovered, thereby diminishing previously healthy transactional margins.

I was in Hong Kong recently, and had the opportunity to attend the Asia Pacific Financial Information Conference, courtesy of FISD. This annual event, the largest of its kind in the region, brings together stock exchanges, data vendors and financial institutions. It has been a few years since I last attended this conference, so it was encouraging to see that delegate numbers have continued to grow, although of the many stock exchanges in the region, only a few had taken exhibition stands; and representation from among buy-side institutions and asset managers was still comparatively low. However, many major sell-side institutions and plenty of vendors were in attendance, along with a growing number of service providers across data networking, hosting and management.

Speaking to delegates, it was clear that there is a risk of regulation overload: not just the volume, but also the complexity and cost of compliance. Plus, it felt like that despite frequent industry consultation, there appears to be limited co-ordination between the various market regulators, resulting in overlap between jurisdictions and duplication across different regulatory functions. Are any of these regulations having the desired effect, or simply creating unforeseen outcomes?

One major post-GFC development has been the establishment of a common legal entity identifier (LEI) for issuers of securities and their counterparts. (This was in direct response to the Lehman collapse, as a result of a failure or inability to correctly and accurately identify counterparty risk in their trading portfolios, especially for derivatives such as credit default swaps.)  However, despite a coordinated international effort, a published standard for the common identifier, and a network of approved LEI issuers, progress in assigning LEIs has been slow (especially in Asia Pacific), and coverage does not reflect market depth. For example, one data manager estimated that of the 20,000 reportable entities that his bank deals with, only 5,000 had so far been assigned LEIs.

Financial institutions need to consume ever more market data, for more complex purposes, and at multiple stages of the securities trading life-cycle:

  • pre-trade analysis (especially to meet KYC obligations);
  • trade transaction (often using best execution forums);
  • post-trade confirmation, settlement and payment;
  • portfolio reconciliation;
  • asset valuation (and in the absence of mark-to-market pricing, meaning evaluated pricing, often requiring more than one independent source);
  • processing corporate actions (in a consistent and timely fashion, and taking account of different taxation rules);
  • financial reporting and accounting standards (local and global); and
  • a requirement to provide more transparency around benchmarks (and other underlying data used in the creation and administration of market indices, and in constructing investable products).

Yet with lower trading volumes and increased compliance costs, this inevitably means that operating margins are being squeezed. Which is likely having most impact on data vendors, since data is increasingly seen as a commodity, and the cost of acquiring new data sets has to be offset against both the on boarding and switching costs and the costs of moving data around to multiple users, applications and locations.

The overloaded data managers from the major financial institutions said they wished stock exchanges and vendors would adopt more common industry standards for data licensing and pricing. Which seems reasonable, until you hear the same data managers claim they each have their own particular requirements, and therefore a “one size fits all” approach won’t work for them. Besides, whereas in the past, data was either sold on an enterprise-wide basis, or on a per-user basis, now data usage is divided between:

  • human users and machine consumption;
  • full access versus non-display only;
  • internal and external use;
  • “as is” compared to derived applications; and
  • pre-trade and post-trade execution.

Oh, and then there’s the ongoing separation of real-time, intraday, end-of-day and static data.

This all raises the obvious question: if more data consumption does not necessarily mean better margins for data vendors (despite the need to use the same data for multiple purposes), who is making money from market data?

While the stock exchanges are the primary source of market data for listed equities and exchange-traded securities, pricing data for OTC securities and derivatives has to be sourced from dealers, inter-bank brokers, contributing traders and order confirmation platforms. The major data vendors have done a good job over the years of collecting, aggregating and distributing this data – but now, with a combination of cost pressures and advances in technology, new providers are offering to help clients to manage the sourcing, processing, transmission and delivery of data. One conference delegate commented that the next development will be in microbilling (i.e., pricing based on actual consumption of each data item by individual users for specific purposes) and suggested this was an opportunity for a disruptive newcomer.

Finally, other emerging developments included the use of social media in market sentiment analysis (e.g., for algo-based trading), data visualisation, and the deployment of dedicated apps to manage “big data” analytics.

Next week: Australia 3.0

Online Pillar 2: #Finance

Along with the launch of the iPhone 6, Apple also announced a new mobile payments system. OK, so it’s not the first smart phone app that will help you manage (read: SPEND) your money, but it’s likely to be a market leader very quickly. After all, financial services mean big money in the interconnected online economy.

This week’s blog is #2 in my mini-series on the Three Pillars. Away from NFC solutions, digital wallets and virtual currencies, what else is helping to drive online innovation in financial services?

First, as with last week’s look at Health, it’s important to consider that despite being both a defined business vertical, and a highly regulated industry, the financial services sector is also vulnerable to market disintermediation, horizontal challengers and disruptive technologies.

Although most of us tend to stick with a single financial institution for the bulk of our banking products and services, we will likely use different providers across our credit cards, insurance policies, personal investments, retirement plans and foreign currency. The major banks don’t always do a good job of being a single provider of choice because they tend to manage their customers from a product perspective, and not always from the vantage point of a life-cycle of different needs.

Most retail banks have launched customer apps – mainly for account management and transaction purposes – and likewise, other platforms such as PayPal offer smart phone solutions. As with our other two pillars (Health and Education), Finance apps proliferate – e.g., calculators, account aggregators, budgeting tools and branded customer products from major financial institutions. But unlike Health apps, at least the Australian retail banks have to comply with consumer information requirements – although I suspect this is more a requirement of APRA than Apple. (Question: should apps offering stock market data, or enabling customers to plan investment strategies have to include product disclosure statements, or ensure customers have first completed a mandatory risk profile?)

Disruption in the banking and finance sector is coming from a variety of directions:

  • traditional retailers extending their existing credit card and insurance services into deposit accounts and investment products;
  • technology startups creating online payment systems;
  • trading platform Alibaba offering microfinance, trade finance services, deposit accounts and investment funds; and
  • online retailers and market places collecting a lot of useful behavioural data on customer creditworthiness and implied financial risk – for example, platforms like eBay and PayPal are using transactional data to assign customers a quasi-credit rating score or ranking.

Elsewhere, the financial services sector drives the use of data and technology to streamline stock trading and settlement – across algorithmic trading strategies, low-latency trading, straight-through securities processing, transaction and security data matching, market identifiers and real-time data analytics. The use of social media sentiment and stock #hashtags is also creating new trading strategies among savvier investors – one major Australian bank I spoke to recently boasted of having a Media Control Centre, where they can monitor client engagement, customer activity and brand profiles across the social web.

Crowdsourcing services, along with other platforms for raising capital and early-stage funding (plus new online listing and share trading platforms) threaten to disintermediate established stock exchanges, investment banks and stock brokers. Yet I see a huge opportunity for traditional bank and non-bank lenders to use these techniques for themselves. For example, banks love asset-backed and secured lending, as opposed to overdraft or cashflow lending. However, most startups don’t have physical assets such as plant or machinery, and young entrepreneurs are less likely to own property that can be put up as collateral.

So, what if banks see startup clients as a new channel to market? By investing part of their marketing costs or R&D budgets to underwrite new business ventures, they could help fund early stage ideas, and gather valuable information on customers and suppliers. Some banks are sort of moving in this startup direction – NAB and RBS, for example – but they have yet to demonstrate new business models or innovative product solutions that align with the lean startup and new entrepreneurial generation. I have observed many founders bemoan the lack of support from banks when it comes to offering merchant services that align with the needs of startups.

On another level, banks could do more to connect ideas with capital, customers with vendors, and buyers with suppliers – as the increasingly online and highly networked economy introduces new supply chains and innovative business models. (Hint to my bank manager: referrals and recommendations are often the most cost-effective way to acquire new customers – so, maybe we can help each other?)

Of course, where financial institutions really need to lift their game is in coming to grips with the shared economy. If consumers no longer see the need to buy or own assets outright (thereby reducing the reliance on mortgages, personal loans, hire purchase agreements and even credit cards….) what are the implications for financial services? Maybe banks need to take more interest in these “shared” asset eco-systems. For example, if I have taken out an investment loan to buy an apartment, which I plan to list on Airbnb, wouldn’t it be in the bank’s best interest to make sure I am getting as many bookings as possible – by helping to market my property to their other customers, or by making it really easy for people to book and pay for the accommodation via their smart phone banking app, or by enabling me to run online credit checks on prospective customers?

It’s nearly ten years since the term “distributed economy” was coined to encapsulate the new approaches to innovation, collaboration and sustainable resource allocation. Apart from microfinance and some developments in CSR and ethical investing, I’m not sure that financial institutions really grasped the opportunities presented by the distributed economy – sure, they were quick to outsource and offshore back office operations, but this was largely a cost-cutting exercise. Innovation in financial products mainly resulted in complex (and risky) derivative instruments – and ultimately, led to the GFC.

In the current low/slow/no growth economic climate, banks have to look at new ways of generating a return on their capital. They can’t just keep paying out higher shareholder dividends (not when banking regulations require them to increase their risk-weighted capital allocation); so they must engage with the new business models and the people behind them, and they must be willing to do so with a new mindset, not one built on staid financing models. Sure, they need to maintain prudent lending standards, and undertake relevant due diligence, but not at the risk of stifling innovation in the markets where their customers increasingly operate.

(For a related article on this topic, see here. Since I drafted this blog, PayPal has launched an SME loan platform, and it has just been announced that the ex-CEO of bond fund PIMCO has taken a key equity stake in an online Peer-to-Peer lending platform.)

Next week: Online Pillar 3: #Education

If it seems too good to be true, then it must be!

As someone who commissioned one of the first books on advance fee fraud (sometimes called ‘Nigerian 419’ scams) nearly 20 years ago I find it staggering that people are still being sucked into these ‘get rich quick’ schemes.

While ‘advance fee’ is a particular type of bank fraud, those spammy and ubiquitous e-mails offering you fantastic sums of money in return for simply providing your personal details (and/or a small upfront payment and the ‘loan’ of your bank account) are among the more common form of financial scams on the Internet.

In most cases, the perpetrators (often posing as government officials, lawyers, bankers or accountants) claim to have unique access to enormous funds which need to be transferred out of their country of residence – usually in the context of foreign trade, bank deposits, bequests or international loan transactions. More recently, I have seen attempts to ‘liberate’ the proceeds of deceased estates where there is no legitimate heir.

Advance fee fraud scams should seem obvious by now, and hopefully recipients are wiser about these dubious offers to make them wealthy beyond their wildest dreams. Yet, I can’t help feeling these predatory fraudsters are merely an extreme version of the contemporary snake oil salesmen that inhabit the business world today.

This thought occurred to me, as I was reading about how one self-made millionaire had built his fortune – and, just by following his ‘system’, anyone could do it too. The images used to accompany the article emphasised the material trappings associated with this wealth, as if to reinforce the message: “You too can have a lifestyle like mine.”

Other variants of these ‘get rich in 10 easy lessons’ programmes are business ownership opportunities (mostly outsourced telesales operations), seminars on how to flip real estate (some of which are now illegal unless provided by licensed financial planners), and courses where you learn to build websites for clients (but your only customers end up being people who want to learn to build websites for their clients….).

Call me sceptical, but many of these ‘systems’ are merely pyramid sales schemes (sorry, MLM plans) masquerading as ways to “Be your own boss and kick the 9-5 routine”. Sure, some of these programmes may be free to access, but the likelihood is that the person offering ‘valuable’ business insights on how you can make your fortune is making their money from ‘selling’ the programme to you (via third-party advertising, sponsorship, speaking engagements, etc.).

If these insights are so valuable, why are they giving them away?