StartupVic’s Machine Learning / AI pitch night

Machine Learning and AI are such hot topics, that I was really intrigued by the prospect of this particular StartupVic pitch night. First, this was a chance to visit inspire9‘s recently established Dream Factory – a tech co-working facility, maker space, and VR lab in Melbourne’s western suburb of Footscray. Second, the Dream Factory, housed in a landmark building owned by Impact Investment Group, was a major beneficiary of LaunchVic funding, and this event could be seen as a showcase for Melbourne’s tech startup sector. Third, with so many buzzwords circling AI, it offered a great opportunity to help demystify some of the jargon and provide some practical insights.

Image sourced from StartupVic

Instead, the pitches felt underdone – probably not helped by the building’s acoustics, the poor PA system, and the inability of many of the audience to be able to read the presenters’ slides. I wasn’t expecting the founders to reveal the “secret sauce” of their algorithms, or to explain in detail how they program or train their “smart” applications. But I had hoped to hear some concrete evidence of how these emerging platforms actually work and how the resulting data is specifically analyzed and applied to client solutions.

Amelie.ai

With a tag line of “powering the future of mental health” the team at Amelie.ai are hoping to have a positive impact in helping to reduce suicide rates. Unfortunately, judging by the way some key statistics are presented on their home page, the data (and the methodology) are not as clear as the core message.

Using technology to help scale the provision of mental health and well-being services, combined with mixed delivery methods, the solution aims to offer continuity of care. Picking up on user dialogue and providing some semi-automated and curated intervention, the presentation was big on phrases like “triage packages”, “customer journey”, “technical architecture”, “chatbots” and of course, “AI” itself, but I would have like a bit more explanation on how it worked.

I understand that the platform is designed to integrate with third-party providers, but how does this happen in practice?

Only when asked by the judges about their competitive advantage (as there are similar tools out there – see Limbr from a previous pitch night) did the presenters refer to their proprietary language models, developed with and based on user trials. This provides  a structured taxonomy, which is currently English-only, but it can be translated.

There were also questions about data privacy (not fully explained?) and sales channels – which may include workplace EAPs and health insurers.

Businest

According to the founder, “dashboards and KPIs only diagnose pain, Businest fixes it“. In short, this is intelligence business analysis for SMEs.

With a focus on tracking working capital and cashflow, as far as I can tell, Businest applies some AI on top of existing third-party accounting software. It identifies key metrics for a specific business, then provides coaching and videos to change business behaviour and improve financial performance. There is a patent pending in the US for the underlying algorithm, which prioritizes the KPIs.

Again, I was not totally clear how the desired results are achieved. For example, are SMEs benchmarked against their peers (e.g., by size/industry/geography/maturity/risk profile)? Do clients know what incremental benefits they should be able to generate over a given time period? How does the financial spreadsheet analysis assist with improving structural or operational efficiencies that are outside the realm of financial accounting?

Available under a freemium SaaS model, Businest is sold direct and via accountants and bookkeepers. A key to success will be how fast the product can scale – via partnering and its integration with Xero, MYOB and QuickBooks.

AiHello

I must admit, I was initially curious, and then totally bemused, by this pitch. It started by asking some major philosophical and existentialist questions:

Q: How do we define “intelligence”?
Q: Are we alone? Or not alone?

No, this is not IBM’s Watson trained on the works of John-Paul Sartre (cf. Dark Star and the struggle with Cartesian Logic). Instead, it is an analytical and predictive app for Amazon sellers. It claims to know what products will sell, where and when. And with trading volumes worth $2.5m of goods per month, it must be doing something right. Serving Amazon sellers in the US and India (and Australia, once Amazon goes live here), AiHello charges fees based on fixed licences and transaction values. The apparent benefits to retailers are speed and savings.

Asked where the trading data is coming from, the presenter referred to existing trading platform APIs, and “big data and deep learning”. It also uses Amazon product IDs to make specific predictions – currently delivering 60% accuracy, but aiming for 90%. According to the founder, “Amazon focuses on buyers, we focus on sellers”. (Compare this, perhaps, to the approach by Etsy.)

C-SIGHT

A new service from the team at Pax Republic, this latest iteration is designed to avoid some of the policy and reputation issues involved with managing, supporting and protecting whistleblowers. Understanding that whistleblowers can pose an internal threat to brand value, and present a significant human risk, C-SIGHT provides a psychologically safe environment for the Board, C-suite and workforce alike, and can act as an early warning system before problems get out of hand.

Sold under a SaaS model, C-SIGHT analyses text-based and anonymous dialogue, with “real-time data sent to different AI apps”. I understood that C-SIGHT combines human and robot facilitation, while preserving anonymity, and also deploys natural language processing – but I didn’t fully understand how.

In one client use case, with the College of Surgeons, there were 1,000 “contributions” – again, it was not clear to me how this input was generated, captured, processed or analysed. Client pricing is based on the number of invitations sent and the number of these “contributions” – what the presenter referred to as an “instance” model (presumably he meant instance-based learning?).

Asked about privacy, C-SIGHT de-identifies contributions (to what degree was not clear), and operates outside the firewall. There was also a question from the judges about the use and analysis of idiom and the vernacular – I don’t believe this addressed in much detail, although the presenter did suggest that the platform could be used as a way to drive “citizen engagement”.

Overall, I was rather underwhelmed by these presentations, although each of them revealed a kernel of a good idea – while in the case of AiHello (which was the winner on the night), sales traction is very promising; and in the case of Businest, industry recognition, especially in the US, has opened up some key opportunities.

Next week: Bitcoin – to fork or not to fork?

ANZ’s new CEO on #FinTech, CX and #digital disruption – 10 Key Takeaways

I went to the recent Q&A with the new CEO of ANZ, Shayne Elliott, organised by FinTech Melbourne. It was the first public speaking appearance by Shayne since becoming CEO (excluding his gig at the Australian Tennis Open), and followed a similar event last year with Patrick Maes, the bank’s CTO.

600_446693337The key themes were:

  1. Improving the customer experience (CX) is paramount
  2. Maintaining the high level of trust customers place in their banks is key
  3. Being aware of FinTech disruption is important, but remaining focused on core strategy is even more important
  4. FinTech can coexist with traditional banks, but the latter will win out in the end
  5. The bigger opportunity for FinTech is probably in SME solutions, rather than B2C
  6. Increased process automation is in support of CX, not about reducing headcount
  7. Big data and customer analytics are all very well, but have to drive CX outcomes
  8. Customers still see the relationship with their main financial institution in terms of basic transaction accounts, which is why payment solutions (a high volume/low margin activity) are vital to the banks’ sustainability
  9. ANZ is about to appoint a head of digital banking who will report direct to the CEO
  10. ANZ has been rated as one of the top global banks in terms of its use of Twitter and social media (but from what I have seen, much of the Big 4 banks’ social media presence can be attributed to their sports sponsorship…)

There was also some discussion around ANZ’s Asian strategy, and the statement last year that the “new” strategy is about becoming a digital bank. Shayne was quick to point out that they are not abandoning the Asian strategy (it’s not either/or) but because they embarked on Asia 8 years ago, most of the work has been done. Now they need to consolidate and expand the platform they have built. He also placed ANZ’s Australian business as being a comparatively small part of the group’s portfolio, and also took the view that despite ANZ’s size, resources and reach, digital products have to be developed market by market – it’s not a one size fits all approach. (Several FinTech founders in the audience took a very different perspective on this.)

And, in a bid to appear entirely approachable, both Shayne and Patrick were happy for people to contact them direct by e-mail… So if any budding FinTech founders have an idea to pitch to a major bank, you know who to contact.

Next week: Making the most of the moment…

Personal vs Public: Rethinking Privacy

An incident I recently witnessed in my neighbourhood has caused to me to rethink how we should be defining “privacy”. Data protection is one thing, but when our privacy can be compromised via the direct connection between the digital and analog worlds, all the cyber security in the world doesn’t protect us against unwanted nuisance, intrusion or even invasion of our personal space.

Pressefotografen mit KamerasScenario

As I was walking along the street, I saw another pedestrian stop outside a house, and from the pavement, use her smart phone to take a photograph through the open bedroom window. Regardless of who was inside, and irrespective of what they were doing (assuming nothing illegal was occurring), I would consider this to be an invasion of privacy.

For example, it would be very easy to share the picture via social media, along with date and location data. From there, it could be possible to search land registries and other public records to ascertain the identity of the owners and/or occupants. And with a little more effort, you might have enough information to stalk or even cyber-bully them.

Privacy Law

Photographing people on private property (e.g., in their home) from public property (e.g., on the street outside) is not an offence, although photographers must not cause a nuisance nor interfere with the occupants’ right of quiet enjoyment. Our current privacy laws largely exclude this breach of privacy (unless it relates to disclosure of personal data by a regulated entity). Even rules about the use of drones are driven by safety rather than privacy concerns.

Since the late 1990’s, and the advent of spam and internet hacking, there have been court decisions that update the law of trespass to include what could be defined as “digital trespass”, although some judges have since tried to limit such actions to instances where actual harm or damage has been inflicted on the plaintiff. (Interestingly, in Australia, an act of trespass does not have to be “intentional”, merely “negligent”.)

Apart from economic and financial loss that can arise from internet fraud and identity theft, invasion of privacy via public disclosure of personal data could lead to personal embarrassment, damage to reputation or even ostracism. (In legal terms emotional stress falls within “pain and suffering”).

Data Protection Law

The Australian Privacy Principles contained within the 1988 Privacy Act apply to government agencies, private companies with annual turnover of $3m or more, and any organisations trading in personal data, dealing with credit information or providing health services. There are specific provisions relating to the use and misuse of government-derived identifiers such as medical records and tax file numbers.

The main purpose of the privacy legislation is to protect “sensitive” information, and to prevent such data being used unlawfully to identify specific individuals. At a minimum, this means keeping personal data such as dates of birth, financial records or hospital files in a secure format.

Some Practical Definitions

The following are not legal definitions, but hopefully offer a practical framework to understand how we might categorise such data, and manage our obligations towards it:

“Confidential”

Secret information that must not be disclosed to anyone unless there is a legal obligation or permission to do so. (There are also specific issues and exceptions relating to “classified information”, public interest matters, whistleblower protection and Freedom of Information requests.)

“Private”

Information which is not for public or general consumption, although the data itself may not be “confidential”. May still be subject to legal protection or rights, such as the right of adopted children to discover the identity of their birth parents, or the right of someone not to be identified as a lottery winner.

“Personal”

Data that relates to, or can specifically identify a particular individual. An increasing issue for Big Data, because data that otherwise resides in separate locations can now be re-connected using triangulation techniques – scrape enough websites and drill down into enough databases, and you could probably find my shoe size.

“Public”

Anything that has been published, or easily discoverable through open search or public database retrieval (but, for example, does not include my past transactions on eBay unless I have chosen to disclose them to other users). My date of birth may be a matter of record, but unless you have authorised access to the relevant database or registry, you won’t be able to discover it and you certainly shouldn’t disclose it without my permission.

Copyright Law

One further dimension to the debate is copyright law – the ownership and related rights associated with any creative works, including photographs. All original content is copyright (except those works deemed to be in the “public domain”), and nearly all copyright vests with the person who created the work (unless they have legally assigned their copyright, or the material was created in the course of their employment).

In the scenario described above, the photographer would hold copyright in the picture they took. However, if the photograph included the image of an artwork or even a framed letter hanging on the wall, they could not reproduce the photograph without the permission of the person who owned the copyright in those original works. In some (limited) situations, a photograph of a building may be subject to the architect’s copyright in the design.

Curiosity is not enough justification to share

My personal view on all this is that unless there is a compelling reason to make something public, protecting our personal privacy takes precedent over the need to post, share or upload pictures of other people in their private residence, especially any images taken without the occupants’ knowledge or permission.

Just to clarify, I’m not referring to surveillance and monitoring by the security services and law enforcement agencies, for which there are understandable motives (and appropriate safeguards).

I’m saying that if we showed a little more respect for each others’ personal space and privacy (particularly within our homes, not just in cyberspace) then we might show a little more consideration to our neighbours and fellow citizens.

Next week: It’s OK to say “I don’t know”

The New Alchemy – Turning #BigData into Valuable Insights

Here’s the paradox facing the consumption and analysis of #BigData: the cost of data collection, storage and distribution may be decreasing, but the effort to turn data into unique, valuable and actionable insights is actually increasing – despite the expanding availability of data mining and visualisation applications.

One colleague has described the deluge of data that businesses are having to deal with as “the firehose of information”. We are almost drowning in data and most of us are navigating up river without a steering implement. At the risk of stretching the aquatic metaphor, it’s rather like the Sorcerer’s Apprentice: we wanted “easy” data, so the internet, mobile devices and social media granted our wish in abundance. But we got lazy/greedy, forgot how to turn the tap off and now we can’t find enough vessels to hold the stuff, let alone figure out what we are going to do with it. Switching analogies, it’s a case of “can’t see the wood for the trees”.

Perhaps it would be helpful to provide some terms of reference: what exactly is “big data”?

First, size definitely matters, especially when you are thinking of investing in new technologies to process more data more often. For any database less than say, 0.5TB, the economies of scale may dissuade you from doing anything other than deploy more processing power and/or capacity, as opposed to paying for a dedicated, super-fast analytics engine. (Of course, the situation also depends on how fast the data is growing, how many transactions or records need to be processed, and how often those records change.)

Second, processing velocity, volume and data variety are also factors – for example, unless you are a major investment bank with a need for high-frequency, low-latency algorithmic market trading solutions, then you can probably make do with off-the-shelf order routing and processing platforms. Even “near real-time” data processing speeds may be overkill for what you are trying to analyze. Here’s a case in point:

Slick advertorial content, and I agree that the insights (and opportunities) are in the delta – what’s changed, what’s different? But do I really need to know what my customers are doing every 15 seconds? For a start, it might have been helpful to explain what APM is (I had to Google it, and CA did not come up in the Top 10 results). Then explain what it is about the resulting analytics that NAB is now using to drive business results. For instance, what does it really mean if peak mobile banking usage is 8-9am (and did I really need an APM solution to find this out?) Are NAB going to lease more mobile bandwidth to support client access on commuter trains? Has NAB considered push technology to give clients account balances at scheduled times? Is NAB adopting technology to shape transactional and service pricing according to peak demand? (Note: when discussing this example with some colleagues, we found it ironic that a simple inter-bank transfer can still take several days before the money reaches your account…)

Third, there are trade-offs when dealing with structured versus non-structured data. Buying dedicated analytics engines may make sense when you want to do deep mining of structured data (“tell me what I already know about my customers”), but that might only work if the data resides in a single location, or in multiple sites that can easily communicate with each other. Often, highly structured data is also highly siloed, meaning the efficiency gains may be marginal unless the analytics engine can do the data trawling and transformation more effectively than traditional data interrogation (e.g., query and matching tools). On the other hand, the real value may be in unstructured data (“tell me something about my customers I don’t know”), typically captured in a single location but usually monitored only for visitor volume or stickiness (e.g., a customer feedback portal or user bulletin board).

So, to data visualisation.

Put simplistically, if a picture can paint a thousand words, data visualisation should be able to unearth the nuggets of gold sitting in your data warehouse. Our “visual language” is capable of identifying patterns as well as discerning abstract forms, of describing subtle nuances of shade as well as defining stark tonal contrasts. But I think we are still working towards a visual taxonomy that can turn data into meaningful and actionable insights. A good example of this might be so-called sentiment analysis (e.g., derived from social media commentary), where content can be weighted and scored (positive/negative, frequency, number of followers, level of sharing, influence ranking) to show what your customers might be saying about your brand on Twitter or Facebook. The resulting heat map may reveal what topics are hot, but unless you can establish some benchmarks, or distinguish between genuine customers and “followers for hire”, or can identify other connections with this data (e.g., links with your CRM system), it’s an interesting abstract image but can you really understand what it is saying?

Another area where data visualisation is being used is in targeted marketing based on customer profiles and sales history (e.g., location-based promotion using NFC solutions powered by data analytics). For example, with more self-serve check-outs, supermarkets have to re-think where they place the impulse-buy confectionary displays (and those magazine racks that were great for killing time while queuing up to pay…). What if they could scan your shopping items as you place them in your basket, and combined with what they already know about your shopping habits, they could map your journey around the store to predict what’s on your shopping list, thereby prompting you via your smart phone (or the basket itself?) towards your regular items, even saving you time in the process. And then they reward you with a special “in-store only” offer on your favourite chocolate. Sounds a bit spooky, but we know retailers already do something similar with their existing loyalty cards and reward programs.

Finally, what are some of the tools that businesses are using? Here are just a few that I have heard mentioned recently (please note I have not used any of these myself, although I have seen sales demos of some applications – these are definitely not personal recommendations, and you should obviously do your own research and due diligence):

For managing and distributing big data, Apache Hadoop was name-checked at a financial data conference I attended last month, along with kdb+ to process large time-series data, and GetGo to power faster download speeds. Python was cited for developing machine learning and even predictive tools, while DataWatch is taking its data transformation platform into real-time social media sentiment analysis (including heat and field map visualisation). YellowFin is an established dashboard reporting tool for BI analytics and monitoring, and of course Tableau is a popular visualisation solution for multiple data types. Lastly, ThoughtWeb combines deep data mining (e.g., finding hitherto unknown connections between people, businesses and projects via media coverage, social networks and company filings) with innovative visualisation and data display.

Next week: a few profundities (and many expletives) from Dave McClure of 500 Startups