Monthly Roundup - June 2018

We’ve had a busy few weeks and wanted to share some of the things we’ve been up to, so below you’ll find some events we’ve been apart of, articles we’ve written, and what we’ve got planned for next month.

Events

Semi-Permanent

We were lucky to be invited by General Assembly to speak on AI and Art at the Carriage Works in Sydney. We also got to listen to some incredible creatives, musicians and designers talk about what they believe the impact of AI will be on their industries in one of the sessions. It was great to be able to hear from some very insightful people talk about what they believe will be the applications of AI to their craft.

Breakfast Meetup

Brilliant talk from Geoff Pidcock from Jayride talk on data cleaning. If you weren’t able to get along, check out some of the materials here. And if you’re interested in talking data over a coffee and croissant check out the meetup group.

Verge Labs Lunch and Learn Sessions

A big thanks to all the organisations that have participated in our lunch and learn sessions. The response has been fantastic since our initial call out a few weeks ago. We have a few more slots available in a few weeks, so if you and your team is interested in an exchange of ideas over lunch and are based in the Sydney CBD area, let us know.

Upcoming

UseR! 2018 - Brisbane, July 10-13

We’re looking forward to UseR! this year, and are super happy to be able to present and sponsor the event! Let us know if you’re going to be at UseR and we’ll make sure to say hi when we’re up there (we’ve got some funky hex stickers to give away).

From the Verge Labs Blog

Spatial Analysis in Australia

How to use geospatial information as part of your analysis and where you can find the tools and data required to get started.

Video: Respecting Privacy with Look-Alike Data Sets

Building “look-alike” data sets that have many of the same statistical properties as source data sets they are generated from, but no longer contain sensitive data.

Why a Professional Body for Data Science is a Bad Idea

A discussion on whether there should there be institutional representation for data scientists.

Interesting articles we’ve read

Jeff Bezos’ Letter to Amazon Shareholders
In the latest letter to Amazon shareholders, Jeff Bezos talks about Amazon successes, growth, policies, high standards, and the company relentless focus on customers as the main contributor to its number one status among online retailers. Bezos emphasises that holding high standards is crucial to running a good business, but thankfully they are also teachable. Learning to perform at a high standard is something you learn from the team around you, when you can coach and be coached, and when you invest proper time in developing this as a skill.

Along with other successful milestones (including Amazon Prime, Amazon AWS, and Amazon Marketplace), Amazon’s CEO talks about the company’s AI and machine learning projects Alexa, Echo, and Fire TV Stick. He discusses continuous Alexa improvements, such as the capacity for 30,000 skills and the ability to control 4,000 smart home devices, as well as improving spoken language understanding by more than 25 percent by using semi-supervised learning techniques.

It’s amazing to see the CEO of one of the world’s largest companies talk about machine learning in a letter to shareholders. And talk about it in a way that is knowledgable and practical, instead of high-level and aspirational.

Announcing Ursa Labs: an innovation lab for open source data science

Data science expert developer Wes McKinney, well known for developing pandas, announces his latest project Ursa Labs created with the intention to advance cross-language computational systems for data science.

The project will hire developers experienced in the data science systems, in particular, the Apache Arrow ecosystem, and partner with larger organisations to support operational (HR and finance, for example) and funding aspects (larger corporations and smaller donations).

In the rationale for establishing this project, McKinney provides a brief history of his experience in OSS (Open Source Software) projects, giving the most common issues developers face when working on OSS, such as maintenance, innovation, and common funding traps. He also talks about his past experience with data science projects, such as DataPad, Cloudera, Two Sigma, and Apache Arrow, and how this experience can be applied in the current project.

In this new undertaking, based on Apache Arrow, RStudio will provide admin help, while Two Sigma will provide technical advice and employee contributions. Ursa Labs (Ursa stands for “bear’ in Latin, will soon announce open positions for developers.

Future Predictions - Revolutionising the Accuracy of Business Forecasting

Professor Rob J Hyndman, Associate Professor George Athanasopoulos, and Ph.D. student Shanika Wickramasuriya from the Monash Business School’s Department of Econometrics and Business Statistics have developed a new forecasting model which improves the predictive capacities of layered data for hierarchical forecasting.

They have significantly improved the accuracy problem of accumulative forecasts, which contain several sets of data that would have separately given different numbers. Professor Hyndman mentions the example of predicting the number of male and female clothing, where the joint sale predictions didn’t match the prediction for each item individually. They’ve been working on a mathematical algorithm that reconciles individual forecasts to match the overall number since 2011. The model, which is free and available for download, has been so far been used by Walmart, Nestle, SAP, Grand Vision Huawei, and Bank of New York Mellon (for bank sheet balances), as well as for tourism and student enrolment forecasts.

The trio will continue working on the model to further improve it for quarterly data and to produce more accurate predictions for small-scale data and shorter time ranges.

Lords Committee on AI

Lords Committee Report on AI, “AI in the UK: ready, willing and able?” states that the UK should ascertain as the leader in AI. Lords Committees Chairman, Lord Clement-Jones, speaks about the importance of putting ethics centre-stage in the development of artificial intelligence to help the public see its benefits and prevent its misuse. The Committee mentions UK’s existing resources to take this role, including expert, technical, and research resources, as well as the dynamic ecosystem. The report also includes issues that need to be solved, such as creating a growth fund for SMEs and changing the immigration system in order to support the leadership role.

Steps forward need to be based on developing a cross-sector AI Code, which can be later adopted on national and worldwide level. The AI Code should be based on 5 principles: common good for the humanity; intelligibility, fairness and transparency; data privacy; wider AI education (including children); and benevolent use.

The conclusions from the report come down to the following key actions:

  • Impending job changes and the need for retraining.
  • Greater control individuals need to have over personal data
  • Review of the use of data by large tech companies in the UK
  • Avoiding past prejudices when creating new AI systems.
  • Creating a National policy framework and investigating the sufficiency of current liability laws

Well that’s if from us, hopefully we’ll see you up in sunny Brisbane for UseR! or at one of our lunch and learn sessions. If you want to stay up to date with what we are doing then you can follow us on twitter.

Previous article Spatial Analysis in Australia

What data best describes a location? Is it the age of the people living nearby,...

Next article Introducing Deckard for large...

Deckard is our first open source contribution, combining deck.gl with R. We saw...

Get in touch

To contact us, please fill in your details.
We're fast responders.

Do you have data?
No
Yes