In 2019, QuantumBlack, our AI firm, launched Kedro, its first open-source software tool for data scientists and data engineers. It’s a library of code that can be used to create data and machine-learning pipelines, the building blocks of any analytics project.
In the two-and-a-half years it has been available on the open-source platform GitHub, the Kedro community and user base continues to grow, with more than 200,000 monthly downloads, over 100 contributors, and a growing number of enterprises that choose Kedro as their standard for data-science code. For example, a team at NASA used Kedro to model air-traffic patterns and Telkomsel, Indonesia’s largest wireless network provider, uses Kedro as a standard across their data science organization.
Today, we are taking the next step in our open-source journey and donating Kedro to the Linux Foundation. It will be hosted by LF AI & Data, a specialist Linux Foundation umbrella organization founded in 2018 to accelerate development and innovation in AI and data by supporting and connecting technical open-source projects, developer communities, and some 1000 companies.
“We’re excited to welcome the Kedro project into LF AI & Data,” says Dr. Ibrahim Haddad, executive director of LF AI & Data. “It addresses the many challenges that exist in creating machine-learning products today, and it is a fantastic complement to our portfolio of hosted technical projects. We look forward to working with the community to grow the project’s footprint and to create new collaboration opportunities with our members, hosted projects, and the larger open-source community.”
Kedro is now in the hands of the data-science ecosystem. “This is the only way it can grow at this point; if it is improved by people around the world,” explains Yetunde Dada, the Kedro product lead. “Our cross-disciplinary team gets to see the increased development and validation of Kedro with this milestone. It establishes Kedro as a de-facto industry tool, joining a collection of other cutting-edge open-source projects such as Kubernetes donated by Google, GraphQL by Facebook, and MLFlow and Delta Lake by Databricks.”
Donating our intellectual property is new ground for McKinsey. But according to McKinsey senior partners and QuantumBlack co-leaders Alex Singla and Alex Sukharevsky, the move expresses our firm’s passion for developing innovative products, particularly in the field of AI, that help create sustainable and inclusive growth for our clients and society. It also represents our move from the age of experimentation into scaling AI solutions.
Jeremy Palmer, a McKinsey senior partner and co-leader of QuantumBlack Labs, the home of products like Kedro within the firm, agrees. “It makes sense to be sharing a tool which can become an industry standard, since the whole open-source community can now contribute to its improvement,” he says. “We will all learn more, and faster, and hence stay at the forefront of useful innovation.”
So what qualifies Kedro as a de-facto “industry standard?” Ivan Danov, the Kedro technical lead, explains: “Kedro is a framework that borrows concepts from software-engineering best practices and brings them to the data-science world. It lays all the groundwork for taking a project from an idea to a finished product, allowing developers and engineers to focus entirely on solving the business problem at hand. Kedro is built to be the backbone of all machine-learning projects.”
Kedro will continue to be the foundation of all advanced analytics projects within McKinsey. “We’ve been building machine learning products for a long time and have accrued a fair amount of scar tissue and learned a lot of important lessons,” says Joel Schwarzmann, Kedro product manager. “The ideas and guardrails that exist in Kedro reflect that experience and are designed to help developers avoid common pitfalls and follow best practices.”
Now its future can be steered by a much wider range of stakeholders, across different industries, geographies and technologies, who bring different perspectives, and can apply Kedro to many more use cases. An extended team of “maintainers,” including McKinsey’s own developers, can contribute to Kedro’s development: writing code, shaping product strategies, tracking use cases, and voting on decisions that affect the project.
McKinsey has gone from being a Kedro owner to a Kedro user, and in so doing, created a pathway for firm innovation that is seeing us create and move products to the open-source world, where they can help solve problems for anyone.
QuantumBlack is currently developing more open-source products. Please subscribe to the New At McKinsey blog for the next release.