- Core improvements to code, process, and documentation
- Simplified installation
- Dependencies, change logs, tracking issues and roadmaps
- Kubeflow 1.4 video update and tutorials
- What’s coming
- Join the community
The Kubeflow 1.4 release lays several important building blocks for the use of advanced metadata workflows. A quick summary of 1.4’s top deliveries includes:
- Advanced metadata workflows with improved metric visualization and pipeline step caching in Kubeflow Pipelines (KFP) via the KFP Software Development Kit (SDK)
- A new KFServing model user interface that displays ML model status, configuration, yaml, logs, and metrics
- New Optuna Suggestion Service with multivariate TPE algorithm and Sobol’s Quasirandom Sequence support for hyperparameter tuning
- A new, unified training operator that supports all deep learning frameworks with a Python SDK, enhanced monitoring and advanced scheduling support
Kubeflow 1.4 enables the use of metadata in advanced machine learning (ML) workflows, especially in the Kubeflow Pipelines SDK. With the Pipelines SDK and its new V2-compatible mode, users can create advanced ML pipelines with Python functions that use the MLMD as input/output arguments. This simplifies metrics visualization.
Another enhancement to Pipelines is the option to use the Emissary executor for non-Docker Kubernetes container runtime requirements. In addition, 1.4 can support metadata-based workflows to streamline the creation of TensorBoard visualizations and to serve ML models.
Core improvements to code, process, and documentation
For the Kubeflow Working Groups, 1.4 was primarily a maintenance release, which enabled the Community to concentrate on core improvements to code, process, and documentation. In the 2021 Kubeflow User Survey, users requested documentation improvements (please see the figure below). The Kubeflow 1.4 release cycle included the 1.4 Docs Sprint that generated nearly fifty (50) PRs. These PRs were tracked in this issue and this Kanban board, and we encourage more users to contribute by reading and improving the Kubeflow documentation.
The 1.4 release improvements simplify future feature development by reducing redundant code, increasing CI/CD, and automating testing. An important delivery was the new Unified Training Operator for Tensorflow, PyTorch, MXNet, and XGBoost PR#1302. 1.4 also initiated the Community’s adoption of a defined release process in its new Kubeflow Release Handbook. The Handbook defines the stages of the release and contributors’ roles, which has helped to improve responsibilities and quality.
As shown in the Kubeflow User Survey (see the figure above), users have also asked for installation improvements. In Kubeflow 1.3, the Community refactored the Kubeflow deployment pattern to use manifests files (in yaml or json), which are stored in Git repositories, and then deployed using the Kustomize installation tool. This flexible installation pattern simplifies customization by overlaying manifests. This pattern is now being exploited in 1.4.
In 1.4, the Community provides an upstream set of base manifests in the Kubeflow manifest repo. Third parties have built custom installation guides or distributions with overlays that extend the base manifests. In 1.4, the third party overlays were removed from the Kubeflow manifest repo and moved to the repository of their choosing. This pattern provides third parties more flexibility to upgrade and document their overlays. You can see a full set of installation guides and distributions here.
In addition, on-prem Kubeflow users can use the base installation manifests which utilize open source solutions like Istio, Dex, and AuthService for authentication. The Community and the Manifests Working Group are actively working to provide extra overlays and patches to accommodate more advanced use cases and installations. For example, we recently configured Knative to work with the AuthService and Dex.
Dependencies, change logs, tracking issues and roadmaps
Kubeflow has many software dependencies. In 1.4, the top dependencies used in testing are defined below:
This chart provides links to important details from the Working Groups, including their 1.4 tracking issues, change logs, and roadmaps. Please note that the Working Groups use version numbers that are specific to their project. As a result, many Kubeflow components, which have been incorporated and tested in Kubeflow 1.4, may have a different version number than 1.4.
|Changelog / Release Notes
Training Operator Changelog
|Training Operators Roadmap
Katib Release Notes
PR for v0.12
Release Notes, Changelog
Kubeflow 1.4 video update and tutorials
The Kubeflow Working Group representatives have recorded a presentation on Kubeflow 1.4’s new features, which you can find on the Kubeflow YouTube channel. Additionally, Kubeflow 1.4’s new features are easy to try in these tutorials:
- AutoML Tutorial with metadata based workflows to build TensorBoards and to serve models
- Run Katib from your local laptop by following this example.
- KFP Tutorial using Pipelines SDK v2 to orchestrate your ML workflow as a pipeline
- KFServing Tutorial
- Training Operator Tutorial
Join the community
We would like to thank everyone for their efforts on Kubeflow 1.4, especially the users, code contributors and working group leads. As you can see from the extensive contributions to Kubeflow 1.4, the Kubeflow Community is vibrant and diverse, and solving real world problems for organizations around the world.
Want to help? The Kubeflow Community Working Groups hold open meetings, public lists, and are always looking for more volunteers and users to unlock the potential of machine learning. If you’re interested in becoming a Kubeflow contributor, please feel free to check out the resources below. We look forward to working with you!