Unlocking the Power of Open Source: How IBM Watson X Leverages Innovation
Discover how IBM Watson X leverages open-source innovation to power enterprise AI and data. Explore the open-source tools and technologies, including Codeflare, PyTorch, KServe, and Presto, that drive model training, tuning, and data analytics at scale on OpenShift.
September 8, 2024
Discover how IBM's Watson X platform leverages the power of open source to deliver cutting-edge AI and data solutions. Explore the open-source technologies that enable efficient model training, tuning, and inferencing, as well as seamless data gathering and analytics. This blog post provides a comprehensive overview of how open source drives innovation within Watson X, empowering businesses to harness the best of AI and data.
The Benefits of Open Source in Watson X
Model Training and Validation with Codeflare
Representing Models with PyTorch
Model Tuning and Inferencing with Open Source Technologies
Data Gathering and Analytics with Presto
Conclusion
The Benefits of Open Source in Watson X
The Benefits of Open Source in Watson X
IBM has a long history of contributing to and leveraging open source in its offerings. This tradition continues with Watson X, IBM's new Enterprise platform for AI and data. By embracing open source, Watson X benefits from the best AI, innovation, and models available.
The use of open source in Watson X spans three key aspects: model training and validation, model tuning and inferencing, and data gathering and analytics.
For model training and validation, Watson X leverages the open-source project Codeflare. Codeflare provides user-friendly abstractions for scaling, queuing, and deploying machine learning workloads, integrating with Ray, Kubernetes, and PyTorch.
PyTorch, the open-source machine learning framework, is used to represent the models in Watson X. PyTorch offers key features such as tensor support, GPU acceleration, and distributed training, enabling efficient handling of large, complex models.
For model tuning and inferencing, Watson X utilizes the open-source projects KServe and Model Mesh. These technologies allow for the efficient serving of thousands of AI models on the OpenShift platform. Additionally, the open-source project KKIT provides APIs for prompt tuning, further enhancing the inferencing capabilities.
Finally, for data gathering and analytics, Watson X leverages the open-source SQL query engine Presto. Presto's high performance, scalability, and ability to query data where it lives make it a valuable component of the Watson X data ecosystem.
By embracing open-source technologies, Watson X benefits from the best available AI, innovation, and models, empowering users to build and deploy intelligent applications at scale.
Model Training and Validation with Codeflare
Model Training and Validation with Codeflare
Training and validating models can take a large amount of cluster resources, especially when the models are huge multi-billion parameter Foundation models. To efficiently use a cluster and make it easier for data scientists, IBM has an open-source project called Codeflare.
Codeflare provides user-friendly abstractions for scaling, queuing, and deploying machine learning workloads. It integrates Ray, Kuberay, and PyTorch to provide these features. With Ray, it provides a job abstraction, and Kuberay allows Ray to run on Kubernetes platforms like OpenShift.
In a typical Codeflare use case, it first spins up a Ray cluster. The data scientist can then submit training jobs to the cluster. If the OpenShift cluster is heavily used and resources are not available, Codeflare can queue the jobs and wait until resources become available. In some cases, it can even scale up the cluster to accommodate the workload. When the training and validation are complete, Codeflare can delete the Ray jobs and remove them from the cluster.
The key benefit of Codeflare is that it enables data scientists to efficiently use a cluster, or even multiple OpenShift clusters, without worrying about the underlying infrastructure.
Representing Models with PyTorch
Representing Models with PyTorch
PyTorch provides key features for representing models, including tensor support, GPU support, and distributed training.
Tensors are multi-dimensional arrays that store the weighted values or probabilities that are tweaked over time to improve the model's predictive capabilities. PyTorch's tensor support enables efficient representation of these complex model parameters.
PyTorch's GPU support allows for highly efficient computation during model training, which is crucial for large, complex models. Additionally, PyTorch's distributed training capabilities enable training of models that are too large to fit on a single machine, by distributing the training across multiple machines.
Other key features of PyTorch for model representation include:
- Neural network creation: PyTorch makes it easy to create various types of neural networks.
- Data loading: PyTorch provides easy-to-use data loading capabilities.
- Training loops: PyTorch includes built-in training loops that efficiently update model parameters to improve predictive accuracy.
- Model adjustments: PyTorch's auto-gradient calculation feature simplifies the process of making minor adjustments to the model to improve its performance.
By leveraging these open-source capabilities provided by PyTorch, Watson X can efficiently represent and train complex AI models as part of its enterprise-grade AI and data platform.
Model Tuning and Inferencing with Open Source Technologies
Model Tuning and Inferencing with Open Source Technologies
We want to be able to serve a large number of AI models and do it at scale on OpenShift. The open source projects we leverage for this are KServ Model Mesh and KKit.
KServ Model Mesh allows us to efficiently serve thousands of models in a single pod. Originally, KServ could only serve one model per pod, which was not very efficient. By merging KServ with the Model Mesh project, we can now serve large numbers of models efficiently on an OpenShift cluster.
To find these models, we leverage the Hugging Face repository, which has over 200,000 open source models. IBM has a partnership with Hugging Face, making it a great source for models to use in our Watson X offerings.
Additionally, we use KKit, an open source project that provides APIs for prompt tuning. This allows us to fine-tune the models on the inferencing side to improve the results.
Together, these open source technologies enable us to serve and tune a large number of AI models at scale on OpenShift, powering the model inferencing capabilities of Watson X.
Data Gathering and Analytics with Presto
Data Gathering and Analytics with Presto
Presto is the open-source project that IBM leverages for data gathering and analytics in Watson X. Presto is a high-performance SQL query engine that enables open data analytics and powers the open data lakehouse.
Key features of Presto include:
-
High Performance: Presto is highly scalable and provides fast query execution, making it well-suited for large-scale data analytics.
-
Federated Queries: Presto allows you to query data across multiple data sources, providing a unified view of your data.
-
Query Data Where It Lives: Presto can query data directly in its source location, eliminating the need to move data to a centralized repository.
By using Presto, Watson X can efficiently gather and analyze data from various sources, enabling data-driven insights and powering the AI and machine learning capabilities of the platform.
Conclusion
Conclusion
IBM has a rich history of contributing to open source and leveraging open source in its offerings, and Watson X continues that tradition. Watson X is IBM's new Enterprise platform for AI and data, and it leverages open source to provide the best AI, innovation, and models.
The open source projects used in Watson X span various aspects of the AI and data pipeline, including model training and validation, model representation, model tuning and inferencing, and data gathering and analytics.
For model training and validation, Watson X utilizes the open source project Codeflare, which provides user-friendly abstractions for scaling, queuing, and deploying machine learning workloads. It integrates with Ray, Kubeflow, and PyTorch to enable efficient use of cluster resources.
PyTorch is the open source project used to represent the models in Watson X, providing key features such as tensor support, GPU support, and distributed training capabilities.
For model tuning and inferencing, Watson X leverages the open source projects KServ Model Mesh and Hugging Face, which enable the efficient serving of thousands of AI models on an OpenShift cluster. Additionally, the open source project KKit provides APIs for prompt tuning to improve the results.
Finally, for data gathering and analytics, Watson X utilizes the open source project Presto, a high-performance SQL query engine for open data analytics and the open data lakehouse.
By embracing open source, Watson X continues IBM's tradition of driving innovation and providing the best AI and data solutions to its customers.
FAQ
FAQ