PikoPong
  • Web Dev
  • Hack
  • Database
  • Big Data
  • AWS
  • Linux
No Result
View All Result
PikoPong
  • Web Dev
  • Hack
  • Database
  • Big Data
  • AWS
  • Linux
No Result
View All Result
PikoPong
No Result
View All Result
Home AWS

Securing and scaling AI and machine learning pipelines with AWS : idk.dev

July 29, 2020
in AWS
281 11
Securing and scaling AI and machine learning pipelines with AWS : idk.dev


Many AWS customers are building AI and machine learning pipelines on top of Amazon Elastic Kubernetes Service (Amazon EKS) using Kubeflow across many use cases, including computer vision, natural language understanding, speech translation, and financial modeling. In this post, we will describe AWS contributions to the Kubeflow project, which provide enterprise readiness for Kubeflow deployments.

Originally open sourced in December 2017, the Kubeflow project reached its 1.0 milestone in March 2020. With this release, Kubeflow has graduated key components of the build, train, optimize, and deploy user journey for machine learning. These components include the Kubeflow dashboard UI, multi-user Jupyter Notebooks, Kubeflow Pipelines, and KFServing, as well as distributed training operators for TensorFlow, PyTorch, and XGBoost.

Our contributions to Kubeflow help to democratize machine learning, streamline data science tasks, and allow customers to leverage the highly optimized, cloud-native, enterprise-ready AWS services with Kubeflow. Customers have a clear path to use Kubeflow with Amazon EKS for managed Kubernetes clusters, Amazon Simple Storage Service (Amazon S3) for object storage, Amazon Relational Database Service (Amazon RDS) for pipeline metadata, Amazon Elastic File System (Amazon EFS) for shared file access, Amazon FSx for Lustre for increased training performance, Amazon CloudWatch for logging/metrics, and Amazon SageMaker for AI and machine learning integration.

Kubeflow logo surrounded by AWS logos

Security

As security is a top priority at AWS, we have tightly integrated the Kubeflow security model directly into the AWS shared-responsibility security services. Integrations include IAM Roles for Service Account for fine-grained access control at the Kubernetes Pod level, Application Load Balancing (ALB) for external traffic management and authentication, AWS Shield for DDoS protection, AWS Certificate Manager (ACM) for in-transit encryption, AWS Key Management Service (AWS KMS) for at-rest encryption, and Amazon Cognito for user management.

Kubeflow users can configure an Application Load Balancer to securely authenticate users either through Amazon Cognito or through an identity provider (IdP) that is OpenID Connect (OIDC) compliant. When you create a Kubeflow profile using Kubeflow’s profile controller, an AWS Identity and Access Management (IAM) role binds to a Kubernetes service account in the user’s namespace. This seamlessly grants AWS permissions to the user. Additionally, Istio and Kubernetes RBAC are created along with the profile creation. RBAC authorizes and isolates users to specific Kubernetes resources. By deploying Kubeflow on Amazon EKS, customers can enable private cluster-endpoint access to keep traffic within their Virtual Private Cloud (VPC) and completely disable public access from the internet.

Compute, autoscaling, and Spot Instances

Amazon CloudWatch logs and metrics allow customers to easily create dashboards and alerts to monitor Kubeflow resources, such as the health of Kubeflow Pipelines and performance of TensorFlow/PyTorch/MXNet models. Customers can choose from a variety of CPU and GPU instance types available on Amazon Elastic Compute Cloud (Amazon EC2) to power their Kubeflow workloads depending on their business needs. Kubeflow running on Amazon EKS will automatically detect GPUs and install the appropriate GPU device plugin on each instance.

Additionally, Amazon EKS supports Spot Instances and cluster autoscaling with Kubeflow. Using Spot Instances, customers can save up to 90% over on-demand instances. Cluster autoscaling will dynamically increase or decrease the number of nodes in your Kubeflow cluster based on resource utilization. AWS has committed a number of improvements around the GPU autoscaling and Spot Instance user experience with Cluster Autoscaler.

AI and machine learning pipelines

Kubeflow Pipelines can automate complex AI and machine learning pipelines using custom components available for many AWS services, including Amazon Athena, Ground Truth, Amazon EMR, and Amazon SageMaker. For example, a typical pipeline might include data ingestion with Amazon Athena, data labeling with Ground Truth, feature engineering with Apache Spark on Amazon EMR, and model training/deploying with SageMaker.

In June 2020, we open sourced SageMaker Components for Kubeflow Pipelines to help customers create best-of-breed AI/ML pipelines to train, tune, and deploy models with Kubeflow and Amazon SageMaker.

Storage and distributed file systems

Kubeflow builds upon Kubernetes to provide a solid infrastructure for large-scale, distributed data processing, including AI/ML model training and tuning. Because distributed processing often requires a distributed file system, AWS provides multiple high-performance, cloud-native, distributed file systems, including Amazon Elastic File System (Amazon EFS) and Amazon FSx for Lustre. These POSIX-compliant file systems are optimized for large-scale and compute-intensive workloads, including high-performance computing (HPC), AI, and machine learning. Kubeflow leverages the Kubernetes-native Container Storage Interfaces (CSI) drivers for Amazon EFS and Amazon FSx.

When performing Kubeflow experiments and hyper-parameter tuning jobs, customers can now store metadata and artifacts directly into Amazon RDS and Amazon S3 object storage for better performance, scalability, and durability. Storing data in Amazon RDS and Amazon S3 adds stability to your Kubeflow cluster—even across Kubeflow version upgrades. You can find more information in the AWS Storage Options, Configuring RDS, and Using S3 for Pipeline Artifacts sections of the Kubeflow documentation.

What’s next

We have a full roadmap of improvements to the Kubeflow experience on AWS. Highlights we plan to include in future releases include:

  • Provide simple Kubeflow installation and management using AWS CloudFormation.
  • Streamline the end-to-end experience for building, training, tuning, and deploying AI/ML models.
  • Integrate Feast feature store with Kubeflow on Amazon EKS.
  • Graduate MXNet Operator to 1.0 and add more production-grade features to the TensorFlow and PyTorch operators.
  • Build more data-processing components to integrate with additional AWS services.

Summary

In this post, we highlighted how AWS customers can use Kubeflow with native AWS-managed services for secure, scalable, and enterprise-ready AI/ML workloads. We encourage you to set up Kubeflow on Amazon EKS by following our EKS + Kubeflow workshop and sample notebooks and pipelines.

For more information on Kubeflow on AWS, check out the Kubeflow documentation. For more information on Kubeflow and Amazon SageMaker, review the SageMaker Components for Kubeflow Pipelines documentation. You can also find us on the Kubeflow #AWS Slack Channel, and we welcome your feedback there because it helps up prioritize the next features to contribute to the Kubeflow project. And lastly, please join us for Kubeflow and AWS monthly community events online.



Source link

Share219Tweet137Share55Pin49

Related Posts

Building resilient services at Prime Video with chaos engineering : idk.dev
AWS

Getting started with Travis-CI.com on AWS Graviton2 : idk.dev

AWS Graviton2 processors deliver a major leap in performance and capabilities over first-generation AWS Graviton processors. They power Amazon...

September 24, 2020
Monitoring the Java Virtual Machine Garbage Collection on AWS Lambda : idk.dev
AWS

Monitoring the Java Virtual Machine Garbage Collection on AWS Lambda : idk.dev

When you want to optimize your Java application on AWS Lambda for performance and cost the general steps are:...

September 23, 2020
AWS adds a C++ Prometheus Exporter to OpenTelemetry : idk.dev
AWS

AWS adds a C++ Prometheus Exporter to OpenTelemetry : idk.dev

In this post, two AWS interns—Cunjun Wang and Eric Hsueh—describe their first engineering contributions to the popular open source...

September 23, 2020
Architecture Patterns for Red Hat OpenShift on AWS : idk.dev
AWS

Architecture Patterns for Red Hat OpenShift on AWS : idk.dev

Editor’s note: Although this blog post and its accompanying code make use of the word “Master,” Red Hat is...

September 22, 2020
Next Post
An Open Standard For JavaScript Functions — Smashing Magazine

Understanding Client-Side GraphQl With Apollo-Client In React Apps — Smashing Magazine

style9: build-time CSS-in-JS | CSS-Tricks

style9: build-time CSS-in-JS | CSS-Tricks

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Applying DevSecOps to your software supply chain

Applying DevSecOps to your software supply chain

December 3, 2020
Customize media notifications and playback controls with the Media Session API

Customize media notifications and playback controls with the Media Session API

May 27, 2020
Friendly Prefix IDs for Eloquent Models

Friendly Prefix IDs for Eloquent Models

March 9, 2021
How to Make an Area Chart With CSS

How to Make an Area Chart With CSS

December 2, 2020

Categories

  • AWS
  • Big Data
  • Database
  • DevOps
  • IoT
  • Linux
  • Web Dev
No Result
View All Result
  • Web Dev
  • Hack
  • Database
  • Big Data
  • AWS
  • Linux

Welcome Back!

Login to your account below

Forgotten Password?

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In