3,042 questions
Best practices
0
votes
0
replies
11
views
AWS sagemaker Model Registry Vs. MLflow
On the AWS environment, what do you suggest: Sagemaker Model Registry or MLFlow?
If we want to use grid search to experiment with different configurations and log all experiments to then decide which ...
-1
votes
0
answers
14
views
how to deploy multi containers to an endpoint on AWS Sagemaker?
How to develop a deployment pipeline that deploys multiple containers to an endpoint (MCE) on AWS Sagemaker?
I need Python code or steps to set up in the AWS console.
-4
votes
1
answer
62
views
AWS Lambda Python script calling SageMaker: AlgorithmError: Framework Error
I am using lambda and Python and S3.
# lambda_bootstrap_train.py
import boto3
import time
import json
import os
sm = boto3.client('sagemaker', region_name='us-east-2')
s3 = boto3.client('s3', ...
1
vote
0
answers
59
views
Sagemaker Unified Studio overriding delta lake configuration to iceberg on EMR
I am connecting to an EMR cluster through SageMaker Unified Studio(JupyterLab).
My EMR cluster is configured with Delta Lake support, and I have the following Spark properties set on the cluster:
...
0
votes
1
answer
48
views
Restricting IAM user from accessing/downloading my .py files in AWS sagemaker
I want to share my jupyter notebook to public in AWS sagemaker. But when an IAM user logins all the files are listed under file explorer and he is allowed to download myfunctions.py files where all ...
0
votes
0
answers
34
views
AWS Sagemaker invoke endpoint error "Could not find variable lstm_model/dense/bias"
I have trained a model and deployed as endpoint using aws sagemaker and when I tried to invoke I have got error:
2025-09-09 14:58:25.724914: I external/org_tensorflow/tensorflow/core/framework/...
0
votes
0
answers
59
views
SageMaker PyTorch MME ignores entry_point and falls back to default handler, causing ModelLoadError
I'm trying to deploy a custom PyTorch model to a SageMaker Multi-Model Endpoint (MME). My model is saved as a state_dict using torch.save(), so it requires a custom inference.py script to load the ...
2
votes
0
answers
152
views
Unable to connect to EMR cluster from SageMaker Unified Studio using runtime role – credentials are null
I'm trying to connect to an existing EMR cluster from SageMaker Unified Studio to run SQL queries via JupyterLab.
SageMaker requires that the EMR cluster be runtime role-enabled to integrate with ...
0
votes
1
answer
51
views
JupyterLab SnowFlake External OAuth EntraID Client how to use it
I did look into Request an access token with a client_secret and Connecting with OAuth and cannot find details how this can be programatically to pass token into:
ctx = snowflake.connector.connect(
...
0
votes
2
answers
222
views
AWS SageMaker - Custom Inference With HuggingFace Model
For context, I'm currently working in a JupyterLab space in SageMaker studio.
My goal is to deploy a HuggingFace Llama model for batch transform inference. The data I will be passing in to the LLM is ...
0
votes
0
answers
36
views
Error Invoking Sagemaker Endpoint from API Gateway REST API Integration
I am having difficulty invoking a sagemaker endpoint from API gateway. The API gateway API is a REST API (POST method type) with Sagemaker Runtime as the Integration type. The HTTP method in the ...
0
votes
1
answer
55
views
S3 Torchconnector loads data as a list of tensors
I'm setting up a model and about to start training a dataset from an S3 bucket. To load the data from S3 I'm using s3torchconnector.S3MapDataset.from_prefix which loads the data into the Sagemaker ...
0
votes
1
answer
77
views
How to keep the same version number in AWS SageMaker ModelPackageGroup when updating model with evaluation metrics?
I’m working with AWS SageMaker Model Registry and have a training pipeline that creates and registers a new model package in a ModelPackageGroupName. After that, I have a separate evaluation pipeline ...
0
votes
2
answers
67
views
Loading and training data on Sagemaker with a Self-supervised pre-training model
I'm testing a Self-supervise pre-training model, independently of the model I'm getting the same Error. My environment is Jupiter Labs for AWS Sagemaker, the problem comes when, using args and trying ...
0
votes
1
answer
66
views
Using Torchvison ImageFolder on AWS S3
I'm working with an AWS S3 instance and trying to deploy a SSL model loading a dataset from a bucket list I have defined on S3. The DL framework I'm using is PyTorch and more concretely to load the ...
0
votes
1
answer
47
views
AWS CDK SageMaker Pipeline Lambda Step
I've been trying to deploy and AWS CDK stack that builds a SageMaker Pipeline with Lambda step, however I keep getting "Invalid request provided: Step[xyz]: Lambda function ARN cannot be null.
No ...
0
votes
0
answers
30
views
SageMaker AutoML Model - Neo Compilation
I have trained a binary image classification model using AutoML as the predictions the AutoML training provides are better than the model I trained in SageMaker using MXNET and image-classification ...
0
votes
0
answers
77
views
SageMaker Real-Time Endpoint Timeout Issues with Lambda for Parallel Data Processing
I’m new to AWS and struggling with an architecture involving AWS Lambda and a SageMaker real-time endpoint. I’m trying to process large batches of data rows efficiently, but I’m running into timeout ...
0
votes
1
answer
42
views
tar.gz file in s3 bucket is not loaded in sagemaker notebook
I'm just testing AWS tools to serve using sagemaker. My current progress is that I've already created a trained model, and compressed pth, inference.py, requirement.txt, and other necessary files into ...
0
votes
1
answer
141
views
How to properly connect AWS SageMaker studio instance to OpenSearch Database
I am trying to connect an OpenSearch domain to the code editor running in SageMaker AI. I have created the OpenSearch instance in the same VPC and subnet, than the SageMaker domain. Next, I have added ...
0
votes
0
answers
43
views
AWS - Pyspark - Spark version 3.4.1 - Input byte array has wrong 4-byte ending unit
I'm trying to execute a base64 pyspark sql function on AWS Sagemaker where the Spark version is 3.4.1:
def transform(df: DataFrame, colB64Name: str) -> DataFrame:
colUnderscore = df.select(&...
1
vote
1
answer
123
views
Is it possible to register a model in MLflow model registry without pickling the model?
We are using MLflow model registry, but we don't need the load_model / save_model functionality. Rather, we just want to be able to do something like this:
run_id = client.get_model_version_by_alias(...
2
votes
1
answer
158
views
How can I pass environment variables to a custom training script in Amazon SageMaker using the Python SDK?
I'm training a custom model using a script in Amazon SageMaker and launching the job with the Python SDK. I want to pass some environment variables (like API keys or config flags) to the training job ...
0
votes
2
answers
272
views
Why is AWS SageMaker not submitting a training job when I create an estimator object?
I'm trying to submit a training job to AWS SageMaker AI API via GitHub using the HuggingFace estimator. However, every time I try to create the job, the estimator creates an object of type None and ...
0
votes
1
answer
66
views
AWS SageMaker - What is Channel?
What is exactly channel in AWS SageMaker?
SageMaker Documentation - Channel
A channel is a named input source that training algorithms can consume.
Input Data Configuration
For example, suppose ...
0
votes
1
answer
322
views
Create a public link for a Streamlit app in Sagemaker Studio
I built a chatbot app using Streamlit in AWS Sagemaker Studio. The app is working using the provided link. However, I need to share the app with others. How can I have a public link to this app?
1
vote
0
answers
121
views
Cannot register and deploy scikit-learn linear regression model on AWS Sagemaker AI
We are evaluating AWS Sagemaker AI pipeline for MLOps, using the step decorator approach as introduced here, and mostly following the example here.
Since we are just starting out we'd prefer to have ...
2
votes
1
answer
103
views
Deploy TPU TF Serving Model to AWS SageMaker
I have a couple of pre-trained and tested TensorFlow LSTM models, which have been trained on Google Colab. I want to deploy these models with AWS as our entire application is deployed there.
I've ...
0
votes
0
answers
43
views
Possibility to enable multi model endpoint with DeepAR on SageMaker
I am working with SageMaker and have a deployed separate real-time endpoints and models for DEV and PROD.
To save costs, I am trying to put both models behind the same endpoint.
While this worked for ...
0
votes
0
answers
29
views
I am trying to implement sagemaker with deep learning model. Everything works fine but the training part. There is nothing running in training jobs
The training job keeps running indefinitely and never terminates. Additionally, the SageMaker "Training jobs" section does not show any active or ongoing training. My IAM role has full ...
0
votes
0
answers
118
views
Blank plot using plotly dash in jupyterlab (AWS Sagemaker)
When I run dash app using the code below in jupyterlab (AWS Sagemaker), the plot is blank.
from dash import Dash, html, dcc, Output, Input, callback
import plotly.express as px
import pandas as pd
...
1
vote
0
answers
40
views
Fixing Python import paths for a package in src used both from entry points outside of src and as a dependency in other packages?
I have a model package that runs in Sagemaker. It's structure looks something like this (domain-specific stuff redacted):
<My project root>
--> src
|--> potato
|---->...
1
vote
0
answers
35
views
increase the time resolution of cloudwatch in sagemaker
I am trying to increase the time resolution of cloudwatch in the sagemaker script I am using to train a model. I am trying to get it under a minute. I know that the high resolution for cloudwatch is a ...
0
votes
0
answers
32
views
list the files in sagemaker estimator or processor
Is there a way to have the sagemaker estimator or processor print out a list of files that it has in a certain directory without introducing custom scripts to them? Or if a custom script has to to be ...
0
votes
1
answer
172
views
"You are not authorized to use the Amazon SageMaker project templates" error despite having necessary permissions
I am trying to create a SageMaker Project in SageMaker Studio, but I keep getting the following error:
You are not authorized to use the Amazon SageMaker project templates.
Please contact your ...
0
votes
0
answers
84
views
filter amazon jumpstart models
I found this block of code that allows me to list all of the available jumpstart models in AWS. I wanted to find a list of the key words that I could filter by and the values. Does this exsist ...
0
votes
0
answers
40
views
Is it possible to output a word document from a sagemaker pipeline processing step?
I have a sagemaker pipeline where I output a summary of clustering data in a chart in a csv file. I am trying to update my pipeline to export the chart in a word document instead.
Using the python ...
1
vote
1
answer
110
views
Working of custom docker image in AWS Sagemaker
I am new to using the AWS Sagemaker services as well as docker. I have afew questions around designing the architecture to deploy my setup from test AWS account to production.
I have a single domain ...
1
vote
0
answers
34
views
Training script unable to load preprocessing model
I am new to Sagemaker, I am trying to create inference pipeline and for that I am creating two models one for preprocessing and another one for training. I am using SKLearn to create the both of those ...
0
votes
1
answer
91
views
deploying a Tensorflow model artifact to Sagemaker
I am trying to deploy a TensorFlow model to a Sagemaker endpoint. I have the model artifact at generic_graph.pb and the model's labels at labels.txt.
I started by creating a tar file with the ...
0
votes
0
answers
40
views
Training a SageMaker KMeans Model with Pipe Mode Results in InternalServerError
I am trying to train a SageMaker built-in KMeans model on data stored in RecordIO-Protobuf format, using the Pipe input mode. However, the training job fails with the following error:
...
0
votes
1
answer
67
views
How to deploy a model trained in AWS CANVAS using serverless inference?
As far as I know, AWS Sagemaker supports two approaches for the model inference: Provisioned and Serverless. This is specified in the endpoint configuration. My question is that when I train a model ...
0
votes
0
answers
82
views
How to format requirements.txt files for processing step in sagemaker pipelines?
I am trying to install a package in a processing job in my sagemaker pipeline through a requirements.txt file. I am able to create the .txt file, compress it to a tar.gz along with the processing ...
0
votes
0
answers
41
views
Is Sagemaker built-in LightGBM compatible with join_source="Input"
When doing a batch transform job for inference with the following code
transformer = model_package.transformer(
instance_count=1,
instance_type=inference_instance_type,
output_path=...
0
votes
0
answers
70
views
How to solve the module not found error for python packages in PySpark in AWS Sagemaker?
I have a business requirement to use Spark using Sagemaker Processing, I need to run a distributed code using pandas, numpy, gensim and sklearn. I generated a .zip file of all the packages installed ...
0
votes
0
answers
79
views
How to find the arn on the console
I have a arn: arn:aws:sagemaker:us-east-2:XXXXXXX:automl-job/auto-insurence-test12345
the above is created using the below lambda
import json
import boto3
def lambda_handler(event, context):
# ...
1
vote
0
answers
72
views
My Sagemaker Asynchronous Endpoint goes down to zero instances after 10 minutes
I'm creating an Asynchronous Endpoint because for a model that takes up to 40 minutes to process the input. I want to configure it so it can scale-in to zero instances, and once it receives an input, ...
0
votes
0
answers
49
views
How to write custom inference script when deploying TF2 models in sagemaker
I have a model trained in TF2 and was trying to deploy it using TensorFlowModel using sagemaker. When using the model, I also need to load other utilities, such as json to map inputs and a tokenizer (...
0
votes
0
answers
46
views
Automation of Python Scripts in AWS Sagemaker
I have built a pipelines to do ETL and run DeepAR model in Sagemaker but I realised the process of automation is so complicated for data science projects.
I have looked into StepFunction as well but ...
1
vote
1
answer
240
views
How to run a sagemaker pipeline with multiple data inputs in the preprocessing script?
I am trying to run a sagemaker pipeline with multiple data inputs in my data preprocessing script. When I run the pipeline with just one data input it runs, but when I introduce a second data input ...