I am trying to perform Dataset versioning where I read a CSV file into a pandas DataFrame and then create a new version of an Azure ML Dataset. I am running the below code in an Azure CLI job within Azure DevOps.
df = pd.read_csv(blob_sas_url)
At this line, I get a 404 Error. Error Message:
urllib.error.HTTPError: HTTP Error 404: The specified resource does not exist
I tried to do this locally, I was able to read the csv file into Dataframe. The SAS URL and token are not expired too.
How to solve this issue?
Edit - Code
def __init__(self, args):
self.args = args
self.run = Run.get_context()
self.workspace = self.run.experiment.workspace
def get_Dataframe(self):
print(self.args.blob_sas_url)
df = pd.read_csv(self.args.blob_sas_url)
return df
def create_pipeline(self):
print("Creating Pipeline")
print(self.args.blob_sas_url)
dataframe = self.dataset_to_update()
# Rest of Code
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Azure ML Dataset Versioning pipeline')
parser.add_argument('--blob_sas_url', type=str, help='SAS URL to the Data File in Blob Storage')
args = parser.parse_args()
ds_versioner = Pipeline(args)
ds_versioner.create_pipeline()
In both the instances where I print the SAS URL within the script print(self.args.blob_sas_url), the URL is shortened. I was able to see this in the std_log.txt file.
blob_sas_urlin your cli job to see what's actually being parsed?python yourscript.py --blob_sas_url $VARIABLE, how does the$VARIABLEcome in?