I'm new to AWS and lambda and would like to trigger a messy file to be converted to a json format using python pandas. The file has no filetype but it can be read in notepad. I have the python file and its working properly but now Im struggling on how to get it incorporated into AWS. Do I transform the file within lambda itself or do I use another service?
I'd like to be able to run my python script and clean up the data within lambda or anther service (whichever is easiest)
Here's a copy of what the file looks like: (colunmns are all over the place and it has no headers)
option19971675181 ACHILLE BLA BLA BLA1 randomblablalba blabla 88 498
option19971675182 ACHILLE BLA BLA BLA 1 blabla 176498
option19971675183 ACHILLE BLA BLA BLA1 blabla 191 498
option19971675184 ACHILLE BLA BLA BLA1 randomblablalba blabla 521 498
option19971675185 ACHILLE BLA BLA BLA1 blabla 919 498
option19971675186 ACHILLEBLABLABLA134234531 randomblablalba blabla 10 498
option19971675187 ACHILLEBLABLABLA134234531 7 65 blabla 0 176498
option19971675188 ACHILLE BLA BLA BLA1342 90345 31 blabla 1764980
option19971675189 ACHILLEBLABLABLA13423N09487OP531 randomblablalba blabla 1764980
option19971675190 ACHILLE BLA BLA BLA 134 23N 094 87 OP53 blabla 0 0
in lambda I have: (I've also added a layer so that aws lambda can read pandas. I've tested this with dummy data and its working :) )
import json
import boto3
import pandas as pd
import io
def lambda_handler(event,context):
print(event)
bucket = event['Records][0]['s3']['bucket']['name']
key= event['Records][0]['s3']['object']['key']
response = s3_client.get_object(Bucket=bucket,Key=key)
data = response['Body'].read().decode('utf-8')
buf = io.STringIO(data
fileRow = buf.readline()
#continue python script to extract the data
Id like my data to be in a json format.