0

I have a lambda function that is calling from a redshift database and the goal is to save the query output as a df and send out the results via email multiple recipients if the output is activating a certain condition. I made sure to sync with an SNS to my function and have the right policies attached to the function.

Here is the last half of the function as the first half is just credentials and the query itself:

con = psycopg2.connect(conn_string)    
filename = '/tmp/Processlist.csv'
with con.cursor() as cur:
    # Enter the query that you want to execute
    cur.execute(sql_query)
    for row in cur:
      df = pd.DataFrame.from_records(cur.fetchall(), columns = [desc[0] for desc in cur.description])
      df['Time_Stamp'] = pd.to_datetime('now')
      if df['ca_active_hosts'] > 0:
        client2 = boto3.client('sns')
        response = client2.publish(
        TopicArn = 'arn:aws:sns:us-west-1:151316834390:email-data-lake',
        Message = 'Warning User has ' +df['ca_active_hosts'])+'at ' +df['Time_Stamp'],
        Subject = 'User Warning'
      )

The error I get after running is this:

Response
{
  "errorMessage": "The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().",
  "errorType": "ValueError",
  "stackTrace": [
    "  File \"/var/task/lambda_function.py\", line 175, in lambda_handler\n    if df['ca_active_hosts'] > 0:\n",
    "  File \"/opt/python/pandas/core/generic.py\", line 1527, in __nonzero__\n    raise ValueError(\n"
  ]
}

Do I need to convert the df['ca_active_hosts'] field to numeric or another type? Not sure how to resolve this.

Thanks for any help!

1 Answer 1

2

The problem is here

df['ca_active_hosts'] > 0

Probably, you have something like:

df['ca_active_hosts'] = [1, 18, 7, -1, 0, ...]

Which means the result of df['ca_active_hosts'] > 0 is

[True, True, True, False, False, ...]

is that sequence True or False? It's ambiguous. However, you could specify that all must be True, or any must be True, to get one single boolean value.

Sign up to request clarification or add additional context in comments.

2 Comments

You're absolutely right as those values correspond to different names. Is there a way to implement that 'all' statement in the "IF" line? like df['ca_active_hosts'].all() > 0?
@Dinho For all, you can use either (df['ca_active_hosts'] > 0).all() or df['ca_active_hosts'].gt(0).all(), whichever you think it more readable. You can also use python's own all method and say all(df['ca_active_hosts'] > 0).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.