Calling databricks notebook using Databricks Job api runs-submit endpoint

Question

I am trying to establish an AWS lambda function which calls a databricks notebook (in the event of an s3 trigger).I understand I have to use the Jobs API of databricks in my lambda function(python) code to make a POST request using the JSON payload of the runs-submit function.

Although the documentation is not very clear, I was able to call a test script and on checking the response text I see the databricks login page html code which means it is not getting authenticated .

I did read on user tokens but I am not sure how to even incorporate them for authentication.

Any help of making this work in other ways or helping me use the user_tokens to get authenticated so that the flow reaches the execution of the notebook rather than getting stopped at authentication page would be helpful.

Thanks in advance.

Code Sample:

import requests
import json

job_payload = {
  "run_name": 'just_a_run',
  "existing_cluster_id": '****',
  "notebook_task": 
    {
      "notebook_path": 'https://databricks.cloud.company.com/****'
    }
}

resp = requests.post('https://databricks.cloud.company.com/2.0/jobs/runs/submit', json=job_payload)
print(resp.status_code)
print(resp.text)

200


<!DOCTYPE html>

<html>
<head>
    <meta charset="utf-8"/>
    <meta http-equiv="Content-Language" content="en"/>
    <title>Databricks - Sign In</title>
    <meta name="viewport" content="width=960">
    <link rel="stylesheet" href="/login/bootstrap.min.css">
    <link rel="icon" type="image/png" href="login/favicon.ico" />

    <meta http-equiv="content-type" content="text/html; charset=UTF8">
<link rel="shortcut icon" href="favicon.ico"><link href="login/login.e555bb48.css" rel="stylesheet"></head>
<body>
<div id="login-page"></div>
<script type="text/javascript" src="login/login.dabd48fd.js"></script></body>
</html>

munnahbaba · Accepted Answer · 2019-05-15 20:22:39Z

12

SOLVED:

1) You will need to create a user token for authorization and send it as 'headers' parameter while performing the REST request.

2) headers={'Authorization': 'Bearer token'} In place of token must be your actual token that you get from databricks.

3) The api link must start with /api

4) Path to the databricks notebook must be absolute path i.e. "/Users/$USER_NAME/book_name"

Final Working Code:

import requests
import json

job_payload = {
  "run_name": 'just_a_run',
  "existing_cluster_id": 'id_of_cluster',
  "notebook_task": 
    {
      "notebook_path": '/Users/username/notebook_name'
    }
}

resp = requests.post('https://databricks.cloud.company.com/api/2.0/jobs/runs/submit', json=job_payload, headers={'Authorization': 'Bearer token'})

print(resp.status_code)

print(resp.text)

answered May 15, 2019 at 20:22

munnahbaba

2511 gold badge3 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Abdul Haseeb Over a year ago

I'm triggering the job using run_now API and I'm getting response 200 but job is not triggering. I tired it with my other account and it worked for that using username and password for authentication.

Jari Turkia Over a year ago

This solution works partially. It will create a new job, but won't run it. Subsequent run job -call is required with the appropriate job ID.

thotwielder Over a year ago

Need some adjustment to your code, like token part and changed databricks.cloud.company.com to our databricks server. Then it works! Thanks!

Collectives™ on Stack Overflow

Calling databricks notebook using Databricks Job api runs-submit endpoint

1 Answer 1

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related