2

I've been trying to find a way to run a simple command against one of my existing Azure VMs using Azure Data Factory V2.

Options so far:

  • Custom Activity/Azure Batch won't let me add existing VMs to the pool
  • Azure Functions - I have not played with this but I have not found any documentation on this using AZ Functions.
  • Azure Cloud Shell - I've tried this using the browser UI and it works, however I cannot find a way of doing this via ADF V2

The use case is the following:

There are a few tasks that are running locally (Azure VM) in task scheduler that I'd like to orchestrate using ADF as everything else is in ADF, these tasks are usually python applications that restore a SQL Backup and or purge some folders.

i.e. sqdb-restore -r myDatabase

where sqldb-restore is a command that is recognized locally after installing my local python library. Unfortunately the python app needs to live locally in the VM.

Any suggestions? Thanks.

2 Answers 2

2

Thanks to @martin-esteban-zurita, his answer helped me to get to what I needed and this was a beautiful and fun experiment.

It is important to understand that Azure Automation is used for many things regarding resource orchestration in Azure (VMs, Services, DevOps), this automation can be done with Powershell and/or Python.

In this particular case I did not need to modify/maintain/orchestrate any Azure resource, I needed to actually run a Bash/Powershell command remotely into one of my existing VMs where I have multiple Powershell/Bash commands running recurrently in "Task Scheduler". "Task Scheduler" was adding unnecessary overhead to my data pipelines because it was unable to talk to ADF.

In addition, Azure Automation natively only runs Powershell/Python commands in Azure Cloud Shell which is very useful to orchestrate resources like turning on/off Azure VMs, adding/removing permissions from other Azure services, running maintenance or purge processes, etc, but I was still unable to run commands locally in an existing VM. This is where the Hybrid Runbook Worker came into to picture. A Hybrid worker group

These are the steps to accomplish this use case.

1. Create an Azure Automation Account

2. Install the Windows Hybrid Worker in my existing VM . In my case it was tricky because my proxy was giving me some errors. I ended up downloading the Nuget Package and manually installing it.

.\New-OnPremiseHybridWorker.ps1 -AutomationAccountName <NameofAutomationAccount> -AAResourceGroupName <NameofResourceGroup>
-OMSResourceGroupName <NameofOResourceGroup> -HybridGroupName <NameofHRWGroup>
-SubscriptionId <AzureSubscriptionId> -WorkspaceName <NameOfLogAnalyticsWorkspace>

Keep in mind that in the above code, you will need to find your own parameter values, the only parameter that does not have to be found and will be created is HybridGroupName this will define the name of the Hybrid Group

3. Create a PowerShell Runbook

[CmdletBinding()]
Param
([object]$WebhookData) #this parameter name needs to be called WebHookData otherwise the webhook does not work as expected.
$VerbosePreference = 'continue'

#region Verify if Runbook is started from Webhook.

# If runbook was called from Webhook, WebhookData will not be null.
if ($WebHookData){

    # Collect properties of WebhookData
    $WebhookName     =     $WebHookData.WebhookName
    # $WebhookHeaders  =     $WebHookData.RequestHeader
    $WebhookBody     =     $WebHookData.RequestBody

    # Collect individual headers. Input converted from JSON.
    $Input = (ConvertFrom-Json -InputObject $WebhookBody)
    # Write-Verbose "WebhookBody: $Input"
    #Write-Output -InputObject ('Runbook started from webhook {0} by {1}.' -f $WebhookName, $From)
}
else
{
   Write-Error -Message 'Runbook was not started from Webhook' -ErrorAction stop
}
#endregion

# This is where I run the commands that were in task scheduler

$callBackUri = $Input.callBackUri

 # This is extremely important for ADF
 Invoke-WebRequest -Uri $callBackUri -Method POST

4. Create a Runbook Webhook pointing to the Hybrid Worker's VM

enter image description here

enter image description here

4. Create a webhook activity in ADF where the above PowerShell runbook script will be called via a POST Method

Important Note: When I created the webhook activity it was timing out after 10 minutes (default), so I noticed in the Azure Automation Account that I was actually getting INPUT data (WEBHOOKDATA) that contained a JSON structure with the following elements:

  • WebhookName
  • RequestBody (This one contains whatever you add in the Body plus a default element called callBackUri)

All I had to do was to invoke the callBackUri from Azure Automation. And this is why in the PowerShell runbook code I added Invoke-WebRequest -Uri $callBackUri -Method POST. With this, ADF was succeeding/failing instead of timing out.

There are many other details that I struggled with when installing the hybrid worker in my VM but those are more specific to your environment/company.

Sign up to request clarification or add additional context in comments.

1 Comment

Awesome tutorial saul!
1

This looks like a use case that is supported with Azure Automation, using a hybrid worker. Try reading here: https://learn.microsoft.com/en-us/azure/automation/automation-hybrid-runbook-worker

You can call runbooks with webhooks in ADFv2, using the web activity.

Hope this helped!

3 Comments

Thanks Martin, will give it a try, I've read about it before but I did not know about the webhooks in ADF V2, will try and get back to you.
I've only used it with PowerShell, but you should be able to create a runbook script that calls python with powershell, create a webhook for this runbook, and call it with ADF. Dont be afraid to ask if you need more help :)
Thanks a lot, after a few hours understanding how to use a hybrid worker and runbooks I finally was able to reproduce the use case. I will provide the full answer in a separate Answer on this question, however your suggestion deserves the get the answer. :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.