Parallel ForEach Iteration for Execute Pipeline activity - azure-data-factory-2

I want to execute a pipeline in parallel via ForEach Activity.
Below is the sample code for parent and child pipeline.
{
"name": "Parent",
"properties": {
"activities": [
{
"name": "ForEach1",
"type": "ForEach",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"items": {
"value": "#pipeline().parameters.Test",
"type": "Expression"
},
"isSequential": false,
"activities": [
{
"name": "Execute Pipeline1",
"type": "ExecutePipeline",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"pipeline": {
"referenceName": "Child",
"type": "PipelineReference"
},
"waitOnCompletion": true
}
}
]
}
}
],
"concurrency": 10,
"parameters": {
"Test": {
"type": "array",
"defaultValue": [
1,
2
]
}
},
"annotations": []
}
}
{
"name": "Child",
"properties": {
"activities": [
{
"name": "Wait1",
"type": "Wait",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"waitTimeInSeconds": 120
}
}
],
"concurrency": 10,
"annotations": []
}
}
After executing the pipeline, ForEach iteration is executing sequentially rather than parallelly therein executing child pipeline sequentially.
Is there any configuration changes that i am missing out which is causing child pipeline to run sequentially.

My assumption is you are executing the pipeline in debug mode :
Because the same code I executed via trigger, it was as expected wherein the execute pipeline was in parallel
Note: The updated the parameter to 7 iterations hence you are able to see more child pipelines

Related

Run dynamic stored procedure from azure logic app

I want to run multiple stored procedures from logic app for Azure SQL database. I want names of the stored procedure to be calculated based on a variable name.
I have a variable with values (API_test1_SP1, API_test2_SP1, API_test3_SP1).
In a for loop, I want to run these stored procedures API_test1, API_test2 and API_test3.
I want to remove _SPI from the variable names and run the stored procedures (API_test1, API_test2, API_test3) for Azure SQL database.
I tried following expression without luck
#{concat(API_,slice(#{variables('variable_name')},1,lastIndexOf('_')))}
Is it possible to run stored procedure like this in logic app?
You can use the below expression to achieve your requirement.
first(split(variables('Array')?[iterationIndexes('Until')],'_SP1'))
To reproduce the issue, I have used the below flow in my logic app.
RESULTS:
Below is the codeview of my logic app
{
"definition": {
"$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
"actions": {
"Initialize_variable_-_array": {
"inputs": {
"variables": [
{
"name": "Array",
"type": "array",
"value": [
"API_test1_SP1",
"API_test2_SP1",
"API_test3_SP1"
]
}
]
},
"runAfter": {},
"type": "InitializeVariable"
},
"Initialize_variable_-_loop": {
"inputs": {
"variables": [
{
"name": "loop",
"type": "integer",
"value": 0
}
]
},
"runAfter": {
"Initialize_variable_-_array": [
"Succeeded"
]
},
"type": "InitializeVariable"
},
"Until": {
"actions": {
"Compose": {
"inputs": "#first(split(variables('Array')?[iterationIndexes('Until')],'_SP1'))",
"runAfter": {},
"type": "Compose"
},
"Increment_variable": {
"inputs": {
"name": "loop",
"value": 1
},
"runAfter": {
"Compose": [
"Succeeded"
]
},
"type": "IncrementVariable"
}
},
"expression": "#equals(variables('loop'), length(variables('Array')))",
"limit": {
"count": 60,
"timeout": "PT1H"
},
"runAfter": {
"Initialize_variable_-_loop": [
"Succeeded"
]
},
"type": "Until"
}
},
"contentVersion": "1.0.0.0",
"outputs": {},
"parameters": {},
"triggers": {
"manual": {
"inputs": {
"schema": {}
},
"kind": "Http",
"type": "Request"
}
}
},
"parameters": {}
}

Function App with VNet Integration Failing Deployment When Setting WEBSITE_CONTENTAZUREFILECONNECTIONSTRING to Storage Behind Firewall

The following ARM template deploys: Virtual Network, Network Security Group, Storage Account, App Service Plan, Function App
When the settings for WEBSITE_CONTENTAZUREFILECONNECTIONSTRING and WEBSITE_CONTENTSHARE are omitted (commented out) the deployment succeeds but the function app configuration shows a warning.
When enabling the two settings, the deployment fails with a 403 Forbidden message.
New-AzResourceGroupDeployment : 17:04:05 - The deployment '20201209-170356' failed with error(s). Showing 1 out of 1 error(s).
Status Message: There was a conflict. The remote server returned an error: (403) Forbidden. (Code: BadRequest)
- There was a conflict. The remote server returned an error: (403) Forbidden. (Code:)
- (Code:BadRequest)
- (Code:)
CorrelationId: ec11767b-9f8f-4722-acca-e751e5c1bbe8
I have tried numerous settings on the NSG, adding service tags, allowing IPs associated with the function app. I have also tried allowing IPRules on the storage account firewall. The only setting that worked was to entirely disable the storage account firewall with 'Allow access from all networks', which is not an acceptable setting for the network.
The ARM template to demonstrate the error:
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
},
"variables": {
"vnetName": "vnet1a",
"addressPrefixVnet": "10.17.0.0/20",
"addressPrefixSubnet": "10.17.4.0/24",
"nsgName_sb_functionapp": "[concat(variables('vnetName'), '-sb-functionapp-nsg')]",
"storageAccountName": "[concat(uniquestring(resourceGroup().id), 'sa1a')]",
"appServicePlanName": "[concat(uniquestring(resourceGroup().id), 'asp1a')]",
"functionAppName": "[concat(uniquestring(resourceGroup().id), 'asp1a')]"
},
"resources": [
{
"type": "Microsoft.Network/networkSecurityGroups",
"apiVersion": "2019-11-01",
"name": "[variables('nsgName_sb_functionapp')]",
"location": "[resourceGroup().location]",
"tags": {
"Purpose": "Function App"
},
"properties": {
"securityRules": []
}
},
{
"type": "Microsoft.Network/virtualNetworks",
"apiVersion": "2019-11-01",
"name": "[variables('vnetName')]",
"location": "[resourceGroup().location]",
"dependsOn": [
"[resourceId('Microsoft.Network/networkSecurityGroups', variables('nsgName_sb_functionapp'))]"
],
"tags": {
"Purpose": "Debug Function App and Storage Account Connectivity"
},
"properties": {
"addressSpace": {
"addressPrefixes": [
"[variables('addressPrefixVnet')]"
]
},
"subnets": [
{
"name": "sb-functionapp",
"properties": {
"addressPrefix": "[variables('addressPrefixSubnet')]",
"networkSecurityGroup": {
"id": "[resourceId('Microsoft.Network/networkSecurityGroups', variables('nsgName_sb_functionapp'))]"
},
"serviceEndpoints": [
{
"service": "Microsoft.Storage",
"locations": [
"*"
]
}
],
"delegations": [
{
"name": "delegation",
"properties": {
"serviceName": "Microsoft.Web/serverFarms"
}
}
],
"privateEndpointNetworkPolicies": "Enabled",
"privateLinkServiceNetworkPolicies": "Enabled"
}
}
],
"enableDdosProtection": false,
"enableVmProtection": false
}
},
{
"type": "Microsoft.Storage/storageAccounts",
"apiVersion": "2019-04-01",
"name": "[variables('storageAccountName')]",
"location": "[resourceGroup().location]",
"tags": {
"Purpose": "Debug Function App and Storage Account Connectivity"
},
"kind": "StorageV2",
"sku": {
"name": "Standard_GRS",
"tier": "Standard"
},
"properties": {
"networkAcls": {
"defaultAction": "Deny",
"bypass": "AzureServices",
"supportsHttpsTrafficOnly": true,
"ipRules": [],
"encryption": {
"keySource": "Microsoft.Storage",
"services": {
"file": {
"enabled": true
},
"blob": {
"enabled": true
}
}
},
"accessTier": "Hot",
"virtualNetworkRules": [
{
"id": "[concat(resourceId('Microsoft.Network/virtualNetworks', variables('vnetName')), '/subnets/sb-functionapp')]",
"ignoreMissingVNetServiceEndpoint": false
}
]
}
}
},
{
"type": "Microsoft.Web/serverfarms",
"apiVersion": "2018-02-01",
"name": "[variables('appServicePlanName')]",
"location": "[resourceGroup().location]",
"tags": {
"Purpose": "Debug Function App and Storage Account Connectivity"
},
"sku": {
"name": "EP1",
"tier": "ElasticPremium",
"size": "EP1",
"family": "EP",
"capacity": 1
},
"kind": "elastic",
"properties": {
"perSiteScaling": false,
"maximumElasticWorkerCount": 20,
"isSpot": false,
"reserved": false,
"isXenon": false,
"hyperV": false,
"targetWorkerCount": 0,
"targetWorkerSizeId": 0
}
},
{
"type": "Microsoft.Web/sites",
"apiVersion": "2018-11-01",
"name": "[variables('functionAppName')]",
"location": "[resourceGroup().location]",
"dependsOn": [
"[resourceId('Microsoft.Web/serverfarms', variables('appServicePlanName'))]"
],
"tags": {
"Purpose": "Debug Function App and Storage Account Connectivity"
},
"kind": "functionapp",
"properties": {
"enabled": true,
"hostNameSslStates": [
{
"name": "[concat(variables('functionAppName'), '.azurewebsites.net')]",
"sslState": "Disabled",
"hostType": "Standard"
},
{
"name": "[concat(variables('functionAppName'), '.scm.azurewebsites.net')]",
"sslState": "Disabled",
"hostType": "Repository"
}
],
"serverFarmId": "[resourceId('Microsoft.Web/serverfarms', variables('appServicePlanName'))]",
"reserved": false,
"isXenon": false,
"hyperV": false,
"scmSiteAlsoStopped": false,
"clientAffinityEnabled": true,
"clientCertEnabled": false,
"hostNamesDisabled": false,
"containerSize": 1536,
"dailyMemoryTimeQuota": 0,
"httpsOnly": true,
"redundancyMode": "None",
"siteConfig": {
"appSettings": [
{
"name": "FUNCTIONS_EXTENSION_VERSION",
"value": "~1"
},
{
"name": "WEBSITE_CONTENTAZUREFILECONNECTIONSTRING",
"value": "[concat('DefaultEndpointsProtocol=https;AccountName=', variables('storageAccountName'), ';AccountKey=', listKeys(resourceId('Microsoft.Storage/storageAccounts', variables('storageAccountName')), '2019-04-01').keys[0].value)]"
},
{
"name": "WEBSITE_CONTENTSHARE",
"value": "[variables('functionAppName')]"
},
{
"name": "WEBSITE_DNS_SERVER",
"value": "168.63.129.16"
},
{
"name": "WEBSITE_VNET_ROUTE_ALL",
"value": "1"
}
]
}
},
"resources": [
{
"type": "networkConfig",
"apiVersion": "2018-11-01",
"name": "virtualNetwork",
"location": "[resourceGroup().location]",
"dependsOn": [
"[resourceId('Microsoft.Web/sites', variables('functionAppName'))]"
],
"properties": {
"subnetResourceId": "[concat(resourceId('Microsoft.Network/virtualNetworks', variables('vnetName')), '/subnets/sb-functionapp')]",
"swiftSupported": true
}
}
]
},
{
"type": "Microsoft.Web/sites/config",
"apiVersion": "2018-11-01",
"name": "[concat(variables('functionAppName'), '/web')]",
"location": "[resourceGroup().location]",
"dependsOn": [
"[resourceId('Microsoft.Web/sites', variables('functionAppName'))]"
],
"tags": {
"Purpose": "Debug Function App and Storage Account Connectivity"
},
"properties": {
"numberOfWorkers": 1,
"defaultDocuments": [
"Default.htm",
"Default.html",
"Default.asp",
"index.htm",
"index.html",
"iisstart.htm",
"default.aspx",
"index.php"
],
"netFrameworkVersion": "v4.0",
"phpVersion": "5.6",
"requestTracingEnabled": false,
"remoteDebuggingEnabled": false,
"remoteDebuggingVersion": "VS2019",
"httpLoggingEnabled": false,
"logsDirectorySizeLimit": 35,
"detailedErrorLoggingEnabled": false,
"publishingUsername": "[concat('$', variables('functionAppName'))]",
"scmType": "VSTSRM",
"use32BitWorkerProcess": true,
"webSocketsEnabled": false,
"alwaysOn": false,
"managedPipelineMode": "Integrated",
"virtualApplications": [
{
"virtualPath": "/",
"physicalPath": "site\\wwwroot",
"preloadEnabled": true
}
],
"loadBalancing": "LeastRequests",
"experiments": {
"rampUpRules": [
]
},
"autoHealEnabled": false,
"cors": {
"allowedOrigins": [],
"supportCredentials": false
},
"localMySqlEnabled": false,
"ipSecurityRestrictions": [],
"scmIpSecurityRestrictions": [
{
"ipAddress": "Any",
"action": "Allow",
"priority": 1,
"name": "Allow all",
"description": "Allow all access"
}
],
"scmIpSecurityRestrictionsUseMain": false,
"http20Enabled": false,
"minTlsVersion": "1.2",
"ftpsState": "AllAllowed",
"reservedInstanceCount": 1
}
}
]
}
Command to deploy to existing resource group:
New-AzResourceGroupDeployment -Name (Get-Date).ToString('yyyyMMdd-HHmmss') -ResourceGroupName 'Test-FunctionApp-Storage-VNet' -TemplateFile .\DebugFunctionApp.json -Verbose
I have seen the question/answer at Function App Deployment Failed - The remote server returned an error: (403) Forbidden but it doesn't solve the problem I see.
The solution is to add another setting named WEBSITE_CONTENTOVERVNET and to set the value to "1".
The updated appSettings section looks like:
"siteConfig": {
"appSettings": [
{
"name": "FUNCTIONS_EXTENSION_VERSION",
"value": "~1"
},
{
"name": "WEBSITE_CONTENTAZUREFILECONNECTIONSTRING",
"value": "[concat('DefaultEndpointsProtocol=https;AccountName=', variables('storageAccountName'), ';AccountKey=', listKeys(resourceId('Microsoft.Storage/storageAccounts', variables('storageAccountName')), '2019-04-01').keys[0].value)]"
},
{
"name": "WEBSITE_CONTENTOVERVNET",
"value": "1"
},
{
"name": "WEBSITE_CONTENTSHARE",
"value": "[variables('functionAppName')]"
},
{
"name": "WEBSITE_DNS_SERVER",
"value": "168.63.129.16"
},
{
"name": "WEBSITE_VNET_ROUTE_ALL",
"value": "1"
}
]
}
The setting is document at https://learn.microsoft.com/en-us/azure/azure-functions/functions-app-settings#website_contentovervnet
For Premium plans only. A value of 1 enables your function app to scale when you have your storage account restricted to a virtual network. You should enable this setting when restricting your storage account to a virtual network.

Add Bootstrap Actions while creating EMR cluster from AWS Step Functions

I'm creating EMR cluster from Step Functions using below code,
"spinning_emr_cluster": {
"Type": "Task",
"Resource": "arn:aws:states:::elasticmapreduce:createCluster.sync",
"Parameters": {
"Name": "CombineFiles",
"VisibleToAllUsers": true,
"ReleaseLabel": "emr-5.29.0",
"Applications": [
{
"Name": "Spark"
}
],
"ServiceRole": "EMR_DefaultRole",
"JobFlowRole": "EMR_EC2_DefaultRole",
"LogUri": "s3://awsmssqltos3/emr_logs/",
"Instances": {
"KeepJobFlowAliveWhenNoSteps": true,
"InstanceFleets": [
{
"Name": "Master",
"InstanceFleetType": "MASTER",
"TargetOnDemandCapacity": 1,
"InstanceTypeConfigs": [
{
"InstanceType": "m1.large"
}
]
},
{
"Name": "Slave",
"InstanceFleetType": "CORE",
"TargetOnDemandCapacity": 1,
"InstanceTypeConfigs": [
{
"InstanceType": "m1.large"
}
]
}
]
}
},
"ResultPath": "$.CreateClusterResult",
"Next": "lambda"
I want to add bootstrap actions while creating the cluster from AWS Step Functions. I have tried searching online but could not find any syntax for that.
"BootstrapActions": [
{
"Name": "CustomBootStrapAction",
"ScriptBootstrapAction": {
"Path": "",
"Args": []
}
}
]
Please Add above code inside Parameters Block.

Data factory polybase - zero rows copied to sink

Data factory pipeline using the copy activity from a source data warehouse -> staging blob storage -> sink data warehouse.
The copy from source to blob works as expected (rows are copied). The copy from staging to sink fails - 0 rows copied
Disabling Polybase , and using bulk insert works.
{
"name": "PI_TEST",
"properties": {
"activities": [
{
"name": "MaterializedEventIdFilter_Copy",
"type": "Copy",
"dependsOn": [],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [
{
"name": "Destination",
"value": "[formigration].[MaterializedEventIdFilter]"
}
],
"typeProperties": {
"source": {
"type": "SqlDWSource",
"sqlReaderStoredProcedureName": "[formigration].[proc_GetStgMaterializedEventIdFilter]"
},
"sink": {
"type": "SqlDWSink",
"allowPolyBase": true,
"writeBatchSize": 100000,
"polyBaseSettings": {
"rejectValue": 0,
"rejectType": "value",
"useTypeDefault": false
}
},
"enableStaging": true,
"stagingSettings": {
"linkedServiceName": {
"referenceName": "riskstoreprd",
"type": "LinkedServiceReference"
},
"enableCompression": true
}
},
"inputs": [
{
"referenceName": "ioPrePrdMaterializedEventIdFilter",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "CloudPrdMaterializedEventIdFilter",
"type": "DatasetReference"
}
]
},
{
"name": "MaterialisedEvent",
"type": "SqlServerStoredProcedure",
"dependsOn": [
{
"activity": "MaterializedEventIdFilter_Copy",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"timeout": "7.00:00:00",
"retry": 2,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"storedProcedureName": "[formigration].[proc_SetStgMaterializedEventIdFilter]"
},
"linkedServiceName": {
"referenceName": "cloud_prd",
"type": "LinkedServiceReference"
}
}
],
"annotations": []
},
"type": "Microsoft.DataFactory/factories/pipelines"
}
I expected the data from the blob to make it into the sink but no rows are copied.
Edit 1:
Checked the data warehouse (sink) a connection is made where I can observe the external tables etc created form the blob storage all within a second, yet no data is copied in.
INSERT INTO [formigration].[MaterializedEventIdFilter] SELECT * FROM [ADFCopyGeneratedExternalTable_307e2c7f-a56f-4b75-86fb-10ab0cb94548]
In polybase, external tables are just a reference to a blob storage folder/file and they dont have any rows. If you want to actually copy data into your warehouse, create a regular table and use it as a sink in your copy activity!!
Hope this helped!

Azure Container Groups and Later Swapping One of the Containers in the Container Group

When you deploy an azure container group ("Microsoft.ContainerInstance/containerGroups"), can you replace just one of the containers at a later time?
Or does the creation of the container_group have to have all the containers at the time of creation (of the container group) ?
https://learn.microsoft.com/en-us/azure/container-instances/container-instances-multi-container-group
"resources": [
{
"name": "[parameters('resourceGroupName')]",
"type": "Microsoft.ContainerInstance/containerGroups",
"apiVersion": "2018-06-01",
"location": "[resourceGroup().location]",
"properties": {
"containers": [
{
"name": "[parameters('loggingContainerName')]",
"properties": {
"image": "[parameters('loggingContainerImage')]",
"resources": {
"requests": {
"cpu": 1,
"memoryInGb": 1
}
},
"volumeMounts": [
{
"name": "[parameters('volumeName')]",
"mountPath": "/aci/logs/"
}
],
"ports": [
{
"port": 8080
}
]
}
},
{
"name": "[parameters('jobGeneratorContainerName')]",
"properties": {
"image": "[parameters('jobGeneratorContainerImage')]",
"resources": {
"requests": {
"cpu": 1,
"memoryInGb": 1
}
},
"ports": [
{
"port": 80
}
],
"volumeMounts": [
{
"name": "[parameters('volumeName')]",
"mountPath": "/aci/logs/"
}
],
"environmentVariables": [
{
"name": "ServiceBusConnectionString",
"value": "[parameters('serviceBusConnectionStringSend')]"
},
{
"name": "LoggingServiceUrl",
"value": "[parameters('loggingServiceUrl')]"
}
]
}
},
{
"name": "[parameters('jobProcessingContainerName')]",
"properties": {
"image": "[parameters('jobProcessingContainerImage')]",
"resources": {
"requests": {
"cpu": 1,
"memoryInGb": 1
}
},
"ports": [
{
"port": 8000
}
],
"environmentVariables": [
{
"name": "ServiceBusConnectionString",
"value": "[parameters('serviceBusConnectionStringListen')]"
},
{
"name": "LoggingServiceUrl",
"value": "[parameters('loggingServiceUrl')]"
}
]
}
}
],
"osType": "Linux",
"ipAddress": {
"type": "Public",
"ports": [
{
"protocol": "tcp",
"port": "80"
},
{
"protocol": "TCP",
"port": 443
}
],
"dnsNameLabel": "[uniqueString( resourceGroup().id )]"
},
As I know, what you said is right. The top level in Azure container instances is container group. No matter one or more than one container instances you want to create in a container group, you should create it or them in one time.
If the container group is created, you cannot change it, such as adding containers or changing container images. If you really want, you just can create a new one.
By the way, the multi-container group only support for Linux containers.