I have a Dataset in Azure Datafactory backed by a CSV file. I added an additional column in Dataset and want to pass it's value from Dataset parameter but value never gets copied to the column
"type": "AzureBlob",
"structure":
[
{
"name": "MyField",
"type": "String"
}
]
I have a defined parameter as well
"parameters": {
"MyParameter": {
"type": "String",
"defaultValue": "ABC"
}
}
How can copy the parameter value to Column? I tried following but doesn't work
"type": "AzureBlob",
"structure":
[
{
"name": "MyField",
"type": "String",
"value": "#dataset().MyParameter"
}
]
But this does not work. I am getting NULL in destination although parameter value is set
Based on document: Expressions and functions in Azure Data Factory , #dataset().XXX is not supported in Azure Data Factory so far. So, you can't use parameters value as custom column into sink or source with native copy activity directly.
However, you could adopt below workarounds:
1.You could create a custom activity and write code to do whatever you need.
2.You could stage the csv file in a azure data lake, then execute a U-SQL script to read the data from the file and append the new column with the pipeline rundId. Then output it to a new area in the data lake so that the data could be picked up by the rest of your pipeline. To do this, you just need to simply pass a Parameter to U-SQL from ADF. Please refer to the U-SQL Activity.
In this thread: use adf pipeline parameters as source to sink columns while mapping, the customer used the second way.
Related
Is there a way to set --for example-- the title and the team names in one operation instead of doing something like this?
[
{
"op": "add",
"path": "/fields/System.Title",
"from": null,
"value": "Test of REST functionality"
},
{
"op": "add",
"path": "/fields/System.State",
"from": null,
"value": "New"
}
]
If not, what would be the best way to go about creating a method/functions/class that create the object for the work item I'd like to cerate?
I'm afraid there's no way to do that in one operation. You have to specify the required field values based on your requirement.
You can reference the following samples to create or update the work items using the API.
Create a bug in Azure DevOps Services using .NET client libraries
WorkItemsSample
I'm new to JSON.
I see in various examples of JSON, like the following, where complex values are prefixed with "type":"object", properties { }
{
"$schema": "http://json-schema.org/draft-06/schema#",
"motor" : {
"type" : "object",
"properties" : {
"class" : "string",
"voltage" : "number",
"amperage" : "number"
}
}
}
I have written JSON without type, object, and properties, like the following.
{
"$schema": "http://json-schema.org/draft-06/schema#",
"motor" : {
"class" : "string",
"voltage" : "number",
"amperage" : "number"
}
}
and submitted to an on-line JSON schema validator with no errors.
What is the purpose of type:object, properties { }? Is it optional?
Yes it is optional, try removing it and use your validator.
{
"$schema": "http://json-schema.org/draft-06/schema#",
"foo": "bar"
}
You actually don't even need to use the $schema keyword i.e. {} is valid json
I would start by understanding what json is, https://www.json.org/ is the best place to start but you may prefer something easier to read like https://www.w3schools.com/js/js_json_intro.asp.
A schema is just a template (or definition) to make sure you're producing valid json for the consumer
As an example let's say you have an application that parses some json and looks for a key named test_score and saves the value (the score) in a database in some table/column. For this example we'll call the table tests and the column score. Since a database column requires a type we'll choose a numeric type, i.e. integer for our score column.
A valid json example for this may look like
{
"test_score": 100
}
Following this example the application would parse the key test_score and save the value 100 to the tests.score database table/column.
But let's say a score is absent so you put in a string i.e "NA"
{
"test_score": "NA"
}
When the application attempts to save NA to the database it will error because NA is a string not an integer which the database expects.
If you put each of those examples into any online json validator they are valid json example. However, while it's valid json to use "NA" or 100 it is not valid for the actual application that needs to consume the json.
So now you may understand that the author of the json may wonder
What are the different valid types I can use as values for my test
score?
The responsibility then falls on the writers of the application to provide some sort of definition (i.e a schema) that the clients (authors) can reference so the author knows exactly how to structure the json so the application can process it accordingly. Having a schema also allows you to validate/test your json so you know it can be processed by the application without actually having to send your json through the application.
So putting it altogether let's say in the schema you see
"$test_score": {
"type": "integer",
"format": "tinyint"
},
The writer of the json now knows that they must pass an integer and the range is 0 to 255 because it's a tinyint. They no longer have to trial by error different values and see which ones the application process. This is a big benefit to having a schema.
I have few identical parallel flows(as shown in screenshot). I have convertRecord in each of the identical flows and in the Record Reader I have used "Schema Text Field Property" as access strategy and specified the "Schema text". For Example:
{
"type": "record",
"name": "AVLRecord0",
"fields" : [
{"name": "TimeOfDay", "type": "string", "logicalType":"timestamp-millis"},
{"name":"Field1", "type": "double"},
{"name":"Field2", "type": "double"},
{"name":"Field3", "type": "double"},
{"name": "Filename", "type": "string"}
]
}
Lets say the above schema I have used across various parallel flows ConvertRecord, and now I want to update one field name from Field to Field_Name so is there any way I can do it in one go across all the convert record Schema Text?
If I want to change/update one of the Field in the schema Text do I have to change/Update the field name in each processor manually? Or there is a global way that will change the field name across all the parallel flow I have?
Is there Any way that I can update the Schema Text across various processors In one go?
Any help is much appreciated! Thanks
As you are using Schema Text Field Property so you need to change in all ConvertRecord processor manually.
Try with this approach:
In ConvertRecord processor use Schema Access Strategy as
Use Schema Name Property
Then set up AvroSchemaRegistry and define your schema by adding new property
I have added sch as schema.name and defined the avro schema.
After GetFile Processor use UpdateAttribute processor and add schema.name attribute(for ex: with value sch) to the flowfile.
Now in reader controller service use the Schema Access strategy as Use Schema Name Property and Schema Registry asAvroSchemaRegistry` that has already setup.
By following this way we are not defining schema on all ConvertRecord processors instead we are referring to same schema that defined in AvroSchemaRegistry in case if you want to change one field name it is easy to go into Registry and change the value.
Flow:
1.GetFile
2.UpdateAttribute //add schema.name attribute
3.ConvertRecord //define/use AvroSchemaRegistry and access strategy as schemaname property
..other processors
Refer to this link for more details regards to defining/using AvroSchemaRegistry.
I want to create a SQL connection and import data from an app (Shoutouts template) to SQL database. I created a SQL connection and tried to import the data in there but I got this error.
CreatedOnDateTime: The specified column is generated by the server and can't be specified
I do have the CreatedOnDateTime column created but I guess it's datatype is not the same or something else.
Where can I look and see what fields and datatypes are being imported from PowerApps to SQL table in PowerApps via SQL connection?
Thank you for your help!
Overall, there's no easy way to find out the structure of a data source in PowerApps (please create a new feature request in the PowerApps Ideas board for that). There is a convoluted way to find it out, however, which I'll go over here.
But for your specific problem, this is the schema of a SQL table that would match the schema of the data source in PowerApps:
CREATE TABLE PowerAppsTest.StackOverflow51847975 (
PrimaryID BIGINT PRIMARY KEY,
[Id] NVARCHAR(MAX),
[Message] NVARCHAR(MAX),
CreatedOnDateTime NVARCHAR(MAX),
CreatorEmail NVARCHAR(MAX),
CreatorName NVARCHAR(MAX),
RecipientEmail NVARCHAR(MAX),
RecipientName NVARCHAR(MAX),
ShoutoutType NVARCHAR(MAX),
[Image] IMAGE
)
Now for the generic case. You've been warned that this is convoluted, so proceed at your own risk :)
First, save the app locally to your computer:
The app will be saved with the .msapp extension, but it's basically a .zip file. If you're using Windows, you can rename it to change the extension to .zip and you'll be able to uncompress and extract the files that describe the app.
One of those files, Entities.json, contains, among other things, the definition of the schema of all data sources used in the app. The file is a huge JSON file, and it has all of its whitespaces removed, so you may want to use some online tool to format (or prettify) the JSON to read it easier. Once this is done, you can open the file in your favorite text editor (anything better than Notepad should be able to handle it).
With the file opened, search for an entry in the JSON root with the property "Name" and the value equal to the name of the data source. For example, in the shoutouts app case, the data source is called "Shoutout", so search for
"Name": "Shoutout"
You'll have to remove the space if you didn't pretty-print the JSON file prior to opening it. This should be an object that describes the data source, and it has one property called DataEntityMetadataJson that has the data source schema, formatted as a JSON string. Again in the Shoutouts example, this is the value:
"{\"name\":\"Shoutout\",\"title\":\"Shoutout\",\"x-ms-permission\":\"read-write\",\"schema\":{\"type\":\"array\",\"items\":{...
Notice that it again is not pretty-printed. You'll first need to decode that string, then pretty-print it again, and you'll end up with something like this:
{
"name": "Shoutout",
"title": "Shoutout",
"x-ms-permission": "read-write",
"schema": {
"type": "array",
"items": {
"type": "object",
"properties": {
"PrimaryID": {
"type": "number",
"format": "double",
...
},
"Message": {
"type": "string",
...
},
"Image": {
"type": "string",
"format": "uri",
"x-ms-media-kind": "image",
...
},
"Id": {
"type": "string",
...
},
"CreatedOnDateTime": {
"type": "string",
...
},
...
And this is the schema for the data source. From that I recreated the schema in SQL, removed the reference to the Shoutout data source from the app (which caused many errors), then added a reference to my SQL table, and since it has a different name, went looking for all places that have errors in the app to fix those.
Hope this helps!
I want to add field description in a bq table programmatically, I know how to do in UI.
I have this requirement because I have few tables in my dataset which are refreshed on a daily basis and we use "writeMode": "WRITE_TRUNCATE". This also deletes the description of all the field names of the table.
I have also added the description in my schema file for the table, like this
{
"name" : "tax",
"type" : "FLOAT",
"description" : "Tax amount customer paid"
}
But I don't see the descriptions in my final table after running the scripts to load data.
Some Tables API (https://cloud.google.com/bigquery/docs/reference/v2/tables) allow you to set table and schema's fields descriptions
You can set descriptions during
table creation - https://cloud.google.com/bigquery/docs/reference/v2/tables/insert
or after table created using one of below APIs:
Patch -
https://cloud.google.com/bigquery/docs/reference/v2/tables/patch
or Update - https://cloud.google.com/bigquery/docs/reference/v2/tables/update
I think, in your case Patch API is more suitable
Below link shows you table resources you can set with those APIs
https://cloud.google.com/bigquery/docs/reference/v2/tables#resource
BigQuery load jobs accept a schema that includes "description" with each field.
https://cloud.google.com/bigquery/docs/reference/v2/jobs#configuration.load
If you specify the description along with each field you are creating during your WRITE_TRUNCATE operation, the descriptions should be applied to the destination table.
Here's a snippet from the above link that includes the schema you are specifying:
"load": {
"sourceUris": [
string
],
"schema": {
"fields": [
{
"name": string,
"type": string,
"mode": string,
"fields": [
(TableFieldSchema)
],
"description": string
}
]
},