How to add an extension to a file copy activity with Azure Data Factory

How to add an extension to a file copy activity with Azure Data Factory - azure-data-factory-2

The datasets that I ingest from a REST API endpoint do not include the .JSON extension to the files (even though they're JSON files). Therefore, can someone let me know where I can add a .json extension from the following scenarios
Scenario 1.
Adding .JSON to the relativeURL
Scenario 2
Adding .JSON to the SINK
Scenario 3
Adding .JSON to SOURCE - However, I don't think this is possible
Can someone please take a look at the three scenarios and let me know if I can add .JSON extension to any of those methods?

Thanks to #Scott Mildenberger, we can provide the name of the file and its extension from the sink dataset.
The following is a demonstration of the same. I have a file called sample without extension.
In the sink dataset, you can simply concat the extension to your filename (If it is just a single file, you can directly give the required name with extension directly). I have used the following dynamic content (fileName parameter value is req_filename).
#concat(dataset().fileName,'.json')
The following file would be generated in the sink.

Related

Parse file and patch schema on Publish

I'm trying to use the #sanity/react-hooks to create a document action to parse a file (from the draft) and patch other fields.
Example
A user adds a .txt file to a file field. When the document is published I would like to parse the file and patch some readOnly fields using data from the .txt file.
This means I need to be able to read the new file.
I've managed to make Actions work in simple ways, like access a string field and patch another field. But I can't seem to access the file asset.
I've followed this tutorial but it doesn't seem to work for a file asset.
Is this possible? And if so, how can I parse a file from the draft to patch another field?
Or perhaps a custom input is the way to go here?

Azure Data Factory HTTP Connector Data Copy - Bank Of England Statistical Database

I'm trying to use the HTTP connector to read a CSV of data from the BoE statistical database.
Take the SONIA rate for instance.
There is a download button for a CSV extract.
I've converted this to the following URL which downloads a CSV via web browser.
[https://www.bankofengland.co.uk/boeapps/database/_iadb-fromshowcolumns.asp?csv.x=yes&Datefrom=01/Dec/2021&Dateto=01/Dec/2021 &SeriesCodes=IUDSOIA&CSVF=TN&UsingCodes=Y][1]
Putting this in the Base URL it connects and pulls the data.
I'm trying to split this out so that I can parameterise some of it.
Base
https://www.bankofengland.co.uk/boeapps/database
Relative
_iadb-fromshowcolumns.asp?csv.x=yes&Datefrom=01/Dec/2021&Dateto=01/Dec/2021 &SeriesCodes=IUDSOIA&CSVF=TN&UsingCodes=Y
It won't fetch the data, however when it's all combined in the base URL it does.
I've tried to add a "/" at the start of the relative URL as well and that hasn't worked either.
According to the documentation ADF puts the "/" in for you "[Base]/[Relative]"
Does anyone know what I'm doing wrong?
Thanks,
Dan
[1]: https://www.bankofengland.co.uk/boeapps/database/_iadb-fromshowcolumns.asp?csv.x=yes&Datefrom=01/Dec/2021&Dateto=01/Dec/2021 &SeriesCodes=IUDSOIA&CSVF=TN&UsingCodes=Y

I don't see a way you could download that data directly as a csv file. The data seems to be manually copied from the site, using their Save as option.
They have used read-only block and hidden elements, I doubt there would any easy way or out of the box method within ADF web activity to help on this.
You can just manually copy-paste into a csv file.

airbyte ETL ,connection between http API source and big query

i have a task in hand, where I am supposed to create python based HTTP API connector for airbyte. connector will return a response which will contain some links of zip files.
each zip file contains csv file, which is supposed to be uploaded to the bigquery
now I have made the connector which is returning the URL of the zip file.
The main question is how to send the underlying csv file to the bigquery ,
i can for sure unzip or even read the csv file in the python connector, but i am stuck on the part of sending the same to the bigquery.
p.s if you guys can tell me even about sending the CSV to google cloud storage, that will be awesome too

When you are building an Airbyte source connector with the CDK your connector code must output records that will be sent to the destination, BigQuery in your case. This allows to decouple extraction logic from loading logic, and makes your source connector destination agnostic.
I'd suggest this high level logic in your source connector's implementation:
Call the source API to retrieve the zip's url
Download + unzip the zip
Parse the CSV file with Pandas
Output parsed records
This is under the assumption that all CSV files have the same schema. If not you'll have to declare one stream per schema.
A great guide, with more details on how to develop a Python connector is available here.
Once your source connector outputs AirbyteRecordMessages you'll be able to connect it to BigQuery and chose the best loading method according to your need (Standard or GCS staging).

Reading yaml properties file from S3

I have a yaml properties file stored in a S3 bucket. In Mule4 I can read this file using S3 connector. I need to use properties defined in this file (for dynamic values reading and using it in Mule4) in DB connectors. I am not able to create properties from this file such that I can use them as ${dbUser} in mule configuration or flow as an example. Any guidance on how can I accomplish this?

You will not be able to use the S3 connector to do that. The connector can read the file in an operation at execution time, but properties placeholders, like ${dbUser} have to be defined earlier, at deployment time.
You might be able to to read the value into a variable (for example: #[vars.dbUser]) and use the variable in the database connector configuration. That is called a dynamic configuration, because it is evaluated dynamically at execution time.

File sink in gnuradio

I am using USRP1 along with gnuradio. I want to store received data in a file using file sink. I would like to have an idea about the flow graph and with what extension I can store the file and how to read the data from the file. Thanks in advance.

Did you search online? Pretty common task that many have documented. For example, check out Dynamic file names in GNU Radio, which links back to a page of examples including writing I&Q to file 2

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to add an extension to a file copy activity with Azure Data Factory - azure-data-factory-2

Related

Parse file and patch schema on Publish

Azure Data Factory HTTP Connector Data Copy - Bank Of England Statistical Database

airbyte ETL ,connection between http API source and big query

Reading yaml properties file from S3

File sink in gnuradio

Categories

Resources