Everytime I run the API in the android app, it runs the query itself and retrieve data from the website instead of the stored data, how do I make it retrieve the data stored to save running time?
This isn't something you can do via the UI just yet, but it is coming!
If you have saved the results of your Extractor as a dataset you can do this via API:
To query a dataset, you need to query its "snapshot"...
First use the GetConnector API with the ID of your dataset:
http://api.docs.import.io/#!/Connector_Methods/getConnector
Note the snapshot ID
Use the ID of the dataset and the snapshot ID from the result and enter them here:
http://api.docs.import.io/#!/Connector_Methods/getDataSnapshot
This will return the data stored in your dataset.
Related
When a user regenerates the data of a schema, will the regenerated test data results in the same sequence of data as it was generated earlier or will the data be shuffled?
When you use the regenerate functionality of HCL OneTest Data, it is possible to get the same sequence of test data. In order to get the same sequence of test data while regeneration, you must enter some seed value while you generate the test data for the first time.
For more information, you can refer https://help.hcltechsw.com/onetest/hclonetestserver/10.1.2/com.hcl.test.otd.help.doc/topics/t_help_regenerate_data.html
What I am trying to achieve is this:
1. Access a REST API to download hotel reservation data - the data output format is in JSON
2. Convert JSON data into the correct format to be uploaded into SQL table
3. Upload this table of data onto Google BigQuery existing table as additional rows
Do let me know if any further information is required and if I have been clear enough
Thanks in advance
1) pretty good REST API tutorial
2) You can use a local SQL DB or use Cloud SQL. The process would be the same (Parse JSON and insert to DB)
If you decide to use Cloud SQL, you can parse the JSON and save it as a CSV then follow this tutorial
or
simply parse the JSON and insert using one of the following API's
3) Use can easily load data into any BigQuery table by using BigQuery API. You can also directly insert the JSON data into BigQuery
But as Tamir had mentioned, it would be best to ask questions if you encounter errors/issues. Since there are multiple ways to perform this type of scenario, we cannot provide an exact solution for you.
Is it possible to write an output parameter to a dataset?
I have a meta data activity that stores the file name of an azure blob dataset and I would like write that value into another azure blob dataset as an additional column via a copy activity.
Thanks
If you are looking to get the output of the previous operation as an input to the next operation, you could probably go ahead in the following manner,
I am hoping that the attribute you are getting is the child Items, the values for this can be obtained in the next step using the following expression.
#activity('Name_of_activity').output.childItems.
This would return an Array of your subfolders.
The following link should help you with the expression in ADF
Our challenge is the following one :
in an Azure SQL database, we have multiple tables with the following table names : table_num where num is just an integer. These tables are created dynamically so the number of tables can vary. (from table_1, table_2 to table_N) All tables have the same columns.
As part of a U-SQL script file, we would like to execute the same query on all of these tables and generate an output csv file with the combined results of all these queries.
We tried several things :
U-SQL does not allow looping so we were thinking creating a View in our Azure SQL database that would combine all the tables using a cursor of some sort. Then, the U-SQL file would query this View (using external source). However, a View in Azure SQL database can only be created via a function and a function cannot execute dynamic SQL or even call a stored procedure...
We did not find a way to call a stored procedure of the external data source directly from U-SQL
we dont want to update our U-SQL job each time a new table is added...
Is there a way to do that in U-SQL through a custom extractor for instance? Any other ideas?
One solution I can think of is to use Azure Data Factory (v2) to assist in this.
You could create a pipeline with the following activities:
Lookup activity configured to execute the stored procedure
For Each activity that uses the output of the lookup activity as a source
As a child item use a U-Sql Activity that executes your U-Sql script which writes the output of a single table (the item of the For Each activity) to blob or datalake
Add a Copy Activity that merges the blobs from step 2.1 to one final blob.
If you have little or no experience working with ADF v2 do mind that it takes some time to get to know it but once you do, you won't regret it. Having a GUI to create the pipeline is a nice bonus.
Edit: as #wBob mentions another (far easier) solution is to somehow create a single table with all rows since all dynamically generated table have the same schema. You can create a stored procedure for populating this table for example.
I am trying to find the best way to copy yesterday's data from DocumentDB to Azure SQL.
I have a working DocumentDB database that is recording data gathered via a web service. I would like to routinely (daily) copy all new records from the DocumentDB to an Azure SQL DB table. In order to do so I have created and successfully executed an Azure Data Factory Pipeline that copies records with a datetime > '2018-01-01', but I've only ever been able to get it to work with an arbitrary date - never getting the date from a variable.
My research on DocumentDB SQL querying shows that it has Mathematical, Type checking, String, Array, and Geospatial functions but no date-time functions equivalent to SQL Server's getdate() function.
I understand that Data Factory Pipelines have some system variables that are accessible, including utcnow(). I cannot figure out, though, how to actually use those by editing the JSON successfully. If I try just including utcnow() within the query I get an error from DocumentDB that "'utcnow' is not a recognized built-in function name".
"query": "SELECT * FROM c where c.StartTimestamp > utcnow()",
If I try instead to build the string within the JSON using utcnow() I can't even save it because of a syntax error:
"query": "SELECT * FROM c where c.StartTimestamp > " + utcnow(),
I am willing to try a different technology than a Data Factory Pipeline, but I have a lot of data in our DocumentDB so I'm not interested in abandoning that, and I have much greater familiarity with SQL programming and need to move the data there for joining and other analysis.
What is the easiest and best way to copy those new entries over every day into the staging table in Azure SQL?
Are you using ADF V2 or V1?
For ADF V2.
I think that you can follow the incremental approach that they recommend, for example you could have a watermark table (it could be in your target Azure SQL database) and two lookups activities, one of the lookups will obtain the previous run watermark value (it could be date, integer, whatever your audit value is) and another lookup activity to obtain the MAX (watermark_value, i.e. date) of your source document and have a CopyActivity that gets all the values where the c.StartTimeStamp<=MaxWatermarkValueFromSource AND c.StartTimeStamp>LastWaterMarkValue.
I followed this example using the Python SDK and worked for me.
https://learn.microsoft.com/en-us/azure/data-factory/tutorial-incremental-copy-powershell