Is Possible To Schedule Azure Data Factory Triggers In Synapse While Not in Live Mode

Is Possible To Schedule Azure Data Factory Triggers In Synapse While Not in Live Mode - azure-data-factory-2

Is it possible to schedule Azure Data Factory Triggers in Git hub mode? My understanding is that Scheduled Triggers will on work in Synapse Live Mode.
As you can see from the image, I'm in Git Hub mode. Is it possible schedule triggers while in this mode?
In an update to the question, I am attempting to create a Trigger in Git Hub mode, however I'm getting the message "Make sure to "Publish" for trigger to be activated after clicking "Save"'. However, when I view my Triggers in Monitor, I can see the Triggers have started even though I haven't published them. Can you let me know if the trigger is actually working or not?

Yes, it is possible Schedule a Triggers in Azure synapse with Git Hub mode.
Go to New trigger -> select your type schedule, start time and time zone.
You can check the Trigger, Pipeline successfully executed and got the output:

Related

How to make Dataproc detect Python-Hive connection as a Yarn Job?

I launch a Dataproc cluster and serve Hive on it. Remotely from any machine I use Pyhive or PyODBC to connect to Hive and do things. It's not just one query. It can be a long session with intermittent queries. (The query itself has issues; will ask separately.)
Even during one single, active query, the operation does not show as a "Job" (I guess it's Yarn) on the dashboard. In contrast, when I "submit" tasks via Pyspark, they show up as "Jobs".
Besides the lack of task visibility, I also suspect that, w/o a Job, the cluster may not reliably detect a Python client is "connected" to it, hence the cluster's auto-delete might kick in prematurely.
Is there a way to "register" a Job to companion my Python session, and cancel/delete the job at times of my choosing? For my case, it is a "dummy", "nominal" job that does nothing.
Or maybe there's a more proper way to let Yarn detect my Python client's connection and create a job for it?
Thanks.

This is not supported right now, you need to submit jobs via Dataproc Jobs API to make them visible on jobs UI page and to be taken into account by cluster TTL feature.
If you can not use Dataproc Jobs API to execute your actual jobs, then you can submit a dummy Pig job that sleeps for desired time (5 hours in the example below) to prevent cluster deletion by max idle time feature:
gcloud dataproc jobs submit pig --cluster="${CLUSTER_NAME}" \
--execute="sh sleep $((5 * 60 * 60))"

Bigquery data transfer trigger on-demand

I have implemented a transfer job from S3 to BQ and configured it to be on-demand. I have used this guide. How can I trigger a data transfer configured to be on-demand via either the console or a CLI / API calls?

To run your job using the Console, click on your job, then click the option 'More' on the upper right corner and click the option 'Refresh Transfer'
Also, it seems like there is currently some improper behavior with the new BigQuery UI, as well as the transfer jobs. Most likely, you will see in the Console that your transfer job, even if configured as 'On-Demand', shows in transfer details as scheduled to run every 24h, but running this command (1), you will see how the job schedule option 'disableAutoScheduling' is set to true. Bear in mind this product is still in Beta, so there is still work in progress.
(1): bq show --format=prettyjson --transfer_config [resource_name]

Terminated ACI not disappearing

I'm working on a new container image that runs my worker process to drain an Azure queue. Once the queue is empty my app exits and I'd like the ACI to de-allocate and be removed as well. What I am seeing is the ACI stick around. It is in a "Terminated" state with a restart count of 0 as I would expect (seen in Azure Portal), but why is it not removed/deleted from the ACI list entirely?
I am using the Azure cli to create these instances and am specifying the restart never option. Here is my command line (minus the image specific details):
az container create --cpu 4 --memory 14 --restart-policy never --os-type windows --location eastus
I am of course also wondering where billing stops. Once I see the terminated state I am hoping that billing has stopped. Though this is unclear. I can of course manually delete the ACI and it is gone immediately, should exiting the app do the same?

If your container is in terminated state, you are no longer being billed. The resource itself though remains until you delete it though in the event you want to query the logs, events, or details of the container after termination. If you wish to delete existing container groups, writing some code on Azure Functions is a good route so you can define when something should be deleted.
Check out this base example of such a concept.
https://github.com/dgkanatsios/AzureContainerInstancesManagement/tree/master/functions/ACIDelete

How To: Get Merge replication status via C#

The project I am working on uses Merger replication which works fine. Behind the scenes I have used a of service to deletes the clients side DB and get new records till a perticular mark and to check for the server connectivity.
The challenge I am currently facing is, when the service goes to starts mode after being stop, it automatically executes what its suppose to do. There is no way I can pause my service as I am not able to get the status of merge replication.
The option of steps I have/can take from the client side is
Stop the exection of my function if the MergeReplication stauts is running.
Force the Sync on Merge Replication from my C# code.
But, I do no know how to take the status of Merger Replication.
I did go through few links on StackOverFlow but nothing got me +ve results. I am stuck with this issue.
How do I check SQL replication status via T-SQL?
How to check if Merge Replication is really complete or not
How to get replication status from code
Another question is - What will happen after the 5GB mark of data is met in Sql Express when the new data is pushed to the Sql Express from the server. Fill it follow the FIFO method and delete the data which came in first automatically and starts fillup itself with the new data from the server that gets pushed?
Hope to get some +ve answers.
[EDIT: Title Corrected]

Please see How to: Synchronize a Push Subscription (RMO Programming) and How to: Synchronize a Pull Subscription (RMO Programming) for instructions on how to synchronize a Merge subscription programmatically using Replication Management Objects in C#.
This will give you the ability to synchronize subscription(s) on-demand and handle the Status event to inspect the Merge Agent progress.

ThnkingSphinx (sphinxd) on remote database server with delta indexes?

I'm working on setting up a simple multi-tier Rails 3.1 setup -- web apps on one or more servers, postgresql database and our Sphinx search indexes on a remote server.
On a single-server setup we're using ThinkingSphinx, and delta indexes (using delayed_job), then a nightly cron to update the main index. Works great.
So: user creates indexable content; app tells delayed_job to schedule an update; delta-indexer adds new content to delta-index; searches look at both to resolve the search query properly; nightly job recreates single main index.
The documentation for ThinkingSphinx says here near the bottom
The best approach is to have Sphinx, the database and the delayed job processing task all running on one machine.
But I am unclear how to send the information needed by the delayed job process to the single server to be run. I have read some stuff about having a shared file system (yuck -- really?). I haven't read the code yet, but maybe there's a simple way?
Here's hoping!

The delayed job worker (running on your DB/Sphinx server) references the database, within the context of your Rails app - so you'll need the app on your DB/Sphinx server as well, but just to run the DJ worker.
From the perspective of your app servers, TS will just add job records to the database as per normal.
You'll also want to set the following settings - this one goes at the end of your config/application.rb:
ThinkingSphinx.remote_sphinx = Rails.env.production?
And add the Sphinx version to your config/sphinx.yml:
production:
version: 2.0.1-beta

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Is Possible To Schedule Azure Data Factory Triggers In Synapse While Not in Live Mode - azure-data-factory-2

Yes, it is possible Schedule a Triggers in Azure synapse with Git Hub mode. Go to New trigger -> select your type schedule, start time and time zone. You can check the Trigger, Pipeline successfully executed and got the output:

Related

How to make Dataproc detect Python-Hive connection as a Yarn Job?

Bigquery data transfer trigger on-demand

Terminated ACI not disappearing

How To: Get Merge replication status via C#

ThnkingSphinx (sphinxd) on remote database server with delta indexes?

Categories

Resources