identifying idle events in google Big Query.existing procedure

identifying idle events in google Big Query.existing procedure - google-bigquery

How can we identify idle events in Google Big Query.
The events that have been pushed by a user but not being queried.
I need to know if there is any existing procedure that i can follow.

I'm not sure I understand your question. Are you asking about query jobs that have been created but have not yet started running? If so, you can call jobs.list() and specify the PENDING state as a filter. Query jobs are generally only in the pending state for a very brief time however, unless you have specified batch priority.

Related

Best means to poll an Oracle table based on an asynchronous response

How can I handle an asynchronous response that eventually updates a status flag in an Oracle table?
I basically have a PL/SQL routine that make a REST call using APEX_WEB_SERVICE API.
My question is, asynchronously, this will eventually update a status flag within a table, which will tell me whether the operation was OK or FAIL.
What is the best way to poll this table to check if a response of OK or FAIL has been returned using Oracle PL/SQL?
I was looking at DBMS_LOCK.sleep() but unsure if this is the best approach. Could DBMS_ALERT also work for this?

Rather than poll the table at an interval, I would recommend using Oracle Advanced Queues along with Oracle Scheduler. AQ is designed exactly for this sort of thing. You can create a "scheduled" job that is triggered by a message (sent by the asynchronous process at the same time it updates the table) being sent to the queue. Scheduler sees the message and runs the appropriate job or job chain to finish the processing.
See here for a basic example: https://pmdba.wordpress.com/2017/08/21/aq-basics/

My SQL process keeps switching from 'suspended' to 'running' and back again

I have a query that takes a long, long time to run (relatively speaking). It retrieves many rows with multiple varbinary(max) columns. This query needs optimising, no doubt about it - but my question is very specific to the ever-changing 'task state' I'm witnessing in SQL activity monitor.
Every 5 seconds or so the task state changes from suspended to running, then back again. What does this imply?
Note: I may raise a separate question regarding optimisation of such a query - but I'm not asking that for now, I'm asking very specifically about the quick change in state.
NOT A DUPLICATE BECAUSE:
I'm asking about the change in quick succession of the task state, I'm not asking what suspended means. I'm asking (if suspended means a wait on I/O) why it would wait on I/O, then not, quickly, many times per query.

This is normal, SUSPENDED simply means that the session is waiting for an event, such as I/O, to complete. You will find that sessions flick in and out of this state rather frequently.
You can see the explanations of the different statuses in this document here:
sp_who (Transact-SQL)

Abort a table import stuck in 'pending'

Similar questions have been asked but not exactly what I am looking for.
The problem: on some occasions importing a table from Google Cloud to Big Query gets stuck in a 'pending' state for hours if not days. Tables that get stuck in this state never seem to come out of it, or at least we didn't bother waiting that long. I know it's not a queue issue since in the mean time we can import other tables just fine. No errors are returned by Big Query.
My question: in this situation, and in general, how can we safely abort/cancel an import to Big Query without having the table quietly import on us without us knowing. This would actually apply to any table regardless of its state, as long as it hasn't finished importing.
Thanks.

You may be hitting job load rate limits. For example, if you try to start more than two load jobs per minute for the same table, the load jobs against that table will be defferred, while other load jobs against other tables may continue at normal speed.
There are per-project limits on rate at which load jobs will be started and limits on the number of load jobs that can be running per project at any one time. If you send jobs faster than this, we'll queue, but as you've noticed, our queueing is not a fair queue, and can start newer jobs before older ones.
Aborting pending jobs is a commonly requested feature. If you file a feature request here that will help us prioritize it.

Suggestion to handle Deadlock

There are like 6 procedures which are called internally to get data from a transactional table and do aggregations on the retrieved data , formated as an XML and then send emails hourly.
During this process, a lot of logging in done and logs are also sent as email in an HTML format(in the same email).There is one procedure where a deadlock occurs and one section of the email is always missed or we have a deadlock occurence(LOGS). So in order to handle I am trying to use the READ_COMMITTED_SNAPSHOT in that particular procedure. Can anyone please suggest how if this has worked for them or else which is the best way to handle this kind of deadlock.
Can I do a retry of that particular procedure internally by checking the output is Null or not.
I cant let the other process fail as that is a transaction.But I need the HTML to show all the information without missing anything in the body.
EDIT: This occurs very rarely.But the frequency is increasing daily now.I am not able to understand as the procedure is just trying to read from the transactional table and make some calculations and format it into XML and the other transaction is writting to the transactional table. So how does a WRITE effect a READ

You need to fix the deadlock in order to resolve this.
A deadlock occurs when one process holds a resource that the other requires in order to proceed and vice-versa. You'll get a deadlock when you have two processes that acquire the same set of resources in different orders. For instance, If process P1 acquires resources in the following order:
Resource A
Resource B
And a competing process, P2, requires the same resources in a different order:
Resource B
Resource A
P1 starts and acquires exclusive access to Resource A.
P2 starts and acquires exclusive access to Resource B.
In order for each to continue, P1 needs access to Resource B and P2 needs access to Resource A.
Neither of them can acquire the resource they need, thus causing the deadlock.
This is different than blocking, where one process is simply waiting for another process to release the needed resource. Given sufficient time, the blocking will be resolved. In a deadlock, the blocking cannot be resolved.
The SQL Engine can (and does) detect the deadlock situation. It resolves it by selecting one process or the other as the deadlock victim and rolling back.
Fix the deadlock by identifying the problem and resolving it, not by simply retrying and hoping it goes through. SQL Trace may help you identify the problem. You may need a DBA to help you.

A simpler (less dangerous) approach would be to change the six procedures in question so that they do dirty reads (i.e., WITH(NOLOCK)). This should work even in a deadlock, although you might get garbage data.

Database Job Scheduling

I have a procedure written in PLJava that sends out updates over JMS in my postgres database.
What I would like to do is have that function called on an interval (every 15 seconds) internally in the database (preferably not from an outside process). Is this possible? Any ideas?

If you need no external access, you are presumably able to modify the database design so that you don't need the update at all. Can you explain more about what the update is doing?
As depesz said, you could use either cron or pgAgent, but they are only able to go down to a one minute granularity, not 15 seconds. Considering sleeping inside the stored procedure until the next iteration is not a good idea, because you will have an open transaction for all that time which is a really bad idea.

Strict answer: it is not possible. Since you don't want outside process, and PostgreSQL doesn't support jobs - you are out of luck.
If you'll reconsider using outside processes, then you're most likely want something like cron, or better yet pgagent.
On absolutely other hand - what do you need to do that has to happen every 30 seconds? this seems like a problem with design.

First, you'll spend the least amount of effort if you just go with a cron job.
However, if you were starting from scracth: You are trying to periodically replicate rows from your database. I think you are looking at a replication queue.
The PGQ project (used for Londiste replication, both from Skype's SkyTools) has a queue that you can use independently. When configuring it, you set a maximum event count, and a loop delay, before batched events are generated. You can get batches spaced by no more than 15 seconds that way. You now have to produce the events that will be batched, using a trigger that calls pgq.insert_event; and consume the queues. The consumer can call your PL/Java stored proc; you'll have to rewrite the procedure to send everything in the batch instead of scanning the base table for new events.

As far as I know postgresql doesn't support scheduled tasks. You'll need to use a script with cron or at (depending on your operating system.)

Sounds like you're doing sort of replication? Every 15s sounds like a lot of updates. Could you setup a trigger (or a number of triggers) instead of polling?

If you are using JMS why not just have th task wait for input on the queue?

Per your depesz comment, you have a PL/Java stored procedure that "flushes out database tables (updates) as java objects". Since you want it to run in 15 second intervals, it must be processing a batch of updates each time. Rather than processing a batch of updates in a stored procedure every 15 seconds, why not process them one at a time when they happen via an after update trigger and eliminate the need for a timed interval. If you are aggregrating data from multiple tables to build your objects than add the triggers to you upper most tables only.

In my case the problem was that agent couldn't authorize to database so after I've made all connections trusted from localhost the service started successfully and job works fine
for more information about error you should see into windows event viewer or eq in unix based system. see my config file C:\Program Files\PostgreSQL\10\data\pg_hba.conf

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

identifying idle events in google Big Query.existing procedure - google-bigquery

How can we identify idle events in Google Big Query. The events that have been pushed by a user but not being queried. I need to know if there is any existing procedure that i can follow.

Related

Best means to poll an Oracle table based on an asynchronous response

My SQL process keeps switching from 'suspended' to 'running' and back again

Abort a table import stuck in 'pending'

Suggestion to handle Deadlock

Database Job Scheduling

Categories

Resources