How to remove a tuple from an SQL table after a timeout? - sql

I am faced with a peculiar requirement which is as follows:
A network-intensive operation is triggered to a server by multiple clients, through a web-interface. However, only one operation is allowed at a time, and hence an entry(tuple) is made in an SQL table to indicate that the operation is in progress. Once the operation is complete (irrespective of success or failure), the appropriate result is displayed back to the client(s), and the corresponding tuple is removed from the SQL table.
Since the operation is network-intensive, a scenario where the operation needs to be "considered" to be cancelled, after some timeout (10 minutes) has to be introduced.
Is there ANY way the lifetime of a row in SQL be associated with a timeout value, so that is is deleted after certain time? My application is primarily written in Java 1.5 and EJB 3.0, using JPA/Hibernate to access Oracle 10g DB engine.
Thanks in advance.
Regards,
Nagendra U M

I would suggest that you try using a timestamp column containing the start time of the task.
A before trigger can be then made to delete the old column before a new one is inserted if the task timed out.
If you want to have multiple tasks with different timeouts, you can even add a column with the timeout in seconds. Just code your trigger accordingly.

I don't know that Oracle has this kind of facility but I think no db engine have this.
If you want to do it at DB level,
you must have a datetime column, e.g.; 'CreatedDate' in
table. This column will have
datetime when record was created.
Write a procedure and put it in a
schedule job. This job will run after every 10 minutes and remove the 10 minutes old records. The query will be like
this.
T-SQL: Please convert it according to your db engine.
DELETE FROM yourtable WHERE CreatedDate < DATEADD(mi, -10, GETDATE())
This will delete all records older than 10 minutes from table.
This is just to give you idea of schedule job. It is in SQL Server. I don't know about Oracle
step_by_step_guide_to_add_a_sql_job_in_sql_server_2005

It sounds like you're implementing a mutex using the database, take a look at this question and see if it helps? Sounds like transactional access to a flag table will solve this for you, as long as you catch both success & failure states in your server code.

Related

How can I schedule a script in BigQuery?

At last BigQuery supports using ; in the queries, so I can write more than one query in one "block", if I seperate them with semicolon.
If I run the code manually, it works. But I cannot schedule that.
When I want to schedule, I have two choices:
(New) Web UI: I must give a destination table. If I don't do it, I could not save the scheduled query. But all my queries are updates and inserts with different "destination tables". Like these:
UPDATE project.exampledataset.a
SET date = current_date()
WHEN TRUE
;
INSERT INTO project.otherdataset.b
SELECT c,d
FROM project.otherdataset.c
So I cannot even make a scheduling in the Web UI.
Classic UI: I tried this, because the official documentary states, that I should leave the "destination table" blank, and Classic UI allows it. I can setup the scheduling, but it doesn't run, when it should. I get the error message in email "Error status: Dataset specified in the query ('') is not consistent with Destination dataset 'exampledataset'."
AIK scripting (and using semicolon) is a very new feature in BigQuery, but I hope someone can help me.
Yes, I know that I could schedule every query one by one, but I would like to resolve it with one big script.
Looks like the scheduled query was defined earlier with destination dataset defined with APPEND/TRUNCATE type transaction. While updating the same scheduled query to a DML query, GUI doesn't show the dataset field / table name to update to NULL. Hence this error is coming considering the previously set dataset and table name in the scheduled query.
Hence the fix is to delete the scheduled query and create it from scratch with DML query option. It worked for me.
Scripting is supported in scheduled query now. However, scripting query, when being scheduled, doesn't support setting a destination table for now. You still need to use DDL/DML to make change to existing table.
E.g.:
CREATE OR REPLACE TABLE destinationTable AS
SELECT *
FROM sourceTable
WHERE date >= maxDate
As of 2022, the BQ Console UI will let you create a new scheduled query without a destination dataset, but it won't let you update a prior SELECT to use DDL/DML block syntax. However, you can use the BigQuery Data Transfer API to update the destinationDatasetId field, via transferconfigs/patch. Use transferconfigs/list to get the configId for a given scheduled query.
Note that you can either use the in-browser API Explorer, if you have the appropriate credentials, or write a programmatic solution. Also seems useful for setting/updating any other fields, including renaming scheduled queries.

How to solve S-IX deadlock without using the Snapshot isolation?

Ok, I got an weird problem: I got two server-side api, one is to select data from A table; the Other one is to Insert new record to B table with part of the data and PK from first one. They should not have any problem to each other with the Usual actions.
Somehow, I detected someone which called my select and insert function at less than 0.005 sec on my SQL monitor, which caused S-IX deadlock. So I searched the Internet and found an solution that told me to enable the Snapshot isolation. But I tried it on my test DB(which its total size is about 223MB), that Alter Database command does not show any sight of finishing after 1 hour execution, so this is intolerable to execute it on the Production DB with such long Downtime(Which its data size is bigger than the Test one).
So my question is: Does anyone know the Other way to solve the S-IX deadlock?(Without lower the Throughput.)
P.S.: My DB is SQL Server 2008

How do I update the column value after 4 days?

I am looking for your assistant in understanding the way and the code to get the column value to be changed after 4 days.
Currently, whenever I am inserting a new row in the database, the value of the flag will be "Y" by default and I will capture the created date & time by storing the sysdate by default. Now, I want the flag to be changed to "N" after 4 days from the creation.
So, please guide and instruct me on the way to do the above.
You should look into creating a Trigger on INSERT/UPDATE/DELETE IF your table is frequently accessed for those operations.
If not, then you would have to schedule a job to run SQL daily or how frequent you'd like to check.
how to schedule a job for sql query to run daily?

SQL Server : Replication Error - the row was not found at the Subscriber when applying the replicated command

I got the following error in Activity Monitor,
The row was not found at the Subscriber when applying the replicated command. (Source: SQL Server, Error number: 20598)
After some research, I found out the error occurs because it's trying to delete records which are not exists in the subscriber(Also are not exists in publisher)
{CALL [dbo].[sp_MSdel_testtable] (241)}
I can manually insert something to the record and the replication will move on. The problem I have right now is I don't how many bad records are there. Is there any fast way to do this? I already spends hours and inserted about 20 records.
Thank you
Re-initialize the subscription via Replication Monitor in SSMS, and generate a new snapshot during re-initialization. This should clear up the missing record issues.
Dude, I [literally] just solved this a few minutes ago. It pushed me about a fifty cent cab ride short of crazy. The replicated database was part of a third party solution to offload their app's reporting. As it turns out, one of their app servers was pointed to our reporting database rather than the live database. Hence, "row not found at subscriber." because they were inserting and deleting records in the subscriber table.
Use distribution.dbo.sp_helpsubscriptionerrors to find the xact_seqno, then you can use distribution.dbo.browsereplcmds and this query
SELECT *
FROM distribution.dbo.MSarticles
WHERE article_id in (
SELECT article_id
FROM MSrepl_commands
WHERE xact_seqno = 0x xact_seqno)
to get more info. Use distribution.dbo.sp_setsubscriptionxactseqno if you want to get that single transaction "un-stuck" or just re-initialize.
I ended up running SQL Profiler on both the publisher and subscriber, then keyed off the tables I found with the queries above. That's when I spotted the DML on the replicated table.
Good luck.
This query will give you all the pending commands for that article, which needs to be sent from distributor to subscriber.
Here you will get multiple records with same xact_seqno_start and xact_seqno_end but with different command ids.
You can check if all are delete commands, you can take primary key column value for all and insert them manually to subscriber server.
EXEC Sp_browsereplcmds
#article_id = 813,
#command_id = 1,
#xact_seqno_start = '0x00099979000038D60001',
#xact_seqno_end = '0x00099979000038D60001',
#publisher_database_id = 1

What problems may occur while querying SQL databases with big amount of data over internet

I am having this big database on one MSSQL server that contains data indexed by a web crawler.
Every day I want to update SOLR SearchEngine Index using DataImportHandler which is situated in another server and another network.
Solr DataImportHandler uses query to get data from SQL. For example this query
SELECT * FROM DB.Table WHERE DateModified > Config.LastUpdateDate
The ImportHandler does 8 selects of this types. Each select will get arround 1000 rows from database.
To connect to SQL SERVER i am using com.microsoft.sqlserver.jdbc.SQLServerDriver
The parameters I can add for connection are:
responseBuffering="adaptive/all"
batchSize="integer"
So my question is:
What can go wrong while doing this queries every day ? ( except network errors )
I want to know how is SQL Server working in this context ?
Further more I have to take a decicion regarding the way I will implement this importing and how to handle errors, but first I need to know what errors can arise.
Thanks!
Later edit
My problem is that I don't know how can this SQL Queries fail. When i am calling this importer every day it does 10 queries to the database. If 5th query fails I have to options:
rollback the entire transaction and do it again, or commit the data I got from the first 4 queries and redo somehow the queries 5 to 10. But if this queries always fails, because of some other problems, I need to think another way to import this data.
Can this sql queries over internet fail because of timeout operations or something like this?
The only problem i identified after working with this type of import is:
Network problem - If the network connection fails: in this case SOLR is rolling back any changes and the commit doesn't take place. In my program I identify this as an error and don't log the changes in the database.
Thanks #GuidEmpty for providing his comment and clarifying out this for me.
There could be issues with permissions (not sure if you control these).
Might be a good idea to catch exceptions you can think of and include a catch all (Exception exp).
Then take the overall one as a worst case and roll-back (where you can) and log the exception to include later on.
You don't say what types you are selecting either, keep in mind text/blob can take a lot more space and could cause issues internally if you buffer any data etc.
Though just a quick re-read and you don't need to roll-back if you are only selecting.
I think you would be better having a think about what you are hoping to achieve and whether knowing all possible problems will help?
HTH