I'm making an ETL using Talend, which should permit to create an Excel file with some informations resulting from a tNetezzaInput component in Talend, executing a dynamic query.
It works perfectly. However, some queries finish after 2 hours, depending of the table size ( I have more than 1000 queries to execute ).
I would like to set a timeout (30seconds/1 minute) on my tNetezzaInput.
Is that possible?
Thank you
Not sure about Talend/tNetezzaInput. But you can address this issue at Netezza side using query timeout limits along with runaway query event.
Secondly, how big tables are that query taking 2 hours. Hope its not some query/join issue.
Thanks,
Sanjit
A query on 200 millions records with 100+ columns tables should not take 2 hrs, unless there is some issue with data distribution, join, statistics or workload manager.
Have you checked the pg, dbos logs, planfile about this query?
Thanks,
Sanjit
Related
I'm using SQL Link server for fetching data from MariaDB.
But I fetching issue with slowness when i used MariaDB from link server.
I used below scenarios to fetch result (also describe time taken by query)
Please suggest if you have any solutions.
Total number of row in patient table : 62520
SELECT count(1) FROM [MariaDB]...[webimslt.Patient] -- 2.6 second
SELECT * FROM OPENQUERY([MariaDB], 'select count(1) from webimslt.patient') -- 47ms
SELECT * FROM OPENQUERY([MariaDB], 'select * from webimslt.patient') -- 20 second
This isn’t really a fair comparison...
SELECT COUNT(1) is only returning a single number and will probably be using an index to count rows.
SELECT * is returning ALL data from the table.
Returning data is an expensive (slow) process, so it will obviously take time to return your data. Then there is the question of data transfer, are the servers connected using a high speed connection? That is also a factor in this. It will never be as fast to query over a linked server as it is to query your database directly.
How can you improve the speed? I would start by only returning the data you need by specifying the columns and adding a where clause. After that, you can probably use indexes in Maria to try to speed things up.
Hi i connected Hive using DB visualizer and fired a simple join query to fetch two columns according to the filter applied. But the query was running for more than an hour with the status "Executing". I fired the same query in Hive logging through Putty and got the result in less than 20 seconds.
Can anyone help me to understand why the query in DB visualizer was running for a long time without producing any output.
Query used:
SELECT
A.ORDER,
B.ORDER1
FROM
ORDER A
INNER JOIN DUORDER B ON A.ORDER=B.ORDER1 AND A.TYPE ='50'
(The result set contain only 400 records)
To analyze why, we need more info. Please please open Tools->Debug Window in DbVisualizer and enable debugging (just for DbVisualizer, not JDBC). Execute the query again, stopping it after some time (say a few minutes). Then submit a support request using Help->Contact Support, and make sure that Attach Logs is enabled. This will give us the info we need to see what may be wrong.
Best Regards,
Hans (DbVisualizer team)
UPDATE
I have noticed that when I do a simple view from this table, results come back fairly quickly, but as soon as a I make WHERE clause restriction it slows way down.
I'm running a query that runs well, about 1.5 minutes. Yet when I include a single column from a linked Oracle DB the query takes 19 minutes. Is performance degradation like this normal?
I'm also not familiar with where I should start troubleshooting this type of issue, and am new Linked Servers.
Thank you,
UPDATE
Below is the code that creates the Oracle Linked Server DB view
SELECT Patient
,Account
,MR#
,Diagnosis
,ICD9
,TransferEMTALAFormsCmpltd
,Disposition
,AdmittingDxTranscribed
,AxisIPrimaryDx
,AgeDOB
,sex
,age
,EDRecordSentToEDM
,TimeRNSignature
,timemdsignature
,RNSgntr
,TriageByNameEntered as 'Triage_End'
,TriageStartTime as 'Triage_Start'
,AddedToAdmissionsTrack
,StatusAdmitConfirmed
,SUBSTRING(statusadmitconfirmed,1,4)+'-'+SUBSTRING(statusadmitconfirmed,7,2)+'-'+SUBSTRING(statusadmitconfirmed,1,4)+' '+SUBSTRING(statusadmitconfirmed,9,2)+':'+SUBSTRING(statusadmitconfirmed,11,2)+':00.000' as 'Admit_Cnrfm_String'
,AdmittingMD
,AreaOfCare
,EDMD
,EDMDID
,Specialty
,AccessProceduresED
,MDSgntr
,Arrival
,ArrivalED
,ChiefComplaint
,TransferringFacility
,ReferMD
,PrivateName
FROM [BMH-EDIS-CL]..[WELLUSER].[Patient_Chart] a
LEFT OUTER JOIN [BMH-EDIS-CL]..[WELLUSER].[Patient_Diagnoses] b
ON a.Master_Rec_Id=b.Master_Rec_Id
AND a.Slave_Rec_Id=b.Slave_Rec_Id
WHERE TriageStartTime > '201299999999'
Just getting a single result out of this view takes some time. When I try to add EDMD to the query I am working on, this is what causes the massive slow down, I have not yet been able to review Materialized Views as mentioned in the comments.
We have SQL and SSRS 2k5 on a Win 2k3 virtual server with 4Gb on the virt server. (The server running the virt server has > 32Gb)
When we run our comparison report, it calls a stored proc on database A. The proc pulls data from several tables, and from a view on database B.
If I run Profiler and monitor the calls, I see activity
SQL:BatchStarting SELECT
DATABASEPROPERTYEX(DB_NAME(),
'Collation'),
COLLATIONPROPERTY(CONVERT(char,
DATABASEPROPERTYEX(DB_NAME(),
'collation')), 'LCID')
then wait several minutes till the actual call of the proc shows up.
RPC:Completed exec sp_executesql
N'exec
[procGetLicenseSales_ALS_Voucher]
#CurrentLicenseYear,
#CurrentStartDate, #CurrentEndDate,
''Fishing License'',
#PreviousLicenseYear,
#OpenLicenseAccounts',N'#CurrentStartDate
datetime,#CurrentEndDate
datetime,#CurrentLicenseYear
int,#PreviousLicenseYear
int,#OpenLicenseAccounts
nvarchar(4000)',#CurrentStartDate='2010-11-01
00:00:00:000',#CurrentEndDate='2010-11-30
00:00:00:000',#CurrentLicenseYear=2010,#PreviousLicenseYear=2009,#OpenLicenseAccounts=NULL
then more time, and usually the report times out. It takes about 20 minutes if I let it run in Designer
This Report was working, albeit slowly but still less than 10 minutes, for months.
If I drop the query (captured from profiler) into SQL Server Management Studio, it takes 2 minutes, 8 seconds to run.
Database B just had some changes and data replicated to it (we only read from the data, all new data comes from nightly replication).
Something has obviously changed, but what change broke the report? How can I test to find out why the SSRS part is taking forever and timing out, but the query runs in about 2 minutes?
Added: Please note, the stored proc returns 18 rows... any time. (We only have 18 products to track.)
The report takes those 18 rows, and groups them and does some sums. No matrix, only one page, very simple.
M Kenyon II
Database B just had some changes and data replicated to it (we only read from the data, all new data comes from nightly replication).
Ensure that all indexes survived the changes to Database B. If they still exist, check how fragmented they are and reorganize or rebuild as necessary.
Indexes can have a huge impact on performance.
As far as the report taking far longer to run than your query, there can be many reasons for this. Some tricks for getting SSRS to run faster can be found here:
http://www.sqlservercentral.com/Forums/Topic859015-150-1.aspx
Edit:
Here's the relevant information from the link above.
AshMc
I recall some time ago we had the same issue where we were passing in the parameters within SSRS to a SQL Dataset and it would slow it all down compared to doing it in SSMS (minutes compared to seconds like your issue). It appeared that when SSRS was passing in the parameter it was possibly recalculating the value and not storing it once and that was it.
What I did was declare a new TSQL parameter first within the dataset and set it to equal the SSRS parameter and then use the new parameter like I would in SSMS.
eg:
DECLARE #X as int
SET #X = #SSRSParameter
janavarr
Thanks AshMc, this one worked for me. However my issue now is that it will only work with a single parameter and the query won’t run if I want to pass multiple parameter values.
...
AshMc
I was able to find how I did this previously. I created a Temp table placed the values that we wanted to filter on in it then did an inner join on the main query to it. We only use the SSRS Parameters as a filter on what to put in the temp table.
This saved a lot of report run time doing it this way
DECLARE #ParameterList TABLE (ValueA Varchar(20))
INSERT INTO #ParameterList
select ValueA
from TableA
where ValueA = #ValueB
INNER JOIN #ParameterList
ON ValueC = ValueA
Hope this helps,
--Dubs
Could be parameter sniffing. If you've changed some data or some of the tables then the cached plan that will have satisfied the sp for the old data model may not be valid any more.
Answered a very similar thing here:
stored procedure performance issue
Quote:
f you are sure that the sql is exactly the same and that the params are the same then you could be experiencing a parameter sniffing problem .
It's a pretty uncommon problem. I've only had it happen to me once and since then I've always coded away the problem.
Start here for a quick overview of the problem:
http://blogs.msdn.com/b/queryoptteam/archive/2006/03/31/565991.aspx
http://elegantcode.com/2008/05/17/sql-parameter-sniffing-and-what-to-do-about-it/
try declaring some local variables inside the sp and allocate the vales of the parameters to them. The use the local variables in place of the params.
It's a feature not a bug but it makes you go #"$#
I am faced with a peculiar requirement which is as follows:
A network-intensive operation is triggered to a server by multiple clients, through a web-interface. However, only one operation is allowed at a time, and hence an entry(tuple) is made in an SQL table to indicate that the operation is in progress. Once the operation is complete (irrespective of success or failure), the appropriate result is displayed back to the client(s), and the corresponding tuple is removed from the SQL table.
Since the operation is network-intensive, a scenario where the operation needs to be "considered" to be cancelled, after some timeout (10 minutes) has to be introduced.
Is there ANY way the lifetime of a row in SQL be associated with a timeout value, so that is is deleted after certain time? My application is primarily written in Java 1.5 and EJB 3.0, using JPA/Hibernate to access Oracle 10g DB engine.
Thanks in advance.
Regards,
Nagendra U M
I would suggest that you try using a timestamp column containing the start time of the task.
A before trigger can be then made to delete the old column before a new one is inserted if the task timed out.
If you want to have multiple tasks with different timeouts, you can even add a column with the timeout in seconds. Just code your trigger accordingly.
I don't know that Oracle has this kind of facility but I think no db engine have this.
If you want to do it at DB level,
you must have a datetime column, e.g.; 'CreatedDate' in
table. This column will have
datetime when record was created.
Write a procedure and put it in a
schedule job. This job will run after every 10 minutes and remove the 10 minutes old records. The query will be like
this.
T-SQL: Please convert it according to your db engine.
DELETE FROM yourtable WHERE CreatedDate < DATEADD(mi, -10, GETDATE())
This will delete all records older than 10 minutes from table.
This is just to give you idea of schedule job. It is in SQL Server. I don't know about Oracle
step_by_step_guide_to_add_a_sql_job_in_sql_server_2005
It sounds like you're implementing a mutex using the database, take a look at this question and see if it helps? Sounds like transactional access to a flag table will solve this for you, as long as you catch both success & failure states in your server code.