Transferring tables with an "insert from location" statement in Sybase IQ is very slow - sql

I am trying to transfer several tables from a Sybase IQ database on one machine, to the same database on another machine (exact same schema and table layout etc).
To do this I'm using an insert from location statement:
insert into <local table> location <other machine> select * from mytablex
This works fine, but the problem is that it is desperately slow. I have a 1 gigabit connection between both machines, but the transfer rate is nowhere near that.
With a 1 gigabyte test file, it takes only 1 or 2 minutes to transfer it via ftp (just as a file, nothing to do with IQ).
But I am only managing 100 gigabytes over 24 hours in IQ. That means that the transfer rate is more like 14 or 15 minutes for 1 gigabyte for the data going through Sybase IQ.
Is there any way I can speed this up?
I saw there is an option to change the packet size, but would that make a difference? Surely if the transfer is 7 times faster for a file the packet size can't be that much of a factor?
Thanks! :)

It appears from the documentation here and here that using insert into is a row by row operation, and not a bulk operation. This could explain the performance issues that you are seeing.
You may want to look at the bulk loading LOAD TABLE operation instead.

If I recall correctly, IQ 15.x has known bugs where packetsize is effectively ignored for insert...location...select and the default 512 is always used.
The insert...location...select is a bulk tds operation typically, however we have found it to have limited value when working with gigabytes of data, and built a process to handle extract/Load Table that is significantly faster.
I know it's not the answer you want, but performance appears to degrade as the data size grows. Some tables will actually never finish, if they are large enough.
Just a thought, you might want to specify the exact columns and wrap in an exec with dynamic sql. Dynamic SQL is a no-no, but if you need the proc executable in dev/qa + prod environments, there really isn't another option. I'm assuming this will be called in a controlled environment anyways, but here's what I mean:
declare #cmd varchar(2500), #location varchar(255)
set #location = 'SOMEDEVHOST.database_name'
set #cmd = 'insert localtablename (col1, col2, coln...) ' +
''''+ trim(#location)+ '''' +
' { select col1, col2, coln... from remotetablename}'
select #cmd
execute(#cmd)
go

Related

How to fix MS SQL linked server slowness issue for MariaDB

I'm using SQL Link server for fetching data from MariaDB.
But I fetching issue with slowness when i used MariaDB from link server.
I used below scenarios to fetch result (also describe time taken by query)
Please suggest if you have any solutions.
Total number of row in patient table : 62520
SELECT count(1) FROM [MariaDB]...[webimslt.Patient] -- 2.6 second
SELECT * FROM OPENQUERY([MariaDB], 'select count(1) from webimslt.patient') -- 47ms
SELECT * FROM OPENQUERY([MariaDB], 'select * from webimslt.patient') -- 20 second
This isn’t really a fair comparison...
SELECT COUNT(1) is only returning a single number and will probably be using an index to count rows.
SELECT * is returning ALL data from the table.
Returning data is an expensive (slow) process, so it will obviously take time to return your data. Then there is the question of data transfer, are the servers connected using a high speed connection? That is also a factor in this. It will never be as fast to query over a linked server as it is to query your database directly.
How can you improve the speed? I would start by only returning the data you need by specifying the columns and adding a where clause. After that, you can probably use indexes in Maria to try to speed things up.

Why is running a query on SQL Azure so much slower?

I created a trial account on Azure, and I deployed my database from SmarterAsp.
When I run a pivot query on SmarterAsp\MyDatabase, the results appeared in 2 seconds.
However, running the same query on Azure\MyDatabase took 94 seconds.
I use the SQL Server 2014 Management Studio (trial) to connect to the servers and run query.
Is this difference of speed because my account is a trial account?
Some related info to my question
the query is:
ALTER procedure [dbo].[Pivot_Per_Day]
#iyear int,
#imonth int,
#iddepartment int
as
declare #columnName Nvarchar(max) = ''
declare #sql Nvarchar(max) =''
select #columnName += quotename(iDay) + ','
from (
Select day(idate) as iDay
from kpivalues where year(idate)=#iyear and month(idate)=#imonth
group by idate
)x
set #columnName=left(#columnName,len(#columnName)-1)
set #sql ='
Select * from (
select kpiname, target, ivalues, convert(decimal(18,2),day(idate)) as iDay
from kpi
inner join kpivalues on kpivalues.idkpi=kpi.idkpi
inner join kpitarget on kpitarget.idkpi=kpi.idkpi
inner join departmentbscs on departmentbscs.idkpi=kpi.idkpi
where iddepartment='+convert(nvarchar(max),#iddepartment)+'
group by kpiname,target, ivalues,idate)x
pivot
(
avg(ivalues)
for iDay in (' + #columnName + ')
) p'
execute sp_executesql #sql
Running this query on 3 different servers gave me different results in terms of Elapsed time till my pivot table appear on the screen:
Azure - Elapsed time = 100.165 sec
Smarterasp.net - Elapsed time = 2.449 sec
LocalServer - Elapsed time = 1.716 sec
Regarding my trial account on Azure, I made it with the main goal to check if I will have a better speed than Smarter when running stored procedure like the above one.
I choose for my database Service Tier - Basic, Performance level -Basic(5DTUs) and Max. Size 2GB.
My database has 16 tables, 1 table has 145284 rows, and the database size is 11mb. Its a test database for my app.
My questions are:
What can I do, to optimize this query (sp)?
Is Azure recommended for small databases (100mb-1Gb)? I mean performance vs. cost!
Conclusions based on your inputs:
I made suggested changes to the query and the performance was improved with more than 50% - Thank you Remus
I tested my query on Azure S2 and the Elapsed time for updated query was 11 seconds.
I tested again my query on P1 and the Elapsed time was 0.5 seconds :)
the same updated query on SmarterASP had Elapsed time 0.8 seconds.
Now its clear for me what are the tiers in Azure and how important is to have a very good query (I even understood what is an Index and his advantage/disadvantage)
Thank you all,
Lucian
This is first and foremost a question of performance. You are dealing with a poorly performing code on your part and you must identify the bottleneck and address it. I'm talking about the bad 2 seconds performance now. Follow the guidelines at How to analyse SQL Server performance. Once you get this query to execute locally acceptable for a web app (less than 5 ms) then you can ask the question of porting it to Azure SQL DB. Right now your trial account is only highlighting the existing inefficiencies.
After update
...
#iddepartment int
...
iddepartment='+convert(nvarchar(max),#iddepartment)+'
...
so what is it? is the iddepartment column an int or an nvarchar? And why use (max)?
Here is what you should do:
parameterize #iddepartment in the inner dynamic SQL
stop doing nvarchar(max) conversion. Make the iddepartment and #iddertment types match
ensure indexes on iddepartment and all idkpis
Here is how to parameterize the inner SQL:
set #sql =N'
Select * from (
select kpiname, target, ivalues, convert(decimal(18,2),day(idate)) as iDay
from kpi
inner join kpivalues on kpivalues.idkpi=kpi.idkpi
inner join kpitarget on kpitarget.idkpi=kpi.idkpi
inner join departmentbscs on departmentbscs.idkpi=kpi.idkpi
where iddepartment=#iddepartment
group by kpiname,target, ivalues,idate)x
pivot
(
avg(ivalues)
for iDay in (' +#columnName + N')
) p'
execute sp_executesql #sql, N'#iddepartment INT', #iddepartment;
The covering indexes is, by far, the most important fix. That obviously requires more info than is here present. Read Designing Indexes including all sub-chapters.
As a more general comment: this sort of queries befit columnstores more than rowstore, although I reckon the data size is, basically, tiny. Azure SQL DB supports updateable clustered columnstore indexes, you can experiment with it in anticipation of serious data size. They do require Enterprise/Development on the local box, true.
(Update: the original question has been changed to also ask how to optimise the query - which is a good question as well. The original question was why the difference which is what this answer is about).
The performance of individual queries is heavily affected by the performance tiers. I know the documentation implies the tiers are about load, that is not strictly true.
I would re-run your test with an S2 database as a starting point and go from there.
Being on a trial subscription does not in itself affect performance, but with the free account you are probably using a B level which isn't really useable by anything real - certainly not for a query that takes 2 seconds to run locally.
Even moving between, say, S1 and S2 will show a noticeable difference in performance of an individual query.
If you want to experiment, do remember you are charged a day for "any part of a day", which is probably okay for S level but be careful when testing P level.
For background; when Azure introduced the new tiers last year, they changed the hosting model for SQL. It used to be that many databases would run on a shared sqlserver.exe. In the new model, each database effectively gets its own sqlserver.exe that runs in a resource constrained sandbox. That is how they control the "DTU usage" but also affects general performance.
It's nothing to do with the fact that your account is trial, it's due to the lower performance level you have selected.
In other service (SmarterAsp) and running local instance you probably do not have performance restrictions rather size restrictions.
At this point it's impossible to put together what actually DTU means / what sort of DTU number is associated with a Sql server installed in your local machine or in any other hosting provider.
However, there are some good analysis (https://cbailiss.wordpress.com/2014/09/16/performance-in-new-azure-sql-database-performance-tiers/) done regarding this but nothing official.

What SQL query is used when opening a table in SAS Guide via ODBC access

I have in SAS Enterprise Guide 6.1 a libname assigned to a database through ODBC.
If in Server List panel I select table attached to the libname, and open it with right mouse menu, I get the resulting table which I can browse.
Is is possible to see somehow which SQL query the opening of the table sends to the ODBC interface?
Addition 1:
I would like to compare the performance when running a proc sql query:
proc sql;
select *
from temp.cases (obs=100);
quit;
And when opening the table with right mouse menu and with Tools > Options > Data > Performance > Maximum number of rows ... settings set to the value of 100.
In order to be able to explain the differences in performance I would need to know which query the opening of the table with right-mouse menu uses. Does is read the full table, and then show 100 lines, or read just 100 lines and then show those 100 lines. There could be a huge difference in performance between these two ways of showing data.
Or, is the only way to find out the query used in opening of the data to look at the log of the server which processed the ODBC query?
Addition 2:
The problem I had was caused by the string length of some fields, which became the maximum 32767. With 48 string fields that was 48 * 32767 = 1.5 M per one row! Apparently no strings had an "end-of-record" mark, which caused the huge data traffic between SAS Server and SAS Client.
After the data was reformatted to have only the string length with maximum of 255, one row took only 48 * 255 = 12 k, which make a tremendous difference in the speed, when viewing the data by "Opening" the table in SAS Guide viewer! Similar performance loss was not seen when outputting the same data into "SAS Report".
I'm not sure that it is possible to see a SQL version of what is happening. Since it is SAS, it is probably using a data step or equivalent to populate the table browsing (like using obs= option on a data step set statement).
However, if you are simply looking to find a proc sql equivalent. The outobs option may work for you.
proc sql outobs=10;
create table temp2 as select * from temp;
run;

Copy lots of table records from SQL Server Dev to Prod

I have a table with about 5 million records and I only need to move the last 1 million to production (as the other 4 million are there). What is the best way to do this so I don't have to recopy the entire table each time?
A little faster will probably be:
Insert into prod.dbo.table (column1, column2....)
Select column1, column2.... from dev.dbo.table d
where not exists (
select 1 from prod.dbo.table pc where pc.pkey = d.pkey
)
But you need to tell us if these tables are on the same server or not
Also how frequently is this run and how robust does it need to be? There are alternative solutions depending on your requirements.
Given this late arriving gem from the OP: no need to compare as I know the IDs > X , then you do not have to do an expensive comparison. You can just use
Insert into prod.dbo.table (column1, column2....)
Select column1, column2.... from dev.dbo.table d
where ID > x
This will be far more efficient as you are only transferring the rows you need.
Edit: (Sorry for revising so much. I'm understanding your question better now)
Insert into TblProd
Select * from TblDev where
pkey not in (select pkey from tblprod)
This should only copy over records that aren't already in your target table.
Since they are on a separate server that changes everything. In short: in order to know what isn't in dev you need to compare everything in DEV to everything in PROD anyway, so there is no simple way to avoid comparing huge datasets.
Some different strategies used for replication between PROD and DEV systems:
A. Backup and restore the whole database across and apply scripts afterwards to clean it up
B. Implement triggers in the PROD database that record the changes then copy only changed records accross
C. Identify some kind of partition or set of records that you know don't change (i.e. 12 months ago), and only refresh those that aren't in that dataset.
D. Copy ALL of prod into a staging table on the DEV server using SSIS. Use a very similar query to above to only insert new records across the database. Delete the staging table.
E. You might be able to find a third party SSIS component that does this efficiently. Out of the box, SSIS is inefficient at comparative updates.
Do you actually have an idea of what those last million records are? i.e. are the for a location or a date or something? Can you write a select to identify them?
Based on this comment:
no need to compare as I know the IDs > X will work
You can run this on the DEV server, assuming you have created a linked server called PRODSERVER on the DEV server
INSERT INTO DB.dbo.YOURTABLE (COL1, COL2, COL3...)
SELECT COL1, COL2, COL3...
FROM PRODSERVER.DB.dbo.YOURTABLE
WHERE ID > X
Look up 'SQL Server Linked Servers' for more information on how to create one.
This is fine for a one off but if you do this regularly you might want to make something more robust.
For example you could create a script that exports the data using BCP.EXE to a file, copies it across to DEV and imports it again. This is more reliable as it does it in one batch rather than requiring a network connection the whole time.
If the tables are on the same server, you can do something like this
I am using MySQL, so may be the syntax will be a little bit different, but in my opinion everything should be the same.
INSERT INTO newTable (columnsYouWantToCopy)
SELECT columnsYouWantToCopy
FROM oldTable WHERE clauseWhichGivesYouOnlyRecodsYouNeed
If on another server, you can do something like this:
http://dev.mysql.com/doc/refman/5.0/en/select-into.html

Report on SQL/SSRS 2k5 takes > 10 minutes, query < 3 mins

We have SQL and SSRS 2k5 on a Win 2k3 virtual server with 4Gb on the virt server. (The server running the virt server has > 32Gb)
When we run our comparison report, it calls a stored proc on database A. The proc pulls data from several tables, and from a view on database B.
If I run Profiler and monitor the calls, I see activity
SQL:BatchStarting SELECT
DATABASEPROPERTYEX(DB_NAME(),
'Collation'),
COLLATIONPROPERTY(CONVERT(char,
DATABASEPROPERTYEX(DB_NAME(),
'collation')), 'LCID')
then wait several minutes till the actual call of the proc shows up.
RPC:Completed exec sp_executesql
N'exec
[procGetLicenseSales_ALS_Voucher]
#CurrentLicenseYear,
#CurrentStartDate, #CurrentEndDate,
''Fishing License'',
#PreviousLicenseYear,
#OpenLicenseAccounts',N'#CurrentStartDate
datetime,#CurrentEndDate
datetime,#CurrentLicenseYear
int,#PreviousLicenseYear
int,#OpenLicenseAccounts
nvarchar(4000)',#CurrentStartDate='2010-11-01
00:00:00:000',#CurrentEndDate='2010-11-30
00:00:00:000',#CurrentLicenseYear=2010,#PreviousLicenseYear=2009,#OpenLicenseAccounts=NULL
then more time, and usually the report times out. It takes about 20 minutes if I let it run in Designer
This Report was working, albeit slowly but still less than 10 minutes, for months.
If I drop the query (captured from profiler) into SQL Server Management Studio, it takes 2 minutes, 8 seconds to run.
Database B just had some changes and data replicated to it (we only read from the data, all new data comes from nightly replication).
Something has obviously changed, but what change broke the report? How can I test to find out why the SSRS part is taking forever and timing out, but the query runs in about 2 minutes?
Added: Please note, the stored proc returns 18 rows... any time. (We only have 18 products to track.)
The report takes those 18 rows, and groups them and does some sums. No matrix, only one page, very simple.
M Kenyon II
Database B just had some changes and data replicated to it (we only read from the data, all new data comes from nightly replication).
Ensure that all indexes survived the changes to Database B. If they still exist, check how fragmented they are and reorganize or rebuild as necessary.
Indexes can have a huge impact on performance.
As far as the report taking far longer to run than your query, there can be many reasons for this. Some tricks for getting SSRS to run faster can be found here:
http://www.sqlservercentral.com/Forums/Topic859015-150-1.aspx
Edit:
Here's the relevant information from the link above.
AshMc
I recall some time ago we had the same issue where we were passing in the parameters within SSRS to a SQL Dataset and it would slow it all down compared to doing it in SSMS (minutes compared to seconds like your issue). It appeared that when SSRS was passing in the parameter it was possibly recalculating the value and not storing it once and that was it.
What I did was declare a new TSQL parameter first within the dataset and set it to equal the SSRS parameter and then use the new parameter like I would in SSMS.
eg:
DECLARE #X as int
SET #X = #SSRSParameter
janavarr
Thanks AshMc, this one worked for me. However my issue now is that it will only work with a single parameter and the query won’t run if I want to pass multiple parameter values.
...
AshMc
I was able to find how I did this previously. I created a Temp table placed the values that we wanted to filter on in it then did an inner join on the main query to it. We only use the SSRS Parameters as a filter on what to put in the temp table.
This saved a lot of report run time doing it this way
DECLARE #ParameterList TABLE (ValueA Varchar(20))
INSERT INTO #ParameterList
select ValueA
from TableA
where ValueA = #ValueB
INNER JOIN #ParameterList
ON ValueC = ValueA
Hope this helps,
--Dubs
Could be parameter sniffing. If you've changed some data or some of the tables then the cached plan that will have satisfied the sp for the old data model may not be valid any more.
Answered a very similar thing here:
stored procedure performance issue
Quote:
f you are sure that the sql is exactly the same and that the params are the same then you could be experiencing a parameter sniffing problem .
It's a pretty uncommon problem. I've only had it happen to me once and since then I've always coded away the problem.
Start here for a quick overview of the problem:
http://blogs.msdn.com/b/queryoptteam/archive/2006/03/31/565991.aspx
http://elegantcode.com/2008/05/17/sql-parameter-sniffing-and-what-to-do-about-it/
try declaring some local variables inside the sp and allocate the vales of the parameters to them. The use the local variables in place of the params.
It's a feature not a bug but it makes you go #"$#