SQL query performance- ssis

SQL query performance- ssis - sql

I have an update query which is taking 15 hrs to complete in the production server. What modifications can I do to make it run faster.
UPDATE pos
SET pos.is_pub = 1
FROM A pos
WHERE pos.is_pub <> 1
and s_type <= (
SELECT TOP 1 month
FROM B with(nolock)
)
AND isnull(is_pub, 0) <> 1
AND isnull(is_adj, 0) <> 1
here 'type' and 'month' are actually integers having number of months as the values.

Start by moving the subquery to the FROM clause:
UPDATE pos
SET pos.is_pub = 1
FROM A pos CROSS JOIN
(SELECT TOP 1 month
FROM B with (nolock) -- very strange, no `order by`
) b
WHERE pos.is_pub <> 1 AND
pos.s_type <= b.month AND -- very strange, comparing "type" to "month"
(is_adj is null or is_adj <> 0);
Given this, there is not much that indexes can do because of the WHERE conditions. Perhaps you are updating essentially all rows in the table, which can be quite expensive. It is often cheaper to re-build the table rather than update it.

Related

I just started learning SQL and I couldn't do the query, can you help me?

There is a field in the sql query that I can't do. First of all, a new column must be added to the table below. The value of this column needs to be percent complete, so it's a percentage value. So for example, there are 7 values from Cupboard=1 shelves. Where IsCounted is here, 3 of them are counted. In other words, those with Cupboard = 1 should write the percentage value of 3/7 as the value in the new column to be created. If the IsCounted of the others is 0, it will write zero percent. How can I do this?
My Sql Code:
SELECT a.RegionName,
a.Cupboard,
a.Shelf,
(CASE WHEN ToplamSayım > 0 THEN 1 ELSE 0 END) AS IsCounted
FROM (SELECT p.RegionName,
r.Shelf,
r.Cupboard,
(SELECT COUNT(*)
FROM FAZIKI.dbo.PM_ProductCountingNew
WHERE RegionCupboardShelfTypeId = r.Id) AS ToplamSayım
FROM FAZIKI.dbo.DF_PMRegionType p
JOIN FAZIKI.dbo.DF_PMRegionCupboardShelfType r ON p.Id = r.RegionTypeId
WHERE p.WarehouseId = 45) a
ORDER BY a.RegionName;
The result is as in the picture below:

It looks like a windowed AVG should do the trick, although it's not entirely clear what the partitioning column should be.
The SELECT COUNT can be simplified to an EXISTS
SELECT a.RegionName,
a.Cupboard,
a.Shelf,
a.IsCounted,
AVG(a.IsCounted * 1.0) OVER (PARTITION BY a.RegionName, a.Cupboard) Percentage
FROM (
SELECT p.RegionName,
r.Shelf,
r.Cupboard,
CASE WHEN EXISTS (SELECT 1
FROM FAZIKI.dbo.PM_ProductCountingNew pcn
WHERE pcn.RegionCupboardShelfTypeId = r.Id
) THEN 1 ELSE 0 END AS IsCounted
FROM FAZIKI.dbo.DF_PMRegionType p
JOIN FAZIKI.dbo.DF_PMRegionCupboardShelfType r ON p.Id = r.RegionTypeId
WHERE p.WarehouseId = 45
) a
ORDER BY a.RegionName;

Condition for a SQL query to check if dates are between prior period and current month and execute subqueries based on given gate

DECLARE #Todaydate DATE
SET #Todaydate = '12/31/2017'
SELECT
CASE
WHEN DATEDIFF(dd,#Todaydate,getdate()) >= 31
THEN (SELECT a.CU, , b.abc
FROM histhold a, security b
WHERE T_QUANTITY_P <> 0
AND ACCOUNTING_DATE = '04/30/2018'
AND a.cu = b.CU)
ELSE ''
END

Just do the logic in the WHERE clause:
SELECT a.CU, , b.abc
FROM histhold a
INNER JOIN security b
ON a.cu = b.CU
WHERE T_QUANTITY_P <> 0
AND ACCOUNTING_DATE = '04/30/2018'
AND DATEDIFF(dd,#Todaydate,getdate()) >= 31;
Unless this is part of some larger procedure/script I can't imagine a reason why you would only want to execute a query given certain conditions (as opposed to executing the query and restricting the results based on those conditions potentially returning an empty recordset).

SQL fill gaps with hold

I've encountered a problem I cannot solve with my knowledge and I haven't found any solutions I understood good enough to solve my problem.
So here is what I try to achieve.
I have a database with the following structure:
node_id, source_time, value
1 , 10:13:15 , 1
2 , 10:13:15 , 1
2 , 10:13:16 , 2
1 , 10:13:19 , 2
1 , 10:13:25 , 3
2 , 10:13:28 , 3
I want to have a sql query to get the following output
time , value1, value2
10:13:15, 1 , 1
10:13:16, 1 , 2
10:13:19, 2 , 2
10:13:25, 3 , 2
10:13:28, 3 , 3
You see, the times are all times that occur from both nodes.
But the values have to be filled in the gaps since node1 has no value for the time :16 and :28.
I got it to the point where I get the 2 columns from one table. That was not the hard part.
SELECT T1.[value], T2.[value]
FROM [db1].[t_value_history] T1, [db1].[t_value_history] T2
WHERE ( T1.node_id = 1 AND T2.node_id = 2)
But the result doesn't look like the way I want it to be.
I found something with COALESCE and another table which holds the previous value. But that looked quiet complicated for such a easy thing.
I guess there is an easy sql solution but I haven't had much time to get into the materia.
I would be happy to get any idea which function to use.
Thanks so far.
Edit: Changed the database, made a mistake on the last line.
Edit2: I am using SQL Server. Sorry for not clarifying this. Also the values are not neccessarily increasing. I just used increasing numbers in this example here.

This works in SQL Server. If you are certain that there is a value for both nodes for the minimum time then you could change the OUTER APPLY to a CROSS APPLY, which would perform better.
WITH times
AS ( SELECT DISTINCT
source_time
FROM dbo.t_value_history
)
SELECT t.source_time ,
n1.value ,
n2.value
FROM times AS t
OUTER APPLY ( SELECT TOP 1
h.value
FROM dbo.t_value_history AS h
WHERE h.node_id = 1
AND h.source_time <= t.source_time
ORDER BY h.source_time DESC
) AS n1
OUTER APPLY ( SELECT TOP 1
h.value
FROM dbo.t_value_history AS h
WHERE h.node_id = 2
AND h.source_time <= t.source_time
ORDER BY h.source_time DESC
) AS n2;

You could use conditional aggregation to get the right set of rows:
select vh.source_time,
max(case when vh.node_id = 1 then value end) as value_1,
max(case when vh.node_id = 2 then value end) as value_2
from db1.t_value_history vh
group by vh.source_time;
If you want to fill in the values, then the best solution is lag() with ignore nulls. Supported by ANSI, but not by SQL Server (which I'm guessing you are using). Your values appear to be increasing. If that is the case, you can use a cumulative max:
select vh.source_time,
max(max(case when vh.node_id = 1 then value end)) over (order by vh.source_time) as value_1,
max(max(case when vh.node_id = 2 then value end) over (order by vh.source_time) as value_2
from db1.t_value_history vh
group by vh.source_time;
In your data, value is increasing, so this works for the data in your example. If that is not the case, a more complex query is needed to fill in the gaps.

This will do it in SQL Server. It is not 'nice' though:
SELECT DISTINCT
T1.source_time,
CASE WHEN T1.node_id = 1 THEN T1.[value] ELSE ISNULL(T2.[value], T3.[value]) END,
CASE WHEN T1.node_id = 1 THEN ISNULL(T2.[value], T3.[Value]) ELSE T1.[value] END
FROM
[db1].[t_value_history] T1
LEFT OUTER JOIN [db1].[t_value_history] T2 ON T2.source_time = T1.source_time
AND T2.node_id <> T1.node_id -- This join looks for a value for the other node at the same time.
LEFT OUTER JOIN [db1].[t_value_history] T3 ON T3.source_time < T1.source_time
AND T3.node_id <> T1.node_id -- If the previous join is empty, this looks for values for the other node at previous times
LEFT OUTER JOIN [db1].[t_value_history] T4 ON T4.source_time > T3.source_time
AND T4.source_time < T1.source_time
AND T4.node_id <> T1.node_id -- This join makes sure there aren't any more recent values
WHERE
T4.node_id IS NULL

How do I use the value from row above when a given column value is zero?

I have a table of items by date (each row is a new date). I am drawing out a value from another column D. I need it to replace 0s though. I need the following logic: when D=0 for that date, use the value in column D from the date prior.
Actually, truth be told, I need it to say, when D is 0, use the value from the latest date where D was not a 0, but the first will get me most of the way there.
Is there a way to build this logic? Maybe a CTE?
Thank you very much.
PS I'm using SSMS 2008.
EDIT: I wasn't very clear at first. The value I want to change is not the date. I want change the value in D with the latest non-zero value from D, based on date.

May be the following query might help you. It uses the OUTER APPLY to fetch the results. Screenshot #1 shows the sample data and query output against the sample data. This query can be written better but this is what I could come up with right now.
Hope that helps.
SELECT ITM.Id
, COALESCE(DAT.New_D, ITM.D) AS D
, ITM.DateValue
FROM dbo.Items ITM
OUTER APPLY (
SELECT
TOP 1 D AS New_D
FROM dbo.Items DAT
WHERE DAT.DateValue < ITM.DateValue
AND DAT.D <> 0
AND ITM.D = 0
ORDER BY DAT.DateValue DESC
) DAT
Screenshot #1:

UPDATE t
Set value = SELECT value
FROM table
WHERE date = (SELECT MAX(t1.date)
FROM table t1
WHERE t1.value != 0
AND t1.date < t.date)
FROM table t
WHERE t.value = 0

You could maybe something like this as part of an update script...
SET myTable.D = (
SELECT TOP 1 myTable2.D
FROM myTable2
WHERE myTable2.myDateField < myTable.myDateField
AND myTable2.D != 0
ORDER BY myTable2.myDateField DESC)
That's assuming that you want to actually update the data though rather than just replace the values for the purpose of a select query.

How about:
SELECT
i.ID,
i.DateValue,
D = CASE WHEN I.D <> 0 THEN I.D ELSE X.D END
FROM
Items I
OUTER APPLY (
SELECT TOP 1 S.D
FROM Items S
WHERE S.DATEVALUE < I.DATEVALUE AND S.D <> 0
ORDER BY S.DATEVALUE DESC
) X

SELECT t.id,
CASE WHEN t.D = 0 THEN t0.D
ELSE t.D END
FROM table AS t
LEFT JOIN table AS t0
ON t0.time =
(
SELECT MAX(time) FROM t0
WHERE t0.time < t.time
AND t0.D != 0
)
or if you want to avoid aggregates entirely,
SELECT t.id,
CASE WHEN t.D = 0 THEN t0.D
ELSE t.D END
FROM table AS t
LEFT JOIN table AS t0
ON t0.time < t.time
LEFT JOIN table AS tx
ON tx.time > t0.time
WHERE t0.D != 0
AND tx.D != 0
AND tx.id IS NULL -- i.e. there isn't any

How to re-write the following mysql query

I have a file upload site, and I want to run a maintenance script that will run every day and delete items that haven't been accessed in a week. I log views for each day, and each item into a table:
hit_itemid
hit_date
hit_views
The main table actually has the files that were uploaded, for the purposes of this example, its just vid_id, vid_title thats in this table, and vid_id will equal to hit_itemid.
I have a query as follows:
SELECT vid_id,
vid_title,
SUM(case when hit_date >= '2009-09-17' then hit_hits else 0 end) as total_hits
FROM videos
LEFT JOIN daily_hits ON vid_id = hit_itemid
WHERE vid_posttime <= '$last_week_timestamp' AND vid_status != 3
GROUP BY hit_itemid
HAVING total_hits < 1
But this always returns a single record.
How can I rewrite this query?

An idea:
SELECT DISTINCT
vid_id, vid_title
FROM
videos v
LEFT JOIN daily_hits dh ON (
v.vid_id = dh.hit_itemid AND dh.hit_date >= '2009-09-17'
)
WHERE
v.vid_posttime <= '$last_week_timestamp' AND v.vid_status != 3
AND dh.hit_itemid IS NULL;
Alternatively (benchmark to see which is faster):
SELECT
vid_id, vid_title
FROM
videos v
WHERE
v.vid_posttime <= '$last_week_timestamp' AND v.vid_status != 3
AND NOT EXISTS (
SELECT 1 FROM daily_hits dh
WHERE v.vid_id = dh.hit_itemid AND dh.hit_date >= '2009-09-17'
)
I'm guessing the first form will be faster, but can't check (I don't
have access to your data). Haven't tested these queries either, for the
same reason.

first guess, may be you have to do a
GROUP BY vid_id
instead of
GROUP BY hit_itemid

SELECT
vd.vid_id,
vd.vid_title,
sum(case when dh.hit_date >= '2009-09-17' then dh.hit_views else 0 end) as total_hits
FROM videos vd
LEFT JOIN daily_hits dh ON dh.hit_itemid = vd.vid_id
WHERE vd.vid_posttime <= '$last_week_timestamp' AND vd.vid_status != 3
GROUP BY vd.vid_id
HAVING total_hits < 1
This is how I would have the query... Assuming vid_posttime & vid_status are fields of table videos

Do you definitely have data which satisfy this criteria? You're only considering rows for videos created before a certain timestamp and with a certain status -- perhaps this is limiting your result set to where only one video matches.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL query performance- ssis - sql

Related

I just started learning SQL and I couldn't do the query, can you help me?

Condition for a SQL query to check if dates are between prior period and current month and execute subqueries based on given gate

SQL fill gaps with hold

How do I use the value from row above when a given column value is zero?

How to re-write the following mysql query

Categories

Resources