IsNull is slow, do I have other option? - sql

Having the following query:
select
tA.Name
,tA.Prop1
,tA.Prop2
( select sum(tB.Values)
from tableB tB
where tA.Prop1 = tB.Prop1
and tA.Prop2 = tB.Prop2
) as Total
from tableA tA
This query is taking me 1 second to run, BUT it will give me wrong SUM when Prop2 is null
I change the query to use IsNULL
and ISNULL(tA.Prop2,-1) = ISNULL(tB.Prop2,-1)
the data is correct now, but takes almost 7 seconds.....
Is there a fastest way to do this?
Note: this is just a partial simplified version of a more complex query.... but the base idea is here.

From your description, = is using an index, but the isnull() blocks the use of an index. This is a bit hard to get around in SQL server.
One way is to break the logic into two sums:
select tA.Name, tA.Prop1, tA.Prop2
(isnull((select sum(tB.Values)
from tableB tB
where tA.Prop1 = tB.Prop1 and tA.Prop2 = tB.Prop2
), 0) +
isnull((select sum(tB.Values)
from tableB tB
where tA.Prop1 = tB.Prop1 and
tA.Prop2 is null and tB.Prop2 is null
), 0)
) as Total
from tableA tA;

I ended up using
AND (
(tA.Prop2= tB.Prop2)
OR (tA.Prop2 IS NULL AND tB.Prop2 IS NULL )
)

Related

postgres: COUNT, DISTINCT is not implemented for window functions

I am trying to use COUNT(DISTINC column) OVER(PARTITION BY column) when I am using COUNT + window function(OVER).
I get an error like the one in the title and can't get it to work.
I have looked into how to deal with this error, but I have not found an example of how to deal with such a complex query as the one below.
I cannot find an example of how to deal with such a complex query as shown below, and I am not sure how to handle it.
The COUNT part of the problem exists on line 65.
How can such a complex query be resolved without slowing down?
WITH RECURSIVE "cte" AS((
SELECT
"videos_productvideocomment"."id",
"videos_productvideocomment"."user_id",
"videos_productvideocomment"."video_id",
"videos_productvideocomment"."parent_id",
"videos_productvideocomment"."text",
"videos_productvideocomment"."commented_at",
"videos_productvideocomment"."edited_at",
"videos_productvideocomment"."created_at",
"videos_productvideocomment"."updated_at",
"videos_productvideocomment"."id" AS "root_id"
FROM
"videos_productvideocomment"
WHERE
(
"videos_productvideocomment"."parent_id" IS NULL
AND "videos_productvideocomment"."video_id" = 'f264433c-c0af-49cc-8b40-84453da71b2d'
)
) UNION(
SELECT
"videos_productvideocomment"."id",
"videos_productvideocomment"."user_id",
"videos_productvideocomment"."video_id",
"videos_productvideocomment"."parent_id",
"videos_productvideocomment"."text",
"videos_productvideocomment"."commented_at",
"videos_productvideocomment"."edited_at",
"videos_productvideocomment"."created_at",
"videos_productvideocomment"."updated_at",
"cte"."root_id" AS "root_id"
FROM
"videos_productvideocomment"
INNER JOIN
"cte"
ON "videos_productvideocomment"."parent_id" = "cte"."id"
))
SELECT
*,
EXISTS(
SELECT
(1) AS "a"
FROM
"videos_productvideolikecomment" U0
WHERE
(
U0."comment_id" = t."id"
AND U0."user_id" = '3bd3bc86-0335-481e-9fd2-eb2fb1168f48'
)
LIMIT 1
) AS "liked"
FROM
(
SELECT DISTINCT
"cte"."id",
"cte"."created_at",
"cte"."updated_at",
"cte"."user_id",
"cte"."text",
"cte"."commented_at",
"cte"."edited_at",
"cte"."parent_id",
"cte"."video_id",
"cte"."root_id" AS "root_id",
COUNT(DISTINCT "cte"."root_id") OVER(PARTITION BY "cte"."root_id") AS "reply_count", <--- here
COUNT("videos_productvideolikecomment"."id") OVER(PARTITION BY "cte"."id") AS "liked_count"
FROM
"cte"
LEFT OUTER JOIN
"videos_productvideolikecomment"
ON (
"cte"."id" = "videos_productvideolikecomment"."comment_id"
)
) t
WHERE
t."id" = t."root_id"
ORDER BY
CASE
WHEN t."user_id" = '3bd3bc86-0335-481e-9fd2-eb2fb1168f48' THEN 0
ELSE 1
END ASC,
"liked_count" DESC
DISTINCT will look for duplicates and remove it, but in big data it will take a lot of time to process this query, you should process the middle of the record in the programming part I think it will be fast than. Thank

Conditional statements in "WHERE" in SQL Server

What I want to achieve is to have a switch case in the where clause. I want to test if this statement returns something, if it returns null, use this instead.
Sample:
SELECT [THIS_COLUMN]
FROM [THIS_TABLE]
WHERE (IF THIS [ID] RETURNS NULL THEN DO THIS SUBQUERY)
What I mean is that it will do this query first.
SELECT [THIS_COLUMN]
FROM [THIS_TABLE]
WHERE [ID] = 'SOMETHING'
If this returns NULL, do this query instead:
SELECT [THIS_COLUMN]
FROM [THIS_TABLE]
WHERE ID = (SELECT [SOMETHING] FROM [OTHER_TABLE]
WHERE [SOMETHING_SPECIFIC] = 'SOMETHING SPECIFIC')
Note that the expected results from the intended query varies from 30 rows up to 15k rows. Hope it helps.
Adding more information:
The results for this query will be used for another query but will just focus on this query.
Providing a real case scenario:
[THIS_COLUMN] is expected to have a list of VALUES.
[THIS_TABLE] contains the latest data only(let's say 1 year's worth of data) while the [OTHER_TABLE] contains the historical data.
What I want to achieve is when I query for a data that is not with in the 1 year's worth of data, IE 'SOMETHING' is not with in the 1 year scope(or in my case it returns NULL), I will use the other query where I query the 'SOMETHING_SPECIFIC'(Or may be 'SOMETHING' from the first statement makes more sense) from the historical table.
If I as reading through the lines correctly, this might work:
SELECT THIS_COLUMN
FROM dbo.THIS_TABLE TT
WHERE TT.ID = 'SOMETHING'
OR TT.ID = (SELECT OT.SOMETHING
FROM dbo.OTHER_TABLE OT
WHERE OT.SOMETHING_SPECIFIC = 'SOMETHING SPECIFIC'
AND NOT EXISTS (SELECT 1
FROM dbo.THIS_TABLE sq
WHERE sq.ID = 'SOMETHING'
AND THIS_COLUMN IS NOT NULL))
Note, however, that this could easily not be particularly performant.
You an use union all and not exists:
select this_column
from this_table
where id = 'something'
union all
select this_column
from this_table
where
not exists (select this_column from this_table where id = 'something')
and id = (select something from other_table where something_specific = 'something specific')
The first union member attempts to find rows that match the first condition, while the other one uses the subquery - the not exists prevents the second member to return something if the first member found a match.
90% of the time you can use a query-batch (i.e. a sequence of T-SQL statements) in a single SqlCommand object or SQL Server client session, so with that in-mind you could do this:
DECLARE #foo nvarchar(50) = (
SELECT
[THIS_COLUMN]
FROM
[THIS_TABLE]
WHERE
[ID] = 'SOMETHING'
);
IF #foo IS NULL
BEGIN
SELECT
[THIS_COLUMN]
FROM
[THIS_TABLE]
WHERE
[ID] = (
SELECT
[SOMETHING]
FROM
[OTHER_TABLE]
WHERE
[SOMETHING_SPECIFIC] = 'SOMETHING SPECIFIC'
)
END
ELSE
BEGIN
SELECT #foo AS [THIS_COLUMN];
END
That said, SELECT ... FROM ... WHERE x IN ( SELECT y FROM ... ) is a code-smell in a query - you probably need to rethink your solution entirely.

Avoiding aggregation when selecting values from tables

I have the following code which selects value from table2 when 'some string' occurs more than once in 1990
SELECT a.value, COUNT(*) AS test
FROM table1 c
JOIN table2 a
ON c.value2 = a.value_2
JOIN table3 o
ON c.value3 = o.value_3
AND o.value4 = 1990
WHERE c.string = 'Some string'
GROUP BY a.value
HAVING COUNT(*) > 1
This works fine but I am attempting to write a query that produces a similar result without using aggregation. I just need to select values with more then 1 c.string and select those rather than counting and selecting the count as well. I thought about searching for pairs of 'some string' occurring in 1990 for a value but am unsure of how to execute this. Pointing me in the right direction would be appreciated! Struggling to find any documentation referencing this. Thank you!
Use window function ROW_NUMBER() to assign a sequence number within the rows of each table2.value. And use window function FIRST_VALUE() to get the largest row number for each table2.value. Use DISTINCT to remove the duplicates:
select distinct value, first_value(rn) over ( order by rn desc) as count
from
(
SELECT a.value , row_number() over (partition by a.value order by null) rn
FROM table1 c
JOIN table2 a
ON c.value2 = a.value_2
JOIN table3 o
ON c.value3 = o.value_3
AND o.value4 = 1990
WHERE c.string = 'Some string' ) t
where rn > 1;
To check for duplicates, you can use 'WHERE EXISTS', as a starting point. You could start by reading this:
https://www.w3schools.com/sql/sql_exists.asp
This will give you quite a long, cumbersome piece of code compared to using aggregation. But I expect that's the point of the task - to show how useful aggregation is.

SQL issue with NULL values on SUM

I'm currently working on some sql stuff, but running in a bit of an issue.
I've got this method that looks for cash transactions, and takes off the cashback but sometimes there are no cash transactions, so that value turns into NULL and you can't subtract from NULL.
I've tried to put an ISNULL around it, but it still turns into null.
Can anyone help me with this?
;WITH tran_payment AS
(
SELECT 1 AS payment_method, NULL AS payment_amount, null as tran_header_cid
UNION ALL
SELECT 998 AS payment_method, 2 AS payment_amount, NULL as tran_header_cid
),
paytype AS
(
SELECT 1 AS mopid, 2 AS mopshort
),
tran_header AS
(
SELECT 1 AS cid
)
SELECT p.mopid AS mopid,
p.mopshort AS descript,
payment_value AS PaymentValue,
ISNULL(DeclaredValue, 0.00) AS DeclaredValue
from paytype p
LEFT OUTER JOIN (SELECT CASE
When (tp.payment_method = 1)
THEN
(ISNULL(SUM(tp.payment_amount), 0)
- (SELECT ISNULL(SUM(ABS(tp.payment_amount)), 0)
FROM tran_payment tp
INNER JOIN tran_header th on tp.tran_header_cid = th.cid
WHERE payment_method = 998
) )
ELSE SUM(tp.payment_amount)
END as payment_value,
tp.payment_method,
0 as DeclaredValue
FROM tran_header th
LEFT OUTER JOIN tran_payment tp
ON tp.tran_header_cid = th.cid
GROUP BY payment_method) pmts
ON p.mopid = pmts.payment_method
Maybe COALESCE() can help you?
You can try this:
SUM(COALESCE(tp.payment_amount, 0))
or
COALESCE(SUM(tp.payment_amount), 0)
COALESCE(arg1, arg2, ..., argN) returns the first non-null argument from the list.
try to put ISNULL inside SUM and ABS, i.e. around the actual field, like this
SUM(ISNULL(tp.payment_amount, 0))
SUM(ABS(ISNULL(tp.payment_amount, 0)))
I don't have MS SQL to test here, but would it work to put the ISNULL around the SELECT? Maybe, ISNULL isn't triggered at all, if there are no matching rows...

Problem with sql select query

I'm having a little problem with [PortfelID] column. I need it's ID to be able to use it in function which will return me name about Type of Strategy per client. However by doing this i need to put [PortfelID] in GroupBy which complicates the results a lot.
I'm looking for a way to find Type of Strategy and Sum of Money this strategy has. However if i use Group By [PortfelID] I'm getting multiple entries per each strategy. Actually over 700 rows (because there are 700 [PortfelID] values). And all I want is just 1 strategy and Sum of [WycenaWartosc] for this strategy. So in total i would get 15 rows or so
Is there a way to use that function without having to add [PortfelID] in Group By?
DECLARE #data DateTime
SET #data = '20100930'
SELECT [dbo].[ufn_TypStrategiiDlaPortfelaDlaDaty] ([PortfelID], #data)
,SUM([WycenaWartosc]) AS 'Wycena'
FROM[dbo].[Wycena]
LEFT JOIN [KlienciPortfeleKonta]
ON [Wycena].[KlienciPortfeleKontaID] = [KlienciPortfeleKonta].[KlienciPortfeleKontaID]
WHERE [WycenaData] = #data
GROUP BY [PortfelID]
Where [dbo].[ufn_TypStrategiiDlaPortfelaDlaDaty] is defined like this:
ALTER FUNCTION [dbo].[ufn_TypStrategiiDlaPortfelaDlaDaty]
(
#portfelID INT,
#data DATETIME
)
RETURNS NVARCHAR(MAX)
AS BEGIN
RETURN ( SELECT TOP 1
[TypyStrategiiNazwa]
FROM [dbo].[KlienciPortfeleUmowy]
INNER JOIN [dbo].[TypyStrategii]
ON dbo.KlienciPortfeleUmowy.TypyStrategiiID = dbo.TypyStrategii.TypyStrategiiID
WHERE [PortfelID] = #portfelID
AND ( [KlienciUmowyDataPoczatkowa] <= #data
AND ([KlienciUmowyDataKoncowa] >= #data
OR KlienciUmowyDataKoncowa IS NULL)
)
ORDER BY [KlienciUmowyID] ASC
)
end
EDIT:
As per suggestion (Roopesh Majeti) I've made something like this:
SELECT SUM(CASE WHEN [dbo].[ufn_TypStrategiiDlaPortfelaDlaDaty] ([PortfelID], #data) = 'portfel energetyka' THEN [WycenaWartosc] ELSE 0 END) AS 'Strategy 1'
,SUM(CASE WHEN [dbo].[ufn_TypStrategiiDlaPortfelaDlaDaty] ([PortfelID], #data) = 'banków niepublicznych' THEN [WycenaWartosc] ELSE 0 END) AS 'Strategy 2'
FROM [dbo].[Wycena]
LEFT JOIN [KlienciPortfeleKonta]
ON [Wycena].[KlienciPortfeleKontaID] = [KlienciPortfeleKonta].[KlienciPortfeleKontaID]
WHERE [WycenaData] = #data
But this seems like a bit overkill and a bit too much of hand job is required. AlexS solution seems to do exactly what I need :-)
Here's an idea of how you can do this.
DECLARE #data DateTime
SET #data = '20100930'
SELECT
TypID,
SUM([WycenaWartosc]) AS 'Wycena'
FROM
(
SELECT [dbo].[ufn_TypStrategiiDlaPortfelaDlaDaty] ([PortfelID], #data) as TypID
,[WycenaWartosc]
FROM[dbo].[Wycena]
LEFT JOIN [KlienciPortfeleKonta]
ON [Wycena].[KlienciPortfeleKontaID] = [KlienciPortfeleKonta].[KlienciPortfeleKontaID]
WHERE [WycenaData] = #data
) as Q
GROUP BY [TypID]
So basically there's no need to group by PortfelID (as soon as you need to group by output of [dbo].[ufn_TypStrategiiDlaPortfelaDlaDaty]).
This query is not optimal, though. Join can be pushed to the outer query in case PortfelID and WycenaData are not in [KlienciPortfeleKonta] table.
UPDATE: fixed select list and aggregation function application
How about using the "Case" statement in sql ?
Check the below link for example :
http://www.1keydata.com/sql/sql-case.html
Hope this helps.