Select row even if a condition is not true - sql

I don't know how to ask that also this is an example.
Say I have 2 tables:
pages:
idpage title
0 first
1 second
2 third
reads:
idread idpage time
50 0 8:15
83 0 2:58
If I do SELECT * FROM pages,reads WHERE pages.idpage=reads.idpage AND pages.idpage<2
I will have something like that:
idpage title idread time
0 first 50 8:15
0 first 83 2:58
Where I would like that:
idpage title idread time
0 first 50 8:15
0 first 83 2:58
1 second 0 0:00
Thanks

Always use explicit JOIN syntax. Never use commas in the FROM clause.
What you need is a LEFT JOIN. And, the way you are expressing the query makes this much harder to figure out. So:
SELECT p.idpage, p.title,
COALESCE(idread, 0) as idread,
COALESCE(time, cast('0:00' as time)) as time
FROM pages p LEFT JOIN
reads r
ON p.idpage = r.idpage
WHERE p.idpage < 2;
Note that when using LEFT JOIN, conditions on the first table should go in the WHERE clause. Conditions on the second table go in the ON clause.

You need a left join and CASE expression to complete the values when they are null, like this:
SELECT p.*,
case when r.idread is null then 0 else r.idread end as idread
case when r.time is null then '0:00' else r.time end as time
FROM pages p
LEFT OUTER JOIN reads r
ON(p.idpage = r.idpage)
WHERE p.idpage < 2
Note that I've changed your syntax to explicit join syntax(LEFT OUTER JOIN) instead of your implicit syntax's, which can easily lead to problems, especially when left joining.

use full outer join
SELECT column_name(s)
FROM table1
FULL OUTER JOIN table2
ON table1.column_name=table2.column_name;

Related

Performance Issue in Left outer join Sql server

In my project I need find difference task based on old and new revision in the same table.
id | task | latest_Rev
1 A N
1 B N
2 C Y
2 A Y
2 B Y
Expected Result:
id | task | latest_Rev
2 C Y
So I tried following query
Select new.*
from Rev_tmp nw with (nolock)
left outer
join rev_tmp old with (nolock)
on nw.id -1 = old.id
and nw.task = old.task
and nw.latest_rev = 'y'
where old.task is null
when my table have more than 20k records this query takes more time?
How to reduce the time?
In my company don't allow to use subquery
Use LAG function to remove the self join
SELECT *
FROM (SELECT *,
CASE WHEN latest_Rev = 'y' THEN Lag(latest_Rev) OVER(partition BY task ORDER BY id) ELSE NULL END AS prev_rev
FROM Rev_tmp) a
WHERE prev_rev IS NULL
My answer assumes
You can't change the indexes
You can't use subqueries
All fields are indexed separately
If you look at the query, the only value that really reduces the resultset is latest_rev='Y'. If you were to eliminate that condition, you'd definitely get a table scan. So we want that condition to be evaluated using an index. Unfortunately a field that just values 'Y' and 'N' is likely to be ignored because it will have terrible selectivity. You might get better performance if you coax SQL Server into using it anyway. If the index on latest_rev is called idx_latest_rev then try this:
Set transaction isolated level read uncommitted
Select new.*
from Rev_tmp nw with (index(idx_latest_rev))
left outer
join rev_tmp old
on nw.id -1 = old.id
and nw.task = old.task
where old.task is null
and nw.latest_rev = 'y'
latest_Rev should be a Bit type (boolean equivalent), i better for performance (Detail here)
May be can you add index on id, task
, latest_Rev columns
You can try this query (replace left outer by not exists)
Select *
from Rev_tmp nw
where nw.latest_rev = 'y' and not exists
(
select * from rev_tmp old
where nw.id -1 = old.id and nw.task = old.task
)

Tricky (MS)SQL query with aggregated functions

I have these three tables:
table_things: [id]
table_location: [id]
[location]
[quantity]
table_reservation: [id]
[quantity]
[location]
[list_id]
Example data:
table_things:
id
1
2
3
table_location
id location quantity
1 100 10
1 101 4
2 100 1
table_reservation
id quantity location list_id
1 2 100 500
1 1 100 0
2 1 100 0
They are connected by [id] being the same in all three tables and [location] being the same in table_loation and table_reservation.
[quantity] in table_location shows how many ([quantity]) things ([id]) are in a certain place ([location]).
[quantity] in table_reservation shows how many ([quantity]) things ([id]) are reserved in a certain place ([location]).
There can be 0 or many rows in table_reservation that correspond to table_location.id = table_reservation_id, so I probably need to use an outer join for that.
I want to create a query that answers the question: How many things ([id]) are in this specific place (WHERE table_location=123), how many of of those things are reserved (table_reservation.[quantity]) and how many of those that are reserved are on a table_reservation.list_id where table_reservation.list_id > 0.
I can't get the aggregate functions right to where the answer contains only the number of lines that are in table_location with the given WHERE clause and at the same time I get the correct number of table_reservation.quantity.
If I do this I get the correct number of lines in the answer:
SELECT table_things.[id],
table_location.[quantity],
SUM(table_reservation.[quantity]
FROM table_location
INNER JOIN table_things ON table_location.[id] = table_things.[id]
RIGHT OUTER JOIN table_reservation ON table_things.location = table_reservation.location
WHERE table_location.location = 100
GROUP BY table_things.[id], table_location[quantity]
But the problem with that query is that I (of course) get an incorrect value for SUM(table_reservation.[quantity]) since it sums up all the corresponding rows in table_reservation and posts the same value on each of the rows in the result.
The second part is trying to get the correct value for the number of table_reservation.[quantity] whose list_id > 0. I tried something like this for that, in the SELECT list:
(SELECT SUM(CASE WHEN table_reservation.list_id > 0 THEN table_reservation.[quantity] ELSE 0 END)) AS test
But that doesn't even parse... I'm just showing it to show my thinking.
Probably an easy SQL problem, but it's been too long since I was doing these kinds of complicated queries.
For your first two questions:
How many things ([id]) are in this specific place (WHERE table_location=123), how many of of those things are reserved (table_reservation.[quantity])
I think you simply need a LEFT OUTER JOIN instead of RIGHT, and an additional join predicate for table_reservation
SELECT l.id,
l.quantity,
Reserved = SUM(ISNULL(r.quantity, 0))
FROM table_location AS l
INNER JOIN table_things AS t
ON t.id = l.ID
LEFT JOIN table_reservation r
ON r.id = t.id
AND r.location = l.location
WHERE l.location = 100
GROUP BY l.id, l.quantity;
N.B I have added ISNULL so that when nothing is reserved you get a result of 0 rather than NULL. You also don't actually need to reference table_things at all, but I am guessing this is a simplified example and you may need other fields from there so have left it in. I have also used aliases to make the query (in my opinion) easier to read.
For your 3rd question:
and how many of those that are reserved are on a table_reservation.list_id where table_reservation.list_id > 0.
Then you can use a conditional aggregate (CASE expression inside your SUM):
SELECT l.id,
l.quantity,
Reserved = SUM(r.quantity),
ReservedWithListOver0 = SUM(CASE WHEN r.list_id > 0 THEN r.[quantity] ELSE 0 END)
FROM table_location AS l
INNER JOIN table_things AS t
ON t.id = l.ID
LEFT JOIN table_reservation r
ON r.id = t.id
AND r.location = l.location
WHERE l.location = 100
GROUP BY l.id, l.quantity;
As a couple of side notes, unless you are doing it for the right reasons (so that different tables are queried depending on who is executing the query), then it is a good idea to always use the schema prefix, i.e. dbo.table_reservation rather than just table_reservation. It is also quite antiquated to prefix your object names with the object type (i.e. dbo.table_things rather than just dbo.things). It is somewhat subject, but this page gives a good example of why it might not be the best idea.
You can use a query like the following:
SELECT tt.[id],
tl.[quantity],
tr.[total_quantity],
tr.[partial_quantity]
FROM table_location AS tl
INNER JOIN table_things AS tt ON tl.[id] = tt.[id]
LEFT JOIN (
SELECT id, location,
SUM(quantity) AS total_quantity,
SUM(CASE WHEN list_id > 0 THEN quantity ELSE 0 END) AS partial_quantity
FROM table_reservation
GROUP BY id, location
) AS tr ON tl.id = tr.id AND tl.location = tr.location
WHERE tl.location = 100
The trick here is to do a LEFT JOIN to an already aggregated version of table table_reservation, so that you get one row per id, location. The derived table uses conditional aggregation to calculate field partial_quantity that contains the quantity where list_id > 0.
Output:
id quantity total_quantity partial_quantity
-----------------------------------------------
1 10 3 2
2 1 1 0
This was a classic case of sitting with a problem for a few hours and getting nowhere and then when you post to stackoverflow, you suddenly come up with the answer. Here's the query that gets me what I want:
SELECT table_things.[id],
table_location.[quantity],
SUM(table_reservation.[quantity],
(SELECT SUM(CASE WHEN table_reservation.list_id > 0 THEN ISNULL(table_reservation.[quantity], 0) ELSE 0 END)) AS test
FROM table_location
INNER JOIN table_things ON table_location.[id] = table_things.[id]
RIGHT OUTER JOIN table_reservation ON table_things.location = table_reservation.location AND table_things.[id] = table_reservation.[id]
WHERE table_location.location = 100
GROUP BY table_things.[id], table_location[quantity]
Edit: After having read GarethD's reply below, I did the changes he suggested (to my real code, not to the query above) which makes the (real) query correct.

Left outer join column name ambiguously defined

I have a table task
select
sts_id,
count(*) mycount
from
task
where
sts_id in (1, 8, 39)
group by sts_id;
output :
sts_id count
1 1
8 1
39 1
I have one more temp table with one column sts_id
which looks like this
sts_id
1
8
39
40
41.
I am trying for a left join for both the tables
select
in_list.sts_id,
count(*) mycount
from
task
left outer join
in_list
on task.sts_id = in_list.sts_id
group by sts_id;
to get ab o/p like
1 1
8 1
39 1
40 0
41 0..
I am getting an error of column ambiguously defined.
You are using left join the wrong way (on the left it must be the table with all the rows you want to show).
Count (task.sts_id) to get 0 on rows without ocurrences on that table
select
in_list.sts_id,
count(task.sts_id) mycount
from
in_list
left outer join
task
on in_list.sts_id = task.sts_id
AND task.sts_id in (1, 8, 39) -- Thanks Matt.
group by in_list.sts_id;
You are missing the table alias in the GROUP BY clause.
However, your needed result says that you need to change your join logic: the starting table should be in_list, while task should be in left outer join:
select ...
from in_list
left outer join task
select
in_list.sts_id,
coalesce(count(task.sts_ID),0) mycount --changed this line
from
task
right outer join --changed this line
in_list
on task.sts_id = in_list.sts_id
group by in_list.sts_id; -- changed this line
Reasons:
as in_list contains more data than task, we needed to either change the table order or make it a right join
Count would count all records and not return resutls you want the count from task
need to coalesce the results otherwise null count will return null not 0.
I got my answer with this query
select t2.sts_id, count(t.sts_id)
from task t, in_list t2
where t2.sts_id = t.sts_id(+)
group by t2.sts_id
Thanks,

SubQuery inner join with condition

I have two tables (REPORTDETAILS, REPORTITEMS), I want to query one of them to find a max value + 1 between two values, but I also want to find the max value + 1 of another table at the same time.
I am getting the correct value from the first table, but the second value is incorrect because of the condition that I have built in for the first table.
What is the correct way to right this statement?
Thanks for your help as always.
Here is my statement.
SELECT MAX(rd.REPNUMBER) + 1 as REPNUMBER,
MAX(ri.REPITEM) + 1 as REPITEM
FROM REPORTDETAILS rd INNER JOIN REPORTITEMS ri ON rd.REPNUMBER = ri.REPNUMBER
WHERE rd.REPNUMBER BETWEEN 11000000 and 11099999;
You could use CASE within your MAX statement and remove your WHERE criteria:
SELECT MAX(CASE WHEN rd.REPNUMBER BETWEEN 11000000 and 11099999
THEN rd.REPNUMBER END) + 1 as REPNUMBER,
MAX(ri.REPITEM) + 1 as REPITEM
FROM REPORTDETAILS rd
INNER JOIN REPORTITEMS ri ON rd.REPNUMBER = ri.REPNUMBER
Because of the INNER JOIN, this would still only show the max(repitem) based on the matching records between reportdetails and reportitems. If you just want the max(repitem) regardless of the reportdetails, then you could use a CROSS JOIN instead.

Join Table on one record but calculate field based on other rows in the join

I am trying to write a query to Identify my subscribers who have abandoned a shopping cart in the last day but also I need a calculated field that represents weather or not they have received and incentive in the last 7 days.
I have the following tables
AbandonCart_Subscribers
Sendlog
The first part of the query is easy, get abandoners in the last day
select a.* from AbandonCart_Subscribers
where DATEDIFF(day,a.DateAbandoned,GETDATE()) <= 1
Here is my attempt to calculate the incentive but I am fairly certain it is not correct as IncentiveRecieved is always 0 even when I know it should not be...
select a.*,
CASE
WHEN DATEDIFF(D,s.SENDDATE,GETDATE()) >= 7
THEN 1
ELSE 0
END As IncentiveRecieved
from AbandonCart_Subscribers a
left join SendLog s on a.EmailAddress = s.EmailAddress and s.CampaignID IS NULL
where
DATEDIFF(day,a.DateAbandoned,GETDATE()) <= 1
Here is a SQL fiddle with the objects and some data. I would really appreciate some help.
Thanks
http://sqlfiddle.com/#!3/f481f/1
Kishore is right in saying the main problem is that it should be <=7, not >=7. However, there is another problem.
As it stands, you could get multiple results. You don't want to do a left join on SendLog in case the same email address is in there more than once. Instead you should be getting a unique result from that table. There's a couple of ways of doing that; here is one such way which uses a derived table. The table I have called s will give you a unique list of emails that have been sent an incentive in the last week.
select a.*,
CASE
WHEN s.EmailAddress is not null
THEN 1
ELSE 0
END As IncentiveRecieved
from AbandonCart_Subscribers a
left join (select distinct EmailAddress
from SendLog s
where s.CampaignID IS NULL
and DATEDIFF(D,s.SENDDATE,GETDATE()) <= 7
) s on a.EmailAddress = s.EmailAddress
where DATEDIFF(day,a.DateAbandoned,GETDATE()) <= 1
You can set a variable using a condition:
select a.*,(DATEDIFF(D,s.SENDDATE,GETDATE()) >= 7) as `IncentiveRecieved `
from AbandonCart_Subscribers a
left join SendLog s on a.EmailAddress = s.EmailAddress and s.CampaignID IS NULL
where
DATEDIFF(day,a.DateAbandoned,GETDATE()) <= 1
Is this what you're looking for?
should it not be less than 7 instead of greater than 7?
select a.*,
CASE
WHEN DATEDIFF(D,s.SENDDATE,GETDATE()) <= 7 AND CampaignID is not null
THEN 1
ELSE 0
END As IncentiveRecieved
from AbandonCart_Subscribers a
left join SendLog s on a.EmailAddress = s.EmailAddress --and s.CampaignID IS NULL
where
DATEDIFF(day,a.DateAbandoned,GETDATE()) <= 1
Hope this satisfies your need.