Get Count 0 if there are no entries in the RIGHT Table - sql

Websites:
website_Id
website_name
1
website_a
2
website_b
3
website_c
4
website_d
5
website_e
Fixtures:
fixture_Id
website_id
fixture_details
1
1
a vs b
2
1
c vs d
3
2
e vs f
4
2
g vs h
5
4
i vs j
Expected Output:
website_Id
website_name
TotalRows
1
website_a
2
2
website_b
2
3
website_c
0
4
website_d
1
5
website_e
0
I would like to get 0 when there are no entries in the fixture table.
Select fx.website_id, ws.website_name, Count (*) as TotalRows
FROM fixtures fx
LEFT JOIN websites ws on ws.website_id = fx.website_id
WHERE date_of_entry = '16-01-2023'
GROUP BY
fx.website_id, ws.website_name
But this does not return 0 when there are no entries.
How can I change my SQL to reflect this?

You are very close, the reason why you cannot get those records with 0 count is because if there are no related fixture records for the specific website, date_of_entry will be NULL which WHERE date_of_entry = '16-01-2023' will filter all those records out. So the solutions are either put it in the LEFT JOIN condition or add an extra condition in where clause. Another core problem is you are grouping count by website related data, you MUST select from website or RIGHT JOIN to fixtures to keep all website records showing in result.
Solution A
Select ws.id AS website_id, ws.website_name, Count (fx.*) as TotalRows
FROM websites ws
LEFT JOIN fixtures fx on ws.website_id = fx.website_id AND date_of_entry = '16-01-2023'
GROUP BY
ws.id, ws.website_name
;
Solution B
Select ws.id AS website_id, ws.website_name, Count (fx.*) as TotalRows
FROM websites ws
LEFT JOIN fixtures fx on ws.website_id = fx.website_id
WHERE date_of_entry IS NULL OR date_of_entry = '16-01-2023'
GROUP BY
ws.id, ws.website_name
;

Try the following statement:
SELECT ws.website_id, ws.website_name, COUNT(fx.id) AS number_of_fixtures
FROM websites ws
LEFT JOIN fixtures fx ON fx.website_id = ws.website_id
WHERE TRUE -- or whatever condition you want but I do not know where to take date_of_entry from
GROUP BY ws.website_id
COUNT with an expression as argument evaluates for each row this expression and does not count the row if it evaluates to NULL.
If you want to stick to your order of joins, you would need fixtures RIGHT JOIN websites.

The issue is that you're counting *; i.e. the number of rows regardless of table; so you'll be getting 1 when there's only a record from the fixtures table as you've returned 1 row. You can get around this by counting rows from the websites table by using count(ws.website_id) instead; since where there are results from this table, this field would return a non-null value and thus be counted; whilst where there's no record this field would be null, and thus not counted.
Select fx.fixture_id, ws.website_name, Count (ws.website_id) as TotalRows
FROM fixtures fx
LEFT JOIN websites ws on ws.website_id = fx.website_id
WHERE fx.date_of_entry = '16-01-2023'
GROUP BY
fx.fixture_id, ws.website_name
;
CORRECTION
Apologies - I'd not looked closely enough / had been returning all fixtures with just websites where those exist... Please try this: DB Fiddle
Select ws.website_id
, ws.website_name
, Count (fx.website_id) as TotalRows
FROM websites ws
LEFT OUTER JOIN fixtures fx
on fx.website_id = ws.website_id
and fx.date_of_entry = '16-01-2023'
GROUP BY ws.website_id, ws.website_name
ORDER BY ws.website_id

Related

Optimization of Top-N and last-N in one query

In RDBMS PostgreSQL 12.8 I have one table (to simplicity I've omitted several columns):
figure (~3.5 millions rows)
id
----
1
2
...
and another table
figure_step (~20 rows for one figure, in total ~70 millions rows)
id figure_id number status
--------------------------
1 1 1 'FINISHED'
2 1 2 'STARTED'
3 2 1 'FINISHED'
4 2 2 'DELAYED'
5 2 3 'CANCELLED'
...
I have query that selects top-1-by-number step in 'DELAYED' status and last-1-by-number step in 'FINISHED' status for figure:
SELECT * FROM figure f
LEFT JOIN LATERAL (SELECT * FROM figure_step fs
WHERE fs.status = 'DELAYED' AND f.id = fs.figure_id
ORDER BY number ASC
LIMIT 1) step_one
ON step_one.figure_id = f.id
LEFT JOIN LATERAL (SELECT * FROM figure_step fs
WHERE fs.status 'FINISHED' AND f.id = fs.figure_id
ORDER BY number DESC
LIMIT 1) step_two
ON step_two.figure_id = f.id
WHERE ...
LIMIT ... OFFSET ...
WHERE clause selects about 19000 rows from ~3.5 million rows with LIMIT about 150 rows. Keyword lateral used here to prevent joining big tables figure and figure_step.
So, I have two questions:
Is it appropriate to use lateral here? I think so, because without it we need to join figure_step two times using left join.
Here we join figure_step two times with lateral join. Are there any ways to optimize this, for example, reuse part of the subquery?
I'm probably wrong about this, and misunderstand the importance of that number.
But maybe one lateral might be enough.
By using conditional aggregation, instead of the double order & limit.
...
LEFT JOIN LATERAL (
SELECT
MIN(CASE WHEN fs.status = 'DELAYED' THEN fs.id END) MinDelayedId
, MAX(CASE WHEN fs.status = 'FINISHED' THEN fs.id END) MaxFinishedId
FROM figure_step fs
WHERE f.id = fs.figure_id
) step
...

Tricky (MS)SQL query with aggregated functions

I have these three tables:
table_things: [id]
table_location: [id]
[location]
[quantity]
table_reservation: [id]
[quantity]
[location]
[list_id]
Example data:
table_things:
id
1
2
3
table_location
id location quantity
1 100 10
1 101 4
2 100 1
table_reservation
id quantity location list_id
1 2 100 500
1 1 100 0
2 1 100 0
They are connected by [id] being the same in all three tables and [location] being the same in table_loation and table_reservation.
[quantity] in table_location shows how many ([quantity]) things ([id]) are in a certain place ([location]).
[quantity] in table_reservation shows how many ([quantity]) things ([id]) are reserved in a certain place ([location]).
There can be 0 or many rows in table_reservation that correspond to table_location.id = table_reservation_id, so I probably need to use an outer join for that.
I want to create a query that answers the question: How many things ([id]) are in this specific place (WHERE table_location=123), how many of of those things are reserved (table_reservation.[quantity]) and how many of those that are reserved are on a table_reservation.list_id where table_reservation.list_id > 0.
I can't get the aggregate functions right to where the answer contains only the number of lines that are in table_location with the given WHERE clause and at the same time I get the correct number of table_reservation.quantity.
If I do this I get the correct number of lines in the answer:
SELECT table_things.[id],
table_location.[quantity],
SUM(table_reservation.[quantity]
FROM table_location
INNER JOIN table_things ON table_location.[id] = table_things.[id]
RIGHT OUTER JOIN table_reservation ON table_things.location = table_reservation.location
WHERE table_location.location = 100
GROUP BY table_things.[id], table_location[quantity]
But the problem with that query is that I (of course) get an incorrect value for SUM(table_reservation.[quantity]) since it sums up all the corresponding rows in table_reservation and posts the same value on each of the rows in the result.
The second part is trying to get the correct value for the number of table_reservation.[quantity] whose list_id > 0. I tried something like this for that, in the SELECT list:
(SELECT SUM(CASE WHEN table_reservation.list_id > 0 THEN table_reservation.[quantity] ELSE 0 END)) AS test
But that doesn't even parse... I'm just showing it to show my thinking.
Probably an easy SQL problem, but it's been too long since I was doing these kinds of complicated queries.
For your first two questions:
How many things ([id]) are in this specific place (WHERE table_location=123), how many of of those things are reserved (table_reservation.[quantity])
I think you simply need a LEFT OUTER JOIN instead of RIGHT, and an additional join predicate for table_reservation
SELECT l.id,
l.quantity,
Reserved = SUM(ISNULL(r.quantity, 0))
FROM table_location AS l
INNER JOIN table_things AS t
ON t.id = l.ID
LEFT JOIN table_reservation r
ON r.id = t.id
AND r.location = l.location
WHERE l.location = 100
GROUP BY l.id, l.quantity;
N.B I have added ISNULL so that when nothing is reserved you get a result of 0 rather than NULL. You also don't actually need to reference table_things at all, but I am guessing this is a simplified example and you may need other fields from there so have left it in. I have also used aliases to make the query (in my opinion) easier to read.
For your 3rd question:
and how many of those that are reserved are on a table_reservation.list_id where table_reservation.list_id > 0.
Then you can use a conditional aggregate (CASE expression inside your SUM):
SELECT l.id,
l.quantity,
Reserved = SUM(r.quantity),
ReservedWithListOver0 = SUM(CASE WHEN r.list_id > 0 THEN r.[quantity] ELSE 0 END)
FROM table_location AS l
INNER JOIN table_things AS t
ON t.id = l.ID
LEFT JOIN table_reservation r
ON r.id = t.id
AND r.location = l.location
WHERE l.location = 100
GROUP BY l.id, l.quantity;
As a couple of side notes, unless you are doing it for the right reasons (so that different tables are queried depending on who is executing the query), then it is a good idea to always use the schema prefix, i.e. dbo.table_reservation rather than just table_reservation. It is also quite antiquated to prefix your object names with the object type (i.e. dbo.table_things rather than just dbo.things). It is somewhat subject, but this page gives a good example of why it might not be the best idea.
You can use a query like the following:
SELECT tt.[id],
tl.[quantity],
tr.[total_quantity],
tr.[partial_quantity]
FROM table_location AS tl
INNER JOIN table_things AS tt ON tl.[id] = tt.[id]
LEFT JOIN (
SELECT id, location,
SUM(quantity) AS total_quantity,
SUM(CASE WHEN list_id > 0 THEN quantity ELSE 0 END) AS partial_quantity
FROM table_reservation
GROUP BY id, location
) AS tr ON tl.id = tr.id AND tl.location = tr.location
WHERE tl.location = 100
The trick here is to do a LEFT JOIN to an already aggregated version of table table_reservation, so that you get one row per id, location. The derived table uses conditional aggregation to calculate field partial_quantity that contains the quantity where list_id > 0.
Output:
id quantity total_quantity partial_quantity
-----------------------------------------------
1 10 3 2
2 1 1 0
This was a classic case of sitting with a problem for a few hours and getting nowhere and then when you post to stackoverflow, you suddenly come up with the answer. Here's the query that gets me what I want:
SELECT table_things.[id],
table_location.[quantity],
SUM(table_reservation.[quantity],
(SELECT SUM(CASE WHEN table_reservation.list_id > 0 THEN ISNULL(table_reservation.[quantity], 0) ELSE 0 END)) AS test
FROM table_location
INNER JOIN table_things ON table_location.[id] = table_things.[id]
RIGHT OUTER JOIN table_reservation ON table_things.location = table_reservation.location AND table_things.[id] = table_reservation.[id]
WHERE table_location.location = 100
GROUP BY table_things.[id], table_location[quantity]
Edit: After having read GarethD's reply below, I did the changes he suggested (to my real code, not to the query above) which makes the (real) query correct.

T-SQL cursor or if or case when

I have this table:
Table_NAME_A:
quotid itration QStatus
--------------------------------
5329 1 Assigned
5329 2 Inreview
5329 3 sold
4329 1 sold
4329 2 sold
3214 1 assigned
3214 2 Inreview
Result output should look like this:
quotid itration QStatus
------------------------------
5329 3 sold
4329 2 sold
3214 2 Inreview
T-SQL query, so basically I want the data within "sold" status if not there then "inreview" if not there then "assigned" and also at the same time if "sold" or "inreview" or "assigned" has multiple iteration then i want the highest "iteration".
Please help me, thanks in advance :)
This is a prioritization query. One way to do this is with successive comparisons in a union all:
select a.*
from table_a a
where quote_status = 'sold'
union all
select a.*
from table_a a
where quote_status = 'Inreview' and
not exists (select 1 from table_a a2 where a2.quoteid = a.quoteid and a2.quotestatus = 'sold')
union all
select a.*
from table_a a
where quote_status = 'assigned' and
not exists (select 1
from table_a a2
where a2.quoteid = a.quoteid and a2.quotestatus in ('sold', 'Inreview')
);
For performance on a larger set of data, you would want an index on table_a(quoteid, quotestatus).
You want neither cursors nor if/then for this. Instead, you'll use a series of self-joins to get these results. I'll also use a CTE to simplify getting the max iteration at each step:
with StatusIterations As
(
SELECT quotID, MAX(itration) Iteration, QStatus
FROM table_NAME_A
GROUP BY quotID, QStats
)
select q.quotID, coalesce(sold.Iteration,rev.Iteration,asngd.Iteration) Iteration,
coalesce(sold.QStatus, rev.QStatus, asngd.QStatus) QStatus
from
--initial pass for list of quotes, to ensure every quote is included in the results
(select distinct quotID from table_NAME_A) q
--one additional pass for each possible status
left join StatusIterations sold on sold.quotID = q.quotID and sold.QStatus = 'sold'
left join StatusIterations rev on rev.quotID = q.quotID and rev.QStatus = 'Inreview'
left join StatusIterations asngd on asngd.quotID = q.quotID and asngd.QStatus = 'assigned'
If you have a table that equates a status with a numeric value, you can further improve on this:
Table: Status
QStatus Sequence
'Sold' 3
'Inreview' 2
'Assigned' 1
And the code becomes:
select t.quotID, MAX(t.itration) itration, t.QStatus
from
(
select t.quotID, MAX(s.Sequence) As Sequence
from table_NAME_A t
inner join Status s on s.QStatus = t.QStatus
group by t.quotID
) seq
inner join Status s on s.Sequence = seq.Sequence
inner join table_NAME_A t on t.quotID = seq.quotID and t.QStatus = s.QStatus
group by t.quoteID, t.QStatus
The above may look like complicated at first, but it can be faster and it will scale easily beyond three statuses without changing the code.

How to get first entry with a value from an hierarchical setting structure?

I have a couple of tables. One table with Groups:
[ID] - [ParentGroupID]
1 - NULL
2 1
3 1
4 2
And another with settings
[Setting] - [GroupId] - [Value]
Title 1 Hello
Title 2 World
Now I'd like to get "Hello" back if I'd query the Title for Group 3
And I'd like to get "World" back if I'd query the Title for Group 4 (And not "Hello" as well)
Is there any way to efficiently do this in MSSQL? At the moment I am resolving this recursively in code. But I was hoping that SQL could solve this problem for me.
Don't knoww the SQL Server syntax but something like the following?
SELECT settings.value
FROM settings
JOIN groups ON settings.groupid = groups.parentgroupid
WHERE settings.setting = 'Title'
AND groups.id = 3
This is a problem we've encountered multiple times in our company. This would work for any case, including when the settings can be set only at some levels and not others (see SQL Fiddle http://sqlfiddle.com/#!3/16af0/1/0 :
With GroupSettings(group_id, parent_group_id, value, current_level)
As
(
Select g.id as group_id, g.parent_id, s.value, 0 As current_Level
From Groups As g
Join Settings As s On s.group_id = g.id
Where g.parent_id Is Null
Union All
Select g.id, g.parent_id, Coalesce((Select value From Settings s Where s.group_id=g.id), gs.value), current_level+1
From GroupSettings as gs
Join Groups As g On g.parent_id = gs.group_id
)
Select *
From GroupSettings
Where group_id=4
I believe the following is what you are seeking. See the sqlfiddle
SELECT vALUE FROM
Groups g inner join Settings s
ON g.ParentGroupId = s.GroupID
WHERE g.ID = 3 -- will return Hello,], set ID = 4 will return World

Why left join is not giving distinct result?

I have following sql query and my left join is not giving me distinct result please help me to trace out.
SELECT DISTINCT
Position.Date,
Position.SecurityId,
Position.PurchaseLotId,
Position.InPosition,
ISNULL(ClosingPrice.Bid, Position.Mark) AS Mark
FROM
Fireball_Reporting.dbo.Reporting_DailyNAV_Pricing POSITION WITH (NOLOCK, READUNCOMMITTED)
LEFT JOIN Fireball.dbo.AdditionalSecurityPrice ClosingPrice WITH (NOLOCK, READUNCOMMITTED) ON
ClosingPrice.SecurityID = Position.PricingSecurityID AND
ClosingPrice.Date = Position.Date AND
ClosingPrice.SecurityPriceSourceID = #SourceID AND
ClosingPrice.PortfolioID IN (5,6)
WHERE
DatePurchased > #NewPositionDate AND
Position.Date = #CurrentPositionDate AND
InPosition = 1 AND
Position.PortfolioId IN (
SELECT
PARAM
FROM
Fireball_Reporting.dbo.ParseMultiValuedParameter(#PortfolioId, ',')
) AND
(
Position > 1 OR
Position < - 1
)
Now here in above my when I use LEFT JOIN ISNULL(ClosingPrice.Bid, Position.Mark) AS Mark and LEFT JOIN it is giving me more no of records with mutiple portfolio ids
for e.g . (5,6)
If i put portfolioID =5 giving result as 120 records
If i put portfolioID =6 giving result as 20 records
When I put portfolioID = (5,6) it should give me 140 records
but it is giving result as 350 records which is wrong . :(
It is happening because when I use LEFT JOIN there is no condition of PurchaseLotID in that as table Fireball.dbo.AdditionalSecurityPrice ClosingPrice not having column PurchaseLotID so it is giving me other records also whoes having same purchaseLotID's with diferent prices .
But I dont want that records
How can I eliminate those records ?
You get one Entry per DailyLoanAndCashPosition.PurchaseLotId = NAVImpact.PurchaseLotId
which would mean you must have more entrys in with the same PurchaseLotId
The most likely cause is that the left join produces duplicated PurchaseLotIds. The best way to know if if you perform a select distinct(PurchaseLotId) on your left side of the inner join.