SQL Include zero rows in query - sql

I have a derived a table with a category, subcategory, date, time, and value columns (from a much larger table).
The dates are all dates with a particular property (sometimes last 5 weeks, sometimes last 5 Sundays, sometimes month to day), as with the times (sometimes midnight to now every hour, sometimes last 30 minutes every 2 minutes).
The category and subcategory are uninteresting, but if they have no entry in the value field for the time parameters they're completely ignored (which is fine).
When there is no value which matches the combination of the other columns, the row is completely ignored.
My problem is that I need the 0's included. Is there some way I can fill in 0's to the rest of the table?
My thought was to do a cross join of the table against itself using only the first 4 columns, then left join with isnull() on the value column. But the cross join kept including the 5th column.

If I understand correctly, you want all combinations of date, time, category, and subcategory that your query returns. If so, you can use cross join to get the combinations and then left outer join to fill in the values or 0:
with t as (
<your query here>
)
select d.date, ti.time, sc.category, sc.subcategory, coalesce(t.value, 0) as value
from (select distinct date from t) d cross join
(select distinct time from t) ti cross join
(select distinct category, subcategory from t) sc left outer join
t
on t.date = d.date and t.time = ti.time and
t.category = sc.category and t.subcategory = sc.subcategory;

Related

Difference between 2 queries?

The 1st query returns 27384 rows. The 2nd query returns 142899 rows. Can someone please explain what is happening with the RIGHT JOIN and LEFT JOIN that is causing the output difference?
1st query :
SELECT u.id AS id,
MIN(q.creation_date) AS q_creation_date,
MIN(a.creation_date) AS a_creation_date
FROM `bigquery-public-data.stackoverflow.posts_questions`AS q
FULL JOIN `bigquery-public-data.stackoverflow.posts_answers` AS a
ON q.owner_user_id = a.owner_user_id
LEFT JOIN `bigquery-public-data.stackoverflow.users` AS u
ON q.owner_user_id = u.id
WHERE u.creation_date >= '2019-01-01'
and u.creation_date < '2019-02-01'
GROUP BY id
2nd query :
SELECT u.id AS id,
MIN(q.creation_date) AS q_creation_date,
MIN(a.creation_date) AS a_creation_date
FROM `bigquery-public-data.stackoverflow.posts_questions` AS q
FULL JOIN `bigquery-public-data.stackoverflow.posts_answers` AS a
ON q.owner_user_id = a.owner_user_id
RIGHT JOIN `bigquery-public-data.stackoverflow.users` AS u
ON q.owner_user_id = u.id
WHERE u.creation_date >= '2019-01-01' and u.creation_date < '2019-02-01'
GROUP BY id
I expected the result from the 1st query to be 142899 rows but I don't know why the LEFT JOIN returns a massively different result.
The recordset produced by the 1st query includes ALL records of 'q' AND ALL records of 'a' (and where either table doesn't have data to match, the database will fill those empty cells with nulls) BUT LIMITED TO records where both 'q' and 'u' have a match.
So, in the 1st query, the recordset is basically limited by the rows in 'u'. The query will never return more than the maximum number of rows in 'u'.
The recordset produced by the 2nd query includes ALL records of 'q' AND ALL records of 'a' (and where either table doesn't have data to match, the database will fill those empty cells with nulls) AND ALSO ALL records of 'u' (and where either table doesn't have data to match, the database will fill those empty cells with nulls).
So, the 2nd query may produce a recordset with as many rows as the largest table.
When you use RIGHT JOIN the table with priority is always the one on the right. Similarly LEFT JOIN prioritizes the table to the left of JOIN. Thus the number of rows is different in that the priority table has the search data and the non-priority table does not have the required combination. More details here.

Sql Join and Sum

I'm trying to sum an amount from two different tables using a left join. I need all rows returned regardless of whether or not there is a match on the second table.
SELECT l.tender,
l.starting+SUM(t.amount) AS 'amount'
FROM label l
LEFT JOIN transfers t on l.tender=t.name
ORDER BY l.tender
This should work:
SELECT l.tender,
l.starting + ISNULL(t.amount,0) AS amount
FROM label l
LEFT JOIN ( SELECT name, SUM(amount) amount
FROM transfers
GROUP BY name) t
ON l.tender = t.name
ORDER BY l.tender
Wrap the SUM(amount) in a COALESCE function so that if there is no match and it is NULL, it will add 0 and won't mark the whole row as NULL.
Change line 2 of your query to look like this:
l.starting + COALESCE(SUM(t.amount), 0) AS amount
Edit:
As #Lamak noted, you also need to GROUP BY your results so that you can SUM it properly. The whole query can look like this (alternative to the other answer):
SELECT l.tender,
l.starting + COALESCE(SUM(t.amount), 0) AS amount
FROM label l
LEFT JOIN transfers t on l.tender=t.name
GROUP BY l.tender, l.starting
ORDER BY l.tender

How to join multiple tables, without omitting values without a match

This small example is based on a question that I have run into countless times, however I failed finding the best answer.
I would like to create a report on Incidents logged per type, per month. Written the following query.
SELECT
d.MonthPeriod
,i.[Type]
,COUNT(*) AS [Count of Calls]
FROM
[dbo].[FactIncident] as i
LEFT JOIN
[dbo].[DimDate] as d on i.DateLoggedKey = d.DateKey
GROUP BY
d.[MonthPeriod],
i.[Type]
This results in the following:
Although correct, I would like to visualize earlier months with 0 logged calls. DimDate contains the following.
What is the best and/or most efficient way of showing the count of calls per month, per type, for all months. Even if the count is 0?
Thought of using Cross Apply, however the resultant query gets huge quickly. Only think of a dataset requiring the count of calls per customer, per category, per month over the last 3 years..
Any ideas?
Do the left join starting with the calendar table, so you keep all the months:
SELECT d.MonthPeriod, i.[Type], COUNT(i.type) AS [Count of Calls]
FROM [dbo].[DimDate] d LEFT JOIN
[dbo].[FactIncident] i
ON i.DateLoggedKey = d.DateKey
GROUP BY d.[MonthPeriod], i.[Type];
This will, of course, return the type as NULL for the months with no data.
If you want all types present, then use CROSS JOIN on the types. This example gets the data from the fact table, but you might have another reference table containing each type:
SELECT d.MonthPeriod, t.[Type], COUNT(i.type) AS [Count of Calls]
FROM [dbo].[DimDate] d CROSS JOIN
(select distinct type from factincident) t LEFT JOIN
[dbo].[FactIncident] i
ON i.DateLoggedKey = d.DateKey and i.type = t.type
GROUP BY d.[MonthPeriod], t.[Type];

Sum Distinct Rows Only In Sql Server

I have four tables,in which First has one to many relation with rest of three tables named as (Second,Third,Fourth) respectively.I want to sum only Distinct Rows returned by select query.Here is my query, which i try so far.
select count(distinct First.Order_id) as [No.Of Orders],sum( First.Amount) as [Amount] from First
inner join Second on First.Order_id=Second.Order_id
inner join Third on Third.Order_id=Second.Order_id
inner join Fourth on Fourth.Order_id=Third.Order_id
The outcome of this query is :
No.Of Orders Amount
7 69
But this Amount should be 49,because the sum of First column Amount is 49,but due to inner join and one to many relationship,it calculate sum of also duplicate rows.How to avoid this.Kindly guide me
I think the problem is cartesian products in the joins (for a given id). You can solve this using row_number():
select count(t1234.Order_id) as [No.Of Orders], sum(t1234.Amount) as [Amount]
from (select First.*,
row_number() over (partition by First.Order_id order by First.Order_id) as seqnum
from First inner join
Second
on First.Order_id=Second.Order_id inner join
Third
on Third.Order_id=Second.Order_id inner join
Fourth
on Fourth.Order_id=Third.Order_id
) t1234
where seqnum = 1;
By the way, you could also express this using conditions in the where clause, because you appear to be using the joins only for filtering:
select count(First.Order_id) as [No.Of Orders], sum(First.Amount) as [Amount]
from First
where exists (select 1 from second where First.Order_id=Second.Order_id) and
exists (select 1 from third where First.Order_id=third.Order_id) and
exists (select 1 from fourth where First.Order_id=fourth.Order_id);

Left and right joining in a query

A friend asked me for help on building a query that would show how many pieces of each model were sold on each day of the month, showing zeros when no pieces were sold for a particular model on a particular day, even if no items of any model are sold on that day. I came up with the query below, but it isn't working as expected. I'm only getting records for the models that have been sold, and I don't know why.
select days_of_months.`Date`,
m.NAME as "Model",
count(t.ID) as "Count"
from MODEL m
left join APPLIANCE_UNIT a on (m.ID = a.MODEL_FK and a.NUMBER_OF_UNITS > 0)
left join NEW_TICKET t on (a.NEW_TICKET_FK = t.ID and t.TYPE = 'SALES'
and t.SALES_ORDER_FK is not null)
right join (select date(concat(2009,'-',temp_months.id,'-',temp_days.id)) as "Date"
from temp_months
inner join temp_days on temp_days.id <= temp_months.last_day
where temp_months.id = 3 -- March
) days_of_months on date(t.CREATION_DATE_TIME) =
date(days_of_months.`Date`)
group by days_of_months.`Date`,
m.ID, m.NAME
I had created the temporary tables temp_months and temp_days in order to get all the days for any month. I am using MySQL 5.1, but I am trying to make the query ANSI-compliant.
You should CROSS JOIN your dates and models so that you have exactly one record for each day-model pair no matter what, and then LEFT JOIN other tables:
SELECT date, name, COUNT(t.id)
FROM (
SELECT ...
) AS days_of_months
CROSS JOIN
model m
LEFT JOIN
APPLIANCE_UNIT a
ON a.MODEL_FK = m.id
AND a.NUMBER_OF_UNITS > 0
LEFT JOIN
NEW_TICKET t
ON t.id = a.NEW_TICKET_FK
AND t.TYPE = 'SALES'
AND t.SALES_ORDER_FK IS NOT NULL
AND t.CREATION_DATE_TIME >= days_of_months.`Date`
AND t.CREATION_DATE_TIME < days_of_months.`Date` + INTERVAL 1 DAY
GROUP BY
date, name
The way you do it now you get NULL's in model_id for the days you have no sales, and they are grouped together.
Note the JOIN condition:
AND t.CREATION_DATE_TIME >= days_of_months.`Date`
AND t.CREATION_DATE_TIME < days_of_months.`Date` + INTERVAL 1 DAY
instead of
DATE(t.CREATION_DATE_TIME) = DATE(days_of_months.`Date`)
This will help make your query sargable (optimized by indexes)
You need to use outer joins, as they do not require each record in the two joined tables to have a matching record.
http://dev.mysql.com/doc/refman/5.1/en/join.html
You're looking for an OUTER join. A left outer join creates a result set with a record from the left side of the join even if the right side does not have a record to be joined with. A right outer join does the same on the opposite direction, creates a record for the right side table even if the left side does not have a corresponding record. Any column projected from the table that does not have a record will have a NULL value in the join result.