How to join multiple tables, without omitting values without a match - sql

This small example is based on a question that I have run into countless times, however I failed finding the best answer.
I would like to create a report on Incidents logged per type, per month. Written the following query.
SELECT
d.MonthPeriod
,i.[Type]
,COUNT(*) AS [Count of Calls]
FROM
[dbo].[FactIncident] as i
LEFT JOIN
[dbo].[DimDate] as d on i.DateLoggedKey = d.DateKey
GROUP BY
d.[MonthPeriod],
i.[Type]
This results in the following:
Although correct, I would like to visualize earlier months with 0 logged calls. DimDate contains the following.
What is the best and/or most efficient way of showing the count of calls per month, per type, for all months. Even if the count is 0?
Thought of using Cross Apply, however the resultant query gets huge quickly. Only think of a dataset requiring the count of calls per customer, per category, per month over the last 3 years..
Any ideas?

Do the left join starting with the calendar table, so you keep all the months:
SELECT d.MonthPeriod, i.[Type], COUNT(i.type) AS [Count of Calls]
FROM [dbo].[DimDate] d LEFT JOIN
[dbo].[FactIncident] i
ON i.DateLoggedKey = d.DateKey
GROUP BY d.[MonthPeriod], i.[Type];
This will, of course, return the type as NULL for the months with no data.
If you want all types present, then use CROSS JOIN on the types. This example gets the data from the fact table, but you might have another reference table containing each type:
SELECT d.MonthPeriod, t.[Type], COUNT(i.type) AS [Count of Calls]
FROM [dbo].[DimDate] d CROSS JOIN
(select distinct type from factincident) t LEFT JOIN
[dbo].[FactIncident] i
ON i.DateLoggedKey = d.DateKey and i.type = t.type
GROUP BY d.[MonthPeriod], t.[Type];

Related

Aggregation of a single data type on a joined table

I have a manual report causing 2-3 hours of manual labor weekly for aggregation.
the "a" table gives me the length I want to sum and the "b" table join brings in the name of the activity I need to aggregate. The issue is that the query does not completely aggregate the value I wish. The output I am looking for is the a.dtlExcode_ and b.print_name_ with the total sum for the dates selected by a.dtlExcode_.
Can anyone provide some pointers? I am fairly new to writing queries but am determined to eliminate manual work from canned reports by using the database.
select sum(a.dtlLength_) as Total_Minutes,
a.dtlExcode_,
b.print_name_
from V_SCHEDULE a
left join V_EXCEPT b on a.dtlExcode_ = b.code_
where sched_date_ is between '2021-07-12' and '2021-07-18'
Group by
a.sched_date_,
a.dtlLength_,
a.dtlExcode_,
b.print_name_
Just a few small errors in the logic. Here are the adjustments:
select sum(a.dtlLength_) as Total_Minutes
, a.dtlExcode_
, b.print_name_
from V_SCHEDULE a
left join V_EXCEPT b
on a.dtlExcode_ = b.code_
where sched_date_ between '2021-07-12' and '2021-07-18'
Group by a.sched_date_, a.dtlExcode_, b.print_name_
;
You had included a.dtlLength_ in the GROUP BY terms, leading to groups for each separate a.dtlLength_ value. That was the main problem.
The other GROUP BY terms could be fine, depending on your requirement.
If you want each date separately in the result, that's fine. If not, remove the date term from the GROUP BY clause, like so:
select sum(a.dtlLength_) as Total_Minutes
, a.dtlExcode_
, b.print_name_
from V_SCHEDULE a
left join V_EXCEPT b
on a.dtlExcode_ = b.code_
where sched_date_ between '2021-07-12' and '2021-07-18'
Group by a.dtlExcode_, b.print_name_
;

SQL Query to count the records

I am making up a SQL query which will get all the transaction types from one table, and from the other table it will count the frequency of that transaction type.
My query is this:
with CTE as
(
select a.trxType,a.created,b.transaction_key,b.description,a.mode
FROM transaction_data AS a with (nolock)
RIGHT JOIN transaction_types b with (nolock) ON b.transaction_key = a.trxType
)
SELECT COUNT (trxType) AS Frequency, description as trxType,mode
from CTE where created >='2017-04-11' and created <= '2018-04-13'
group by trxType ,description,mode
The transaction_types table contains all the types of transactions only and transaction_data contains the transactions which have occurred.
The problem I am facing is that even though it's the RIGHT join, it does not select all the records from the transaction_types table.
I need to select all the transactions from the transaction_types table and show the number of counts for each transaction, even if it's 0.
Please help.
LEFT JOIN is so much easier to follow.
I think you want:
select tt.transaction_key, tt.description, t.mode, count(t.trxType)
from transaction_types tt left join
transaction_data t
on tt.transaction_key = t.trxType and
t.created >= '2017-04-11' and t.created <= '2018-04-13'
group by tt.transaction_key, tt.description, t.mode;
Notes:
Use reasonable table aliases! a and b mean nothing. t and tt are abbreviations of the table name, so they are easier to follow.
t.mode will be NULL for non-matching rows.
The condition on dates needs to be in the ON clause. Otherwise, the outer join is turned into an inner join.
LEFT JOIN is easier to follow (at least for people whose native language reads left-to-right) because it means "keep all the rows in the table you have already read".

Selecting the most frequent value in a column based on the value of another column in the same row?

So basically what I'm trying to do is generate a report for our stores. We have an incident report website where the employees can report an incident that takes place at any of our stores. So in the general report I'm trying to generate, I want to show the details for each store we have (Five stores). This would include the name of the store, number of incidents, oldest incident date, newest incident date, and then the most recurring type of incident at each store.
SELECT Store.Name AS [Store Name], COUNT(*) AS [No. Of Incidents], Min(CAST(DateNotified AS date)) AS [Oldest Incident], Max(CAST(DateNotified AS date)) AS [Latest Incident],
( SELECT TOP 1 IncidentType.Details
FROM IncidentDetails
INNER JOIN Store ON IncidentDetails.StoreID = Store.StoreID
INNER JOIN IncidentType On IncidentDetails.IncidentTypeID = IncidentType.IncidentTypeID
Group By IncidentType.Details, IncidentDetails.StoreID
Order By COUNT(IncidentType.Details) DESC) AS [Most Freqeuent Incident]
FROM IncidentDetails
INNER JOIN Store ON IncidentDetails.StoreID = Store.StoreID
INNER JOIN IncidentType On IncidentDetails.IncidentTypeID = IncidentType.IncidentTypeID
GROUP BY Store.Name
Just to make it clear, the IncidentDetails table stores all the details about the incident including which store it occured at, what the type of incident was, time/date, etc.
What this does though is it gives me 5 rows for each store, but the [Most Frequent Incident] value is the same for every row. Basically, it gets the most frequent incident value for the whole table, regardless of which store it came from, and then displays that for each store, even though different stores have different values for the column.
I've been trying to solve this for a while now but haven't been able to :-(
You have too many joins and no correlation clause.
There are several ways to approach this problem. You have already started with an aggregation in the outer query and then a nested subquery. So, this continues that approach. I think this does what you want:
SELECT s.Name AS [Store Name], COUNT(*) AS [No. Of Incidents],
Min(CAST(DateNotified AS date)) AS [Oldest Incident],
Max(CAST(DateNotified AS date)) AS [Latest Incident],
(SELECT TOP 1 it.Details
FROM IncidentDetails id2 INNER JOIN
IncidentType it2
On id2.IncidentTypeID = it2.IncidentTypeID
WHERE id2.StoreId = s.StoreId
Group By it.Details
Order By COUNT(*) DESC
) AS [Most Freqeuent Incident]
FROM IncidentDetails id INNER JOIN
Store s
ON id.StoreID = s.StoreID
GROUP BY s.Name, s.StoreId;
Notes:
Removed the IncidentType table from the outer joins. This doesn't seem needed (although it could be used for filtering).
Added s.StoredId to the group by clause. This is needed for the correlation in the subquery.
Added a where clause so the subquery is only processed once for each store in the outer query.
Removed the join to Store in the subquery. It seems unnecessary, if the queries can be correlated on StoreId.
Changed the group by in the subquery to use Details. That is the value being selected.
Added table aliases, which make queries easier to write and to read.

SQL Include zero rows in query

I have a derived a table with a category, subcategory, date, time, and value columns (from a much larger table).
The dates are all dates with a particular property (sometimes last 5 weeks, sometimes last 5 Sundays, sometimes month to day), as with the times (sometimes midnight to now every hour, sometimes last 30 minutes every 2 minutes).
The category and subcategory are uninteresting, but if they have no entry in the value field for the time parameters they're completely ignored (which is fine).
When there is no value which matches the combination of the other columns, the row is completely ignored.
My problem is that I need the 0's included. Is there some way I can fill in 0's to the rest of the table?
My thought was to do a cross join of the table against itself using only the first 4 columns, then left join with isnull() on the value column. But the cross join kept including the 5th column.
If I understand correctly, you want all combinations of date, time, category, and subcategory that your query returns. If so, you can use cross join to get the combinations and then left outer join to fill in the values or 0:
with t as (
<your query here>
)
select d.date, ti.time, sc.category, sc.subcategory, coalesce(t.value, 0) as value
from (select distinct date from t) d cross join
(select distinct time from t) ti cross join
(select distinct category, subcategory from t) sc left outer join
t
on t.date = d.date and t.time = ti.time and
t.category = sc.category and t.subcategory = sc.subcategory;

Flattening Join in PostgreSQL

Is it possible to join a table so that only a specific row at a specific ordered offset is joined instead of every matching record in table?
I have two tables, Customer and MonthlyRecommendation. MonthlyRecommendation points to Customer and tracks one product recommendation made by the customer at some day in each month.
I'm trying to write a query that retrieves each customer, along with the last 12-months of recommendations. Simply doing:
SELECT c.id, m.date, m.product
FROM Customer AS c
INNER JOIN MonthlyRecommendation AS m ON m.customer_id = c.id
will get me the data I want, but I need it flattened so that each customer's data is in one row, and the result signature looks like:
id, date_01, product_01, date_02, product_02, ..., date_12, product_12
Is there any way to do this in PostgreSQL? For similar problems, I would normally just make 12 separate JOINs, joining on specific sub-condition for each one, but in this case, the condition is relative to the order of the date values in the table. I'd like to be able to specify and ORDER BY, with maybe a LIMIT and OFFSET, but I don't believe and SQL dialect supports that.
Some databases support the pivot operation directly. In Postgres, you could use a user-defined function, such as cross tab. But the aggregation method is simple enough:
SELECT c.id,
'2013-01' as date_01, max(case when m.date = '2013-01' then m.product end) as product_01,
'2013-02' as date_02, max(case when m.date = '2013-02' then m.product end) as product_02,
. . .
'2013-12' as date_12, max(case when m.date = '2013-12' then m.product end) as product_12
FROM Customer c INNER JOIN
MonthlyRecommendation m
ON m.customer_id = c.id
GROUP BY c.id;
Of course, the above query is just guessing at a format for date. You'll need to put the right comparison in for your data.