Include zero counts when grouping by multiple columns and setting filters - sql

I have a table (tbl) containing category (2 categories), impact (3 impacts), company name and date for example:
category | impact | company | date | number
---------+----------+---------+-----------|
Animal | Critical | A | 12/31/1999|1
Book | Critical | B | 12/31/2000|2
Animal | Minor | C | 12/31/2001|3
Book | Minor | D | 12/31/2002|4
Animal | Medium | E | 1/1/2003 |5
I want to get the count of records for each category and impact and be able to add rows with zero count and also be able to filter by company and date.
In the example result set below, the count result is 1 for category = Animal and company = A. The rest is 0 records and only the Critical and Medium impacts appear
category | impact | count
---------+----------+-------
Animal | Critical | 1
Animal | Medium | 0
I've looked at the responses to similar questions by using joins however, adding a WHERE clause doesn't include the zero records.
I also tried doing outer joins but it doesn't produce desired output. For example
select a.impact, b.category, ISNULL(count(b.impact), 0) from tbl a
left outer join tbl b
on b.number = a.number
and (a.category = 'Animal' and a.company in ('A'))
group by a.impact, b.category
produces
impact | category | count
---------+------------+--------
Medium | NULL | 0
Medium | Animal | 1
Critical | NULL | 0
Minor | NULL | 0
but the desired output should be
category | impact | count
---------+----------+-------
Animal | Critical | 1
Animal | Medium | 0
Animal | Minor | 0
Any help will be appreciated. Answers to associated questions don't have filtering so I will appreciate if someone can help with a query to produce desired output.

You need a master table with all the possible combinations of Categories and Impacts for this. Then Left join your table with the master and do the aggregation. Something like below
;WITH CAT
AS
(
SELECT
category
FROM Tbl
GROUP BY category
),
IMP
AS
(
SELECT
Impact
FROM Tbl
GROUP BY Impact
),MST
AS
(
SELECT
*
FROM CAT
CROSS JOIN IMP
)
SELECT
MST.category,
MST.Impact,
COUNT(T.Number)
FROM MST
LEFT JOIN Tbl T
ON MST.category = T.category
AND MST.Impact = T.Impact
AND T.Company = 'A'
WHERE MST.Category = 'Animal' GROUP BY MST.category,
MST.Impact

Related

Comparing aggregated columns to non aggregated columns to remove matches

I have two separate tables from two different databases that are performing a matching check.
If the values match I want them out of the result set. The first table (A) has multiple entries that contain the same symbol matches for the matching columns in the second table (B).
The entries in table B, if added up will ideally equal the value of one of the matching rows of A.
The tables look like below when queried separately.
Underneath the tables is what my query currently looks like. I thought if I group the columns by the symbols I could use the SUM of B to add up to the value of A which would get rid of the entries. However, I think because I am summing from B and not from A, then the A doesn't count as an aggregated column so must be included in the group by and doesn't allow for the summing to work in the way I'm wanting it to calculate.
How would I be able to run this query so the values in B are all summed up. Then, if matching to the symbol/value from any of the entries in A, don't get included in the result set?
Table A
| Symbol | Value |
|--------|-------|
| A | 1000 |
| A | 1000 |
| B | 1440 |
| B | 1440 |
| C | 1235 |
Table B
| Symbol | Value |
|--------|-------|
| A | 750 |
| A | 250 |
| B | 24 |
| B | 1416|
| C | 1874|
SELECT DBA.A, DBB.B
FROM DatabaseA DBA
INNER JOIN DatabaseB DBB on DBA.Symbol = DBB.Symbol
and DBA.Value != DBB.Value
group by DBA.Symbol, DBB.Symbol, DBB.Value
having SUM(DBB.Value) != DBA.Value
order by Symbol, Value
Edited to add ideal results
Table C
| SymbolB| ValueB| SymbolA | ValueA |
|--------|-------|---------|--------|
| C | 1874 | C | 1235 |
Wherever B adds up to A remove both. If they don't add, leave number inside result set
I will use CTE and use this common table expression (CTE) to search in Table A. Then join table A and table B on symbol.
WITH tDBB as (
SELECT DBB.Symbol, SUM(DBB.Value) as total
FROM tableB as DBB
GROUP BY DBB.Symbol
)
SELECT distinct DBB.Symbol as SymbolB, DBB.Value as ValueB, DBA.Symbol as SymbolA, DBA.Value as ValueA
FROM tableA as DBA
INNER JOIN tableB as DBB on DBA.Symbol = DBB.Symbol
WHERE DBA.Symbol in (Select Symbol from tDBB)
AND NOT DBA.Value in (Select total from tDBB)
Result:
|symbolB |valueB |SymbolA |ValueA |
|--------|-------|--------|-------|
| C | 1874 | C | 1235 |
with t3 as (
select symbol
,sum(value) as value
from t2
group by symbol
)
select *
from t3 join t on t.symbol = t3.symbol and t.value != t3.value
symbol
value
Symbol
Value
C
1874
C
1235
Fiddle

SQL Server avoid repeat same joins

I´m doing the query below where I´m repeating the same joins multiple times, there is a better way to do it? (SQL Server Azure)
Ex.
Table: [Customer]
[Id_Customer] | [CustomerName]
1 | Tomy
...
Table: [Store]
[Id_Store] | [StoreName]
1 | SuperMarket
2 | BestPrice
...
Table: [SalesFrutes]
[Id_SalesFrutes] | [FruteName] | [Fk_Id_Customer] | [Fk_Id_Store]
1 | Orange | 1 | 1
...
Table: [SalesVegetable]
[Id_SalesVegetable] | [VegetableName] | [Fk_Id_Customer] | [Fk_Id_Store]
1 | Pea | 1 | 2
...
Select * From [Customer] as C
left join [SalesFrutes] as SF on SF.[Fk_Id_Customer] = C.[Id_Customer]
left join [SalesVegetable] as SV on SV.[Fk_Id_Customer] = C.[Id_Customer]
left join [Store] as S1 on S1.[Id_Store] = SF.[Fk_Id_Store]
left join [Store] as S2 on S1.[Id_Store] = SV.[Fk_Id_Store]
In my real case, I have many [Sales...] to Join with [Customer] and many other tables similar to [Store] to join to each [Sales...]. So it starts to scale a lot the number on joins repeating. There is a better way to do it?
Bonus question: I do like also to have FruteName, VegetableName, StoreName, and each Food table name under the same column.
The Expected Result is:
[CustomerName] | [FoodName] | [SalesTableName] | [StoreName]
Tomy | Orange | SalesFrute | SuperMarket
Tomy | Pea | SalesVegetable | BestPrice
...
Thank you!!
So based on the information provided, I would have suggested the below, to use a cte to "fix" the data model and make writing your query easier.
Since you say your real-world scenario is different to the info provided it might not work for you, but could still be applicable if you have say 80% shared columns, you can just use placeholder/null values where relevant for unioning the data sets and still minimise the number of joins eg to your store table.
with allSales as (
select Id_SalesFrutes as Id, FruitName as FoodName, 'Fruit' as SaleType, Fk_Id_customer as Id_customer, Fk_Id_Store as Id_Store
from SalesFruits
union all
select Id_SalesVegetable, VegetableName, 'Vegetable', Fk_Id_customer, Fk_Id_Store
from SalesVegetable
union all... etc
)
select c.CustomerName, s.FoodName, s.SaleType, st.StoreName
from Customer c
join allSales s on s.Id_customer=c.Id_customer
join Store st on st.Id_Store=s.Id_Store

PostgreSQL searching the query result based on defined conditions

I am a beginner and I am working on a quite big query which returns about 50k of rows.
I've spent a big number of hours trying to figure out the issue and it looks that there is a gap in my knowledge and I would be really greatful if you could help me.
In order to show you the main idea I decided to simplify and split the data. I am presenting the relevant tables here:
*company*
+----+----------+---------------+----------------+
| ID | name | classificaton | special_number |
+----+----------+---------------+----------------+
| 1 | companyX | 309 | 242 |
+----+----------+---------------+----------------+
*branch*
+----+---------------+-------+
| ID | name | color |
+----+---------------+-------+
| 1 | environmental | green |
| 2 | navy | blue |
+----+---------------+-------+
*company_branch*
+------------+-----------+
| ID_company | ID_branch |
+------------+-----------+
| 1 | 1 |
| 1 | 2 |
+------------+-----------+
Ok as we have all the needed data presented I need to create a query which will select all the companies along with the main color of the branches they are working in.
A companyX can belong to more than one branch but I need to show only the main branch which can be calculated based on the three conditions below:
*if classification = 309 and special_number is even then show the relevant color and go the next company (ignore the next conditions)
*if classification = 209 and special_number is even then show the relevant color and go the next company (ignore the next condition)
*else show as grey
I created a query like that: (I know that the case phrase is not correct but I am keeping it as it shows in a better way what I am trying to accomplish)
SELECT c.ID, c.name, b.color, c.classification, c.special_number,
CASE
WHEN c.classification = 309 AND c.special_number % 2 = 0 THEN b.color
WHEN c.classification = 209 AND c.special_number % 2 = 0 THEN b.color
ELSE 'grey'
END AS 'case'
FROM company c INNER JOIN company_branch cb ON c.ID = cb.ID_company
INNER JOIN branch b ON b.ID = cb.ID_branch
then I get the following result
*result*
+----+----------+-------+----------------+----------------+------+
| ID | name | color | classification | special_number | case |
+----+----------+-------+----------------+----------------+------+
| 1 | companyX | green | 309 | 242 | green|
| 1 | companyX | blue | 309 | 242 | blue |
+----+----------+-------+----------------+----------------+------+
The problem is that if any company belongs to more than one branch then I always get many colors... what I would like to get is a list of companies with only one color of the main branch they are working in.
can you guys help the newbie ?
I think you want something like this:
SELECT DISTINCT ON (c.id) c.ID, c.name,
COALESCE(b.color, 'grey') as color
c.classification, c.special_number,
FROM company c LEFT JOIN
company_branch cb
ON c.ID = cb.ID_company LEFT JOIN
branch b
ON b.ID = cb.ID_branch
ORDER BY c.Id,
(CASE WHEN c.classification = 309 AND c.special_number % 2 = 0 THEN 1
WHEN c.classification = 209 AND c.special_number % 2 = 0 THEN 2
ELSE 3
END);
DISTINCT ON is a (useful) Postgres extension that returns one row per item(s) in parentheses. The returned row is based on the GROUP BY. The first key(s) in the ORDER BY need to be the item(s). The following specifies what "first" means.
I switched the joins to being outer joins, so all companies are in the result set.

Can i merge the result of two queries?

I'm trying to retrieve some statistics from my database, to be concrete i'm look to show how many todo's is completed vs the total of a checklist.
The structure is as follows
A category has many Cards, has many checklists, has many assessments.
I can get the amount of assessments or completed assessments with the following query.
SELECT count(a.id) AS completed_count, a.checklist_id, ca.category_id
FROM assessments a
JOIN checklists ch ON ch.id = a.checklist_id
JOIN cards ca ON ca.id = ch.card_id
WHERE a.complete
GROUP BY a.checklist_id, ca.category_id;
This will give me something like this.
completed_count | checklist_id | category_id
-----------------+--------------+-------------
2 | 3 | 2
1 | 2 | 2
2 | 5 | 3
I could then do a query, to get the total amount, by removing the WHERE a.complete, and write some code that matches the two results.
But what i really want, is a result like this.
completed_amount | total_amount | checklist_id | category_id
------------------+--------------+--------------+-------------
2 | 2 | 3 | 2
1 | 1 | 2 | 2
2 | 2 | 5 | 3
I just can't wrap my head around, how i can achieve that.
I think conditional aggregation does what you want:
SELECT sum( (a.complete)::int ) AS completed_count,
count(*) as total_count
a.checklist_id, ca.category_id
FROM assessments a JOIN
checklists ch
ON ch.id = a.checklist_id JOIN
cards ca
ON ca.id = ch.card_id
GROUP BY a.checklist_id, ca.category_id;

How to find whether an unordered itemset exists

I am representing itemsets in SQL (SQLite, if relevant). My tables look like this:
ITEMS table:
| ItemId | Name |
| 1 | Ginseng |
| 2 | Honey |
| 3 | Garlic |
ITEMSETS:
| ItemSetId | Name |
| ... | ... |
| 7 | GinsengHoney |
| 8 | HoneyGarlicGinseng |
| 9 | Garlic |
ITEMSETS2ITEMS
| ItemsetId | ItemId |
| ... | .... |
| 7 | 1 |
| 7 | 2 |
| 8 | 2 |
| 8 | 1 |
| 8 | 3 |
As you can see, an Itemset may contain several Items, and this relationship is detailed in the Itemset2Items table.
How can I check whether a new itemset is already in the table, and if so, find its ID?
For instance, I want to check whether "Ginseng, Garlic, Honey" is an existing itemset. The desired answer would be "Yes", because there exists a single ItemsetId which contains exactly these three IDs. Note that the set is unordered: a query for "Honey, Garlic, Ginseng" should behave identically.
How can I do this?
I would recommend that you start by placing the item sets that you want to check into a table, with one row per item.
The question is now about the overlap of this "proposed" item set to other itemsets. The following query provides the answer:
select itemsetid,
from (select coalesce(ps.itemid, is2i.itemid) as itemid, is2i.itemsetid,
max(case when ps.itemid is not null then 1 else 0 end) as inProposed,
max(case when is2i.itemid is not null then 1 else 0 end) as inItemset
from ProposedSet ps full outer join
ItemSets2items is2i
on ps.itemid = is2i.itemid
group by coalesce(ps.itemid, is2i.itemid), is2i.itemsetid
) t
group by itemsetid
having min(inProposed) = 1 and min(inItemSet) = 1
This joins all the proposed items with all the itemsets. It then groups by the items in each item set, giving a flag as to whether the item is in the set. Finally, it checks that all items in an item set are in both.
Sounds like you need to find an ItemSet that:
contains all the Items in your wanted list
doesn't contain any other Items
This example will return the ID of such an itemset if it exists.
Note: this solution is for MySQL, but it should work in SQLite once you change #variables into something SQLite understands, e.g. bind variables.
-- these are the IDs of the items in the new itemset
-- if you add/remove some, make sure to change the IN clauses below
set #id1 = 1;
set #id2 = 2;
-- this is the count of items listed above
set #cnt = 2;
SELECT S.ItemSetId FROM ItemSets S
INNER JOIN
(SELECT ItemsetId, COUNT(*) as C FROM ItemSets2Items
WHERE ItemId IN (#id1, #id2)
GROUP BY ItemsetId
HAVING COUNT(*) = #cnt
) I -- included ingredients
ON I.ItemsetId = S.ItemSetId
LEFT JOIN
(SELECT ItemsetId, COUNT(*) as C FROM ItemSets2Items
WHERE ItemId NOT IN (#id1, #id2)
GROUP BY ItemsetId
) A -- additional ingredients
ON A.ItemsetId = S.ItemSetId
WHERE A.C IS NULL
See fiddle for MySQL.