How to do a count including not existing records? - sql

How to do a count including not existing records, which should have '0' as the count?
Here is my table:
CREATE TABLE SURVEY
(year CHAR(4),
cust CHAR(2));
INSERT INTO SURVEY VALUES ('2011', 'AZ');
INSERT INTO SURVEY VALUES ('2011', 'CO');
INSERT INTO SURVEY VALUES ('2012', 'ME');
INSERT INTO SURVEY VALUES ('2014', 'ME');
INSERT INTO SURVEY VALUES ('2014', 'CO');
INSERT INTO SURVEY VALUES ('2014', 'ME');
INSERT INTO SURVEY VALUES ('2014', 'CO');
I've tried this, but of course it is missing zero counts:
select cust, year, count(*) as count from SURVEY
group by cust, year
I want to have this result:
+------+---------+--------+
| cust | year | count |
+------+---------+--------+
| AZ | 2011 | 1 |
| AZ | 2012 | 0 |
| AZ | 2014 | 0 |
| CO | 2011 | 1 |
| CO | 2012 | 0 |
| CO | 2014 | 2 |
| ME | 2011 | 0 |
| ME | 2012 | 1 |
| ME | 2014 | 2 |
+------+---------+--------+
please note:
My table has many records (~10k with different 'cust')
years may not be sequential (for example 2013 is skipped)
over time i may have 2015, 2016 and so on
the actual query will be executed in MS_ACCESS'2010 (not sure if its matter)
please help, thank you!

It sounds like you want a count for every cust x year combination with a zero when no survey record exists. If this is the case you will need two more tables: customers and years then do something like:
select leftside.cust, leftside.year, count(survey.cust) from
(select * from customers, years) as leftside left join survey
on leftside.cust = survey.cust and
leftside.year = survey.year
group by leftside.cust, leftside.year

select cust, year, (select count(cust) from survey) as count
from SURVEY
group by cust, year
But this query will return count of all records, without group condition.

If you have a domain table for years and customers:
select y.year, c.cust, count(s.year) as cnt
from customer as c
cross join year as y
left join survey as s
on s.year = y.year
and s.cust = c.cust
group by y.year, c.cust
If ms-access don't have cross join, you can do the same with:
from customer as c
join year as y
on 1 = 1
If you don't have domain tables you will somehow need to "invent" the domains since you cant create something from nothing.

If you have domain tables as others said, well and good. If you have to depend only on data in your table, the below query will do that for you.
select cp.cust, cp.year, iif(isnull(sum(cnt)), 0, sum(cnt)) as count from
(select * from (
(select distinct cust from survey) as c cross join
(select distinct year from survey) as y)
) cp left join
(select *, 1 as cnt from survey) s on cp.cust=s.cust and cp.year=s.year
group by cp.cust, cp.year
order by cp.cust,cp.year
Instead of iif(isnull(sum(cnt)), 0, sum(cnt)), you can use coalesce(sum(cnt),0) if that works. In MS Access use iif function and in other databases coalesce works.

Related

Select rows where main value has disabled status and sub value is active

I have a table containing customer agreement numbers and a status field indicating whether that agreement is active or not - 1 for active, 0 for disabled.
A main customer number contains 5 digits, from which other subagreements can be made. These other agreements are characterized by a 10 digit number, the first 5 coming from the main number and the last 5 autogenerated.
Note that not all main agreements necessarily have subagreements.
Heres a simplified snippet of the table I currently get from my query:
+-------------+----------+------------+--+
| CustNumber| CustName | CustStatus | |
+-------------+----------+------------+--+
|12345 | Cust1 | 1 | |
|1234500001 | Cust1 | 1 | |
|1234500002 | Cust1 | 0 | |
|12346 | Cust2 | 0 | |<---
|1234600001 | Cust2 | 1 | |<---
|1234600002 | Cust2 | 0 | |
+-------------+----------+------------+--+
Query:
SELECT
custnumber,
custstatus,
custname
FROM table
WHERE LEFT(custnumber, 5) IN (
SELECT LEFT(custnumber, 5)
FROM table
GROUP BY LEFT(custnumber, 5)
HAVING Count(*) > 1
)
ORDER BY custnumber,
custstatus DESC;
From here I'm pretty lost. I'm thinking something along the lines of an inner join on a subquery but I'm really not sure.
What I'm looking for is a query that selects rows with subagreement numbers that are active but where the main agreement number is disabled.
I'm new to SQL and have spend a good while searching around for similar questions, but I actually don't know how to describe this problem in a google-friendly manner.
Join the table with itself - I am using a WITH clause for readability, but that is not necessary - and check the statuses.
with main_rows as
(
select custnumber as main_number, custname, custstatus
from mytable
where length(custnumber) = 5
)
, sub_rows as
(
select
left(custnumber, 5) as main_number,
right(custnumber, 5) as sub_number,
custname,
custstatus
from mytable
where length(custnumber) = 10
)
select
main_number,
m.custname as main_name,
s.sub_number,
s.custname as sub_name
from main_rows m
join sub_rows s using (main_number)
where m.custstatus = 0 and s.custstatus = 1
order by main_number, s.sub_number;
And here is the same thing, but shorter and just not as talkative :-)
select *
from mytable m
join mytable s on s.custnumber like m.custnumber || '_____'
where m.custstatus = 0 and s.custstatus = 1
order by s.custnumber;
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=5873044787e5fd3f32f7648dbc54a7b0
with data (CustNumber, CustName, CustStatus) as(
Select '12345' ,'Cust1',1 union all
Select '1234500001' ,'Cust1',1 union all
Select '1234500002' ,'Cust1',0 union all
Select '12346' ,'Cust2',0 union all
Select '1234600001' ,'Cust2',1 union all
Select '1234600002' ,'Cust2',0
)
,subagg (k,CustNumber, CustName, CustStatus) as(
select Left(CustNumber,5) k,CustNumber, CustName, CustStatus
from data
where len(CustNumber)=10
and CustStatus = 1
)
select s.CustNumber ActiveSunCustomer, d.CustNumber InactivePrimaryCustomer
from subagg s
join data d on d.CustNumber=s.k and d.CustStatus = 0

Aggregate functions based on current Row value

I am working with data similar to below,
week | product | sale
1 | ABC | 2
1 | ABC | 1
2 | ABC | 1
3 | ABC | 5
4 | ABC | 1
2 | DEF | 5
Let us say that is my Orders table named tblOrders. Now, in each row, I want to aggregate the total sales from last week for that product - for instance, if I am on week 2 of product "ABC", I need to show the aggregated sales amount of week 1 for product ABC. so, the output should look something like below,
week | product | sale | ProductPreviousWeekSales
1 | ABC | 2 | 0
1 | ABC | 1 | 0
2 | ABC | 1 | 3
3 | ABC | 5 | 1
4 | ABC | 1 | 5
2 | DEF | 5 | 0
I was originally thinking I could solve this using Aggregates and Window Function, but doesn't look to be so. Another thought I was having is to use Conditional Aggregate - something like sum(case when x=currentRow.x then sale else 0 end), but that wouldn't work too.
Here is the SQLFiddle for above sample - http://sqlfiddle.com/#!18/890b7/2
Note: I need to calculate similar value for Last 4 weeks, so trying to avoid doing this as a sub-query or multiple joins (if possible), as the data set I am working with is very large, and don't want to add to much performance overhead trying to incorporate this change.
Here is one approach which first aggregates your table in a separate CTE and uses LAG to find the previous week's amount, for each week and product:
WITH cte AS (
SELECT week, product,
LAG(SUM(sale)) OVER (PARTITION BY product ORDER BY week) AS lag_total_sales
FROM yourTable
GROUP BY week, product
)
SELECT t1.week, t1.product, t1.sale,
COALESCE(t2.lag_total_sales, 0) AS ProductPreviousWeekSales
FROM yourTable t1
INNER JOIN cte t2
ON t2.week = t1.week AND
t2.product = t1.product
ORDER BY
t1.product,
t1.week;
Demo
DISCLAIMER
The query I am showing below doesn't work in SQL Server, unfortunately. Up to SQL Server version 2019 the DBMS lacks full support of the RANGE clause that is essential for the query to work. Running the query in SQL Server results in
Msg 4194 Level 16 State 1 Line 1 RANGE is only supported with UNBOUNDED and CURRENT ROW window frame delimiters.
I am not deleting this answer, because this is standard SQL and the approach may help future readers. It runs fine in a lot of DBMS, and maybe a future version of SQL Server will be able to deal with this, too. I've added demos to show that it runs in PostgreSQL, MySQL and Oracle, but fails in SQL Server 2019.
ORIGINAL ANSWER
Your query shown in the fiddle (select a.*, sum(sale) over(partition by product) ProductPreviousWeekSales from tblOrder a) is merely lacking the appropriate windowing clause. As you are dealing with ties here (more than one row per product and week) this needs to be a RANGE clause:
select a.*,
sum(sale) over(partition by product
order by week range between 1 preceding and 1 preceding
) as ProductPreviousWeekSales
from tblOrder a
order by product, week;
(Use COALESCE if you want to see a zero instead of NULL.)
Demos:
https://dbfiddle.uk/?rdbms=postgres_13&fiddle=149eddbff82500d539b2c615f4167cff
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=a8453970efac08ad69275914910bb13e
https://dbfiddle.uk/?rdbms=oracle_18&fiddle=64ed21150142caa0acb7f8c7ca7d9022
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=149eddbff82500d539b2c615f4167cff
You can do from following
; WITH cteorder AS
(
SELECT DISTINCT product, week FROM dbo.tblOrder
)
SELECT
cte.*,
SUM(ISNULL(b.sale,0)) ProductPreviousWeekSales
from tblOrder a
INNER JOIN cteorder cte ON cte.product = a.product AND cte.week = a.week
LEFT JOIN dbo.tblOrder b ON b.product = cte.product AND b.week = (a.week-1)
GROUP BY cte.product,
cte.week
You can run from : Fiddle
You need to select from TblOrders twice. Once, grouping by week and product and summing the sales, and the second time, a row-by-row scan against TblOrders, left-joining it with the grouping query on same product and week offset by 1:
If the join fails , the sales value of the joined grouping query returns NULL. You can put in 0 instead of NULL using COALESCE(), but ISNULL() has all chances of being faster, as it has a fixed number of parameters, while COALESCE() has a variable argument list, which comes at a certain cost.
WITH
tblorders(wk,product,sales) AS (
SELECT 1,'ABC',2
UNION ALL SELECT 1,'ABC',1
UNION ALL SELECT 2,'ABC',1
UNION ALL SELECT 3,'ABC',5
UNION ALL SELECT 4,'ABC',1
UNION ALL SELECT 2,'DEF',5
)
,
grp AS (
SELECT
wk
, product
, SUM(sales) AS sales
FROM tblorders
GROUP BY
wk
, product
)
SELECT
o.wk
, o.product
, o.sales
, ISNULL(g.sales,0) AS productpreviousweeksales
FROM tblorders o
LEFT
JOIN grp g
ON o.wk - 1 = g.wk
AND o.product= g.product
ORDER BY 2,1
;
wk | product | sales | productpreviousweeksales
----+---------+-------+--------------------------
1 | ABC | 2 | 0
1 | ABC | 1 | 0
2 | ABC | 1 | 3
3 | ABC | 5 | 1
4 | ABC | 1 | 5
2 | DEF | 5 | 0

How to join 2 tables without relationship in SQL

I am struggling to combine three tables with outer join, but cant get it right
I am using PostgreSQL database..
My tables look like this:
My Orders table have the foreign key of Customers and Years (this two dont have any relation)
Some year and Customer are not presented in my Orders table
I am trying to make an sql query so that each customer have all year. I can get this result by cross join customer and year, but this approach doesn't work when i continue to join this result with other tables. (My Orders table have other foreign keys also that i have to join also) So can i get this result by using outer join instead?
I have tried:
Select* From Orders
RIGHT JOIN Customers on Orders.customer_id = customer.id
That give me all the customer in Orders table (even them who doesent have a order) then i would like to do the same with all years also so every customer have one row for each year(2015-2020) Have tryed to do another right join with years table but it doesent work.. Anyone know how to fix this?
p.s the names of the tables is not real,i just used this names to make it easier to understand!
If you want all customers to have a year and their respective orders if they have them per year.
SELECT CUS.ID AS CustomerID, CUS.Name AS CustomerName, YEA.year AS Year, ORD.ID AS OrderID
FROM Customers AS CUS
CROSS JOIN Years AS YEA
LEFT JOIN Orders AS ORD
ON CUS.ID = ORD.Customer_ID
AND ORD.YEAR_ID = YEA.ID;
This will give you and their orders as seperate rows.
If you want the number of orders use this instead:
SELECT CUS.ID AS CustomerID, CUS.Name AS CustomerName, YEA.year AS Year, Count(ORD.ID) AS NrOfOrders
FROM Customers AS CUS
CROSS JOIN Years AS YEA
LEFT JOIN Orders AS ORD
ON CUS.ID = ORD.Customer_ID
AND ORD.YEAR_ID = YEA.ID
GROUP BY CUS.ID, CUS.Name, YEA.Year;
Which will show the number of orders and give each customer a row per year.
Try it out/see it in action here, I added some data to show the multiple orders.
A CROSS JOIN of customers and years, then a LEFT JOIN of that with orders seems to do exactly what you describe in your post. Or am I getting something wrong?
And don't put your data into an image next time. Just type it /paste it into your post, so we can copy-paste it into our examples ... I had to re-type them...
And "name" and "year" are reserved words. I avoided them in my example.
\pset null NULL
WITH
-- your input ...
customers ( id, nm) AS (
SELECT 1,'alpha'
UNION ALL SELECT 2,'bravo'
UNION ALL SELECT 3,'charlie'
UNION ALL SELECT 4,'delta'
UNION ALL SELECT 5,'echo'
UNION ALL SELECT 6,'foxtrot'
)
,
years(id,yr) AS (
SELECT 1,2015
UNION ALL SELECT 2,2016
UNION ALL SELECT 3,2017
UNION ALL SELECT 4,2018
UNION ALL SELECT 5,2019
UNION ALL SELECT 6,2020
)
,
orders(id,cust_id,yr_id) AS (
SELECT 1,1,1
UNION ALL SELECT 2,2,3
UNION ALL SELECT 3,4,5
UNION ALL SELECT 4,5,6
)
-- end of your input ...
SELECT
cust.nm
, years.yr
, ord.id
FROM customers cust
CROSS JOIN years
LEFT JOIN orders ord ON ord.yr_id=years.id
ORDER BY 1,2
;
-- out nm | yr | id
-- out ---------+------+------
-- out alpha | 2015 | 1
-- out alpha | 2016 | NULL
-- out alpha | 2017 | 2
-- out alpha | 2018 | NULL
-- out alpha | 2019 | 3
-- out alpha | 2020 | 4
-- out bravo | 2015 | 1
-- out bravo | 2016 | NULL
-- out bravo | 2017 | 2
-- out bravo | 2018 | NULL
-- out bravo | 2019 | 3
-- out bravo | 2020 | 4
-- out charlie | 2015 | 1
-- out charlie | 2016 | NULL
-- out charlie | 2017 | 2
-- out charlie | 2018 | NULL
-- out charlie | 2019 | 3
-- out charlie | 2020 | 4
-- out delta | 2015 | 1
-- out delta | 2016 | NULL
-- out delta | 2017 | 2
-- out delta | 2018 | NULL
-- out delta | 2019 | 3
-- out delta | 2020 | 4
-- out echo | 2015 | 1
-- out echo | 2016 | NULL
-- out echo | 2017 | 2
-- out echo | 2018 | NULL
-- out echo | 2019 | 3
-- out echo | 2020 | 4
-- out foxtrot | 2015 | 1
-- out foxtrot | 2016 | NULL
-- out foxtrot | 2017 | 2
-- out foxtrot | 2018 | NULL
-- out foxtrot | 2019 | 3
-- out foxtrot | 2020 | 4

How to return all rows with MAX value meeting a condition of another field in SQL?

I have the following costs table:
+--------+------+-----------+
| Year | ID | Amount |
+--------+------+-----------+
| 1960 | 1 | 100 |
| 1960 | 2 | 200 |
| 1960 | 3 | 200 |
| 1960 | 4 | 150 |
| 1961 | 1 | 300 |
| 1961 | 2 | 200 |
| 1961 | 3 | 100 |
| 1961 | 4 | 300 |
+---------+------+----------+
I want all ID’s having the MAX Amount by Year. For example, for 1960, I want rows with ID's 2 and 3. For 1961, I want rows with ID's 1 and 4.
SELECT Year, ID, Amount FROM costs WHERE Amount = (SELECT MAX(Amount) FROM costs);
The above gets me all MAX values across all Years. But I want a condition that only gets me the max Amount values per year. How do I add an condition to only select records with Year = 1960?
Please try this with below query.This is tested. Its working fine.
By clicking on the below link you can see your expected result in live which you want.
SQL Fiddle Live Demo
SELECT
t1.*
FROM
costs t1
WHERE
t1.amount = (
SELECT
MAX(t2.amount)
FROM
costs t2
WHERE
t2. `year` = t1. `year`
);
Try this....It should work
SELECT
*
FROM
costs
WHERE
(YEAR, amount) IN (
SELECT
YEAR,
max(amount)
FROM
costs
GROUP BY
YEAR
);
One option which should run on all major databases is to use a subquery which finds the max amounts for each year to select the records you want:
SELECT c1.*
FROM costs c1
INNER JOIN
(
SELECT Year, MAX(Amount) AS MaxAmount
FROM costs
GROUP BY Year
) c2
ON c1.Year = c2.Year AND
c1.Amount = c2.MaxAmount
Another way to do this would be to use a correlated subquery:
SELECT c1.*
FROM costs c1
WHERE c1.Amount = (SELECT MAX(c2.Amount) FROM costs c2 WHERE c2.Year = c1.Year)
I expect that joining (the first option) would be the fastest method for larger tables, especially if you have proper indices would could be used.
SELECT Year , ID , Amount
FROM #Table T1
JOIN
(
SELECT MAX(Amount) Amount,Year
FROM #Table
GROUP BY Year
) A ON A.Year = T1.Year AND A.Amount = T1.Amount

Grouping in SQL Table [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
Suppose I have a Table such that:
|ID | product |orderid | brand |number of product cust ord|
|----|---------|--------|-------|--------------------------|
| 1 | 123 | 111 | br | 1 |
|----|---------|--------|-------|--------------------------|
| 1 | 234 | 111 | br | 1 |
|----|---------|--------|-------|--------------------------|
| 1 | 345 | 333 | br | 1 |
|----|---------|--------|-------|--------------------------|
| 2 | 123 | 211 | br | 1 |
|----|---------|--------|-------|--------------------------|
| 2 | 456 | 212 | br | 2 |
|----|---------|--------|-------|--------------------------|
| 3 | 567 | 213 | br | 1 |
|----|---------|--------|-------|--------------------------|
What I'd like to do is group them as:
|ID | brand |number of product cust ord|
|----|---------|--------------------------|
| 1 | br | 3 |
|----|---------|--------------------------|
| 2 | br | 4 |
|----|---------|--------------------------|
further to that i'd like to classify them and tried a case...when but can't seem to get it right.
if ID purchases more than 3 unique products and orders more than twice- i'd like to call them a frequent buyer (in the above example, ID '1' would be a 'frequent buyer'), if the average number of products they purchase is higher than the average number of that product sold - i'd like to call them a 'merchant', else just a purchaser.
I've renamed the last field to qty for brevity and called the table test1.
To get frequent flyers use below query. Note that I used >= instead of >. I changed this based on your example where ID 1 is a "frequent flyer" even though he only bought 3 products, not more than 3.
SELECT ID, count(distinct product) as DistinctProducts, count(distinct orderid) DistinctOrders
FROM test1
GROUP BY ID
HAVING count(distinct product) >= 3 and count(distinct orderid) >= 2
Not sure if I understood the merchant logic correctly. Below is the query which will give you customers that on average purchased more than overall average of product for any given product. There are none in the data.
SELECT DISTINCT c.ID
FROM
(select ID, product, avg(qty) as AvgQty
FROM test1
GROUP BY ID, product) as c
FULL OUTER JOIN
(select product, avg(qty) as AvgQty
FROM test1
GROUP BY product) p ON p.product = c.product
WHERE c.AvgQty > p.AvgQty;
To get "purchasers" do EXCEPT between all customer and the UNION of merchants and frequent buyers:
select distinct ID from test1
EXCEPT
(SELECT ID FROM (
select ID, count(distinct product) as DistinctProducts, count(distinct orderid) DistinctOrders
FROM test1
GROUP BY ID
HAVING count(distinct product) >= 3 and count(distinct orderid) >= 2) t
UNION
SELECT DISTINCT c.ID
FROM
(select ID, product, avg(qty) as AvgQty
FROM test1
GROUP BY ID, product) as c
FULL OUTER JOIN
(select product, avg(qty) as AvgQty
FROM test1
GROUP BY product) p ON p.product = c.product
WHERE c.AvgQty > p.AvgQty
);
This is one way that you could do it. Note that according to the description you gave, buyers could be constantly being reclassified between 'Merchant' and 'Purchaser' as the average goes up and down. That might not be what you want.
With cte As (
Select ID,
Brand,
DistinctOrders = Count(Distinct OrderID), -- How many separate orders by this customer for the brand?
DistinctProducts = Count(Distinct Product), -- How many different products by this customer for the brand?
[number of product cust ord] = Sum(CountOfProduct), -- Total number of items by this customer for the brand.
AverageCountOfProductPerBuyer =
Sum(Sum(CountOfProduct)) Over () * 1.0 / (Select Count(*) From (Select Distinct ID, Brand From #table) As tbl)
-- Average number of items per customer (for all customers) for this brand
From #table
Group By ID, Brand)
Select ID, Brand, DistinctOrders, DistinctProducts, [number of product cust ord],
IsFrequentBuyer = iif(DistinctOrders > 1 And DistinctProducts > 2, 'Frequent Buyer', NULL),
IsMerchant = iif(AverageCountOfProductPerBuyer < [number of product cust ord], 'Merchant', 'Purchaser')
From cte;
This query could be written without the common-table expression, but was written this way to avoid defining expressions multiple times.
Note that I have the first ID as a 'Frequent Buyer' based on your description, so I'm assuming that when you say 'more than 3 unique products' you mean 3 or more. Likewise with two or more distinct orders.