unexpected output with group by - sql

Here is the DDL and DML:
create table fb_customers (
customer_id int,
customer_name varchar,
product_bought varchar);
insert into fb_customers values
(1, 'james', 'A'),(2, 'james', 'B'), (3, 'james', 'A'), (4, 'james', 'C'), (5, 'ada', 'A'), (6, 'ada', 'A'), (7, 'Tom', 'B'), (8, 'Leo', 'C');
when I select customer_name and product_bought with group by:
select customer_name, product_bought from fb_customers
group by customer_name, product_bought;
the result is surprisingly not grouped by name automatically - I thought with group by the result should be grouped by customer_name and product_bought but we can see james is separated and scattered instead of grouped together.
Now I'd like to find the customers who bought both A and B, and here is my query:
select customer_name from fb_customers
group by customer_name, product_bought
having sum(case when product_bought = 'A' then 1
when product_bought = 'B' then 1
else 0
end) = 2;
and the result is surprising include ada who should not be there:
where am I wrong with these two questions? Thanks

You can try the below - DEMO Here
select customer_name from fb_customers
where product_bought in ('A','B')
group by customer_name
having count(distinct product_bought)=2

please see the below photo
you in your query writed for every customer_name every where product_bought = 'A' then +1 and every where product_bought = 'B' then +1 that in customer_name whit name's ada
ada A +1
ada A +1
since ada have tow 'A' that result sum in query is 2
sum for ada=>1+1=2
and see james
james A +1
james B +1
james A +1
james C 0
since james have tow 'A' and one 'B' that result sum in query is 3
sum for james=>1+1+1=3
query result is ada since just sum ada is 2
and if you want use from your code you can edite as below
select customer_name from fb_customers
group by customer_name
having sum(case when product_bought = 'A' then 1
else 0
end) >= 1
and
sum(case when product_bought = 'B' then 1
else 0
end) >= 1

Related

create pivot table with 2 dimensions

I have 2 tables:
CREATE TABLE loans
(
loan_id int,
client_id int,
loan_date date
);
CREATE TABLE clients
(
client_id int,
client_name varchar(20),
gender varchar(20)
);
INSERT INTO CLIENTS
VALUES (1, arnold, 'male'),
(2, lilly, 'female'),
(3, betty, 'female'),
(4, tom, 'male'),
(5, jim, 'male');
INSERT INTO loans
VALUES (1, 1, '20220522'),
(2, 2, '20220522'),
(3, 3, '20220525'),
(4, 4, '20220525'),
(5, 1, '20220527'),
(6, 2, '20220527'),
(7, 3, '20220601'),
(8, 1, '20220603'),
(9, 2, '20220603'),
(10, 1, '20220603');
It is necessary to calculate by years the number of contracts for loans in the context of the serial number of the contract and sex.
Output should be:
sex
1 contract, 2022
2 contract, 2022
3 contract, 2022
male
2
2
1
female
4
1
1
1 contract, 2 contract - Its client contract serial number.
Probably need to apply crosstab here, but it does not apply to CTE.
I would like to have auto-completion of the serial number and year in the columns, because the period includes several years and a large number of contracts
with cte as
(
select
l.client_id,
loan_date,
extract(year from loan_date) as year,
client_name,
gender,
row_number() over (partition by l.client_id order by loan_date asc) as serial_number_contact
from
loans l
inner join
clientc on l.client_id = c.client_id
)
select
gender,
year,
serial_number_contact,
count(*) as count_loan
from
cte
group by
gender, year, serial_number_contact
order by
serial_number_contact, year
;with cte as(
select gender,
row_number() over (partition by l.client_id order by l.client_id,l.loan_date) serial
from loans l
inner join clients c
on c.client_id=l.client_id)
select gender
,sum(case when serial=1 then 1 else 0 end) as "1 Contract 2022"
,sum(case when serial=2 then 1 else 0 end) as "2 Contract 2022"
,sum(case when serial=3 then 1 else 0 end) as "3 Contract 2022"
from cte group by gender
Test
http://sqlfiddle.com/#!17/0d0b93/3

SQL query to fetch distinct records

Can someone help me out with this sql query on postgres which I have to write but I just can't come up with, I have tried my best to simplify the problem from 1 million records and more constraints to this, I know this looks easy, but I am still unable to resolve this somehow :-
Table_name = t
Column_1_name = id
Column_2_name = st
Column_1_elements = [1,1,1,1,2,2,2,3,3]
Column_2_elements = [a,b,c,d,a,c,d,b,d]
Now I want to print to those distinct ids from id where they do not have their corresponding st equals to 'b' or 'a'.
For example, for the above example, the ouput should be [2,3] as 2 does not have corresponding 'b' and 3 does not have 'a'. [even though 3 does not have c also, but we are not concerned about 'c']. id=1 is not returned in solution as it has a relation with both 'a' and 'b'.
Let me know if you need more clarity.
Thanks in advance for helping.
edit1:- The number of elements for id = 1,2,3 could be anything. I just want those ids where there corresponding st does not "contain" 'a' or 'b'.
if there is an id=4 which has just one st which is 'r', and there is an id=5 which contains 'a','b','c','d','e','f','k','z'.
Then we want id=4 in the output as well as it does not contain 'a' or 'b'..
You might need to correct the syntax a little bit based on you SQL engine but this one is a working solution in Google BigQuery -
with temp as (
select 1 as id, 'a' as st union all
select 1 as id, 'b' as st union all
select 1 as id, 'c' as st union all
select 1 as id, 'd' as st union all
select 2 as id, 'a' as st union all
select 2 as id, 'c' as st union all
select 2 as id, 'd' as st union all
select 3 as id, 'b' as st union all
select 3 as id, 'd' as st union all
select 4 as id, 'e' as st union all
select 5 as id, 'g' as st union all
select 5 as id, 'h' as st
)
-- add 2 columns for is_a and is_b flags
, temp2 as (
select *
, case when st = 'a' then 1 else 0 end is_a
,case when st = 'b' then 1 else 0 end as is_b
from temp
)
-- IDs that have both the flags as 1 should be filtered out (like ID = 1)
select id
from temp2
group by 1
having max(is_a) + max(is_b) < 2
This solution takes care of the problem you mentioned with ID 4 . Let me know if this works for you.
See if this works:
create table t (id integer, st varchar);
insert into t values (1, 'a'), (1, 'b'), (1, 'c'), (1, 'd'), (2, 'a'), (2, 'c'), (2, 'd'), (3, 'b'), (3, 'd'), (4, 'r');
insert into t values (5, 'a'), (5, 'b'), (5, 'c'), (5, 'd'), (5, 'e'), (5, 'f'), (5, 'k'), (5, 'z');
select id, array['a', 'b'] <# array_agg(st)::text[] as tf from t group by id;
id | tf
----+----
3 | f
5 | t
4 | f
2 | f
1 | t
select * from (select id, array['a', 'b'] <# array_agg(st)::text[] as tf from t group by id) as agg where agg.tf = 'f';
id | tf
----+----
3 | f
4 | f
2 | f
In the first select query the array_agg(st) aggregates all the st values for an id via the group by id. array['a', 'b'] <# array_agg(st)::text[] then asks if the a and b are both in the array_agg.
The query is then turned into a sub-query where the outer query selects those rows that where 'f'(false), in other words did not have both a and b in the aggregated id values.

Selecting rows from a table with specific values per id

I have the below table
Table 1
Id WFID data1 data2
1 12 'd' 'e'
1 13 '3' '4f'
1 15 'e' 'dd'
2 12 'f' 'ee'
3 17 'd' 'f'
2 17 'd' 'f'
4 12 'd' 'f'
5 20 'd' 'f'
From this table I just want to select the rows which has 12 and 17 only exclusively. Like from the table I just want to retrieve the distinct id's 2,3 and 4. 1 is excluded because it has 12 but also has 13 and 15. 5 is excluded because it has 20.
2 in included because it has just 12 and 17.
3 is included because it has just 17
4 is included because it has just 12
If you just want the list of distinct ids that satisfy the conditions, you can use aggregation and filter with a having clause:
select id
from mytable
group by id
having max(case when wfid not in (12, 17) then 1 else 0 end) = 0
This filters out groups that have any wfid other than 12 or 17.
If you want the entire corresponding rows, then window functions are more appropriate:
select
from (
select t.*,
max(case when wfid not in (12, 17) then 1 else 0 end) over(partition by id) flag
from mytable t
) t
where flag = 0
You really need to start thinking in terms of sets. And it helps everyone if you provide a script that can be used to experiment and demonstrate. Here is another approach using the EXCEPT operator. The idea is to first generate a set of IDs that we want based on the filter. You then generate a set of IDs that we do not want. Using EXCEPT we can then remove the 2nd set from the 1st.
declare #x table (Id tinyint, WFID tinyint, data1 char(1), data2 varchar(4));
insert #x (Id, WFID, data1, data2) values
(1, 12, 'd', 'e'),
(1, 13, '3', '4f'),
(1, 15, 'e', 'dd'),
(2, 12, 'f', 'ee'),
(3, 17, 'd', 'f'),
(2, 17, 'd', 'f'),
(4, 12, 'd', 'f'),
(2, 12, 'z', 'ef'),
(5, 20, 'd', 'f');
select * from #x
select id from #x where WFID not in (12, 17);
select id from #x where WFID in (12, 17)
except
select id from #x where WFID not in (12, 17);
Notice the added row to demonstrate what happens when there are "duplicates".

SQL - Combination of Distinct & Count in a table

Need a simple query to summarize result from a table where 3 columns are present:
Order ID, Category & Brand.
The summary should contain order ID, distinct count of category and distinct count of brand belonging to the order ID.
Sample Data:
orderno product brand
1 A Z
1 A X
1 B Y
2 C X
2 B X
3 C X
3 B Y
Expected Result:
orderno product brand
1 2 3
2 2 1
3 2 2
Sample Data & Summary
Use DISTINCT of COUNT to get your expected result:
select orderno, count(distinct product) as product, count(distinct brand) as brand
from testtable
group by orderno
Sample execution with the given data
declare #test1 table (orderno int, product varchar(2), brand varchar(2))
insert into #test1 (orderno, product, brand) values
(1, 'A', 'Z'),
(1, 'A', 'X'),
(1, 'B', 'Y'),
(2, 'C', 'X'),
(2, 'B', 'X'),
(3, 'C', 'X'),
(3, 'B', 'Y');
select orderno, count(distinct product) as product, count(distinct brand) as brand
from #test1
group by orderno
Result:
orderno product brand
1 2 3
2 2 1
3 2 2
Try this,.
select orderno,count(DISTINCT product) product,COUNT(DISTINCT brand) brand
from data
GROUP by orderno
The output.,
orderno product brand
----------- ----------- -----------
1 2 3
2 2 1
3 2 2

ADD Specific values in SQL Column determined by other Column

I have a Database that determines different values based on a label.
Where the label determines whether it's an exempted value or not.
For instance, 2 = non exempted and 3 = exempted. If I run a query my results look something like this
|Name |ExemptionStatus |Total Value|
|X |2 |100 |
|X |3 |200 |
My Query is
SELECT NAME, EXEMPTIONSTATUS
SUM(TOTAL_VALUE) AS 'TOTAL VALUE'
FROM ORDER_ACCOUNT JOIN ACCOUNT_INVOICE
WHERE ORDER_ACCOUNT.DATE BETWEEN 'M/D/YEAR' AND 'M/D/YEAR'
GROUP BY NAME, EXEMPTIONSTATUS
ORDER BY NAME ASC
How can I get my query to create a new column for the values, for example:
|Name |NON EXEMPT VALUE|EXEMPT VALUE|
|X |100 |200 |
I just don't know how how I would sort it whether it's in my Where clause or not.
Use a CASE statement within a SUM to only total NON EXEMPT, then EXEMPT, and select them as separate columns. Similar to the following (might need to add TOTAL_VALUE to the GROUP BY, or remove EXEMPTIONSTATUS)
SELECT
NAME
,SUM(CASE WHEN EXEMPTIONSTATUS = 2 THEN TOTAL_VALUE ELSE 0 END) AS 'NON EXEMPT VALUE'
,SUM(CASE WHEN EXEMPTIONSTATUS = 3 THEN TOTAL_VALUE ELSE 0 END) AS 'EXEMPT VALUE'
FROM ORDER_ACCOUNT JOIN ACCOUNT_INVOICE
WHERE ORDER_ACCOUNT.DATE BETWEEN 'M/D/YEAR' AND 'M/D/YEAR'
GROUP BY NAME, EXEMPTIONSTATUS
ORDER BY NAME ASC
EDIT: New code below adds new columns to your existing table. you will need to replace the #Test with your tables, but I believe this will get you what you're looking for.
SELECT
NAME,
EXEMPTIONSTATUS
,[TOTAL_VALUE]
,(SELECT SUM(CASE WHEN EXEMPTIONSTATUS = 2 THEN TOTAL_VALUE ELSE 0 END) FROM #Test t WHERE t.NAME = NAME) 'NON EXEMPT VALUE'
,(SELECT SUM(CASE WHEN EXEMPTIONSTATUS = 3 THEN TOTAL_VALUE ELSE 0 END) FROM #Test t WHERE t.NAME = NAME) 'EXEMPT VALUE'
FROM #Test
This gives me the following output
| NAME | EXEMPTIONSTATUS | TOTAL_VALUE | NON EXEMPT VALUE | EXEMPT VALUE |
| X | 2 | 100 | 100 | 200 |
| X | 3 | 200 | 100 | 200 |
Let's say your table structure is like this:
CREATE TABLE tab(ID int, Name nvarchar(20), ExemptionStatus int, TotalValue int);
INSERT INTO tab(ID, Name, ExemptionStatus, TotalValue) values (1, 'X', 2, 100);
INSERT INTO tab(ID, Name, ExemptionStatus, TotalValue) values (2, 'X', 3, 200);
So your data looks like this:
ID Name ExemptionStatus TotalValue
1 X 2 100
2 X 3 200
Then the query you'd use is:
SELECT NotExempted.Name,
NotExempted.NonExemptValue,
Exempted.ExemptValue
FROM (SELECT Name,
CASE
WHEN ExemptionStatus = 2 THEN TotalValue
END
AS 'NonExemptValue'
FROM #tab
) NotExempted
INNER JOIN (SELECT Name,
CASE
WHEN ExemptionStatus = 3 THEN TotalValue
END
AS 'ExemptValue'
FROM #tab
) Exempted ON NotExempted.Name = Exempted.Name
WHERE NotExempted.NonExemptValue IS NOT NULL
AND Exempted.ExemptValue IS NOT NULL
GROUP BY NotExempted.Name,
NotExempted.NonExemptValue,
Exempted.ExemptValue
You result will look like this :
Name NonExemptValue ExemptValue
X 100 200
You can see this here -> http://sqlfiddle.com/#!9/8902d3/2
Now, let's say you have data like this :
CREATE TABLE #tab(ID int, Name nvarchar(20), ExemptionStatus int, TotalValue int)
INSERT INTO #tab(ID, Name, ExemptionStatus, TotalValue) values (1, 'X', 2, 100)
INSERT INTO #tab(ID, Name, ExemptionStatus, TotalValue) values (2, 'X', 3, 200)
INSERT INTO #tab(ID, Name, ExemptionStatus, TotalValue) values (3, 'X', 2, 1000)
INSERT INTO #tab(ID, Name, ExemptionStatus, TotalValue) values (4, 'X', 3, 2000)
INSERT INTO #tab(ID, Name, ExemptionStatus, TotalValue) values (5, 'X', 2, 1045)
INSERT INTO #tab(ID, Name, ExemptionStatus, TotalValue) values (6, 'X', 3, 2045)
INSERT INTO #tab(ID, Name, ExemptionStatus, TotalValue) values (7, 'X', 2, 1034)
INSERT INTO #tab(ID, Name, ExemptionStatus, TotalValue) values (8, 'X', 3, 2023)
INSERT INTO #tab(ID, Name, ExemptionStatus, TotalValue) values (9, 'X', 2, 1023)
INSERT INTO #tab(ID, Name, ExemptionStatus, TotalValue) values (10, 'X', 3, 2076)
which looks like this:
ID Name ExemptionStatus TotalValue
1 X 2 100
2 X 3 200
3 X 2 1000
4 X 3 2000
5 X 2 1045
6 X 3 2045
7 X 2 1034
8 X 3 2023
9 X 2 1023
10 X 3 2076
If you need to sum the total value up, then you can use the following query (which is a slight modification of the query above):
SELECT NotExempted.Name,
NotExempted.NonExemptValue,
Exempted.ExemptValue
FROM (SELECT Name,
CASE
WHEN ExemptionStatus = 2 THEN (SELECT SUM(TotalValue) FROM #tab WHERE ExemptionStatus = 2)
END
AS 'NonExemptValue'
FROM #tab
) NotExempted
INNER JOIN (SELECT Name,
CASE
WHEN ExemptionStatus = 3 THEN (SELECT SUM(TotalValue) FROM #tab WHERE ExemptionStatus = 3)
END
AS 'ExemptValue'
FROM #tab
) Exempted ON NotExempted.Name = Exempted.Name
WHERE NotExempted.NonExemptValue IS NOT NULL
AND Exempted.ExemptValue IS NOT NULL
GROUP BY NotExempted.Name,
NotExempted.NonExemptValue,
Exempted.ExemptValue
Your result will look like this :
Name NonExemptValue ExemptValue
X 4202 8344
You can see this here -> http://sqlfiddle.com/#!9/02c76/3
Hope this helps!!!