SQLite - Count and group duplicates

SQLite - Count and group duplicates - sql

I have a problem figuring out how to count duplicates in a table like below:
Campaign
Approved
Disqualified
campaign-1
1
null
campaign-1
null
2
campaign-2
5
null
campaign-2
null
3
My query:
select
"Campaign"
,case when "Status" = 'Approved' then count("Id") end as "Approved"
,case when "Status" = 'Disqualified' then count("Id") end as "Disqualified"
from "table"
group by "Campaign","Status"
having count(*) > 1
order by "Campaign"
I would like to have a table as result below?
result:
Campaign
Approved
Disqualified
campaign-1
1
2
campaign-2
5
3
...

You must group by Campaign only.
Also, it is easier to use SUM() instead of COUNT(), because SQLite evaluates boolean expressions like Status = 'Approved' as 1 or 0:
SELECT Campaign,
SUM(Status = 'Approved') AS Approved,
SUM(Status = 'Disqualified') AS Disqualified
FROM "table"
GROUP BY Campaign
HAVING COUNT(*) > 1

get_info = """
WITH query1 AS
(
SELECT Campaign,
SUM(Approved) AS sum_a,
SUM(Disqualified) AS sum_d
FROM Campaign
GROUP BY Campaign
)
SELECT Campaign, sum_a, sum_d FROM query1
GROUP BY Campaign
"""

Related

Grouping by id and looking at another column in a particular order to see if the id group satisfies a particular condition

customer_id
transaction success
1
Failed
2
Complete
1
Failed
1
Complete
3
Failed
2
Failed
3
Complete
3
Failed
3
Failed
3
Complete
Essentially I want to write a statement to identify if the customer has had a completed transaction after having had a failed transaction sometime before. So in this example, customer 1 and customer 2 would be satisfy this. Assume that there is an added timestamp column next to transaction success.
The resulting table should look like this:
customer_id
returning_success
1
True
2
False
3
True

Assuming that is not important if the Complete was after or prior to the Cancellation, you can LEFT JOIN the table with a subquery that only takes the completes. If the result is NULL, then is not have a complete state. Otherwise is true.
As you don't provide your DBMS (Please read: Why should I "tag my RDBMS"?) we take in consideration IFNULL but this can change in other DBMS: https://www.w3schools.com/sql/sql_isnull.asp
SELECT
yt.customer_id,
IFNULL(completes.customer_id,'false','true') as returning_success
FROM
yourtable yt
LEFT JOIN
(
SELECT
customer_id
FROM
yourTable
WHERE transaction_success = 'Complete') completes
ON completes.customer_id = yt.customer_id

 If you just need customers that had had both succesfull and faild transactions, you should implement this:
select customer_id, case when sum(case
when transaction='Faild'
then 1
else 0 end)>0
and
sum(case
when transaction='Complete'
then 1
else 0 end)>0
then 'True'
else 'False' end
returning_success
from table_
group by customer_id
 If you actually do have some timestamp column:
select nvl(c.customer_id, f.customer_id) customer_id
, case when last_complete_time is null
or first_fail_time is null
or first_fail_time>last_complete_time
then 'False'
else 'True' end
returning_success
from (
select customer_id, max(time_) last_complete_time
from table_
group by customer_id
where transaction='Complete'
) c
full join (
select customer_id, min(time_) first_fail_time
from table_
group by customer_id
where transaction='Fail'
) f on c.customer_id=f.customer_id
 You also can use this query to filter all True cases and then just union or join the rest:
select f.customer_id, 'True'
from (
select customer_id, max(time_) last_complete_time
from table_
group by customer_id
where transaction='Complete'
) c
join (
select customer_id, min(time_) first_fail_time
from table_
group by customer_id
where transaction='Fail'
) f on c.customer_id=f.customer_id
where first_fail_time<last_complete_time

Roll up multiple rows into one

I have a table that looks like this:
USER_ID,ADDED_DATE,STATUS,COMPLETION_ID_TYPE,QA_OPTION,QA_OPTION_COUNT
12543,2020-06-01 00:00:00,qaComplete_L2,chart,Correct,3
12543,2020-06-01 00:00:00,qaComplete_L2,chart,Incorrect,3
12543,2020-06-12 00:00:00,qaComplete_L2,chart,Incorrect,1
12543,2020-06-12 00:00:00,qaComplete_L2,chart,Correct,1
I want to display the results as:
USER_ID ADDED_DATE STATUS COMPLETION_ID_TYPE L2 Correct L2 InCorrect
8388 6/01/20 0:00 qaComplete_L2 chart 3 3
8388 6/12/20 0:00 qaComplete_L2 chart 1 1
I have tried this but not getting the results I am expecting:
select distinct user_id,
added_date,
status,
completion_id_type,
max(case
when qa_option = 'Correct'
then qa_option_count
else 0
end) as L2_Correct,
max(case
when qa_option = 'Incorrect'
then qa_option_count
else 0
end) as L2_Incorrect
from qa_report2
where user_id = 12543
and status = 'qaComplete_L2'
group by user_id, status, added_date, completion_id_type,qa_option, qa_option_count
order by user_id, added_date;
;
USER_ID,ADDED_DATE,STATUS,COMPLETION_ID_TYPE,L2_CORRECT,L2_INCORRECT
12543,2020-06-01 00:00:00,qaComplete_L2,chart,0,3
12543,2020-06-01 00:00:00,qaComplete_L2,chart,3,0
12543,2020-06-12 00:00:00,qaComplete_L2,chart,1,0
12543,2020-06-12 00:00:00,qaComplete_L2,chart,0,1

You were almost there :)
I only removed the distinct and two last group by columns. Columns you need in the calculation, shouldn't appear in the group by clause, but only in the group function in the select clause.
So in the end, what I think you're looking for is:
select user_id,
added_date,
status,
completion_id_type,
max(case
when qa_option = 'Correct'
then qa_option_count
else 0
end) as L2_Correct,
max(case
when qa_option = 'Incorrect'
then qa_option_count
else 0
end) as L2_Incorrect
from qa_report2
where user_id = 12543
and status = 'qaComplete_L2'
group by user_id,
status,
added_date,
completion_id_type
--,qa_option
--,qa_option_count
order by user_id,
added_date;
Note: You should be aware that you're using max(), I can imagine that if multiple records exist, you actualy want to use sum(), but that really depends on your use case.

You can use the PIVOT to achieve it.
SELECT *
FROM (
SELECT USER_ID,
ADDED_DATE,
STATUS,
COMPLETION_ID_TYPE,
QA_OPTION_COUNT,
QA_OPTION
FROM QA_REPORT2
WHERE USER_ID = 12543
AND STATUS = 'qaComplete_L2'
) PIVOT (
MAX ( QA_OPTION_COUNT )
FOR QA_OPTION
IN ( 'Correct' AS L2_CORRECT, 'Incorrect' AS L2_INCORRECT )
);

SQL Select item which has same value in all rows

For example, if the below is the table
SupId ItemId Status
1 1 Available
1 2 OOS
2 3 Available
3 4 OOS
3 5 OOS
4 6 OOS
5 7 NULL
I am looking to fetch distinct suppliers whose all items are OOS or NULL.
One solution is to get all the suppliers who has atleast one active item (active suppliers) and then add a clause NOT IN active suppliers to pick non active supplier.
Is there any better way to achieve the same?

One option, using aggregation:
SELECT SupId
FROM yourTable
GROUP BY SupId
HAVING
SUM(CASE WHEN Status = 'OOS' OR Status IS NULL THEN 1 ELSE 0 END) = COUNT(*) AND
(MAX(Status) = 'OOS' OR COUNT(Status) = 0);
This assumes you want suppliers who have only all NULL or all OOS status. If you just want to limit to both these two status values, then use this:
SELECT SupId
FROM yourTable
GROUP BY SupId
HAVING SUM(CASE WHEN Status <> 'OOS' AND Status IS NOT NULL THEN 1 ELSE 0 END) = 0;

Try:
SELECT DISTINCT SupId FROM my_table t
WHERE NOT EXISTS(SELECT 1 FROM my_table
WHERE SupId = t.SupId
AND [Status] IS NOT NULL
AND [Status] <> 'OOS')

SELECT DISTINCT SupId
FROM Table
WHERE SupId <> (
SELECT DISTINCT SupId
FROM Table
WHERE Status NOT IN ('OOS',NULL)
)

I would use NOT EXISTS :
SELECT t.*
FROM table t
WHERE NOT EXISTS (SELECT 1 FROM table t1 WHERE t1.supid = t.supid and t1.status <> 'OOS');

I would use group by and having:
select suppid
from t
group by suppid
having (min(Status) = 'OOS' and max(Status) = 'OOS') or
min(Status) is null;

SQL Server : do not Select all if true

I have these columns
Id Status
----------
1 pass
1 fail
2 pass
3 pass
How do I select all that only have a status of pass but if the Id has at least one fail it will not be selected as well.

If same id can have multiple passes
SELECT id
from table
WHERE status = 'pass'
and id NOT IN (SELECT id FROM table WHERE status = 'fail')

You need to use GROUP BY & HAVING clause
SELECT Id
FROM yourtable
GROUP BY Id
HAVING Sum(case when status ='pass' then 1 else 0 end) = count(status)
HAVING clause can be changed to
HAVING Count(case when status ='pass' then 1 end) = count(status)

I just hate chatty case statement, so
SELECT Id
FROM table1
GROUP BY Id
HAVING COUNT(DISTINCT [Status]) = 1 AND MIN([Status]) = 'pass'
or
SELECT Id
FROM table1
GROUP BY Id
HAVING COUNT(NULLIF([Status], 'fail')) = 1 AND COUNT(NULLIF([Status], 'pass')) = 0
The second query only works when status has two values 'pass' and 'fail'.

Custom aggregation in GROUP BY clause

If I have a table with a schema like this
table(Category, SubCategory1, SubCategory2, Status)
I would like to group by Category, SubCategory1 and aggregate the Status such that
if not all Status values over the group have a certain value Status will be 0 otherwise 1.
So my result set will look like
(Category, SubCategory1, Status)
I don't want to write a function. I would like to do it inside the query.

Assuming that status is a numeric data type, use:
SELECT t.category,
t.subcategory1,
CASE WHEN MIN(t.status) = MAX(t.status) THEN 1 ELSE 0 END AS status
FROM dbo.TABLE_1 t
GROUP BY t.category, t.subcategory1

You can test that both the minimum and maximum status for each group are equal to your desired value:
SELECT
category,
subcategory1,
CASE WHEN MIN(status) = 42 AND MAX(status) = 42 THEN 1 ELSE 0 END AS Status
FROM table1
GROUP BY category, subcategory1

Let's say you want to find groups that have all status values under 100
SELECT category, subcategory1,
CASE WHEN MAX(status) < 100 THEN 0 ELSE 1 END AS Status
FROM table1
GROUP BY category, subcategory1
All groups with status under 100 will have Status set to 0, and all groups with at least one status >= 100 will be set to 1.
I think that's what you're asking for, but if not let me know.

I would like to group by Category,
SubCategory1 and aggregate the Status
such that if not all Status values
over the group have a certain value
Status will be 0 otherwise 1.
I'm interpreting this as "If there exists a Status value in a given group not equal to a given parameter, the returned Status will be 0 otherwise 1".
Select T.Category, T.SubCategory1
, Case
When Exists(
Select 1
From Table As T2
Where T2.Category = T.Category
And T2.SubCategory1 = T.SubCategory1
And T2.Status <> #Param
) Then 0
Else 1
End As Status
From Table As T
Group By t.Category, T.SubCategory1

Something like that :
select
Category,
SubCategory1,
(
case
when good_record_count = all_record_count then 1
else 0
end
) as all_records_good
from (
select
t.Category,
t.SubCategory1,
sum( cast(coalesce(t.Status, 'GOOD', '1', '0') as int) ) good_record_count,
count(1) all_record_count
from
table_name t
group by
t.Category, t.SubCategory1
)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQLite - Count and group duplicates - sql

get_info = """ WITH query1 AS ( SELECT Campaign, SUM(Approved) AS sum_a, SUM(Disqualified) AS sum_d FROM Campaign GROUP BY Campaign ) SELECT Campaign, sum_a, sum_d FROM query1 GROUP BY Campaign """

Related

Grouping by id and looking at another column in a particular order to see if the id group satisfies a particular condition

Roll up multiple rows into one

SQL Select item which has same value in all rows

SQL Server : do not Select all if true

Custom aggregation in GROUP BY clause

Categories

Resources