Get count With Distinct in SQL Server - sql

Select
Count(Distinct iif(t.HasReplyTask = 1, t.CustomerID, Null)) As Reply,
Count(Distinct iif(t.HasOverdueTask = 1, t.CustomerID, Null)) As Overdue,
Count(Distinct t.CustomerID) As Total
From
Table1 t
If a customer is in Reply, we need to remove that customer in Overdue count, That means if Customer 123 is in both, The Overdue count should be one less. How can I do this?
I am adding some data here,
Customer 123 has "HasReplyTask", so, we have to filter that customer from Count in OverDue(even though that customer has one Overdue task without HasReplyTask). 234 is one and Distinct of 456 is one.
So, the overdue count should be 2, Above query returns 3

If I've got it right, this can be done using a subquery to get the numbers for each customer, and then get the summary information as follows:
Select Sum(HasReplyTask) As Reply,
Sum(HasOverdueTask) As Overdue,
Count(CustomerID) As Total
From (
Select CustomerID,
IIF(Max(Cast(HasReplyTask As TinyInt))<>0, 0, Max(Cast(HasOverdueTask As TinyInt))) As HasOverdueTask,
Max(Cast(HasReplyTask As TinyInt)) As HasReplyTask
From Table1
Group by CustomerID) As T
I don't know about column data types, so I used cast function to use max function.
db<>fiddle
Reply
Overdue
Total
1
2
3

What would probably be more efficient for you is to pre-aggregate your table by customer ID and have counts per customer. Then your outer query can test for whatever you are really looking for. Something like
select
sum( case when PQ.ReplyCount > 0 then 1 else 0 end ) UniqReply,
sum( case when PQ.OverdueCount > 0 then 1 else 0 end ) UniqOverdue,
sum( case when PQ.OverdueCount - PQ.ReplyCount > 0 then 1 else 0 end ) PendingReplies,
count(*) as UniqCustomers
from
( select
yt.customerid,
count(*) CustRecs,
sum( case when yt.HasReplyTask = 1 then 1 else 0 end ) ReplyCount,
sum( case when yt.HasOverdueTask = 1 then 1 else 0 end ) OverdueCount
from
yourTable yt
group by
yt.customerid ) PQ
Now to differentiate the count you are REALLY looking for, you might need to do a test against the prequery (PQ) of ReplyCount vs OverdueCount such as... For a single customer ID (thus the pre query), if the OverdueCount is GREATER than the ReplyCount, then it is still considered overdue? So for customer 123, they had 3 overdue, but only 2 replies. You want that counted once? But for customers 234 and 456, the only had overdue entries and NO replies. So, the total where Overdue - Reply > 0 = 3 distinct people.
Is that your ultimate goal?

Related

Add a Total Row and converting into percentages

I have a database with information about people's work condition and neighbourhood.
I have to display a chart of information in percentages like this:
Neighbourhood Total Employed Unemployed Inactive
Total 100 50 25 25
1 100 45 30 25
2 100 55 20 25
To do that, the code that I've made so far is:
select neighbourhood, Count (*) as Total,
Count(Case when (condition = 1) then 'employed' end) as employed,
Count (case when (condition = 2) then 'unemployed' end) as unemployed,
Count (Case when (condition =3) then 'Inactive' end) as Inactive
from table
group by neighbourhood
order by neighbourhood
the output for that code is (the absolut numbers are made up, they dont result in the percentages above):
Neighbourhood Total Employed Unemployed Inactive
1 600 300 200 100
2 450 220 159 80
So, I have to turn the absolut numbers in percentages and add the Total Row (suming the values from the neighbourhoods) but I all my efforts were a failure. I can't solve how to add that Total row nor how to have that total for each neighbourhood for calculating the percentages
I started studying SQL just two weeks ago so I apologize for any inconvenience. I tried my best to keep it simple (in my database are 15 neighbourhoods and it's ok if they are labeled by numbers)
Thanks
You need to UNION to the add the total row
select 'All' as neighbourhood, Count (*) as Total,
Count(Case when (condition = 1) then 1 end) as employed,
Count (case when (condition = 2) then 1 end) as unemployed,
Count (Case when (condition =3) then 1 end) as Inactive
from table
UNION all
select neighbourhood, Count (*) as Total,
Count(Case when (condition = 1) then 1 end) as employed,
Count (case when (condition = 2) then 1 end) as unemployed,
Count (Case when (condition =3) then 1 end) as Inactive
from table
group by neighbourhood
order by neighbourhood
You can add the total rows using grouping sets:
select neighbourhood, Count(*) as Total,
sum((condition = 1)::int) as employed,
sum((condition = 2)::int) as unemployed,
sum((condition = 3)::int) as Inactive
from table
group by grouping sets ( (neighbourhood), () )
order by neighbourhood;
If you want averages within each row, then use avg() rather than sum().

SQL aggregate rows with same id , specific value in secondary column

I'm looking to filter out rows in the database (PostgreSQL) if one of the values in the status column occurs. The idea is to sum the amount column if the unique reference only has a status equals to 1. The query should not SELECT the reference at all if it has also a status of 2 or any other status for that matter. status refers to the state of the transaction.
Current data table:
reference | amount | status
1 100 1
2 120 1
2 -120 2
3 200 1
3 -200 2
4 450 1
Result:
amount | status
550 1
I've simplified the data example but I think it gives a good idea of what I'm looking for.
I'm unsuccessful in selecting only references that only have status 1.
I've tried sub-queries, using the HAVING clause and other methods without success.
Thanks
Here's a way using not exists to sum all rows where the status is 1 and other rows with the same reference and a non 1 status do not exist.
select sum(amount) from mytable t1
where status = 1
and not exists (
select 1 from mytable t2
where t2.reference = t1.reference
and t2.status <> 1
)
SELECT SUM(amount)
FROM table
WHERE reference NOT IN (
SELECT reference
FROM table
WHERE status<>1
)
The subquery SELECTs all references that must be excluded, then the main query sums everything except them
select sum (amount) as amount
from (
select sum(amount) as amount
from t
group by reference
having not bool_or(status <> 1)
) s;
amount
--------
550
You could use windowed functions to count occurences of status different than 1 per each group:
SELECT SUM(amount) AS amount
FROM (SELECT *,COUNT(*) FILTER(WHERE status<>1) OVER(PARTITION BY reference) cnt
FROM tc) AS sub
WHERE cnt = 0;
Rextester Demo

Calculate percentages of columns in Oracle SQL

I have three columns, all consisting of 1's and 0's. For each of these columns, how can I calculate the percentage of people (one person is one row/ id) who have a 1 in the first column and a 1 in the second or third column in oracle SQL?
For instance:
id marketing_campaign personal_campaign sales
1 1 0 0
2 1 1 0
1 0 1 1
4 0 0 1
So in this case, of all the people who were subjected to a marketing_campaign, 50 percent were subjected to a personal campaign as well, but zero percent is present in sales (no one bought anything).
Ultimately, I want to find out the order in which people get to the sales moment. Do they first go from marketing campaign to a personal campaign and then to sales, or do they buy anyway regardless of these channels.
This is a fictional example, so I realize that in this example there are many other ways to do this, but I hope anyone can help!
The outcome that I'm looking for is something like this:
percentage marketing_campaign/ personal campaign = 50 %
percentage marketing_campaign/sales = 0%
etc (for all the three column combinations)
Use count, sum and case expressions, together with basic arithmetic operators +,/,*
COUNT(*) gives a total count of people in the table
SUM(column) gives a sum of 1 in given column
case expressions make possible to implement more complex conditions
The common pattern is X / COUNT(*) * 100 which is used to calculate a percent of given value ( val / total * 100% )
An example:
SELECT
-- percentage of people that have 1 in marketing_campaign column
SUM( marketing_campaign ) / COUNT(*) * 100 As marketing_campaign_percent,
-- percentage of people that have 1 in sales column
SUM( sales ) / COUNT(*) * 100 As sales_percent,
-- complex condition:
-- percentage of people (one person is one row/ id) who have a 1
-- in the first column and a 1 in the second or third column
COUNT(
CASE WHEN marketing_campaign = 1
AND ( personal_campaign = 1 OR sales = 1 )
THEN 1 END
) / COUNT(*) * 100 As complex_condition_percent
FROM table;
You can get your percentages like this :
SELECT COUNT(*),
ROUND(100*(SUM(personal_campaign) / sum(count(*)) over ()),2) perc_personal_campaign,
ROUND(100*(SUM(sales) / sum(count(*)) over ()),2) perc_sales
FROM (
SELECT ID,
CASE
WHEN SUM(personal_campaign) > 0 THEN 1
ELSE 0
end AS personal_campaign,
CASE
WHEN SUM(sales) > 0 THEN 1
ELSE 0
end AS sales
FROM the_table
WHERE ID IN
(SELECT ID FROM the_table WHERE marketing_campaign = 1)
GROUP BY ID
)
I have a bit overcomplicated things because your data is still unclear to me. The subquery ensures that all duplicates are cleaned up and that you only have for each person a 1 or 0 in marketing_campaign and sales
About your second question :
Ultimately, I want to find out the order in which people get to the
sales moment. Do they first go from marketing campaign to a personal
campaign and then to sales, or do they buy anyway regardless of these
channels.
This is impossible to do in this state because you don't have in your table, either :
a unique row identifier that would keep the order in which the rows were inserted
a timestamp column that would tell when the rows were inserted.
Without this, the order of rows returned from your table will be unpredictable, or if you prefer, pure random.

Getting the difference of two sums from the same table

I need to get the difference of a query result from the same table.
Ex: myTable
id code status amount
1 A active 250
2 A active 200
3 B active 300
4 B active 100
5 A active 100
6 C active 120
7 C active 200
I want to get the total or sum of amount for each code (add if active, subtract if inactive status) so that i will have a result like this:
code total amount
A 150
B 400
C 80
i have tried a lot already, one of them is this:
SELECT code, amount1-amount2
FROM (
SELECT code, SUM(amount) AS amount1
FROM mytable
WHERE status='active'
GROUP BY code
UNION ALL
SELECT code, SUM(amount) AS amount2
FROM mytable
WHERE status='inactive'
GROUP BY code
) as T1
GROUP BY code
The simplest solution would be to use a case expression inside the sum function:
SELECT
code,
sum(case when status='active' then amount else amount * -1 end) as total_amount
FROM mytable
GROUP BY code
If there can be other statuses besides active/inactive you have to be more explicit:
sum(
case
when status = 'active' then amount
when status = 'inactive' then amount * -1
end
) AS total
WITH A AS
(SELECT
code,
CASE WHEN status='active' THEN amount ELSE -amount END AS signed_amount
FROM myTable)
SELECT code, sum(signed_amount) FROM A GROUP BY code;

Get Highest values on Diffrent ID's on the same table?

I have table of bids, each bid has Amount and AuctionID.
I want to SUM/SELECT all the highest bids from each AuctionID..
Example:
The Result of:
SELECT AuctionID,Amount, Highest FROM Bids Where Burned=0 ORDER BY Amount DESC
AuctionID Amount Highest
1 44.4400 0
3 43.7800 0
2 42.3300 0
1 22.2200 0
4 21.2700 0
1 21.2600 0
4 21.2500 0
2 21.2400 0
1 12.6600 0
4 12.5200 0
It should return 44.44, 43.78, 42.33, 21.27 .
The 'Highest' is a flag that i thought may be helpful it still has no use.
I wanted to see if there is a method to do with without using a Flag.
A simple group by clause will do the trick:
select AuctionID, MAX(Amount)
from table
group by AuctionID
SELECT AuctionId, MAX(Amount) FROM TableName GROUP BY AuctionID
To get all the highest bids:
Select auctionid, max(amount) from auctions group by auctionid
To get the overall sum of the highest bids:
select sum(v1.max_amount) from
(Select auctionid, max(amount) max_amount from auctions group by auctionid
) as v1