Oracle SQL: Obtain count of distinct column values based on another column - sql

Here is a sample data to explain my case,
CompanyInfo:
Name, Company, Location, Completed, Pending
Andy AA Home 4 2
Jim AA Office 3 3
Dahli AA Home 4 2
Monica AA Home 4 2
Chandler AA Home-Office 1 0
Ashley AA Home-Office 1 0
The last three columns have duplicated information and I am trying to obtain count of location, completed and pending which are bound to each other. So the output would look something like below,
Company, Count(Locations), Count( Completed+Pending > 0),
AA 3 3
Why Count( Completed+Pending > 0) is 3? there are just three unique combinations of Home, Office and home-office columns where sum of completed+pending is > 0.
I did try below, but it gives me (AA, 3, 6) since it is processing all the 6 rows to obtain the count.
select Company,
count(distinct Location),
SUM (
CASE
WHEN (Completed + Pending) > 0 THEN 1
ELSE 0
END)
AS Total
From CompanyInfo
group by Company;
Any pointers?

I think you want a conditional count(distinct):
select Company,
count(distinct Location),
count(distinct case when Completed + Pending > 0 then location end)
from CompanyInfo
group by Company;

Related

SQL Group by Sales Rep - Select 2 counts

I would like to query a table that has leads assigned by sales rep to return the unique number of leads grouped by agent and also the number sold. There can be multiple leads from one buyer, I would like to select distinct so each buyer is counted only once. Here is the layout of the data:
AgentId
BuyerEmail
Product
Category
1
lisa#gmail.com
Jeans
1
1
lisa#gmail.com
Hat
1
1
ryan#gmail.com
Shoes
3
2
mark#gmail.com
Jeans
1
2
mark#gmail.com
Socks
1
2
mark#gmail.com
Hat
1
4
john#gmail.com
Shirt
3
5
lou#gmail.com
Hat
3
5
tim#gmail.com
Shirt
3
I would like to return a dataset like the following:
AgentId
UniqueLeads
QtySold
1
2
1
2
1
0
4
1
1
5
2
2
I can query this individually but I can't get it to return in one result set. Here are the 2 separate queries:
SELECT COUNT(DISTINCT BuyerEmail) FROM SalesLeads GROUP BY InitialAgent
SELECT COUNT(DISTINCT BuyerEmail) FROM SalesLeads WHERE Category = 3 GROUP BY InitialAgent
How can I query the table and have both data points return in one result set? Please note, a category = 3 means it is sold.
You can use conditional aggregation to calculate QtySold in the same statement:
select AgentId,
count(distinct BuyerEmail) as UniqueLeads,
count(case when Category = 3 then Category end) as QtySold
from SalesLeads
group by AgentId
When Category is anything other than 3 the case statement returns null so that record isn't counted in the QtySold calculation.
db<>fiddle

SQL: create another column that calculates ratio

So I have a table that looks like the following:
car owner
non car owner
have dog
num ppl
1
0
1
60
0
1
1
80
1
0
0
90
1
0
0
98
I am trying to add another column to find the ratios. For example, the total number of car owners is 110. If I want to find the ratio of people who own car and have dog, then I have to divide 60/110 for the first row. Also, the total number of non car owners is 98. Therefore, if I want to find that ration, I need to divide 80 by 98 for the second row and so on.
So far, I have tried the following code:
with a as (
select
id,
case when car_owner = 1 then 1 else 0 end car_owner,
case when non_car_owner = 1 then 1 else 0 end as non_car_owner = 1
from `xyz_table`
),
b as (select
car_owner,
non_car_owner,
case when have_dog = 1 then 1 else 0 end have_dog,
count(distinct id) num_ppl
from `xyz_table`
join a using (id)
group by 1,2,3
order by 4 desc
)
select *, num_ppl/(select (case when dog_owner = 1 then 110 else 0 end) as ratio
from a)
from b
Unfortunately , it throws the following error:
Scalar subquery produced more than one element
Any help would be appreciated.
PS. I am running this code on google bigquery.
If I want to find the ratio of people who own car and have dog,
You can use avg():
select avg(car_owner * have_dog)
from t;

SQL: calculation on two columns with multiple group by statements

I have a table which has the following columns:
user_id - includes duplicates
product_id - includes duplicates
purchases - number of purchases of given product_id
My table looks somewhat like this:
user_id date product_id purchases
0 1 1 1 4
1 1 2 1 0
2 1 3 2 0
3 1 4 2 0
4 2 1 1 1
5 2 2 1 0
6 2 3 1 1
7 3 1 2 0
8 3 2 3 0
9 4 1 5 1
My goal is to calculate the following metric:
% of products that were purchased at least once, grouped by user
For example: user 1 had 2 products, one of them got purchased at least once, the other one did not get purchased at all. So the metric would be the number of products that got purchased at least once / number of all products per user: 1/2 * 100 = 50%
I have little SQL experience so I do not have any legitimate code that could be corrected.
My desired output would be like this:
user_id total_products products_with_purchases metric
0 1 2 1 50%
1 2 1 1 100%
2 3 2 0 0%
3 4 1 1 100%
I would appreciate seeing a good practice solution to this problem. Many thanks!
select
user_id,
count(distinct product_id) as total_products,
count(distinct case when purchases > 0 then product_id end) as products_with_purchases,
100.00 * count(distinct case when purchases > 0 then product_id end)
/ count(distinct product_id) as metric
from T as t
group by user_id
https://rextester.com/EDSY39439
You can do this all in one query but this is the type of situation where it is easier to understand with sub-queries -- sql optimizer should make it fast.
select
user_id,
total_products,
products_with_purchase,
(products_with_purchase / total_products) * 100 as metric
from (
select -- group by user to get totals
user_id,
count(product_id) as total_products,
sum(case when purchases > 0 then 1 else 0 end) as products_with_purchase
from ( -- group by user and product and get purchase items
SELECT user_id, product_id, sum(purchases) as purchases
FROM table
GROUP BY user_id, product_id
) X
group by user_id
) X2
I Am Mohit Sahni
you can solve the above problem with the below SQL Code:
select
user_id,
count(distinct product_id) as total_products,
sum(case when purchases = 0 then 0 else 1 end) as products_with_purchases,
((sum(case when purchases = 0 then 0 else 1 end))/count(distinct product_id))*100 as metric
from
table
group by
user_id

SQL Server : how can I get difference between counts of total rows and those with only data

I have a table with data as shown below (the table is built every day with current date, but I left off that field for ease of reading).
This table keeps track of people and the doors they enter on a daily basis.
Table entrance_t:
id entrance entered
------------------------
1 a 0
1 b 0
1 c 0
1 d 0
2 a 1
2 b 0
2 c 0
2 d 0
3 a 0
3 b 1
3 c 1
3 d 1
My goal is to report on people and count entrances not used(grouping on people), but ONLY if they entered(entered=1).
So using the above table, I would like the results of query to be...
id count
----------
2 3
3 1
(id=2 did not use 3 of the entrances and id=3 did not use 1)
I tried queries(some with inner joins on two instances of same table) and I can get the entrances not used, but it's always for everybody. Like this...
id count
----------
1 4
2 3
3 1
How do I not display results id=1 since they did not enter at all?
Thank you,
You could use conditional aggregation:
SELECT id, count(CASE WHEN entered = 0 THEN 1 END) AS cnt
FROM entrance_t
GROUP BY id
HAVING count(CASE WHEN entered = 1 THEN 1 END) > 0;
DBFiddle Demo

SQL: Average number of applications per customer for last x months

I have 3 tables Customer, Applications, ApplicationHistory. I have to retrieve following data:
Get the average number of applications per customer for last 3 months
Get the number of customers with atleast one or more applications for last 3 months
I had been trying group by however having following issues:
ApplicationHistory table has more than one entries for each application, now sure how to eliminate them &
Note: Have included Customer Table as need to filter data by customertype
Can you please suggest how can I get this right?
Many thanks,
My Solution ( does not work )
SELECT a.ApplicationId, a.CustomerId, count(*) count
from [application] a
inner join [applicationhistory] ah on a.ApplicationId = ah.ApplicationId
inner join Customer c on c.CustomerId = a.CustomerId
where ah.EventDate between #StartDateFilter and #EndDateFilter
--c.CustomerType in ( A, B)
group by a.ApplicationId, a.CustomerId
Table Structure:
Customer
Name CustomerId CustomerType
test1 1 A
test2 2 B
Applications
ApplicationId CustomerId
3 1
4 1
5 2
6 2
7 2
ApplicationHistory
ApplicationId EventDate EventType
3 2014-12-01 New
3 2014-12-01 Updated
3 2014-12-02 Withdrawn
4 2014-12-02 New
4 2014-12-03 Updated
5 2014-12-05 New
5 2014-12-06 Updated
5 2014-12-06 Updated
5 2014-12-07 Updated
6 2014-12-08 New
First, you query doesn't need the joins -- unless you care about customers with no applications. So, this is a simpler version to get the total
select ah.CustomerId, count(*) as cnt
from applicationhistory ah
where ah.EventDate between #StartDateFilter and #EndDateFilter
group by a.CustomerId;
Note that the group by only has CustomerId and not ApplicationId.
If you want only "New" applications, the use where:
select ah.CustomerId, count(*) as cnt
from applicationhistory ah
where ah.EventDate between #StartDateFilter and #EndDateFilter and
EventType = 'New'
group by a.CustomerId;
If you want net applications "new" - "withdrawn", then use conditional aggregation:
select ah.CustomerId,
sum(case when EventType = 'New' then 1 else -1 end) as cnt
from applicationhistory ah
where ah.EventDate between #StartDateFilter and #EndDateFilter and
EventType in ( 'New', 'Withdrawn' )
group by a.CustomerId;