How to count occurrence in SQL - sql

I have a list of items (e.g. shirts, tops, pants, adidas, nike, puma etc.) and years in a format like this 2017-01-01. I want to find out how many times each item was purchased per year and have it arranged by year.
How can I do this?
I have the following table called Items:
Year | Purchases |
------------------
2017-01-01 | makeup
2018-01-01 | clothing
2019-01-01 | makeup
2017-01-01 | shoes
2017-01-01 | clothing
2016-01-01 | shoes
2018-01-01 | clothing
2017-01-01 | clothing
2019-01-01 | makeup
The desired output is something like this:
Year | Purchases| Count
-----------------------
2016 | Shoes | 1
2017 | Makeup | 1
2017 | Clothing | 2
2017 | Shoes | 1
2018 | Clothing | 2
2019 | Makeup | 2
My code so far is this:
SELECT YEAR(d.date_format) AS Year
, Purchase = (CASE WHEN it.type IN ('shirts', 'tops', 'pants) THEN 'clothing'
WHEN it.type('nike','adidas', 'puma') THEN 'shoes'
WHEN it.type('facewash', 'lipstick') THEN 'makeup' END), COUNT(*)
FROM ....
INNER JOIN...
WHERE...
GROUP BY Year, Purchase
ORDER BY Year

select year, purchases, count(*)
from table
group by year, purchase
order by 1, 2

you can use group by with order by
select year, purchases, count(*)
from myTable
group by year, purchases
order by year

Is this what you want?
SELECT YEAR(d.date_format) AS Year,
(CASE WHEN it.type IN ('shirts', 'tops', 'pants') THEN 'clothing'
WHEN it.type IN ('nike','adidas', 'puma') THEN 'shoes'
WHEN it.type IN ('facewash', 'lipstick') THEN 'makeup'
END), COUNT(*)
FROM .... INNER JOIN...
WHERE...
GROUP BY YEAR(d.date_format),
(CASE WHEN it.type IN ('shirts', 'tops', 'pants') THEN 'clothing'
WHEN it.type IN ('nike','adidas', 'puma') THEN 'shoes'
WHEN it.type IN ('facewash', 'lipstick') THEN 'makeup'
END)
ORDER BY Year;
The use of = in the SELECT suggests that you are using SQL Server. SQL Server does not support the use of aliases in the GROUP BY, so you need to repeat the expression.

Related

SQL monthly subscription rate

How to write a concise sql to get subscription rate by month.
formula: subscription rate = subscription count/ trial count
NOTE: The tricky part is the subscription event should be attributed to the month that company started the trail.
| id | date | type |
|-------|------------|-------|
| 10001 | 2019-01-01 | Trial |
| 10001 | 2019-01-15 | Sub |
| 10002 | 2019-01-20 | Trial |
| 10002 | 2019-02-10 | Sub |
| 10003 | 2019-01-01 | Trial |
| 10004 | 2019-02-10 | Trial |
Based on the above table, the out output should be:
2019-01-01 2/3
2019-02-01 0/1
One option is a self-join to identify whether each trial eventually subscribed, then aggregation and arithmetics:
select
date_trunc('month', t.date) date_month
1.0 * count(s.id) / count(t.id) rate
from mytable t
left join mytable s on s.id = t.id and s.type = 'Sub'
where t.type = 'Trial'
group by date_trunc('month', t.date)
The syntax to truncate a date to the beginning of the month widely varies across databases. The above would work in Postgres. Alternatives are available in other databases, such as:
date_format(t.date, '%Y-%m-01') -- MySQL
trunc(t.date, 'mm') -- Oracle
datefromparts(year(t.date), month(t.date), 1) -- SQL Server
You can do this with window functions. Assuming that there are not duplicate trial/subs:
select date_trunc('month', date) as yyyymm,
count(*) where (num_subs > 0) * 1.0 / count(*)
from (select t.*,
count(*) filter (where type = 'Sub') over (partition by id) as num_subs
from t
) t
where type = 'Trial'
group by yyyymm;
If an id can have duplicate trials or subs, then I suggest that you ask a new question with more detail about the duplicates.
You an also do this with two levels of aggregation:
select trial_date,
count(sub_date) * 1.0 / count(*)
from (select id, min(date) filter (where type = 'trial') as trial_date,
min(date) filter (where type = 'sub') as sub_date
from t
group by id
) id
group by trial_date;

Query to Show Max Sales for Each Seller

I have a table like this (sample):
Name_Seller Month Value
---------------------------
Seller A Jan 200
Seller B Jan 100
Seller A Fev 300
Seller B Fev 100
Seller C Jan 400
Seller A Mar 200
Seller D Jan 300
SQL query:
SELECT Name_Seller, Month, Value
FROM SALES
WHERE Value = (SELECT MAX(Value) FROM SALES GROUP BY Name_Seller);
And I'd like to print for each seller which was his maximum sale and when it was.
Could you help me fix my query and explain why it does not work?
I tried:
select name_seller, month, max(value)
from sales
group by name_seller, month;
but this query returns:
+---------------+------------+------+
| NAME_SELLER | MAX(VALUE) | MONTH|
+---------------+------------+------+
| SELLER A | 4182.00 | Jan |
| SELLER A | 3261.00 | Fev |
| SELLER A | 4219.00 | Mar |
| SELLER B | 2123.00 | Jan |
| SELLER B | 2111.00 | Fev |
| SELLER B | 3918.00 | Mar |
| SELLER C | 3000.00 | Jan |
| SELLER C | 4000.00 | Fev |
| SELLER C | 1500.00 | Mar |
| SELLER D | 2819.00 | Jan |
| SELLER D | 3881.00 | Fev |
| SELLER D | 2012.00 | Mar |
+---------------+------------+------+
And I would like just THE TOP sale of each salesman and when it was.
So it should return just one sale for each salesman.
With ROW_NUMBER() window function:
SELECT t.Name_Seller, t.Month, t.Value
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Name_Seller ORDER BY Value DESC) rn
FROM SALES
) t
WHERE t.rn = 1
Change ROW_NUMBER() with RANK() if you want ties returned.
Or with a correlated subquery in the WHERE clause:
SELECT s.* FROM SALES s
WHERE s.Value = (SELECT MAX(VALUE) FROM SALES WHERE Name_Seller = s.Name_Seller)
Or if your database supports it:
SELECT * FROM SALES
WHERE (Name_Seller, Value) IN (SELECT Name_Seller, MAX(VALUE) FROM SALES GROUP BY Name_Seller)
You query can also bring the results however the "=" operator in where clause needs to change to "IN" because the query below brings more than 1 row so it needs a IN operator in where clause. Also, the data you have in your query returned the correct results but please be wary to use it in general because it will also bring wrong results because of comparison with sales amount( value) as example given by #forpas.
When changed the operator, your query will work.
SELECT Name_Seller, Month, Value FROM SALES
WHERE Value IN (Select MAX(Value) FROM SALES GROUP BY Name_Seller);
You can also use the rank() window function
SELECT Name_Seller, Month, VALUE
FROM (SELECT Name_Seller, Month, VALUE,
RANK() OVER (PARTITION BY Name_Seller ORDER BY VALUE DESC ) as RN
FROM SALES
) A
WHERE A.RN = 1
It will look like this:
SELECT Name_Seller, Month, MAX(Value)
FROM SALES
GROUP BY Name_Seller, Month;
You can use below query,
select name_seller, month, max(value) from sales group by name_seller, month;
If you are expecting month as well then use,
select s2.name_seller, s1.month, max(s2.value) from sales s1
inner join
(select name_seller, max(value) as value from sales
group by name_seller) s2
on (s1.name_seller = s2.name_seller and s1.value = s2.value);

I'm getting unexpected results when I use SUM(CASE WHEN .... THEN "column_name") in sqlite3

I am trying to find the number of orders I got in the month of April. I have 3 orders but my query gets the result 0. What could be the problem?
Here's the table:
id | first | middle | last | product_name | numberOut | Date
1 | Muhammad | Sameer | Khan | Macbook | 1 | 2020-04-01
2 | Chand | Shah | Khurram | Dell Optiplex | 1 | 2020-04-02
3 | Sultan | | Chohan | HP EliteBook | 1 | 2020-03-31
4 | Express | Eva | Plant | Dell Optiplex | 1 | 2020-03-11
5 | Rana | Faryad | Ali | HP EliteBook | 1 | 2020-04-02
And here's the query:
SELECT SUM(CASE WHEN strftime('%m', oDate) = '04' THEN 'id' END) FROM orders;
If you want all Aprils, then you can just look at the month. I would recommend:
select count(*)
from orders o
where o.date >= '2020-04-01' and o.date < '2020-05-01';
Note that this does direct comparisons of date to a valid dates in the where clause.
The problem with your code is this:
THEN 'id'
You are using the aggregate function SUM() and you sum over a string literal like 'id' which is implicitly converted to 0 (because it can't be converted to a number) so the result is 0.
Even if you remove the single quotes you will not get the result that you want because you will get the sum of the ids.
But if you used:
THEN 1 ELSE 0
then you would get the correct result.
But with SQLite you can write it simpler:
SELECT SUM(strftime('%m', oDate) = '04') FROM orders;
without the CASE expression.
Or since you just want to count the orders then COUNT() will do it:
SELECT COUNT(*) FROM orders WHERE strftime('%m', oDate) = '04';
Edit.
If you want to count the orders for all the months then group by month:
SELECT strftime('%Y-%m', oDate) AS month,
COUNT(*) AS number_of_orders
FROM orders
GROUP BY month;
SELECT SUM(CASE WHEN strftime('%m', oDate) = '04' THEN 1 ELSE 0 END) FROM orders;
if you need to use SUM
There is a problem with your query. You do not need to do that aggregation operation.
SELECT COUNT(*) from table_name WHERE strftime('%m', Date) = '04';
I would use explicit date comparisons rather than date functions - this makes the query SARGeable, ie it may benefit an existing index.
The most efficient approach, with a filter in the where clause:
select count(*) cnt
from orders
where oDate >= '2020-04-01' and oDate < '2020-05-01'
Alternatively, if you want a result of 0 even when there are no orders in April you can do conditional aggregation, as you originally intended:
select sum(case when oDate >= '2020-04-01' and oDate < '2020-05-01' then 1 else 0 end) cnt
from orders

Select row that has max total value SQL Server

I have the following scheme (2 tables):
Customer (Id, Name) and
Sale (Id, CustomerId, Date, Sum)
How to select the following data ?
1) Best customer of all time (Customer, which has Max Total value in the Sum column)
For example, I have 2 tables (Customers and Sales respectively):
id CustomerName
---|--------------
1 | First
2 | Second
3 | Third
id CustomerId datetime Sum
---|----------|------------|-----
1 | 1 | 04/06/2013 | 50
2 | 2 | 04/06/2013 | 60
3 | 3 | 04/07/2013 | 30
4 | 1 | 03/07/2013 | 50
5 | 1 | 03/08/2013 | 50
6 | 2 | 03/08/2013 | 30
7 | 3 | 24/09/2013 | 20
Desired result:
CustomerName TotalSum
------------|--------
First | 150
2) Best customer of each month in the current year (the same as previous but for each month in the current year)
Thanks.
Try this for the best customer of all times
SELECT Top 1 WITH TIES c.CustomerName, SUM(s.SUM) AS TotalSum
FROM Customer c JOIN Sales s ON s.CustomerId = c.CustomerId
GROUP BY c.CustomerId, c.CustomerName
ORDER BY SUM(s.SUM) DESC
One option is to use RANK() combined with the SUM aggregate. This will get you the overall values.
select customername, sumtotal
from (
select c.customername,
sum(s.sum) sumtotal,
rank() over (order by sum(s.sum) desc) rnk
from customer c
join sales s on c.id = s.customerid
group by c.id, c.customername
) t
where rnk = 1
SQL Fiddle Demo
Grouping this by month and year should be trivial at that point.

Sql: Calc average times a customers ordered a product in a period

How would you calc how many times a product is sold in average in a week or month, year.
I'm not interested in the Amount, but how many times a customer has bought a given product.
OrderLine
OrderNo | ProductNo | Amount |
----------------------------------------
1 | 1 | 10 |
1 | 4 | 2 |
2 | 1 | 2 |
3 | 1 | 4 |
Order
OrderNo | OrderDate
----------------------------------------
1 | 2012-02-21
2 | 2012-02-22
3 | 2012-02-25
This is the output I'm looking for
ProductNo | Average Orders a Week | Average Orders a month |
------------------------------------------------------------
1 | 3 | 12 |
2 | 5 | 20 |
You would have to first pre-query it grouped and counted per averaging method you wanted. To distinguish between year 1 and 2, I would add year() of the transaction into the grouping qualifier for distinctness. Such as Sales in Jan 2010 vs Sales in 2011 vs 2012... similarly, week 1 of 2010, week 1 of 2011 and 2012 instead of counting as all 3 years as a single week.
The following could be done if you are using MySQL
select
PreCount.ProductNo,
PreCount.TotalCount / PreCount.CountOfYrWeeks as AvgPerWeek,
PreCount.TotalCount / PreCount.CountOfYrMonths as AvgPerMonth,
PreCount.TotalCount / PreCount.CountOfYears as AvgPerYear
from
( select
OL.ProductNo,
count(*) TotalCount,
count( distinct YEARWEEK( O.OrderDate ) ) as CountOfYrWeeks,
count( distinct Date_Format( O.OrderDate, "%Y%M" )) as CountOfYrMonths,
count( distinct Year( O.OrderDate )) as CountOfYears
from
OrderLine OL
JOIN Order O
on OL.OrderNo = O.OrderNo
group by
OL.ProductNo ) PreCount
This is a copy of DRapp's answer, but coded for SQL Server (it's too big for a comment!)
SELECT PreCount.ProductNo,
PreCount.TotalCount / PreCount.CountOfYrWeeks AS AvgPerWeek,
PreCount.TotalCount / PreCount.CountOfYrMonths AS AvgPerMonth,
PreCount.TotalCount / PreCount.CountOfYears AS AvgPerYear
FROM (SELECT OL.ProductNo,
Count(*) TotalCount,
Count(DISTINCT Datepart(wk, O.OrderDate)) AS CountOfYrWeeks,
Count(DISTINCT Datepart(mm, O.OrderDate)) AS CountOfYrMonths,
Count(DISTINCT Year(O.OrderDate)) AS CountOfYears
FROM OrderLine OL JOIN [Order] O
ON OL.OrderNo = O.OrderNo
GROUP BY OL.ProductNo) PreCount