SQL count total number of days by customer - sql

I have a table customer which contains 2 columns, 1 is a customer_id column, and the other one is a date column named order_date that records what dates did the customers purchased a product. Now I want to count for how many days each customer went in and made a purchase. I tried to do the following but only got an error message saying sum(date) doesn't exist.
select customer_id, sum(order_date)
from customer;
How can I do this correctly?
---- Edit, adding the query to create table:
CREATE TABLE sales (
"customer_id" VARCHAR(1),
"order_date" DATE
);
INSERT INTO sales
("customer_id", "order_date")
VALUES
('A', '2021-01-01'),
('A', '2021-01-01'),
('A', '2021-01-07'),
('A', '2021-01-10'),
('A', '2021-01-11'),
('A', '2021-01-11'),
('B', '2021-01-01'),
('B', '2021-01-02'),
('B', '2021-01-04'),
('B', '2021-01-11'),
('B', '2021-01-16'),
('B', '2021-02-01'),
('C', '2021-01-01'),
('C', '2021-01-01'),
('C', '2021-01-07');

You'll want just this:
SELECT
customer_id,
COUNT( DISTINCT "order_date" ) AS count_days_they_bought_something
FROM
sales
GROUP BY
customer_id

Related

Need to get Average count

I would like to get average of product=A that a client have. Say inner select return 1,2,1,4,4,4 for 6 clients
I would like to see result as 4 which means the avg product count a client can have is 4
Can somebody please confirm the following.
E.g
Select avg(count)
From (
Select count(*) as count
From Table1
Where product = A
Group by client)
as counts
Having sample data is important to getting assistance. It's still difficult to determine how your data looks. Let's assume it looks like this:
create table table1 (
client varchar(10),
product varchar(10)
);
insert into table1 values
('xxx', 'A'),
('bbb', 'A'),
('bbb', 'A'),
('ccc', 'A'),
('ddd', 'A'),
('ddd', 'A'),
('ddd', 'A'),
('ddd', 'A'),
('tt', 'A'),
('tt', 'A'),
('tt', 'A'),
('tt', 'A'),
('bdad', 'A'),
('bdad', 'A'),
('bdad', 'A'),
('bdad', 'A');
I don't have access to a DB2 database, but this query works for most dbms types. You may need to tweak to fit DB2.
select purchased as most_common_value
from (
select client, count(*) as purchased
from table1
where product = 'A'
group by client
)z
group by purchased
order by count(client) desc
limit 1
Output of query is:
most_common_value
4

creating nested array presto

But I have this table:
with cte (customer_id, product, sell) as (
values
(1, 'a', 100),
(1, 'b', 150),
(2, 'a', 90),
(2, 'b', 110)
)
select * from cte
I want a result like the following:
+----------------------------------------------------------+
| result |
+----------------------------------------------------------+
| {1: {"a": 100, "b": 150}, 2: {"a":90, "b": 110}} |
+----------------------------------------------------------+
Your result is not a nested array but a nested map. I would say that unless this is part of some bigger query it is quite strange to try mapping whole table to a single row especially taking in account size of data usually handled by Athena but for this test data you can use map_agg and nested grouping:
with cte (customer_id, product, sell) as (
values (1, 'a', 100),
(1, 'b', 150),
(2, 'a', 90),
(2, 'b', 110)
)
select map_agg(customer_id, m) as result
from (
select customer_id, map_agg(product, sell) m
from cte
group by customer_id
)
group by true -- fake grouping
Output:
result
{1={a=100, b=150}, 2={a=90, b=110}}

Nested case statement with different conditions in T-SQL

I have below data
CREATE TABLE #EmployeeData
(
EmpID INT,
Designation VARCHAR(100),
Grade CHAR(1)
)
INSERT INTO #EmployeeData (EmpID, Designation, Grade)
VALUES (1, 'TeamLead', 'A'),
(2, 'Manager', 'B'),
(3, 'TeamLead', 'B'),
(4, 'SeniorTeamLead', 'A'),
(5, 'TeamLead', 'C'),
(6, 'Manager', 'C'),
(7, 'TeamLead', 'D'),
(8, 'SeniorTeamLead', 'B')
SELECT Designation,CASE WHEN COUNT(DISTINCT GRADE)>1 THEN 'MultiGrade' ELSE Grade END FROM
#EmployeeData
GROUP BY Designation
Desired result:
Designation Grade
--------------------------
Manager MultiGrade
TeamLead MultiGrade
SeniorTeamLead A
Note:
If designation has more than one grade then it is multigrade
If single grade is there then the particular grade
In case there is a combination with A and B then it should be A only
I tried with a query using case but I get this error:
Column '#EmployeeData.Grade' is invalid in the select list because it is not contained in either` an aggregate function or the GROUP BY clause.
Can anyone suggest the query to fetch the desired result?
As the error says, you need to aggregate the columns you are not grouping by. So use MAX and MIN (as Jeroen commented).
SELECT Designation
, CASE WHEN MAX(Grade) = 'B' AND MIN(Grade) = 'A' THEN 'A' WHEN MAX(Grade) <> MIN(Grade) THEN 'MultiGrade' ELSE MIN(Grade) END Grade
FROM #EmployeeData
GROUP BY Designation
ORDER BY Designation;
Your real world situation might be more complex, but the same principle applies.

Running total over duplicate column values and no other columns

I want to do running total but there is no unique column or id column to be used in over clause.
CREATE TABLE piv2([name] varchar(5), [no] int);
INSERT INTO piv2
([name], [no])
VALUES
('a', 1),
('a', 2),
('a', 3),
('a', 4),
('b', 1),
('b', 2),
('b', 3);
there are only 2 columns, name which has duplicate values and the no on which I want to do running total in SQL Server 2017 .
expected result:
a 1
a 3
a 6
a 10
b 11
b 13
b 16
Any help?
The following query would generate the output you expect, at least for the exact sample data you did show us:
SELECT
name,
SUM(no) OVER (ORDER BY name, no) AS no_sum
FROM piv2;
If the order you intend to use for the rolling sum is something other than the order given by the name and no columns, then you should reveal that logic along with sample data.

SQL Select: Do rows matching id all have the same column value

I have a table like this
sub_id reference
1 A
1 A
1 A
1 A
1 A
1 A
1 C
2 B
2 B
3 D
3 D
I want to make sure all the references in each group have the same reference.
Meaning, for example, all references in:
group 1 should be A
group 2 should be B
group 3 should be D
If they are not, then I would like to have returned a list of sub_id's.
So for the table above my result would be: 1
Ideally, with these conditions reference would be in a separate table with sub_id as PK, but I need to fix first for a massive dataset before I can move on restructuring the database.
You could use the following method:
select t.sub_id
from YourTable t
group by t.sub_id
having max(t.reference) <> min(t.reference)
Change YourTable to suit.
Are you looking for simple aggregation ?
select sub_id
from table t
group by sub_id
having count(distinct reference) > 1;
The query you want:
SELECT sub_id
FROM test_sub
GROUP BY sub_id HAVING count(DISTINCT reference) > 1
;
Here is what I used to test it:
CREATE TABLE `test_sub` (
sub_id int(11) NOT NULL,
reference varchar(45) DEFAULT NULL
);
INSERT INTO test_sub (sub_id, reference) VALUES
(1, 'A'),
(1, 'A'),
(1, 'A'),
(1, 'A'),
(1, 'C'),
(2, 'B'),
(2, 'B'),
(3, 'D'),
(3, 'D'),
(3, 'D'),
(4, 'E'),
(4, 'E'),
(4, 'E'),
(5, 'F'),
(5, 'G')
;