Determine the person with the fewest entries - sql

That seems to be quite simple, but there is no solution right now. I would like to determine from a table the person who has the least entries. If there are several, I just want to limit it to TOP 1.
In this example it would be Person 2 (or Person 5) with the least entries.
Id, Person
1, Person 1
2, Person 3
3, Person 4
4, Person 1
5, Person 1
6, Person 3
7, Person 2
8, Person 5
9, Person 6

Use GROUP BY and ORDER BY:
select top (1) person
from t
group by person
order by count(*);
The question specifically asks for one row in the result set. If you want all of them, then use top (1) with ties`.

Use aggregation and TOP:
SELECT TOP 1 person
FROM mytable
GROUP BY person
ORDER BY COUNT(*), person
This will give you a unique record with the name of the person that has fewest entries. If there are ties, the person with the alphabetically first name will show up.
Demo on DB Fiddle with your sample data:
| person |
| :------- |
| Person 2 |

You can also use
SELECT TOP 1 Person -- You can add WITH TIES too
FROM
(
SELECT Person, COUNT(Person) Cnt
FROM
(
VALUES
(1, 'Person 1'),
(2, 'Person 3'),
(3, 'Person 4'),
(4, 'Person 1'),
(5, 'Person 1'),
(6, 'Person 3'),
(7, 'Person 2'),
(8, 'Person 5'),
(9, 'Person 6')
) T(ID, Person)
GROUP BY Person
) TT
ORDER BY Cnt;
Demo
You could also use OFFSET and FETCHif you have 2012+ version as
SELECT Person
FROM
(
SELECT Person, COUNT(Person) Cnt
FROM
(
VALUES
(1, 'Person 1'),
(2, 'Person 3'),
(3, 'Person 4'),
(4, 'Person 1'),
(5, 'Person 1'),
(6, 'Person 3'),
(7, 'Person 2'),
(8, 'Person 5'),
(9, 'Person 6')
) T(ID, Person)
GROUP BY Person
) TT
ORDER BY Cnt
OFFSET 0 ROWS
FETCH NEXT 1 ROWS ONLY;

Related

Need to get Average count

I would like to get average of product=A that a client have. Say inner select return 1,2,1,4,4,4 for 6 clients
I would like to see result as 4 which means the avg product count a client can have is 4
Can somebody please confirm the following.
E.g
Select avg(count)
From (
Select count(*) as count
From Table1
Where product = A
Group by client)
as counts
Having sample data is important to getting assistance. It's still difficult to determine how your data looks. Let's assume it looks like this:
create table table1 (
client varchar(10),
product varchar(10)
);
insert into table1 values
('xxx', 'A'),
('bbb', 'A'),
('bbb', 'A'),
('ccc', 'A'),
('ddd', 'A'),
('ddd', 'A'),
('ddd', 'A'),
('ddd', 'A'),
('tt', 'A'),
('tt', 'A'),
('tt', 'A'),
('tt', 'A'),
('bdad', 'A'),
('bdad', 'A'),
('bdad', 'A'),
('bdad', 'A');
I don't have access to a DB2 database, but this query works for most dbms types. You may need to tweak to fit DB2.
select purchased as most_common_value
from (
select client, count(*) as purchased
from table1
where product = 'A'
group by client
)z
group by purchased
order by count(client) desc
limit 1
Output of query is:
most_common_value
4

SQL count total number of days by customer

I have a table customer which contains 2 columns, 1 is a customer_id column, and the other one is a date column named order_date that records what dates did the customers purchased a product. Now I want to count for how many days each customer went in and made a purchase. I tried to do the following but only got an error message saying sum(date) doesn't exist.
select customer_id, sum(order_date)
from customer;
How can I do this correctly?
---- Edit, adding the query to create table:
CREATE TABLE sales (
"customer_id" VARCHAR(1),
"order_date" DATE
);
INSERT INTO sales
("customer_id", "order_date")
VALUES
('A', '2021-01-01'),
('A', '2021-01-01'),
('A', '2021-01-07'),
('A', '2021-01-10'),
('A', '2021-01-11'),
('A', '2021-01-11'),
('B', '2021-01-01'),
('B', '2021-01-02'),
('B', '2021-01-04'),
('B', '2021-01-11'),
('B', '2021-01-16'),
('B', '2021-02-01'),
('C', '2021-01-01'),
('C', '2021-01-01'),
('C', '2021-01-07');
You'll want just this:
SELECT
customer_id,
COUNT( DISTINCT "order_date" ) AS count_days_they_bought_something
FROM
sales
GROUP BY
customer_id

Nested case statement with different conditions in T-SQL

I have below data
CREATE TABLE #EmployeeData
(
EmpID INT,
Designation VARCHAR(100),
Grade CHAR(1)
)
INSERT INTO #EmployeeData (EmpID, Designation, Grade)
VALUES (1, 'TeamLead', 'A'),
(2, 'Manager', 'B'),
(3, 'TeamLead', 'B'),
(4, 'SeniorTeamLead', 'A'),
(5, 'TeamLead', 'C'),
(6, 'Manager', 'C'),
(7, 'TeamLead', 'D'),
(8, 'SeniorTeamLead', 'B')
SELECT Designation,CASE WHEN COUNT(DISTINCT GRADE)>1 THEN 'MultiGrade' ELSE Grade END FROM
#EmployeeData
GROUP BY Designation
Desired result:
Designation Grade
--------------------------
Manager MultiGrade
TeamLead MultiGrade
SeniorTeamLead A
Note:
If designation has more than one grade then it is multigrade
If single grade is there then the particular grade
In case there is a combination with A and B then it should be A only
I tried with a query using case but I get this error:
Column '#EmployeeData.Grade' is invalid in the select list because it is not contained in either` an aggregate function or the GROUP BY clause.
Can anyone suggest the query to fetch the desired result?
As the error says, you need to aggregate the columns you are not grouping by. So use MAX and MIN (as Jeroen commented).
SELECT Designation
, CASE WHEN MAX(Grade) = 'B' AND MIN(Grade) = 'A' THEN 'A' WHEN MAX(Grade) <> MIN(Grade) THEN 'MultiGrade' ELSE MIN(Grade) END Grade
FROM #EmployeeData
GROUP BY Designation
ORDER BY Designation;
Your real world situation might be more complex, but the same principle applies.

Summing Results in a Table for Repeated Values

I'm currently in a tricky situation that I have been unable to figure out, and I was hoping you all might be able to help me solve my issue below:
I have a data set that includes a large amount of columns, however I am only going to show the columns pertinent to my issue (and I renamed them and put them in an excel doc).
What I am trying to do is develop a SQL query to calculate the total amount of PASS results and then the amount of FAIL Results for a given House Name. Each Result corresponds with a specific Resident ID and each Resident ID corresponds with a specific House Name/House ID. Unfortunately, the value Room ID needs to be in this data set, and each unique Room ID also corresponds with a specific House Name/House ID. Therefore, for every unique Room ID that exists for a given House Name, the Resident ID is being repeated.
For Example, if there are 7 Room IDs associated with a specific House Name/House ID, each unique Resident ID associated with that specific House Name/House ID will be repeated 7 times, once for every unique Room ID. Therefore, the Results are also all repeated 7 times. I have attached an example of what the data looks like below.
Note: Not all the data is included here. There are a few more rows to the AAAAAA data not shown, and there are a number of other House Names/House IDs.
Any thoughts would be much appreciated!
What you are looking for is GROUP BY.
Without looking at your data it is hard to come up with the exact query but i have created some test data.
create table House (HouseId int, HouseName nvarchar(max));
insert into House (HouseId, HouseName) values (1,'House A'), (2, 'House B'), (3,'House C');
create table Room (RoomId int, RoomName nvarchar(max), HouseId int);
insert into Room (RoomId, RoomName, HouseId)
values
(1,'Room 1 in house A', 1), (2,'Room 2 in house A', 1),
(3,'Room 3 in house B', 2),(4,'Room 4 in house B', 2),
(5,'Room 5 in house C', 3),(6,'Room 6 in house C', 3)
create table Resident (ResidentId int, ResidentName nvarchar(max), RoomId int, Result int);
insert into Resident (ResidentId, ResidentName, RoomId, Result)
values
-- House A = 4 passed, 0 failed
(1, 'Resident 1 in Room 1', 1, 82), (2, 'Resident 2 in Room 1', 1, 76),
(3, 'Resident 3 in Room 2', 2, 91), (4, 'Resident 4 in Room 2', 2, 67),
-- House B = 2 passed, 2 failed
(5, 'Resident 5 in Room 3', 3, 60), (6, 'Resident 6 in Room 3', 3, 64),
(7, 'Resident 7 in Room 4', 4, 28), (8, 'Resident 8 in Room 4', 4, 42),
-- House C = 3 passed, 1 failed
(9, 'Resident 9 in Room 5', 5, 99), (10, 'Resident 10 in Room 5', 5, 57),
(9, 'Resident 11 in Room 6', 6, 75), (10, 'Resident 12 in Room 6', 6, 38)
Then your query would look something like:
select
HouseName,
[Passed] = SUM(x.Passed),
[Failed] = SUM(x.Failed)
from
Resident re
outer apply (
--// Logic to determine if they passed or failed
--// I arbitrarily chose the number 50 to be the threshold
select [Passed] = case when re.Result >= 50 then 1 else 0 end,
[Failed] = case when re.Result < 50 then 1 else 0 end
) x
inner join Room r on r.RoomId = re.RoomId
inner join House h on h.HouseId = r.HouseId
group by
h.HouseName
here is a fiddle: http://sqlfiddle.com/#!18/30894/1

SQL Select: Do rows matching id all have the same column value

I have a table like this
sub_id reference
1 A
1 A
1 A
1 A
1 A
1 A
1 C
2 B
2 B
3 D
3 D
I want to make sure all the references in each group have the same reference.
Meaning, for example, all references in:
group 1 should be A
group 2 should be B
group 3 should be D
If they are not, then I would like to have returned a list of sub_id's.
So for the table above my result would be: 1
Ideally, with these conditions reference would be in a separate table with sub_id as PK, but I need to fix first for a massive dataset before I can move on restructuring the database.
You could use the following method:
select t.sub_id
from YourTable t
group by t.sub_id
having max(t.reference) <> min(t.reference)
Change YourTable to suit.
Are you looking for simple aggregation ?
select sub_id
from table t
group by sub_id
having count(distinct reference) > 1;
The query you want:
SELECT sub_id
FROM test_sub
GROUP BY sub_id HAVING count(DISTINCT reference) > 1
;
Here is what I used to test it:
CREATE TABLE `test_sub` (
sub_id int(11) NOT NULL,
reference varchar(45) DEFAULT NULL
);
INSERT INTO test_sub (sub_id, reference) VALUES
(1, 'A'),
(1, 'A'),
(1, 'A'),
(1, 'A'),
(1, 'C'),
(2, 'B'),
(2, 'B'),
(3, 'D'),
(3, 'D'),
(3, 'D'),
(4, 'E'),
(4, 'E'),
(4, 'E'),
(5, 'F'),
(5, 'G')
;