SQL: select three rows for each distinct value

SQL: select three rows for each distinct value - sql

For each distinct Name, I want to select the first three rows with the earliest time_stamp (or smallest number in UNIXTIME). What is the correct query?
Start Table:
Name Log-in Time
-------- -----------------
Don 05:30:00
Don 05:35:32
Don 07:12:43
Don 09:52:23
Don 05:32:43
James 03:30:00
James 03:54:23
James 09:51:54
James 14:43:34
James 43:22:11
James 59:43:33
James 20:12:11
Mindy 05:32:22
Mindy 15:14:44
Caroline 10:02:22
Rebecca 20:43:32
End Table:
Name Log-in Time
-------- -----------------
Don 05:30:00
Don 05:35:32
Don 07:12:43
James 03:30:00
James 03:54:23
James 09:51:54
Mindy 05:32:22
Mindy 15:14:44
Caroline 10:02:22
Rebecca 20:43:32

WITH Table (Name, LoginTime, Row) AS
(
SELECT
Name,
LoginTime,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY LoginTime)
FROM SomeTable
)
SELECT
Name,
LoginTime
FROM Table
WHERE
Row <= 3

An ansi standard approach actually looks to work with the following:
http://www.sqlfiddle.com/#!2/b814d/15
SELECT
NAME
, LOGIN
FROM (
SELECT
test_first.NAME,
test_first.LOGIN,
COUNT(*) CNT
FROM
TABLE_NAME test_first
LEFT OUTER JOIN
TABLE_NAME test_second
ON (test_first.NAME = test_second.NAME)
WHERE
test_first.LOGIN <= test_second.LOGIN
GROUP BY
test_first.NAME, test_first.LOGIN) test_order
WHERE
test_order.CNT <= 3
ORDER BY
NAME ASC, LOGIN ASC

Related

Find intersecting dates

Can somebody help me with next problem. I have MS Access table, lets say with my employees, and for each one of them I have start and end date of their vacation:
Name begin end
John 1.3.2021. 15.3.2021.
Robert 6.3.2021. 8.3.2021.
Lisa 13.3.2021. 16.3.2021.
John 1.4.2021. 3.4.2021.
Robert 2.4.2021. 2.4.2021.
Lisa 15.5.2021. 23.5.2021.
Lisa 5.6.2021. 15.6.2021.
How to get the result with number of employees which are absent from work per each date from the table (dates which are included into intervals begin-end). For example:
1.3.2021. 1 '>>>only John
2.3.2021. 1 '>>>only John
3.3.2021. 1 '>>>only John
4.3.2021. 1 '>>>only John
5.3.2021. 1 '>>>only John
6.3.2021. 2 '>>>John and Robert
7.3.2021. 2 '>>>John and Robert
...
Thank you in advanced!

You can use union to combine the tables and a correlated subquery:
select dte,
(select count(*)
from t
where d.dte between t.[begin] and t.[end]
) as cnt
from (select [begin] as dte
from t
union
select [end]
from t
) d;

Count distinct over partition by

I am trying to do a distinct count of names partitioned over their roles. So, in the example below: I have a table with the names and the person's role.
I would like a role count column that gives the total number of distinct people in that role. For example, the role manager comes up four times but there are only 3 distinct people for that role - Sam comes up again on a different date.
If I remove the date column, it works fine using:
select
a.date,
a.Name,
a.Role,
count(a.Role) over (partition by a.Role) as Role_Count
from table a
group by a.date, a.name, a.role
Including the date column then makes it count the total roles rather than by distinct name (which I know I haven't identified in the partition). Giving 4 managers and 3 analysts.
How do I fix this?
Desired output:
Date
Name
Role
Role_Count
01/01
Sam
Manager
3
02/01
Sam
Manager
3
01/01
John
Manager
3
01/01
Dan
Manager
3
01/01
Bob
Analyst
2
02/01
Bob
Analyst
2
01/01
Mike
Analyst
2
Current output:
Date
Name
Role
Role_Count
01/01
Sam
Manager
4
02/01
Sam
Manager
4
01/01
John
Manager
4
01/01
Dan
Manager
4
01/01
Bob
Analyst
3
02/01
Bob
Analyst
3
01/01
Mike
Analyst
3

Unfortunately, SQL Server (and other databases as well) don't support COUNT(DISTINCT) as a window function. Fortunately, there is a simple trick to work around this -- the sum of DENSE_RANK()s minus one:
select a.Name, a.Role,
(dense_rank() over (partition by a.Role order by a.Name asc) +
dense_rank() over (partition by a.Role order by a.Name desc) -
1
) as distinct_names_in_role
from table a
group by a.name, a.role

Unfortunately, COUNT(DISTINCT is not available as a window aggregate. But we can use a combination of DENSE_RANK and MAX to simulate it:
select
a.Name,
a.Role,
MAX(rnk) OVER (PARTITION BY date, Role) as Role_Count
from (
SELECT *,
DENSE_RANK() OVER (PARTITION BY date, Role ORDER BY Name) AS rnk
FROM table
) a
If Name may have nulls then we need to take that into account:
select
a.Name,
a.Role,
MAX(CASE WHEN Name IS NOT NULL THEN rnk END) OVER (PARTITION BY date, Role) as Role_Count
from (
SELECT *,
DENSE_RANK() OVER (PARTITION BY date, Role, CASE WHEN Name IS NULL THEN 0 ELSE 1 END ORDER BY Name) AS rnk
FROM table
) a

Not able to get exact latest records with two columns having same value - in SQL Server

I am trying to get distinct records for a specific department from the table employee.
I have tried with this code in SQL Server, and I'm getting this error:
Error: employeeId is invalid in the select list because it is not contained in either aggregate function or the GROUP BY clause.
My code:
SELECT
name, department, MAX(jointime) LatestDate, employeeId
FROM
employee
WHERE
department = 'Mechanical'
GROUP BY
name
Records in DB:
name department joinTime EmployeeId
-----------------------------------------------------------
Erik Mechanical 2019-07-06 11:59:59 456
Tom Mechanical 2019-07-06 11:59:59 789
Erik Computer 2019-07-05 11:59:59 222
Erik Computer 2019-07-04 11:59:59 111
Erik Mechanical 2019-07-01 11:59:59 123
I want to achieve the result when a query for 'Mechanical' is executed. The latest record should be fetched from DB for a particular department.
name department joinTime EmployeeId
-----------------------------------------------------------
Erik Mechanical 2019-07-06 11:59:59 456
Tom Mechanical 2019-07-06 11:59:59 789

Assuming the key is [Name] and not [EmployeeId]
One option is the WITH TIES clause, and thus no need for aggregation
Example
Select Top 1 with ties *
From employee
Where department='Mechanical'
Order By Row_Number() over (Partition By [Name] order by joinTime Desc)
Returns
name department joinTime EmployeeId
Erik Mechanical 2019-07-06 11:59:59.000 456
Tom Mechanical 2019-07-06 11:59:59.000 789

You can use EXISTS:
SELECT e.*
FROM employee e
WHERE e.department='Mechanical'
AND NOT EXISTS (
SELECT 1 FROM employee
WHERE department = e.department
AND name = e.name AND joinTime > e.joinTime
)
See the demo.
Results:
> name | department | joinTime | EmployeeId
> :--- | :--------- | :------------------ | ---------:
> Erik | Mechanical | 2019-07-06 11:59:59 | 456
> Tom | Mechanical | 2019-07-06 11:59:59 | 789

You can use ROW_NUMBER to mark the latest row for each employee, or CROSS APPLY to run a correlated subquery for each employee.
with q as
(
SELECT name, department, jointime, employeeId,
row_number() over (partition by name, order by joinTime desc) rn
FROM employee where department='Mechanical'
)
select name, department, jointime, employeeId
from q
where rn = 1
or
with emp as
(
select distinct name from employee
)
select e.*
from q
cross apply
(
select top 1 *
from employee e2
where e2.name = q.name
order by joinDate desc
) e

Just add department,employeeId to the GROUP BY
SELECT name , department, MAX(jointime) LatestDate , employeeId
FROM employee where department='Mechanical'
GROUP BY name, department, employeeId

You need to use AGGREGATE Functions for fields which are used in SELECT statement:
SELECT name,
MIN(department)
, MAX(jointime) LatestDate,
, MIN(employeeId)
FROM employee where department='Mechanical'
GROUP BY name
SQL server finds all records with names Tom or Erik, but SQL Server does not know what one value from multiple rows should be chosen for the fields such as department or employeeId. By using aggregrate functions, you are advising SQL Server to get the MIN, MAX, SUM, COUNT values of that columns.
OR use those columns to the GROUP BY clause to get all unique rows:
SELECT name
, department
, jointime
, employeeId
FROM employee where department='Mechanical'
GROUP BY name
, department
, jointime
, employeeId

How to count by two fields and join with other table Postgres?

I have two tables, one table user and second table transactions related with the transactions done by a user. I want to do a query that give me the count by name and date, with the fields in user table. How can I do it?
Table user:
Name Id Card
-----------------
Alex 01 N
James 02 Y
Table transaction:
Name Date
-----------------
Alex 01/07/2012
Alex 01/12/2012
James 01/08/2012
Alex 01/07/2012
Alex 01/12/2012
James 01/07/2012
James 01/07/2012
I want sometihng like this:
Name Date Transactions ID Card
---------------------------------------------
Alex 01/07/2012 2 01 N
Alex 01/12/2012 2 01 N
James 01/08/2012 1 02 Y
James 01/07/2012 2 02 Y
First of all I tryed to count by two columns with something like this:
select name, date, count(name, date) from pm_transaction GROUP BY (name,date)
select count(distinct(machine, date)) from pm_transaction
But it does not work, I tried a lot of combinations but no one works

Try this
select tb1.name, tb2.date , tb2.transaction , tb1.Id, tb1.card
from tbUser as tb1
inner join
(select date,
name,
count(date) as transaction
from tbTransaction group by date)
as tb2 on tb1.name = tb2.name

This looks like simple aggregation task. Just check and correct table join condition and table names:
select u.name, t.date, count(1) as transactions, u.id, u.card
from transaction t
join user_table u on u.name = t.name
group by u.name, t.date, u.id, u.card;

selecting a row using MIN or ROWNUM

I have a oracle table which is similar to the one below which stores people's lastname firstname and age. If last name is same people belong to same family.
LastName FirstName Age
===========================
1 miller charls 20
2 miller john 30
3 anderson peter 45
4 Bates andy 50
5 anderson gary 60
6 williams mark 15
I need to write a oracle sql query to
select youngest person from each family. output shd select rows 1,3,4 and 6
How do I do this ?

Another way, a bit shorter:
select lastname
, max(firstname) keep(dense_rank first order by age) as first_name
, max(age) keep(dense_rank first order by age) as age
from you_table_name
group by lastname
order by lastname
Result:
LASTNAME FIRST_NAME AGE
-------- ---------- ----------
Bates andy 50
anderson peter 45
miller charls 20
williams mark 15
And SQLFiddle Demo

DENSE_RANK() is a ranking function which generates sequential number and for ties the number generated is the same. I prefer to use DENSE_RANK() here considering that a family can have twins, etc.
SELECT Lastname, FirstName, Age
FROM
(
SELECT Lastname, FirstName, Age,
DENSE_RANK() OVER (PARTITION BY LastName ORDER BY Age) rn
FROM tableName
) a
WHERE a.rn = 1
SQLFiddle Demo

With Standard SQL I would do as this...
select *
from family f1
where (
select count(*)
from family f2
where
f2.lastname = f1.lastname
and
f2.age <= f1.age) <= 1
order by lastname;
This SQL gives you possibilities to pick x youngest/oldest in a family. Just modify the f2.age <= f1.age to e.g. f2.age >= f1.age, and the <= 1 to e.g. <=10 (to get top 10 youngest/oldest in a family).
SQLfiddle

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL: select three rows for each distinct value - sql

WITH Table (Name, LoginTime, Row) AS ( SELECT Name, LoginTime, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY LoginTime) FROM SomeTable ) SELECT Name, LoginTime FROM Table WHERE Row <= 3

Related

Find intersecting dates

Count distinct over partition by

Not able to get exact latest records with two columns having same value - in SQL Server

How to count by two fields and join with other table Postgres?

selecting a row using MIN or ROWNUM

Categories

Resources