Limiting records of combinations from 2 columns - sql

looking for some help limiting the results while querying combinations between 2 columns. Here's an example of the kind of table I am working with..
id name group state
1 Bob A NY
2 Jim A NY
3 Dan A NY
4 Mike A FL
5 Tim B NY
6 Sam B FL
7 Brad B FL
8 Glen B FL
9 Ben C FL
I am trying to display all records of all combinations of "group" and "state", but limiting to displaying only up to 2 records for each combination. The result should look like the following..
id name group state
1 Bob A NY
2 Jim A NY
4 Mike A FL
5 Tim B NY
6 Sam B FL
7 Brad B FL
9 Ben C FL
Thanks for the help.

Assuming you always want the two rows for each group and state combination with the lowest id
SELECT *
FROM (SELECT a.*,
row_number() over (partition by group, state
order by id asc) rnk
FROM your_table a)
WHERE rnk <= 2
Of course, since group is a reserved word, I assume your column is actually named something else... You'd need to adjust my query to use the correct column name.

Related

SQL for joining two tables and grouping by shared column

I want to join two tables and use the column they both share to group the results, including a null result for those accountIds which only appear in one table.
Table a
AccountId
productApurchases
Steve
1
Jane
5
Bill
10
Abed
2
Table b
AccountId
productApurchases
Allan
1
Jane
10
Bill
2
Abed
1
Mike
2
Desired output
AccountId
productApurchases
productBpurchases
Steve
1
0
Jane
5
10
Bill
10
2
Abed
2
1
Mike
0
2
I've been trying with various joins but cannot figure out how to group by all the account ids.
Any advice much appreciated, thanks.
Use full join:
select accountid,
coalesce(productApurchases, 0) as productApurchases,
coalesce(productBpurchases, 0) as productBpurchases
from a full join
b
using (accountid);

How do I display rows from a max count?

I want to return all the data, from max count query with hospital that has most number of patients. What I seem to be getting when I try to nest queries is display of all rows of hospital data. I've tried to look at similar questions in stack overflow and other sites it seems to be a simple query to do but i am not getting it.
select max(highest_hospital) as max_hospital
from (select count(hospital) as highest_hospital
from doctor
group by hospital)
highest_hospital
-------------
3
Doc ID Doctor Patient Hospital Medicine Cost
------ ------- ------ --------- ------ --------
1 Jim Bob Patient1 Town 1 Medicine 1 4000
2 Janice Smith Patient2 Town 2 Medicine 3 3000
3 Harold Brown Patient3 Town 2 Medicine 5 2000
4 Larry Owens Patient4 Town 2 Medicine 6 3000
5 Sally Brown Patient5 Town 3 Medicine 7 4000
6 Bob Jim Patient6 Town 4 Medicine 8 6000
Outcome should be return of 3 rows
Doc ID Doctor Patient Hospital Medicine Cost
------ ------- ------ --------- ------ --------
2 Janice Smith Patient2 Town 2 Medicine 3 3000
3 Harold Brown Patient3 Town 2 Medicine 5 2000
4 Larry Owens Patient4 Town 2 Medicine 6 3000
You can use window functions:
select d.*
from (select d.*, max(hospital_count) over () as max_hospital_count
from (select d.*, count(*) over (partition by hospital) as hospital_count
from doctor d
) d
) d
where hospital_count = max_hospital_count;
Edit:
Using GROUP BY is a pain. If you are only looking for a single hospital (even when there are ties), then in Oracle 12C you can do:
select d.*
from doctor d
where d.hospital = (select d2.hospital
from doctor d2
group by d2.hospital
order by count(*) desc
fetch first 1 row only
);
You can do this in earlier versions of Oracle using an additional subquery.

SQL query to get only rows match the condition based on two separated columns under one 'group by'

The simple SELECT query would return the data as below:
Select ID, User, Country, TimeLogged from Data
ID User Country TimeLogged
1 Samantha SCO 10
1 John UK 5
1 Andrew NZL 15
2 John UK 20
3 Mark UK 10
3 Mark UK 20
3 Steven UK 10
3 Andrew NZL 15
3 Sharon IRL 5
4 Andrew NZL 25
4 Michael AUS 5
5 Jessica USA 30
I would like to return a sum of time logged for each user grouped by ID
But for only ID numbers where both of these values Country = UK and User = Andrew are included within their rows.
So the output in the above example would be
ID User Country TimeLogged
1 John UK 5
1 Andrew NZL 15
3 Mark UK 30
3 Steven UK 10
3 Andrew NZL 15
First you need to identify which IDs you're going to be returning
SELECT ID FROM MyTable WHERE Country='UK'
INTERSECT
SELECT ID FROM MyTable WHERE [User]='Andrew';
and based on that, you can then filter to aggregate the expected rows.
SELECT ID,
[User],
Country,
SUM(Timelogged) as Timelogged
FROM mytable
WHERE (Country='UK' OR [User]='Andrew')
AND ID IN( SELECT ID FROM MyTable WHERE Country='UK'
INTERSECT
SELECT ID FROM MyTable WHERE [User]='Andrew')
GROUP BY ID, [User], country;
So, you have described what you need to write almost perfectly but not quite. Your result table indicates that you want Country = UK OR User = Andrew, rather than AND
You need to select and group by, then include a WHERE:-
Select ID, User, Country, SUM(Timelogged) as Timelogged from mytable
WHERE Country='UK' OR User='Andrew'
Group by ID, user, country

get extra rows for each group where date doesn't exist

I've been playing with this for days, and can't seem to come up with something. I have this query:
select
v.emp_name as Name
,MONTH(v.YearMonth) as m
,v.SalesTotal as Amount
from SalesTotals
Which gives me these results:
Name m Amount
Smith 1 123.50
Smith 2 40.21
Smith 3 444.21
Smith 4 23.21
Jones 1 121.00
Jones 2 499.00
Jones 3 23.23
Jones 4 41.82
etc....
What I need to do is use a JOIN or something, so that I get a NULL value for each month (1-12), for each name:
Name m Amount
Smith 1 123.50
Smith 2 40.21
Smith 3 444.21
Smith 4 23.21
Smith 5 NULL
Smith 6 NULL
Smith ... NULL
Smith 12 NULL
Jones 1 121.00
Jones 2 499.00
Jones 3 23.23
Jones 4 41.82
Jones 5 NULL
Jones ... NULL
Jones 12 NULL
etc....
I have a "Numbers" table, and have tried doing:
select
v.emp_name as Name
,MONTH(v.YearMonth) as m
,v.SalesTotal as Amount
from SalesTotals
FULL JOIN Number n on n.Number = MONTH(v.YearMonth) and n in(1,2,3,4,5,6,7,8,9,10,11,12)
But that only gives me 6 additional NULL rows, where what I want is actually 6 NULL rows for each group of names. I've tried using Group By, but not sure how to use it in a JOIN statement like that, and not even sure if that's the correct route to take.
Any advice or direction is much appreciated!
Here's one way to do it:
select
s.emp_name as Name
,s.Number as m
,st.salestotal as Amount
from (
select distinct emp_name, number
from salestotals, numbers
where number between 1 and 12) s left join salestotals st on
s.emp_name = st.emp_name and s.number = month(st.yearmonth)
Condensed SQL Fiddle
You could do:
SELECT EN.emp_name Name,
N.Number M,
ST.SalesTotal Amount
FROM ( SELECT Number
FROM NumberTable
WHERE Number BETWEEN 1 AND 12) N
CROSS JOIN (SELECT DISTINCT emp_name
FROM SalesTotals) EN
LEFT JOIN SalesTotals ST
ON N.Number = MONTH(ST.YearMonth)
AND EN.emp_name = ST.emp_name

Replicate recent location

I have table like below
ID User Date Location
1 Tom 6-Mar-2012 US
2 Tom 4-Feb-2012 UK
3 Tom 6-Jan-2012 Uk
4 Bob 6-Mar-2012 UK
5 Bob 4-Feb-2012 UK
6 Bob 6-Jan-2012 AUS
7 Dev 6-Mar-2012 US
8 Dev 4-Feb-2012 AUS
9 Nic 6-Jan-2012 US
I have to get each employee recent location in same table.
ID User Date Location CurrentLocation
1 Tom 6-Mar-2012 US US
2 Tom 4-Feb-2012 UK US
3 Tom 6-Jan-2012 Uk US
4 Bob 6-Mar-2012 UK UK
5 Bob 4-Feb-2012 UK UK
6 Bob 6-Jan-2012 AUS UK
7 Dev 6-Mar-2012 US US
8 Dev 4-Feb-2012 AUS US
9 Nic 6-Jan-2012 US US
I have tired with temp tables. can I get this done using single query. This is in middle of implementation. I have already created many temp tables.
Thanks in Advance.
Try this:
select *, CurrentLocation
from tbl x
outer apply
(
select top 1 location as CurrentLocation
from tbl
where [user] = x.[user]
and id <= x.id
order by id
) y
order by id
Output:
ID USER DATE LOCATION CURRENTLOCATION
1 Tom 2012-03-06 US US
2 Tom 2012-02-04 UK US
3 Tom 2012-01-06 Uk US
4 Bob 2012-03-06 UK UK
5 Bob 2012-02-04 UK UK
6 Bob 2012-01-06 AUS UK
7 Dev 2012-03-06 US US
8 Dev 2012-02-04 AUS US
9 Nic 2012-01-06 US US
Live test: http://www.sqlfiddle.com/#!3/83a6a/7
select
id,
user,
date,
location,
first_value(location) over(partition by user order by date desc) as current_location
from your_table s;
The above is valid only in Oracle.
This is my try in absence of first_value analytic function:
select
a.id,
a.usr,
a.date,
a.location,
m.location as current_location
from
a
join
(select usr, location
from
(select usr,
location,
row_number() over(partition by usr order by date desc) as rnk
from a
)b
where rnk = 1) m
on m.usr = a.usr;
The m inner query contains the records with most recent user entries.
After this I make a join with this view, to obtain users's location.
See here sqlfiddle test.