Selecting Max Count on another Column - sql

I've been stuck on trying to come up with a query to GROUP BY CODE, so that CCC for example should only appear once, and also selecting the Name with the highest count. Can someone point me in the right direction. Thanks
So I want my query to return:
AAA Lee, Albert
BBB Robert, Steven
CCC Jones, Albert
DDD Lim, Kevin
EEE Zhang, Wil
OR
AAA Lee, Albert 12
BBB Robert, Steven 4
CCC Jones, Albert 3
DDD Lim, Kevin 21
EEE Zhang, Wil 11
From Using Sample Data:
CODE NAME Count
AAA Lee, Albert 12
BBB Robert, Steven 4
CCC Robert, Steven 2
CCC Jones, Albert 3
DDD Lim, Kevin 21
EEE Zhang, Wil 11
EEE Wil Zhang 5

The standard SQL method uses the ANSI standard row_number() function:
select s.*
from (select s.*,
row_number() over (partition by code order by count desc) as seqnum
from sample s
) s
where seqnum = 1;

You could LEFT JOIN the table to itself and filter the results
SELECT t1.* FROM Data AS t1
LEFT JOIN Data AS t2
ON
(t1.CODE = t2.CODE) AND (t1.Count < t2.Count)
WHERE t2.Count is null
http://sqlfiddle.com/#!6/cc2ea/4

Related

Google Sheets "=QUERY()" JOIN ON or equitant

I have one large spreadsheet with names, addresses, phone numbers, emails, Etc. Some records have a second address for which I have a column named "Address 2" I was hopping to write a query that would give me an output with duplicate rows of which the only difference was the "Address 2" column would be in the main address Column.
Data:
A
B
C
D
E
F
G
1
Status
Name
Address
Phone
Email
Address2
Hire Date
2
Joe Smith
123 Smith St
201 555 3099
Joe#stackoverflow.com
7th Avenue Sq
4
Q
Jane Smith
321 Not Smith St
12/15/1980
5
Robert Smith
818 555 4321
Robert#googlesheets.com
12/13/1981
Looking for an Query output to look like:
A
B
C
D
E
F
1
Status
Name
Address
Phone
Email
Hire Date
2
Joe Smith
123 Smith St
201 555 3099
Joe#stackoverflow.com
3
Joe Smith
7th Avenue Sq
201 555 3099
Joe#stackoverflow.com
4
Q
Jane Smith
321 Not Smith St
12/15/1980
5
Robert Smith
818 555 4321
Robert#googlesheets.com
12/13/1981
I was trying something like:
=QUERY({Sheet1!$A2:$G,Sheet1!$B2:$B,Sheet1!$F2:$J },"SELECT Col1, Col2, Col3, Col4, Col5, Col7 JOIN Col6 ON Col2 = Col2")
Which I think is more or less how it would be in SQL, but Google sheets doesn't have a join function.
Is there any way to get this done?
most simple you can do is:
=QUERY({A1:E, G1:G; A2:B, F2:F, D2:E, G2:G}, "where Col3 is not null", )
Something like this?
You can stack data of the same length with {},
this sample create 2 query function and stack them together.
=ArrayFormula(
LAMBDA(DATA_1,DATA_2,
QUERY({DATA_1;DATA_2},"WHERE Col2 IS NOT NULL ORDER BY Col2",1)
)(
QUERY({A1:G4},"SELECT "&JOIN(",","Col"&{1,2,3,4,5,7}),1),
QUERY({A1:G4},"SELECT "&JOIN(",","Col"&{1,2,6,4,5,7})&" WHERE Col6 IS NOT NULL LABEL "&JOIN(",","Col"&{1,2,6,4,5,7}&"''"),1)
)
)

SQL multiple columns value into one column

Here is my table design
Location Inventory
ID Name HostName LID1 LID2 LID3
1 AAA Peter 1 2 3
2 BBB Betty 2
3 CCC Charlie 1 2
As my expected result is like this below.
HostName Name
Peter AAA
BBB
CCC
Betty BBB
Charlie AAA
BBB
But I run the sql statement not like this.
SELECT HostName,location.Name AS Department
FROM inventory
INNER JOIN location ON inventory.LID1 = location.ID
UNION
SELECT HostName,location.Name AS Department
FROM inventory
INNER JOIN location ON inventory.LID2 = location.ID
UNION
SELECT HostName,location.Name AS Department
FROM inventory
INNER JOIN location ON inventory.LID3 = location.ID
HostName Name
Peter AAA
Peter BBB
Peter CCC
Betty BBB
Charlie AAA
Charlie BBB
Anyone can help me to solve the problem?
Thanks.

finding manager id from employee table

I have a table data like in the below.
Emp_id Emp_name Dept_id
111 aaa 1
222 bbb 2
333 ccc 3
444 ddd 4
555 eee 5
Then i want to populate new column manager id as next emp_id from the employee table like in the below.
Emp_id Emp_name Dept_id Manager_id
111 aaa 1 222
222 bbb 2 333
333 ccc 3 444
444 ddd 4 555
555 eee 5 111
Thanks for your help in advance!
You can return the value as:
select t.*,
coalesce(lead(empid) over (order by empid),
min(empid) over ()
) as manager_id
from t;
Perhaps a select query is sufficient. Actually modifying the table is a bit tricky and the best syntax depends on the database.

Limiting records of combinations from 2 columns

looking for some help limiting the results while querying combinations between 2 columns. Here's an example of the kind of table I am working with..
id name group state
1 Bob A NY
2 Jim A NY
3 Dan A NY
4 Mike A FL
5 Tim B NY
6 Sam B FL
7 Brad B FL
8 Glen B FL
9 Ben C FL
I am trying to display all records of all combinations of "group" and "state", but limiting to displaying only up to 2 records for each combination. The result should look like the following..
id name group state
1 Bob A NY
2 Jim A NY
4 Mike A FL
5 Tim B NY
6 Sam B FL
7 Brad B FL
9 Ben C FL
Thanks for the help.
Assuming you always want the two rows for each group and state combination with the lowest id
SELECT *
FROM (SELECT a.*,
row_number() over (partition by group, state
order by id asc) rnk
FROM your_table a)
WHERE rnk <= 2
Of course, since group is a reserved word, I assume your column is actually named something else... You'd need to adjust my query to use the correct column name.

sql "group by" same PersonID, different PersonNames. Eliminate duplicates

I have a (rather dirty) datasource (excel) that looks like this:
ID | Name | Subject | Grade
123 | Smith, Joe R. | MATH | 2.0
123 | Smith, Joe Rodriguez | FRENCH | 3.0
234 | Doe, Mary Jane D.| BIOLOGY | 2.5
234 | Doe, Mary Jane Dawson| CHEMISTRY | 2.5
234 | Doe, Mary Jane | FRENCH | 3.5
My application's output should look like this:
Smith, Joe R.
123
MATH | 2.0
FRENCH | 3.0
So basically I want to do query (just for the ID/Person parent 'container') something like:
SELECT DISTINCT ID, Name FROM MyTable<br/>
or
SELECT ID, Name FROM MyTable GROUP BY ID
Of course both of the above are invalid and won't work.
I would like to 'combine' the same ID's and ignore/truncate the other records with the same ID/different Name (because we all know they're the same person since ID is our identifier and clearly it's just a typo/dirty data).
Can this be done by a single SELECT query?
If you don't really care which value shows up in the name field, use MAX() or MIN():
SELECT ID,
MAX(Name) AS Name
FROM [YourTable]
GROUP BY ID
Here's a working example to play with: https://data.stackexchange.com/stackoverflow/q/116699/
You can find the MIN or MAX Value of Name
SELECT ID, Max(Name)
FROM MyTable
GROUP BY ID
SELECT A.ID, A.NAME, T.Subject, T.Grade
FROM (SELECT ID, MIN(NAME) AS NAME
FROM MyTable
GROUP BY ID) A
LEFT JOIN MyTable T on A.ID = T.ID
Will give you something like
123 Smith, Joe R. Math 2.0
123 Smith, Joe R. FRENCH 3.0
234 Doe, Mary Jane BIOLOGY 2.5
234 Doe, Mary Jane CHEMISTRY 2.5
234 Doe, Mary Jane FRENCH 3.5
If you don't care which name you keep, you can use a MAX() or MIN() aggregate to pick just one name:
SELECT ID, MAX(Name) as Name
FROM MyTable GROUP BY ID