SQL: select from same table and same column, just different counts

SQL: select from same table and same column, just different counts - sql

I have a table called names, and I want to select 2 names after being count(*) as uniq, and then another 2 names just from the entire sample pool.
firstname
John
John
Jessica
Mary
Jessica
John
David
Walter
So the first 2 names would select from a pool of John, Jessica, and Mary etc giving them equal chances of being selected, while the second 2 names will select from the entire pool, so obvious bias will be given to John and Jessica with multiple rows.
I'm sure there's a way to do this but I just can't figure it out. I want to do something like
SELECT uniq.firstname
FROM (SELECT firstname, count(*) as count from names GROUP BY firstname) uniq
limit 2
AND
SELECT firstname
FROM (SELECT firstname from names) limit 2
Is this possible? Appreciate any pointers.

I think you are close but you need some randomness for the sampling:
(SELECT uniq.firstname
FROM (SELECT firstname, count(*) as count from names GROUP BY firstname) uniq
ORDER BY rand()
limit 2
)
UNION ALL
(SELECT firstname
FROM from names
ORDER BY rand()
limit 2
)

As mentioned here you can use RAND or similar functions to achieve it depending on the database.
MySQL:
SELECT firstname
FROM (SELECT firstname, COUNT(*) as count FROM names GROUP BY firstname)
ORDER BY RAND()
LIMIT 2
PostgreSQL:
SELECT firstname
FROM (SELECT firstname, COUNT(*) as count FROM names GROUP BY firstname)
ORDER BY RANDOM()
LIMIT 2
Microsoft SQL Server:
SELECT TOP 2 firstname
FROM (SELECT firstname, COUNT(*) as count FROM names GROUP BY firstname)
ORDER BY NEWID()
IBM DB2:
SELECT firstname , RAND() as IDX
FROM (SELECT firstname, COUNT(*) as count FROM names GROUP BY firstname)
ORDER BY IDX FETCH FIRST 2 ROWS ONLY
Oracle:
SELECT firstname
FROM(SELECT firstname, COUNT(*) as count FROM names GROUP BY firstname ORDER BY dbms_random.value )
WHERE rownum in (1,2)
Follow the similar approach for selecting from entire pool

Related

SQL Server : finding duplicates based on first few characters on column

I want to find duplicates based on the first three characters of the surname, is there a way a to do that on SQL? I can compare the whole name, but how to do we compare the first few characters?
Below are my tables
custid forename surname dateofbirth
----------------------------------------
1 David John 16-09-1985
2 David Jon 16-09-1985
3 Sarah Smith 10-08-2015
4 Peter Proca 11-06-2011
5 Peter Proka 11-06-2011
This is my query that I am currently running to compare
SELECT
y.id, y.forename, y.surname
FROM
customers y
INNER JOIN
(SELECT
forename, surname, COUNT(*) AS CountOf
FROM customers
GROUP BY forename, surname
HAVING COUNT(*) > 1) dt ON y.forename = dt.forename

You can use left():
select c.*
from (select c.*, count(*) over (partition by left(surname, 3)) as cnt
from customers c
) c
order by surname;
You can include the forename as well in the partition by if you mean forename and first three letters of surname.

You can use exists as follows:
select t.* from t
Where exists
(select 1 from t tt
Where left(t.surname, 3) = left(tt.surname, 3) and t.custid <> tt.custid
)
order by t.surname;

How to combine SELECT DISTINCT and ROWNUM in Oracle Query

I need to combine the two MySQL statements below into a single ORACLE query if possible.
The initial query is
SELECT DISTINCT FIRST_NAME FROM PEOPLE WHERE LAST_NAME IN ("Smith","Jones","Gupta")
then based on each FIRST_NAME returned I query
SELECT *
FROM PEOPLE
WHERE FIRST_NAME = {FIRST_NAME}
AND LAST_NAME IN ("Smith","Jones","Gupta")
ORDER BY FIELD(LAST_NAME, "Smith","Jones","Gupta") DESC
LIMIT 1
The "List of last names" serves as a "default / override" indicator, so I only have one person for each first name, and where multiple rows for the same first name exist, only the Last match from the list of "Last Names" is used.
I need a SQL query that returns the last row from the "in" clause based on the order of the values in the IN(a,b,c). Here is a sample table, and the results I need from the query.
For the Table PEOPLE, with values
LAST_NAME FIRST_NAME
.....
Smith Mike
Smith Betty
Smith Jane
Jones Mike
Jones Sally
....
I need a query based on DISTINCT FIRST_NAME and LAST_NAME IN ('Smith','Jones') that returns
Betty Smith
Jane Smith
Mike Jones
Sally Jones

You can do it like this:
select first_name, last_name
from (
select p.first_name,
p.last_name,
row_number() over (partition by p.first_name
order by case p.last_name
when 'Smith' then 1
when 'Jones' then 2
when 'Gupta' then 3
end desc) as rn
from people p
where p.last_name in ('Smith','Jones','Gupta')
)
where rn = 1;
Demo: SQL Fiddle
EDIT
It's not hard to get more columns. I'm sure you could have figured it out with a bit more effort:
select *
from (
select p.*,
row_number() over (partition by p.first_name
order by case p.last_name
when 'Smith' then 1
when 'Jones' then 2
when 'Gupta' then 3
end desc) as rn
from people p
where p.last_name in ('Smith','Jones','Gupta')
)
where rn = 1;

Or like this:
select first_name,
max(last_name)
keep (dense_rank first order by decode(last_name,
'Smith', 1,
'Jones', 2,
'Gupta', 3) desc)
group by first_name
Oracle "FIRST"/"LAST" functions allow to get values from other columns of row with maximum/minimum value (for example get last_name of employee with maximum salary, or like in this case - get last_name from row with maximum rank)
http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions056.htm

sql command to display highest repeated field in a column

How to display the highest repeated field in a column in sql ?
for eg if a column contains:
jack
jack
john
john
john
how to display the maximum repeated field (i.e) john from the above column?

select chairman
from mytable
group by chairman
HAVING COUNT(*) = (
select TOP 1 COUNT(*)
from mytable
group by chairman
ORDER BY COUNT(*) DESC)

select name from persons
group by name
having count(*) = (
select count(*) from persons
group by name
order by count(*) desc
limit 1)

Need a hand with a simple query

I need a help with a query. I think is not so difficult.
I need to do a select with distinct and at the same time, do a count(*) of how many rows are returned by this distinct.
One example:
Table names>
Id Name
1 john
2 john
3 mary
I need a query thats return:
Name Total
john 2
mary 1

select name, count(*) from names group by name;

SELECT name, COUNT(*) FROM names GROUP BY name

SELECT name, count(*) as occurrences FROM names GROUP BY name

In a SQL GROUP BY query, what value is used for the non-aggregate columns?

Say I've got the following data back from a SQL query:
Lastname Firstname Age
Anderson Jane 28
Anderson Lisa 22
Anderson Jack 37
If I want to know the age of the oldest person with the last name Anderson, I can select MAX(Age) and GROUP BY Lastname. But I also want to know the first name of that oldest person. How can I make sure that, when the Firstname values are collapsed into one row by the GROUP BY, I get the Firstname value from the same row where I got the max age?

For those RDBMS that support it (e.g., SQL Server 2005+), you can use a window function:
select t.Lastname, t.Firstname, t.Age
from (select Lastname, Firstname, Age,
row_number() over (partition by Lastname order by Age desc) as RowNum
from YourTable
) t
where t.RowNum = 1
For others, you'd need a subquery on Lastname and a join to get Firstname:
select yt.Lastname, yt.Firstname, yt.Age
from YourTable yt
inner join (select LastName, max(Age) as MaxAge
from YourTable
group by LastName) q
on yt.Lastname = q.Lastname
and yt.Age = q.MaxAge

You have to join back to the table from your grouped results - i.e. create a view or a nested query to contain the group by.

The main thing you need to watch out for whatever your approach is that there might be more than 1 firstname with the same age for a given lastname.
This query will return just 1 row, but if your data set had more than one 'Anderson' aged 37, it could return either one:
select firstname, age
from yourtable
where lastname = 'Anderson'
order by age desc limit 1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL: select from same table and same column, just different counts - sql

I think you are close but you need some randomness for the sampling: (SELECT uniq.firstname FROM (SELECT firstname, count(*) as count from names GROUP BY firstname) uniq ORDER BY rand() limit 2 ) UNION ALL (SELECT firstname FROM from names ORDER BY rand() limit 2 )

Related

SQL Server : finding duplicates based on first few characters on column

How to combine SELECT DISTINCT and ROWNUM in Oracle Query

sql command to display highest repeated field in a column

Need a hand with a simple query

In a SQL GROUP BY query, what value is used for the non-aggregate columns?

Categories

Resources