Count function to find people with ten references - sql

need a bit of help regarding university work. (I am using SQL developer)
Basically, the question is 'Find the different first keywords associated with references by prolific authors who have more than ten references'
So far I have only been able to do:
select surname, count(surname)from librarian.readings
group by surname
having count(surname) > 10
which gives me
SURNAME COUNT(SURNAME)
---------- --------------
White 16
Marble 11
Peuquet 14
Robinson 12
Rhind 15
However, this doesn't give me the keywords associated with it
select distinct surname,key1
from librarian.readings
but this just gives me too much information.
How do I do this?

This is assuming that surname is a unique field;
SELECT
r.surname
,r.key1
FROM librarian.readings r
INNER JOIN (select
surname,
count(surname) surname_count
from librarian.readings
group by surname) sub
ON r.surname = sub.surname
WHERE sub.surname_count > 10
If Surname isn't unique then you need to do this by the primary key on librarian.readings.

If all you need is the distinct keywords associated with .... then:
select distinct key1
from librarian.readings
group by surname
having count(*) > 10

Related

how to get combine from Two select statements

my table is,
ID First_Name Last_name manager_ID Unique_ID
12 Jon Doe 25 CN=Jon Doe, DC=test,DC=COM
25 Steve Smith 39 CN=steve smith, DC=test,dc=com
I want to write a sql that will give me manager's unique ID,
select manager_id from test where ID = '12'
this will give me users manager_ID
select unique_id from test where ID = '25'
can i combine above sql in one statement that will give me user's manager's unique_id as output?
You are looking for a self-join:
select m.unique_id
from test t join
test m
on t.manager_id = m.id
where t.ID = 12;
Note that I remove the single quotes around 12. Presumably, id is an integer. You should not be comparing an integer to a string.
Instead of joining it to the same table, you can also make a nested subquery statement like this.
SELECT unique_id FROM test WHERE ID =(SELECT manager_id FROM test WHERE ID = 12);
The inner query outputs the manager_id where id of person equals 12 and the outer query gives the unique_id of the related manager.

Hive - How to combine a group by over columns A and B and a distinct over column C

I need to create a query which selects from a particular table the users which have more than one different email. To distinguish users, I group them based on two fields: name and age. Let's see this with an example.
So I have a table like this:
name age email phone
----------------------------------
Andy 20 Andy#du 1234
Berni 21 Berni#du 2345
Carol 22 Carol#du 3456
Andy 20 Andy#du 4321
Berni 21 Berni#et 2345
Dody 28 Dodi#du 7869
Carol 22 Carol#pt 3456
What I want to get is:
Berni 21 Berni#du, Berni#et
Carol 22 Carol#du, Carol#pt
Note that Andy is also twice in the database but with same email (what changes is the phone number). Because of this user I need to make a distinc over email, so only users with two different emails are selected.
With this query I am able to solve the issue and I have the desired result.
select * from
(
select aux.name,
aux.age,
concat_ws(',',collect_set(email)) as email
FROM
(select a.name, a.age, a.email
FROM TestUsers a
RIGHT JOIN
(select name,
age
FROM TestUsers
GROUP BY
name,
age
having count(*) > 1
)b
ON a.name = b.name
AND a.age = b.age
)aux
GROUP BY aux.name,
aux.age
)tr
where locate(",",tr.email) > 0;
But I am sure it has to be a more efficient way than checking when there is not a comma in the email field(which means more than one email).
Has anyone in mind a better approach?
If I understand correctly, you should be able to do this using a having clause:
select tu.name, tu.age,
concat_ws(',', collect_list(tu.email)) as emails
from (select distinct tu.name, tu.age, tu.email
from TestUsers tu
) tu
group by tu.name, tu.age
having count(*) > 1;
Actually, because collect_set() removes duplicates, this should work without a subquery:
select tu.name, tu.age,
concat_ws(',', collect_set(tu.email)) as emails
from testusers tu
group by tu.name, tu.age
having min(tu.email) <> max(tu.email);

Display a blank row between every unique row?

I have a simple query like:
SELECT employee, ITEM_TYPE, COUNT(ITEM_TYPE)
FROM hr_database
So the output may look like
BOB MUGS 4
BOB PENCILS 10
CAT MUGS 2
CAT PAPERCLIPS 7
SAL MUGS 11
But for readability, I want to put a blank row between each user in the output(i.e for readability), like this :
BOB MUGS 4
BOB PENCILS 10
CAT MUGS 2
CAT PAPERCLIPS 7
SAL MUGS 11
Is there a way to do this in Oracle SQL ? So far, I found this link but it doesn't match what I need . I'm thinking to use a WITH in the query?
You can do it in the database, but this type of processing should really be done at the application layer.
But, it is kind of an amusing trick to figure out how to do it in the database, and that is your specific question:
WITH e AS (
SELECT employee, ITEM_TYPE, COUNT(ITEM_TYPE) as cnt
FROM hr_database
GROUP BY employee, ITEM_TYPE
)
SELECT (case when cnt is not null then employee end) as employee,
item_type, cnt
FROM (select employee, item_type, cnt, 1 as x from e union all
select distinct employee, NULL, NULL, 2 as x from e
) e
ORDER BY e.employee, x;
I emphasize, though, that this is really for amusement and perhaps for understanding better how SQL works. In the real world, you do this type of work at the application layer.
A summary of how this works. The union all brings in one additional row for each employee. The x is a priority for sorting -- because you have to sort the result set to get the proper ordering. The case statement is needed to prevent the employee from being in the first column. cnt should never be NULL for the valid rows.
You can try like this with normal union & distinct
select emp,item_type,cnt from
(select distinct ' ' as emp,' ' as item_type ,' ' as cnt, employee
from hr_database
union
select employee as emp,item_type ,to_char(count(item_type)) as cnt, employee
from hr_database
group by employee,item_type)a
order by a.employee

write a query to identify discrepancy

I have a table with Student ID's and Student Names. There has been issues with assigning unique Student Id's to students and Hence I want to find the duplicates
Here is the sample Table:
Student ID Student Name
1 Jack
1 John
1 Bill
2 Amanda
2 Molly
3 Ron
4 Matt
5 James
6 Kathy
6 Will
Here I want a third column "Duplicate_Count" to display count of duplicate records.
For e.g. "Duplicate_Count" would display "3" for Student ID = 1 and so on. How can I do this?
Thanks in advance
Select StudentId, Count(*) DupCount
From Table
Group By StudentId
Having Count(*) > 1
Order By Count(*) desc,
Select
aa.StudentId, aa.StudentName, bb.DupCount
from
Table as aa
join
(
Select StudentId, Count(*) as DupCount from Table group by StudentId
) as bb
on aa.StudentId = bb.StudentId
The virtual table gives the count for each StudentId, this is joined back to the original table to add the count to each student record.
If you want to add a column to the table to hold dupcount, this query can be used in an update statement to update that column in the table
This should work:
update mytable
set duplicate_count = (select count(*) from mytable t where t.id = mytable.id)
UPDATE:
As mentioned by #HansUp, adding a new column with the duplicate count probably doesn't make sense, but that really depends on what the OP originally thought of using it for. I'm leaving the answer in case it is of help for someone else.

how to find people with same family name?

You have a table with 4 columns:
Primary key / name / surname / middle name
How to write Sql query to find people who has same family name?
1 / Ivan / Ivanov / Ivanovich
2 / Petr / Levinsky / Aleksandrovich
3 / Alex / Ivanov / albertovich
Should return Ivan and Alex
Thanks
In standard SQL you can simply join the table with itself:
select a.name, b.name
from t as a, t as b
where a.surname = b.surname and a.id < b.id
where t is your table and id is the primary key column.
This returns all distinct pairs of first names for every surname that has multiple entries.
You might want to add surname to the list of selected columns.
If you want to find exactly names then you should firstly find all surnames that appear more than once and the find all names:
select name
from t
where surname in (select surname from t group by surname having count(surname) > 1);
As for me easiest way is to group records by surname and then select those with count more than 1.
You want to GROUP BY the surname and then use a HAVING clause to find any groups that have > 1.
Untested:
SELECT
name
FROM
theTable
WHERE Surname IN (
SELECT
Surname
FROM
theTable
GROUP BY
Surname
HAVING
COUNT(Surname) > 1)
select surname,group_concat(firstname)
from people
group by surname
having count(firstname)> 1;