How to handle duplicates created by LEFT JOIN - sql

LEFT TABLE:
+------+---------+--------+
| Name | Surname | Salary |
+------+---------+--------+
| Foo | Bar | 100 |
| Foo | Kar | 300 |
| Fo | Ba | 35 |
+------+---------+--------+
RIGHT TABLE:
+------+-------+
| Name | Bonus |
+------+-------+
| Foo | 10 |
| Foo | 20 |
| Foo | 50 |
| Fo | 10 |
| Fo | 100 |
| F | 1000 |
+------+-------+
DESIRED OUTPUT:
+------+---------+--------+-------+
| Name | Surname | Salary | Bonus |
+------+---------+--------+-------+
| Foo | Bar | 100 | 80 |
| Foo | Kar | 300 | 0 |
| Fo | Ba | 35 | 110 |
+------+---------+--------+-------+
The closest I get is this:
SELECT
a.Name,
Surname,
sum(Salary),
sum(Bonus)
FROM (SELECT
Name,
Surname,
sum(Salary) as Salary
FROM input
GROUP BY 1,2) a LEFT JOIN (SELECT Name,
SUM(Bonus) as Bonus
FROM input2
GROUP BY 1) b
ON a.Name = b.Name
GROUP BY 1,2;
Which gives:
+------+---------+-------------+------------+
| Name | Surname | sum(Salary) | sum(Bonus) |
+------+---------+-------------+------------+
| Fo | Ba | 35 | 110 |
| Foo | Bar | 100 | 80 |
| Foo | Kar | 300 | 80 |
+------+---------+-------------+------------+
I can't figure out how to get rid of Bonus duplication. Ideal solution for me would be as specified in the 'DESIRED OUTPUT', which is adding Bonus to only one Name and for other records with the same Name adding 0.

You can use row_number():
select l.*, (case when l.seqnum = 1 then r.bonus else 0 end) as bonus
from (select l.*, row_number() over (partition by name order by salary) as seqnum
from "left" l
) l left join
(select r.name, sum(bonus) as bonus
from "right" r
group by r.name
) r
on r.name = l.name

Try a Row_number over the Name category partioned by Name. This will give you different numbers for your duplicates. You can then search for the case when this number is 1 and return the result you want. Else return 0. The code can look something like this.
SELECT
a.Name,
Surname,
sum(Salary),
Case when Duplicate_Order = 1
then bonus
else 0
end as 'Bonus'
FROM (SELECT
Name,
Surname,
sum(Salary) as Salary
,ROW_NUMBER() over (partition by Name order by name) as [Duplicate_Order]
FROM input
GROUP BY 1,2) a
LEFT JOIN (SELECT Name,
SUM(Bonus) as Bonus
FROM input2
GROUP BY 1) b
ON a.Name = b.Name
GROUP BY 1,2;
Hope that helps!

You can use Correlated Subquery with sum() aggregation to compute the bonus column, and then apply lag() window analytic function to get the zeros for successively identical valued column values for the name column :
select Name, Surname, Salary,
bonus - lag(bonus::int,1,0) over (partition by name order by salary) as bonus
from
(
select i1.*,
( select sum(Bonus)
from input2 i2
where i1.Name = i2.Name
group by i2.Name ) as bonus
from input i1
) ii
order by name desc, surname;
Demo

Related

Join Table From Minimum Value and Specific Name

I have:
Table id
+--------+
| number |
+--------+
| 1 |
| 2 |
| 3 |
+--------+
Table data
+-------+--------------+
| name | phone_number |
+-------+--------------+
| Bob | 111 |
| John | 333 |
| Alice | 555 |
+-------+--------------+
How to join table with results: (number from minimum value & name='John') ?
+--------+-------+--------------+
| number | name | phone_number |
+--------+-------+--------------+
| 1 | John | 333 |
+--------+-------+--------------+
You can try below -
select
(select min(number) FROM ID) as number, name, phone_number
from date
where name = 'John'
You can use cross join:
select min(number) as number, name, phone_number
from Table_Id
cross join Table_Data
group by name, phone_number
Depending on the RDBMS you're using, this query should get you close.
SELECT
MIN_NUMBER, NAME, PHONE_NUMBER
FROM
DATA LEFT JOIN (SELECT MIN(NUMBER) AS MIN_NUMBER FROM ID) ON 1=1
WHERE NAME = 'JOHN'

SQL - finding the value between rows

I have 2 tables:
Employee table with 2 columns Name and Sales
Rewards table with 2 columns Bonus and Range
Sample data:
Employee Rewards
| Name | Sales | | Bonus | Range |
+------+-------+ +-------+-------+
| John | 112 | | 2 | 200 |
| Mary | 201 | | 3 | 300 |
| Joe | 400 | | 5 | 500 |
| Jack | 300 |
Each employee deserves bonus from the Rewords table if his sales <= Rewards.Range.
I want to select Employee.Name and Rewards.Bonus.
In this case the result should be:
| Name | Bonus |
+------+-------+
| John | 2 |
| Mary | 3 |
| Joe | 5 |
| Jack | 3 |
Any idea what this SQL query will be?
Thanks,
zb
I suggest this approach. There might be a syntax error or two.
select name
, (select bonus from rewards
where range =
(select min(range)
from rewards
where range >= sales)
)
from employee
I just tested this one out and it retrieved the right results with your test tables:
SELECT a.name,
MIN(b.bonus) AS bonus
FROM db.employee a
INNER JOIN db.rewards b
ON a.sales <= b.range
GROUP BY a.name;
I'd use lag to get the bottom part part of each range and join it on the employee's table:
SELECT name, bonus
FROM employee e
JOIN (SELECT bonus,
range AS top_range,
COALESCE(LAG(range) OVER (ORDER BY bonus ASC), 0) AS bottom_range
FROM rewards) r ON e.sales BETWEEN r.bottom_range AND r.top_range
;with OrderedBonuses as (
select e.name, r.bonus, row_number() over (partition by e.name order by r.Range desc) as ord
from Employee e
JOIN Rewards r on e.Sales <= r.Range
)
select name, bonus
from OrderedBonuses
where ord = 1;

SQL Server aggregate functions - how to?

Input table contains 2 columns i.e. name and dept
+------+------+
| name | dept |
+------+------+
| A | 123 |
| B | 456 |
| A | 789 |
| C | 123 |
| A | 456 |
| B | 789 |
+------+------+
Output is
name
-----
A
so here A is working in 3 depts (123, 456, 789). How to retrieve the name who is working in all the 3 depts?
This might help you.
SELECT NAME
FROM TABLE1
GROUP BY NAME
HAVING COUNT(DISTINCT DEPT) =
(
SELECT COUNT(DISTINCT DEPT)
FROM TABLE1
)
Here's one option using a window function:
select name
from (
select name, count(distinct dept) cnt,
count(distinct dept) over () overallcnt
from yourtable
group by name
) t
where cnt = overallcnt
Try this:
SELECT NAME
FROM TABLE1
GROUP BY NAME
HAVING COUNT(DISTINCT DEPT)=(SELECT COUNT(DISTINCT DEPT) FROM TABLE1 )

string aggregate group and count on a value

I have table like this.
| table |
| class_id| name | gender |
+---------+---------+----------+
| 1 | Jane | F |
| 1 | John | M |
| 1 | Tom | M |
| 1 | Bob | M |
| 2 | Jack | M |
| 2 | Kate | F |
I have a query like this.
select id, array_to_string(array_agg(name), ' - '::text) as name_list from table
group by class_id
My result is
| 1 | Jane-John-Tom-Bob |
But i'd like to count my gender count also i mean in the first group (cass 1) i need a column like 1 F + 3 M
My request is something like this and i'd like to use it in 1 group by.
| 1 | Jane-John-Tom-Bob |1F + 3M
You can do that with a filtered aggregate:
select id,
string_agg(name, ' - ') as name_list,
concat(
count(*) filter (where gender = 'F'),
'F + ',
count(*) filter (where gender = 'M'),
'M') as gender_count
from table
group by class_id;
If you are on an older Postgres version, you need to replace
count(*) filter (where gender = 'F')
with
count(case when gender = 'F' then 1 end)
(and the same for 'M')
There is also another solution without using Filter aggregate
select tt.class_id, string_agg ( t, ','::text) as gender, string_agg(distinct y,','::text) as name
from
(
select class_id, count(gender)::text|| string_agg( distinct gender, ',' ) as t
from string_test
group by class_id , gender
) tt ,
(
select class_id, string_agg( distinct name::text, ','::text ) as y
from string_test
group by class_id
) yy
where tt.class_id=yy.class_id
group by tt.class_id
Result;
+==========+========+===================+
| class_id | gender | name |
+==========+========+===================+
| 1 | 1F,3M | Bob,Jane,John,Tom |
+----------+--------+-------------------+
| 2 | 1F,1M | Jack,Kate |
+==========+========+===================+

Sql two table query most duplicated foreign key

I got those two tables sport and student:
First table sport:
|idsport | name |
_______________________
| 1 | bobsled |
| 2 | skating |
| 3 | boarding |
| 4 | iceskating |
| 5 | skiing |
Second table student:
foreign key
|idstudent | name | sport_idsport
__________________________________________
| 1 | john | 3 |
| 2 | pauly | 2 |
| 3 | max | 1 |
| 4 | jane | 2 |
| 5 | nico | 5 |
so far i did this it output which number is mostly inserted, but cant get it to work
with two tables
SELECT sport_idsport
FROM (SELECT sport_idsport FROM student GROUP BY sport_idsport ORDER BY COUNT(*) desc)
WHERE ROWNUM<=1;
I need to output name of most popular sport, in that case it would be skating.
I use oracle sql.
with counter as (
Select sport_idsport,
count(*) as cnt,
dense_rank() over (order by count(*) desc) as rn
from student
group by sport_idsport
)
select s.*, c.cnt
from sport s
join counter c on c.sport_idsport = s.idsport and c.rn = 1;
SQLFiddle example: http://sqlfiddle.com/#!4/b76e21/1
select cnt, sport_idsport from (
select count(*) cnt, sport_idsport
from student
group by sport_idsport
order by count(*) desc
)
where rownum = 1