How does one 'group' 2 ids on one row to filter data on another table? - sql

Probably a newbie question, but most of my SQL Server experience is basic reporting, with all of my formatting and grouping being made somewhat manually in Excel. Now I am tackling a homework problem that I must solve everything within SQL...
I have a database with 2 tables:
Employees(id, job title, partnerID)
Bugs_Fixed(employeeID, bugs2010, bugs2011, bugs2012)
Each employee has one partner, who is on the same table (Like if an employee with ID 34 had partner ID 201, then ID 201 would have partnerID 34)
I need to essentially group those 2 together and calculate the combined total number of bugs they fixed (each year combined) without repeating the data for the inverse partner/employee relationship.
For example:
| Team | AMT |
| 34, 20 | 717 |
| 76, 16 | 576 |
| 102, 3 | 901 |
I've gotten the query to select based on id, then sum the # of bugs, but that is for each individual employee and it needs to be represented as a group.
SELECT employeeID, partnerID, SUM (bugs2010 + bugs2011 + bugs2012) as 'AMT'
FROM Bugs_Fixed
JOIN Employees on Employees.id = Bugs_Fixed.employeeID
GROUP BY employeeID, partnerID
It calculates the yearly bugfixes correctly, but obviously doesn't partner up the 2 ids and their combined total.
Edit: Clarified SQL Server

You might be able to adress this by generating a concatenated key made of the partnerID and the employeeID. The trick is to order the IDs, like:
SELECT
CONCAT(GREATEST(employeeID, partnerID), ',', LEAST(employeeID, partnerID) as team,
SUM(bugs2010 + bugs2011 + bugs2012) as 'AMT'
FROM Bugs_Fixed
JOIN Employees on Employees.id = Bugs_Fixed.employeeID
GROUP BY team
Notes - you did not tag the RDBMS that you are using:
LEAST() and GREATEST() are not supported by all RDBMS (notably, SQL Server does not support them, while MySQL, Oracle and Postgres do).
using a table alias in the GROUP BY clause (here, team) is not allowed in SQL Server, while MySQL, Postgres and sqlite do support it

SELECT
ARRAY[e1.id, e2.id] AS team,
SUM(b.bugs2010) + SUM(b.bugs2011) + SUM(b.bugs2012) AS amt
FROM employees e1
INNER JOIN employees e2 ON e2.partnerID = e1.id AND e1.id < e2.id
INNER JOIN bugs_fixed b ON b.employeeId IN (e1.id, e2.id)
GROUP BY e1.id, e2.id

Hope this is the solution your are looking for:
select concat(str(e.id), ', ', str(e.partner_id)) as TEAM,
AMT = (select sum(bugs2010 + bugs2011 + bugs2012) from Bugs_Fixed
where employee_id in (e.id, e.partner_id))
from employee e

Related

How to get dates based on months that appear more than once?

I'm trying to get months of Employees' birthdays that are found in at least 2 rows
I've tried to unite birthday information table with itself supposing that I could iterate through them abd get months that appear multiple times
There's the question: how to get birthdays with months that repeat more than once?
SELECT DISTINCT e.EmployeeID, e.City, e.BirthDate
FROM Employees e
GROUP BY e.BirthDate, e.City, e.EmployeeID
HAVING COUNT(MONTH(b.BirthDate))=COUNT(MONTH(e.BirthDate))
UNION
SELECT DISTINCT b.EmployeeID, b.City, b.BirthDate
FROM Employees b
GROUP BY b.EmployeeID, b.BirthDate, b.City
HAVING ...
Given table:
| 1 | City1 | 1972-03-26|
| 2 | City2 | 1979-12-13|
| 3 | City3 | 1974-12-16|
| 4 | City3 | 1979-09-11|
Expected result :
| 2 | City2 |1979-12-13|
| 3 | City3 |1974-12-16|
Think of it in steps.
First, we'll find the months that have more than one birthday in them. That's the sub-query, below, which I'm aliasing as i for "inner query". (Substitute MONTH(i.Birthdate) into the SELECT list for the 1 if you want to see which months qualify.)
Then, in the outer query (o), you want all the fields, so I'm cheating and using SELECT *. Theoretically, a WHERE IN would work here, but IN can have unfortunate side effects if a NULL comes back, so I never use it. Instead, there's a correlated sub=query; which is to say we look for any results where the month from the outer query is equal to the months that make the cut in the inner (correlated sub-) query.
When using a correlated sub-query in the WHERE clause, the SELECT list doesn't matter. You could put 1/0 and it won't throw an error. But I always use SELECT 1 to show that the inner query isn't actually returning any results to the outer query. It's just there to look for, well, the correlation between the two data sets.
SELECT
*
FROM
#table AS o
WHERE
EXISTS
(
SELECT
1
FROM
#table AS i
WHERE
MONTH(i.Birthdate) = MONTH(o.Birthdate)
GROUP BY
MONTH(i.Birthdate)
HAVING
COUNT(*) > 1
);
Seems to be an odd requirement.
This might help with some tweaks. Works in Oracle.
SELECT DATE FROM TABLE WHERE EXTRACT(MONTH FROM DATE)=EXTRACT(MONTH FROM SOMEDATE);
Give this a try and you may be able to dispense with your UNION:
SELECT
EmployeeId
, City
, BirthDate
FROM Employees
GROUP BY
EmployeeId
, City
, BirthDate
HAVING COUNT(Month(BirthDate)) > 2
Here is another approach using GROUP_CONCAT. It's not exactly what you're looking for but it might do the job. Eric's approach is better though. (Note: This is for MySQL)
SELECT GROUP_CONCAT(EmployeeID) EmployeeID, BirthDate, COUNT(*) DupeCount
FROM Employees
GROUP BY MONTH(BirthDate)
HAVING DupeCount> 1;

Sum of two columns alongside separate select statement

I am working on an SQL task and I cannot figure out how to get the sum of two columns from the same table while displaying information from another table.
I have tried multiple things and have spent probably about two hours trying to figure this out.
I have two tables: Employees and Fuel. I displayed all of the employee's information.First SQL statement I had to make:
SELECT firstname, lastname, title, registrationyear, make, model FROM Employees ORDER BY make;
My Employees table has the following columns: firstname, lastname, employeeid, make, model, registrationyear, title
My Fuel table has the following columns: currentprice, fueltype, fuelcost, mileage, mileagecount, fuelamount, employeeid, date
My instructions state: "A list that shows what cars the employees currently use (first SQL statement I made, so this one is DONE!)
Like the above report but also the total amount of kilometers that the employees have driven and the total fuel cost." (this is the task that I am trying to make a statement for)
I have tried using LIKE, UNION, UNION ALL, etc. and the best that I have been able to do is listing the employee information and the totals ON TOP of the information instead of in two separate columns of their own alongside the other data in the query.
I am really stuck here. Could anyone please help me?
This second task is muck more complex than the first one.
First of all, combining in a single row the columns from two or more tables is what join is for, so you will have to join the two tables based on employeeid. This will return you a table like this
employeeid | other emp fields | fuel date | other fueld fields
1 | ... | 01/01/2017 | ...
1 | ... | 01/02/2017 | ...
2 | ... | 01/01/2017 | ...
2 | ... | 02/01/2017 | ...
2 | ... | 04/03/2017 | ...
From here, you want the data from each employee combined with the sum of the rows from fuel related to that employee, and that's what group by is for.
When using group by you define a set of columns that defines the grouping criteria; everything else in your select statement will have to be grouped somehow (in your case with a sum), so that the columns in the group by stay unique.
Your final query would look like this
select t1.firstname, t1.lastname, t1.title, t1.registrationyear, t1.make, t1.model,
sum(t2.mileage) as total_milege,
sum(t2.fuelcost * t2.fuelamount) as total_fuel_cost
from Employees t1
join Fuel t2
on t1.employeeid = t2.employeeid
group by t1.firstname, t1.lastname, t1.title, t1.registrationyear, t1.make, t1.model
Note: I don't know the difference between mileage and mileagecount, so the part of my query involving those fields may need some tweaking.
You can use Inner join & Group By clause as mentioned below. Let me know if you mean something else.
SELECT A.firstname, A.lastname, A.title, A.registrationyear, A.make, A.model,
SUM(B.Column_Having_Kilometer_Driven_Value)
FROM
Employee A
INNER JOIN Fuel B ON A.EmployeeID = B.EmployeeID
Group By A.EmployeeID, A.firstname, A.lastname, A.title, A.registrationyear, A.make, A.model

sql merge tables side-by-side with nothing in common

I'm looking for an sql answer on how to merge two tables without anything in common.
So let's say you have these two tables without anything in common:
Guys Girls
id name id name
--- ------ ---- ------
1 abraham 5 sarah
2 isaak 6 rachel
3 jacob 7 rebeka
8 leah
and you want to merge them side-by-side like this:
Couples
id name id name
--- ------ --- ------
1 abraham 5 sarah
2 isaak 6 rachel
3 jacob 7 rebeka
8 leah
How can this be done?
I'm looking for an sql answer on how to merge two tables without anything in common.
You can do this by creating a key, which is the row number, and joining on it.
Most dialects of SQL support the row_number() function. Here is an approach using it:
select gu.id, gu.name, gi.id, gi.name
from (select g.*, row_number() over (order by id) as seqnum
from guys g
) gu full outer join
(select g.*, row_number() over (order by id) as seqnum
from girls g
) gi
on gu.seqnum = gi.seqnum;
Just because I wrote it up anyway, an alternative using CTEs;
WITH guys2 AS ( SELECT id,name,ROW_NUMBER() OVER (ORDER BY id) rn FROM guys),
girls2 AS ( SELECT id,name,ROW_NUMBER() OVER (ORDER BY id) rn FROM girls)
SELECT guys2.id guyid, guys2.name guyname,
girls2.id girlid, girls2.name girlname
FROM guys2 FULL OUTER JOIN girls2 ON guys2.rn = girls2.rn
ORDER BY COALESCE(guys2.rn, girls2.rn);
An SQLfiddle to test with.
Assuming, you want to match guys up with girls in your example, and have some sort of meaningful relationship between the records (no pun intended)...
Typically you'd do this with a separate table to represent the association (relationship) between the two.
This wouldn't give you a physical table, but it would enable you to write an SQL query representing the final results:
SELECT Girls.ID AS GirlId, Girls.Name AS GirlName, Guys.ID AS GuyId, Guys.Name AS GuyName
FROM Couples INNER JOIN
Girls ON Couples.GirlId = Girls.ID INNER JOIN
Guys ON Couples.GuyId = Guys.ID
which you could then use to create a table on the fly using the Select Into syntax
SELECT Girls.ID AS GirlId, Girls.Name AS GirlName, Guys.ID AS GuyId, Guys.Name AS GuyName
INTO MyNewTable
FROM Couples INNER JOIN
Girls ON Couples.GirlId = Girls.ID INNER JOIN
Guys ON Couples.GuyId = Guys.ID
(But standard Normalization rules would say it's best to keep them in distinct tables rather than creating a temp table, unless there's a performance reason not to do so.)
I need this all the time, -- creating templates in Excel using input from my tables. This pulls from one table that has my regions, the other with the quarters in a year. the result gives me one region name for each quarter/period.
SELECT b.quarter_qty, a.mkt_name FROM TBL_MKTS a, TBL_PERIODS b

How to concatenate rows delimited with comma using standard SQL?

Let's suppose we have a table T1 and a table T2. There is a relation of 1:n between T1 and T2. I would like to select all T1 along with all their T2, every row corresponding to T1 records with T2 values concatenated, using only SQL-standard operations.
Example:
T1 = Person
T2 = Popularity (by year)
for each year a person has a certain popularity
I would like to write a selection using SQL-standard operations, resulting something like this:
Person.Name Popularity.Value
John Smith 1.2,5,4.2
John Doe NULL
Jane Smith 8
where there are 3 records in the popularity table for John Smith, none for John Doe and one for Jane Smith, their values being the values represented above. Is this possible? How?
I'm using Oracle but would like to do this using only standard SQL.
Here's one technique, using recursive Common Table Expressions. Unfortunately, I'm not confident on its performance.
I'm sure that there are ways to improve this code, but it shows that there doesn't seem to be an easy way to do something like this using just the SQL standard.
As far as I can see, there really should be some kind of STRINGJOIN aggregate function that would be used with GROUP BY. That would make things like this much easier...
This query assumes that there is some kind of PersonID that joins the two relations, but the Name would work too.
WITH cte (id, Name, Value, ValueCount) AS (
SELECT id,
Name,
CAST(Value AS VARCHAR(MAX)) AS Value,
1 AS ValueCount
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Name) AS id,
Name,
Value
FROM Person AS per
INNER JOIN Popularity AS pop
ON per.PersonID = pop.PersonID
) AS e
WHERE id = 1
UNION ALL
SELECT e.id,
e.Name,
cte.Value + ',' + CAST(e.Value AS VARCHAR(MAX)) AS Value,
cte.ValueCount + 1 AS ValueCount
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Name) AS id,
Name,
Value
FROM Person AS per
INNER JOIN Popularity AS pop
ON per.PersonID = pop.PersonID
) AS e
INNER JOIN cte
ON e.id = cte.id + 1
AND e.Name = cte.Name
)
SELECT p.Name, agg.Value
FROM Person p
LEFT JOIN (
SELECT Name, Value
FROM (
SELECT Name,
Value,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY ValueCount DESC)AS id
FROM cte
) AS p
WHERE id = 1
) AS agg
ON p.Name = agg.Name
This is an example result:
--------------------------------
| Name | Value |
--------------------------------
| John Smith | 1.2,5,4.2 |
--------------------------------
| John Doe | NULL |
--------------------------------
| Jane Smith | 8 |
--------------------------------
As per in Oracle you can use listagg to achive this -
select t1.Person_Name, listagg(t2.Popularity_Value)
within group(order by t2.Popularity_Value)
from t1, t2
where t1.Person_Name = t2.Person_Name (+)
group by t1.Person_Name
I hope this will solve your problem.
But the comment you have given after #DavidJashi question .. well this is not sql standard and I think he is correct. I am also with David that you can not achieve this in pure sql statement.
I know that I'm SUPER late to the party, but for anyone else that might find this, I don't believe that this is possible using pure SQL92. As I discovered in the last few months fighting with NetSuite to try to figure out what Oracle methods I can and cannot use with their ODBC driver, I discovered that they only "support and guarantee" SQL92 standard.
I discovered this, because I had a need to perform a LISTAGG(). Once I found out I was restricted to SQL92, I did some digging through the historical records, and LISTAGG() and recursive queries (common table expressions) are NOT supported in SQL92, at all.
LISTAGG() was added in Oracle SQL version 11g Release 2 (2009 – 11 years ago: reference https://oracle-base.com/articles/misc/string-aggregation-techniques#listagg) , CTEs were added to Oracle SQL in version 9.2 (2007 – 13 years ago: reference https://www.databasestar.com/sql-cte-with/).
VERY frustrating that it's completely impossible to accomplish this kind of effect in pure SQL92, so I had to solve the problem in my C# code after I pulled a ton of extra unnecessary data. Very frustrating.

How do I write a standard SQL GROUP BY that includes columns not in the GROUP BY clause

Let's say I have a table called Customer, defined like this:
Id Name DepartmentId Hired
1 X 101 2001/01/01
2 Y 102 2002/01/01
3 Z 102 2003/01/01
And I want to retrieve the date of the last hiring in each department.
Obviously I would do this
SELECT c.DepartmentId, MAX(c.Hired)
FROM Customer c
GROUP BY c.DepartmentId
Which returns:
101 2001/01/01
102 2003/01/01
But what do I do if I want to return the name of the guy hired? I.e. I would want this result set:
101 2001/01/01 X
102 2003/01/01 Z
Note that the following does not work, as it would return three rows rather than the two I'm looking for:
SELECT c.DepartmentId, c.Name, MAX(c.Hired)
FROM Customer c
GROUP BY c.DepartmentId
I can't remember seeing a query that achieves this.
NOTE: It's not acceptable to join on the Hired field, as that would not be guaranteed to be accurate.
A subselect would do the job and would handle the case where more than one person was hired in the same department on the same day:
SELECT c.DepartmentId, c.Name, c.Hired from Customer c,
(SELECT DepartmentId, MAX(Hired) as MaxHired
FROM Customer
GROUP BY DepartmentId) as sub
WHERE c.DepartmentId = sub.DepartmentId AND c.Hired = sub.MaxHired
Standard Sql:
select *
from Customer C
where exists
(
-- Linq to Sql put NULL instead ;-)
-- In fact, you can even put 1/0 here and would not cause division by zero error
-- An RDBMS do not parse the select clause of correlated subquery
SELECT NULL
FROM Customer
where c.DepartmentId = DepartmentId
GROUP BY DepartmentId
having c.Hired = MAX(Hired)
)
If Sql Server happens to support tuple testing, this is the most succint:
select *
from Customer
where (DepartmentId, Hired) in
(select DepartmentId, MAX(Hired)
from Customer
group by DepartmentId)
SELECT a.*
FROM Customer AS a
JOIN
(SELECT DepartmentId, MAX(Hired) AS Hired
FROM Customer GROUP BY DepartmentId) AS b
USING (DepartmentId,Hired);