How to make a DISTINCT CONCAT statement? - sql

New SQL developer here, how do I make a DISTINCT CONCAT statement?
Here is my statement without the DISTINCT key:
COLUMN Employee FORMAT a25;
SELECT CONCAT(CONCAT(EMPLOYEEFNAME, ' '), EMPLOYEELNAME) AS "Employee", JOBTITLE "Job Title"
FROM Employee
ORDER BY EMPLOYEEFNAME;
Here is it's output:
Employee Job Title
------------------------- -------------------------
Bill Murray Cable Installer
Bill Murray Cable Installer
Bob Smith Project Manager
Bob Smith Project Manager
Frank Herbert Network Specilist
Henry Jones Technical Support
Homer Simpson Programmer
Jane Doe Programmer
Jane Doe Programmer
Jane Doe Programmer
Jane Fonda Project Manager
John Jameson Cable Installer
John Jameson Cable Installer
John Carpenter Technical Support
John Carpenter Technical Support
John Jameson Cable Installer
John Carpenter Technical Support
John Carpenter Technical Support
Kathy Smith Network Specilist
Mary Jane Project Manager
Mary Jane Project Manager
21 rows selected
If I were to use the DISTINCT key I should only have 11 rows selected, however
if I use SELECT DISTINCT CONCAT I get an error.

One option is to use GROUP BY:
SELECT CONCAT(CONCAT(EMPLOYEEFNAME, ' '), EMPLOYEELNAME) AS "Employee",
JOBTITLE AS "Job Title"
FROM Employee
GROUP BY CONCAT(CONCAT(EMPLOYEEFNAME, ' '), EMPLOYEELNAME),
JOBTITLE
ORDER BY "Employee"
Another option, if you really want to use DISTINCT, would be to subquery your current query:
SELECT DISTINCT t.Employee,
t."Job Title"
FROM
(
SELECT CONCAT(CONCAT(EMPLOYEEFNAME, ' '), EMPLOYEELNAME) AS "Employee",
JOBTITLE AS "Job Title"
FROM Employee
) t

Related

SQL join manager from same table onto a row with their employees

To start, here's a dummy table I've made to show the data I'm working with:
employee
title
division
email
Boss Person
boss
o
bp#email
John Smith
supervisor
a
jos#email
Jane Smith
supervisor
b
jas#email
Leo Messi
employee
a
lm#email
Amanda Kessel
employee
a
ak#email
Derek Jeter
employee
b
dj#email
I want to end up with the following info:
employee
title
division
email
supervisor_name
supervisor_email
Boss Person
boss
o
bp#email
NULL
NULL
John Smith
supervisor
a
jos#email
Boss Person
bp#email
Jane Smith
supervisor
b
jas#email
Boss Person
bp#email
Leo Messi
employee
a
lm#email
John Smith
jos#email
Amanda Kessel
employee
a
ak#email
John Smith
jos#email
Derek Jeter
employee
b
dj#email
Jane Smith
jas#email
I've looked through and tried documentation at:
https://www.sqltutorial.org/sql-self-join/
SQL Server : LEFT JOIN EMPLOYEE MANAGER relationship
One of the big differences here is I don't have any employee or manager id column to work with.
If you're a supervisor for a division, ie John Smith is a supervisor in division a, then you manage all the employees in division a. Meanwhile, all the supervisors answer to the boss in division o, while the boss answers to no one.
Here is the best code I've tried so far:
select e.*, b.employee as supervisor, b.email as supervisor_email
from employees e, employees b
where b.division = e.division
and
b.title like '%supervisor%'
This got me close, it returned:
employee
title
division
email
supervisor_name
supervisor_email
John Smith
supervisor
a
jos#email
John Smith
jos#email
Jane Smith
supervisor
b
jas#email
Jane Smith
jas#email
Leo Messi
employee
a
lm#email
John Smith
jos#email
Amanda Kessel
employee
a
ak#email
John Smith
jos#email
Derek Jeter
employee
b
dj#email
Jane Smith
jas#email
So, it got the employee info right, but left out the Boss record and placed the supervisors as their own supervisor. I think I need some kind of case or if statement here, but I'm not sure.
Please let me know if this makes sense or if any further clarification is needed.
You could try using a LEFT JOIN and work with two conditions:
when division is the same and we're dealing with the relationship employee < supervisor
when the relationship is supervisor < boss
Here's how I did it:
SELECT t1.*,
t2.employee,
t2.email
FROM tab t1
LEFT JOIN tab t2
ON (t1.division = t2.division AND
t2.title = 'supervisor' AND
t1.title = 'employee')
OR (t2.title = 'boss' AND
t1.title = 'supervisor')
You'll find an SQL fiddle here.
If you want to update the current table (if columns are available), you can do the following (more or less the same as #lemon) :
UPDATE testing t1 JOIN testing t2 ON t2.`division`=t1.division OR t2.division="o" SET
t1.supervisor_name=t2.`employee`, t1.supervisor_email=t2.email
WHERE (CASE
WHEN t1.`title`="employee" THEN t2.title="supervisor"
WHEN t1.`title`="supervisor" THEN t2.title="boss"
END);
SELECT * FROM testing;

Assign value to a new column for all rows associate with unique value in another column

I need to assign name to a new 'responsible' column for all rows associate with customer.
If part of the string in 'codes' consist 'manager', manager's name should be assigned to the 'responsible' column. If there is no 'manager' in the codes column, 'responsible' columns should be populated with the 'empl_name' associate with the row.
I assume case and group by should be used?
table looks like:
cust_name empl_name codes
john mike empl, office
liza nick manager_1, remote
john kate empl, remote
john mike empl, remote
liza mike empl, office
david kate empl, remote
john mike empl, remote
liza mike empl, office
david mike empl, remote
chris jennifer manager_2, office
output should be:
cust_name empl_name codes responsible
john mike empl, office mike
liza nick manager_1, remote nick
john kate empl, remote kate
john mike empl, remote mike
liza mike empl, office nick
david kate empl, remote kate
john mike empl, remote mike
liza mike empl, office nick
david mike empl, remote mike
chris jennifer manager_2, office jennifer
My code (googled everything):
SELECT
c.cust_name,
e.emp_name,
a.codes,
FROM Billing as b
--- Code Labels in 1 single row, separated by comma
OUTER APPLY (
SELECT STUFF((
(SELECT ', ' + y.CodeLabelName
FROM CodeToLabelBridge x
JOIN CodeLabel y
ON y.CodeLabelId = x.CodeLabelId
WHERE x.CodeId = b.billing_code_id
FOR XML PATH(''), TYPE).value('.', 'varchar(max)')),1,1,''
) AS codes
) AS a
--- JOINS
JOIN Client as c
ON (b.billing_cust_id = c.cust_id)
JOIN Employer as e
ON (b.billing_emp_id = e.emp_id)
JOIN Code as sc
ON (b.billing_code_id = sc.codes_id)
--- Table with Client and associate Manager
WITH cte AS (
SELECT * ,
row_number() OVER(PARTITION BY t.cust_name, t.empl_name ORDER BY t.cust_name desc) AS [rn]
FROM t
WHERE t.codes LIKE '%manager%'
)
Select cust_name, empl_name from cte WHERE [rn] = 1
Then I'm stuck. I thought to JOIN cte table and main table on 'cust_name' field, however having issues with that.
It sounds like you want to get who is 'ultimately' responsible for a customer, if the data has a row for each contact/rep the customer has, and showing the manager, if exists. This (assuming that your table is Tbl) would do that:
select
a.*,
Responsible=coalesce((select min(b.empl_name)
from Tbl b
where a.cust_name=b.cust_name
and b.codes like '%manager%'), a.empl_name)
from Tbl a
I used min() to avoid errors which may occur if the customer had more than one row with 'manager' in Codes.
Coalesce takes the current row's empl_name if there is no other record with manager; because the select subquery would return NULL.

SQL Query: How to select multiple instances of a single item without collapsing into a group?

I'm trying to do with following with an SQL query in Impala. I've got a single data table that has (among other things) two columns with values that intersect multiple times. For example, let's say we have a table with two columns for related names and phone numbers:
Names Phone Numbers
John Smith (123) 456-7890
Rob Johnson (123) 456-7890
Greg Jackson (123) 456-7890
Tom Green (123) 456-7890
Jack Mathis (123) 456-7890
John Smith (234) 567-8901
Rob Johnson (234) 567-8901
Joe Wolf (234) 567-8901
Mike Thomas (234) 567-8901
Jim Moore (234) 567-8901
John Smith (345) 678-9012
Rob Johnson (345) 678-9012
Toby Ellis (345) 678-9012
Sam Wharton (345) 678-9012
Bob Thompson (345) 678-9012
John Smith (456) 789-0123
Rob Johnson (456) 789-0123
Kelly Howe (456) 789-0123
Hank Rehms (456) 789-0123
Jim Fellows (456) 789-0123
What I need to get from this table is a selection of each item from the Name column that has multiple entries from the Phone Numbers column associated with it, like this:
Names Phone Numbers
John Smith (123) 456-7890
John Smith (234) 567-8901
John Smith (345) 678-9012
John Smith (456) 789-0123
Rob Johnson (123) 456-7890
Rob Johnson (234) 567-8901
Rob Johnson (345) 678-9012
Rob Johnson (456) 789-0123
This is the query I've got so far, but it's not quite giving me the results I'm looking for:
SELECT a.name, a.phone_number, b.phone_number, b.count1
FROM databasename a
INNER JOIN (
SELECT phone_number, COUNT(phone_number) as count1
FROM databasename
GROUP BY phone_number
) b
ON a.phone_number = b.phone_number;
Any ideas on how to improve my query to get the results I'm looking for?
Thank you.
Working with your query...
This generates a subset by name of users having more than 1 phone number it then joins back to the entire set based on name returning all phone numbers for users having more than 1 phone number. however if a user has the same phone number listed more than once it would get returned. to eliminate those if needed, add distinct to the count in the inline view.
SELECT a.name, a.phone_number
FROM databasename a
INNER JOIN (
SELECT name, COUNT(phone_number) as count1
FROM databasename
GROUP BY name
having COUNT(phone_number) > 1
) b
on a.name = b.name
Order by a.name, a.phone_Number
One method is to use exists:
select t.*
from tablename t
where exists (select 1 from tablename t2 where t2.name = t.name and t2.phonenumber <> t.phonenumber)
SELECT DISTINCT x.*
FROM my_table x
JOIN my_table y
ON y.name = x.name
AND y.phone <> x.phone;

How do I transpose multiple rows to columns in SQL

My first time reading a question on here.
I am working at a university and I have a table of student IDs and their supervisors, some of the students have one supervisor and some have two or three depending on their subject.
The table looks like this
ID Supervisor
1 John Doe
2 Peter Jones
2 Sarah Jones
3 Peter Jones
3 Sarah Jones
4 Stephen Davies
4 Peter Jones
4 Sarah Jones
5 John Doe
I want to create a view that turns that into this:
ID Supervisor 1 Supervisor 2 Supervisor 3
1 John Doe
2 Peter Jones Sarah Jones
3 Peter Jones Sarah Jones
4 Stephen Davies Peter Jones Sarah Jones
5 John Doe
I have looked at PIVOT functions, but don't think it matches my needs.
Any help is greatly appreciated.
PIVOT was the right clue, it only needs a little 'extra' :)
DECLARE #tt TABLE (ID INT,Supervisor VARCHAR(128));
INSERT INTO #tt(ID,Supervisor)
VALUES
(1,'John Doe'),
(2,'Peter Jones'),
(2,'Sarah Jones'),
(3,'Peter Jones'),
(3,'Sarah Jones'),
(4,'Stephen Davies'),
(4,'Peter Jones'),
(4,'Sarah Jones'),
(5,'John Doe');
SELECT
*
FROM
(
SELECT
ID,
'Supervisor ' + CAST(ROW_NUMBER() OVER(PARTITION BY ID ORDER BY Supervisor) AS VARCHAR(128)) AS supervisor_id,
Supervisor
FROM
#tt
) AS tt
PIVOT(
MAX(Supervisor) FOR
supervisor_id IN ([Supervisor 1],[Supervisor 2],[Supervisor 3])
) AS piv;
Result:
ID Supervisor 1 Supervisor 2 Supervisor 3
1 John Doe NULL NULL
2 Peter Jones Sarah Jones NULL
3 Peter Jones Sarah Jones NULL
4 Peter Jones Sarah Jones Stephen Davies
5 John Doe NULL NULL
You will notice that the assignment to Supervisor X is done by ordering by the Supervisor-VARCHAR. If you want the ordering done differently, you might want to include an [Ordering] column; then change to ROW_NUMBER() OVER(PARTITION BY ID ORDER BY [Ordering]). Eg an [Ordering] column could be an INT IDENTITY(1,1). I'll leave that as an excercise to you if that's what's really needed.

identifying duplicates withing a partition with different ID's

i am new to SQL and Data analysis.
I have a scenario i am trying to identify using SQL partitions.
Basically i want to find duplicates [same first_name, last_name, suffix code and Zip code but only if the id's are different.
This query gives me only partial results which is not correct...i know i am missing a filter here and there.
SELECT i.party_id,
I.FIRST_NM,
I.LAST_NM,
I.SFFX_CD,
A.ZIP_CD,
ROW_NUMBER() OVER (PARTITION BY I.FIRST_NM,
I.LAST_NM,
I.SFFX_CD,
A.ZIP_CD
ORDER BY I.PARTY_ID) AS RN
FROM INDVDL I,
PARTY_ADDR A
WHERE I.PARTY_ID = A.PARTY_ID
i should only get the ones marked with ** and not the rest
PARTY_ID FIRST_NM LAST_NM SFFX_CD ZIP_CD RN
886874 John Doe Jr. 45402 1
886874 John Doe Jr. 45406 1
934635 John Doe Jr. 45406 2
886874 John Doe Jr. 45415 1
886874 John Doe Jr. 45415 2
886874 John Doe Jr. 45415 3
886874 John Doe Jr. 45415 4
886874 John Doe Jr. 45415 5
886874 John Doe Jr. 45415 6
**886874 John Doe Jr. 45415 7
**934635 John Doe Jr. 45415 8
934635 John Doe Jr. 45415 9
934635 John Doe Jr. 45415 10
Here is my suggestion. Use window functions to get the minimum and maximum values of PARTY_ID for the groups you have in mind. Then, filter to return only rows where these are different:
SELECT *
FROM (SELECT i.*, a.*,
MIN(I.PARTY_ID) OVER (PARTITION BY I.FIRST_NM, I.LAST_NM, I.SFFX_CD, A.ZIP_CD) as min_pi,
MAX(I.PARTY_ID) OVER (PARTITION BY I.FIRST_NM, I.LAST_NM, I.SFFX_CD, A.ZIP_CD) as max_pi
FROM INDVDL I JOIN
PARTY_ADDR A
ON I.PARTY_ID = A.PARTY_ID
) ia
WHERE min_pi <> max_pi;
Note: I fixed your join syntax to use explicit joins. Simple rule: never use commas in the from clause.
Also, I replaced the column lists with * for convenience. Add in the columns you want.