Concatenate distinct strings and numbers - sql

I am trying to get a distinct concatenated list of employee_ids and sum their employee_allowance. However, I do not want to sum duplicate employee_id's employee_allowance.
My expected result
name
employee_ids
allowance
this column is for explanation (not part of output)
Bob
11Bob532, 11Bob923
26
13+13=26 because the id's are different, so we sum both
Sara
12Sara833
93
John
18John243, 18John823
64
21+43=64 because we got rid of the duplicate 18John243's allowance
Table creation/dummy data
CREATE TABLE emp (
name varchar2(100) NOT NULL,
employee_id varchar2(100) NOT NULL,
employee_allowance number not null
);
INSERT INTO emp (name, employee_id, employee_allowance) VALUES ('Bob', '11Bob923', 13);
INSERT INTO emp (name, employee_id, employee_allowance) VALUES ('Bob', '11Bob532', 13);
INSERT INTO emp (name, employee_id, employee_allowance) VALUES ('Sara', '12Sara833', 93);
INSERT INTO emp (name, employee_id, employee_allowance) VALUES ('John', '18John243', 21);
INSERT INTO emp (name, employee_id, employee_allowance) VALUES ('John', '18John243', 21);
INSERT INTO emp (name, employee_id, employee_allowance) VALUES ('John', '18John823', 43);
My attempt
My output gives me the distinct, concatenated employee_ids but still sums up the duplicate employee_allowance row.
SELECT
name,
LISTAGG(DISTINCT employee_id, ', ') WITHIN GROUP (ORDER BY employee_id) "ids",
SUM(employee_allowance)
FROM emp
GROUP BY
name

Find the DISTINCT rows first and then aggregate:
SELECT name,
LISTAGG(employee_id, ', ') WITHIN GROUP (ORDER BY employee_id) AS employee_ids,
SUM(employee_allowance) AS allowance
FROM (
SELECT DISTINCT *
FROM emp
)
GROUP BY name
Which, for the sample data, outputs:
NAME
EMPLOYEE_IDS
ALLOWANCE
Bob
11Bob532, 11Bob923
26
John
18John243, 18John823
64
Sara
12Sara833
93
db<>fiddle here

Related

Pivoting a table with SQL

I have a table with position (junior, senior), salary, and an ID. I have done the following to find the highest salary for each position.
SELECT position, MAX(salary) FROM candidates GROUP BY position;
What I am getting:
How I want it:
I want to transpose the outcome so that 'junior' and 'senior' are the columns without using crosstab. I have looked at many pivot examples but they are done on examples much more complex than mine.
I am not proficient in PostgreSQL, but I believe there is a practical workaround solution since this is a simple table:
SELECT
max(case when position = 'senior' then salary else null end) senior,
max(case when position = 'junior' then salary else null end) junior
FROM payments
It worked with this example:
create table payments (id integer, position varchar(100), salary int);
insert into payments (id, position, salary) values (1, 'junior', 1000);
insert into payments (id, position, salary) values (1, 'junior', 2000);
insert into payments (id, position, salary) values (1, 'junior', 5000);
insert into payments (id, position, salary) values (1, 'junior', 3000);
insert into payments (id, position, salary) values (2, 'senior', 3000);
insert into payments (id, position, salary) values (2, 'senior', 8000);
insert into payments (id, position, salary) values (2, 'senior', 9000);
insert into payments (id, position, salary) values (2, 'senior', 7000);
insert into payments (id, position, salary) values (2, 'senior', 4000);
select
max(case when position = 'junior' then salary else 0 end) junior,
max(case when position = 'senior' then salary else 0 end) senior
from payments;
Here is my attempt at teaching myself crosstab:
CREATE EXTENSION IF NOT EXISTS tablefunc;
select Junior
, Senior
from
(
select *
from crosstab
(
'select 1, position, max(salary)
from candidates
group by position
'
, $$VALUES('Junior'), ('Senior')$$
)
as ct(row_number integer, Junior integer, Senior integer) --I don't know your actual data types, so you will need to update this as needed
) q
Edit: Below is no longer relevant as this appears to be PostgreSQL
Based on your description, it sounds like you probably want a pivot like this:
select q.*
from
(
select position
, salary
from candidates
) q
pivot (
max(salary) for position in ([Junior], [Senior])
) p
This example was made in SQL Server since we don't know DBMS.
It depends on which SQL dialect you are running. It also depends on the complexity of your table. In SQL Server, I believe you can use the solutions provided in this question for relatively simple tables: Efficiently convert rows to columns in sql server

How to replace the NULL value of the last row of a column with 'Grand total' while retaining 'Total' replacing NULL value in the same column?

Below is the table created and inserted values in it:
CREATE TABLE Employees
(
Id INTEGER IDENTITY(1,1),
Name VARCHAR(50),
Gender VARCHAR(50),
Salary INTEGER,
Country VARCHAR(50)
)
INSERT INTO Employees (Name, Gender, Salary, Country)
VALUES ('Mark', 'Male', 5000, 'USA')
INSERT INTO Employees (Name, Gender, Salary, Country)
VALUES ('John', 'Male', 4500, 'India')
INSERT INTO Employees (Name, Gender, Salary, Country)
VALUES ('Pam', 'Female', 5500, 'USA')
INSERT INTO Employees (Name, Gender, Salary, Country)
VALUES ('Sara', 'Female', 4000, 'India')
INSERT INTO Employees (Name, Gender, Salary, Country)
VALUES ('Todd', 'Male', 3500, 'India')
INSERT INTO Employees (Name, Gender, Salary, Country)
VALUES ('Mary', 'Female', 5000, 'UK')
INSERT INTO Employees (Name, Gender, Salary, Country)
VALUES ('Ben', 'Male', 6500, 'UK')
INSERT INTO Employees (Name, Gender, Salary, Country)
VALUES ('Elizabeth', 'Female', 7000, 'USA')
INSERT INTO Employees (Name, Gender, Salary, Country)
VALUES ('Tom', 'Male', 5500, 'UK')
INSERT INTO Employees (Name, Gender, Salary, Country)
VALUES ('Ron', 'Male', 5000, 'USA')
SELECT * FROM Employees
Now I ran the following query:
SELECT
COALESCE(Country, '') AS [Country],
COALESCE(Gender, 'Total') AS [Gender],
SUM(Salary) AS [Total Salary]
FROM
Employees
GROUP BY
ROLLUP(Country, Gender)
When you look at the query result, the last row of the Gender column has the value 'Total' in it.
I want to replace 'Total' with 'Grand Total' only in the last row of Gender column while keeping 'Total' text in the other rows of Gender column.
Is there any possibility to achieve that ?
If so, then what is the simplest possible way to achieve it ?
You can use GROUPING_ID() for it:
SELECT
COALESCE(Country,'') AS [Country],
CASE WHEN GROUPING_ID(Country)=1 THEN 'Grand Total' ELSE COALESCE(Gender,'Total') END as [Gender],
SUM(Salary) AS [Total Salary]
FROM Employees
GROUP BY ROLLUP(Country,Gender)
DBFIDDLE
EDIT: In the comment of the question is noted that the order of the result should be specified, to make sure it is correct.
This query can be ordered like this, to make sure totals are below the details.
SELECT
COALESCE(Country,'') AS [Country],
CASE WHEN GROUPING_ID(Country)=1 THEN 'Grand Total' ELSE COALESCE(Gender,'Total') END as [Gender],
SUM(Salary) AS [Total Salary],
GROUPING_ID(Country),
GROUPING_ID(Gender)
FROM Employees
GROUP BY ROLLUP(Country,Gender)
ORDER BY COALESCE(Country,'ZZZ'),GROUPING_ID(Country),
Gender,GROUPING_ID(Gender)
One other easy way would be to just to concatenate the country name using isnull which is preferable in Sql server with just two values, such as:
select
isnull(Country,'') Country,
isnull(Gender, Concat(IsNull(Country, 'Grand'), ' Total')) Gender,
Sum(Salary) [Total Salary]
from Employees
group by rollup(Country,Gender);

Group functions with Hierarchical Query

I have the following table employees(emp_id, name,salary, manager_id)
I want to write query to retrieve manager_id and summation of all salaries of employees who are managed by this manager or even managed by manager who is managed by this manager.
I wrote query like this:
Select manager_id , sum(salary)
from employees
connect by prior emp_id = manager_id
start manager_id = 100
group by manager_id;
but that doesn't retrieve sum salary as I want.
Build hierachy first remembering ROOT, then group by root. E.g. salary of the manager which emp_id=100 and all employees he/she managers:
SELECT manager_id, SUM(salary) "Total_Salary"
FROM (
SELECT CONNECT_BY_ROOT emp_id as manager_id, Salary
FROM emploees
START WITH emp_id=100
CONNECT BY PRIOR emp_id = manager_id )
GROUP BY manager_id
ORDER BY manager_id;
#Serg's solution is fine, but for one manager even simpler query works:
select 21 as id, sum(salary) as summed
from employees e
start with emp_id = 21
connect by prior emp_id = manager_id;
If you don't want manager's salary in sum then add where level<>1.
Test data:
create table employees(emp_id number(4), name varchar2(10),
salary number(6), manager_id number(4));
insert into employees values ( 1, 'King', 10000, null);
insert into employees values ( 11, 'Smith', 8000, 1);
insert into employees values ( 21, 'Jones', 9000, 1);
insert into employees values ( 211, 'Brown', 7500, 21);
insert into employees values ( 212, 'Adams', 6200, 21);
insert into employees values (2111, 'White', 5000, 211);
Output:
ID SUMMED
------ ----------
21 27700

Retrieving rows randomly in pl/sql query

I have a table (t1). I know how to retrieve percentage of set randomly.
What I want is to insert 30% of randomly selected rows into t2, and insert remaining 70% into table t3.
Is there any other way except inserting 30% into table t2 and than compare t2 with t1 and insert into t3? This method is not good for me since table is huge.
ps. oracle version - 11g
Look into ora_hash. Generate a hash using the table's PK (or some similar column combination) with a bucket of 9, and those with a 0-6 go in one table, and those with 7,8 or 9 go in another.
would an insert all work? here is one I did with the HR employees table so I ordered by random and took 30 percent of them. those ones got an indicator of one. I did a union all on the whole table and give it an indicator of 0. I took the max for the indicator then did an insert all. if the indicator is 1 into the first table otherwise the remaining 70% into the second.
INSERT ALL
WHEN (table_one_ind = 1) THEN
INTO table_one
(
employee_id,
first_name,
last_name,
email,
hire_date,
job_id
)
VALUES
(
employee_id,
first_name,
last_name,
email,
hire_date,
job_id
)
ELSE
INTO table_two
(
employee_id,
first_name,
last_name,
email,
hire_date,
job_id
)
VALUES
(
employee_id,
first_name,
last_name,
email,
hire_date,
job_id
)
SELECT MAX (table_one_ind) table_one_ind,
employee_id,
first_name,
last_name,
email,
hire_date,
job_id
FROM
(SELECT t.*,
1 AS table_one_ind
FROM
( SELECT * FROM employees ORDER BY dbms_random.value
) t
WHERE rownum <=
( SELECT ceil(COUNT(*)*.3) FROM employees
)
UNION ALL
SELECT t.*, 0 FROM employees t
)
GROUP BY employee_id,
first_name,
last_name,
email,
hire_date,
job_id

SQL: how to find maximum value items according a attribute

I am a beginner of SQL, having this table instructor:
ID name dept_name salary
001 A d01 1000
002 B d02 2000
003 C d01 3000
...
I am writing a code to find people who have highest salary in each department like:
name dept_name
C d01
B d02
I do know how to find maximum value
but I have no idea how to use it by according dept_name for all each department.
This will ensure that only records which are the highest salary for each department are returned to the result set.
SELECT name, dept_name, salary
FROM tbl t
WHERE NOT EXISTS(SELECT salary FROM tbl t2 WHERE t2.salary>t.salary AND t2.dept_name=t.dept_name)
Using SELECT name, MAX(salary) like other answerers have used won't work. Using MAX() will return the highest salary for each department, but the name will not necessarily be related to that salary value.
For example, SELECT MIN(salary), MAX(salary) is most likely going to pull values from different records. That's how aggregate functions work.
select name, max(dept_name)
from tbl
group by name
I assume it is a requirement to not include the salary in the result:
WITH INSTRUCTOR
AS
(
SELECT *
FROM (
VALUES ('001', 'A', 'd01', 1000),
('002', 'B', 'd02', 2000),
('003', 'C', 'd01', 3000)
) AS T (ID, name, dept_name, salary)
),
INSTRUCTOR_DEPT_HIGHEST_SALARY
AS
(
SELECT dept_name, MAX(salary) AS highest_salary
FROM INSTRUCTOR
GROUP
BY dept_name
)
SELECT ID, name, dept_name
FROM INSTRUCTOR AS T
WHERE EXISTS (
SELECT *
FROM INSTRUCTOR_DEPT_HIGHEST_SALARY AS H
WHERE H.dept_name = T.dept_name
AND H.salary = T.highest_salary
);
You can use the group by clause. Check this w3Schools link
SELECT NAME,DEPT_NAME,max(SALARY) FROM table_name group by DEPT_NAME