Pivoting a table with SQL - sql

I have a table with position (junior, senior), salary, and an ID. I have done the following to find the highest salary for each position.
SELECT position, MAX(salary) FROM candidates GROUP BY position;
What I am getting:
How I want it:
I want to transpose the outcome so that 'junior' and 'senior' are the columns without using crosstab. I have looked at many pivot examples but they are done on examples much more complex than mine.

I am not proficient in PostgreSQL, but I believe there is a practical workaround solution since this is a simple table:
SELECT
max(case when position = 'senior' then salary else null end) senior,
max(case when position = 'junior' then salary else null end) junior
FROM payments
It worked with this example:
create table payments (id integer, position varchar(100), salary int);
insert into payments (id, position, salary) values (1, 'junior', 1000);
insert into payments (id, position, salary) values (1, 'junior', 2000);
insert into payments (id, position, salary) values (1, 'junior', 5000);
insert into payments (id, position, salary) values (1, 'junior', 3000);
insert into payments (id, position, salary) values (2, 'senior', 3000);
insert into payments (id, position, salary) values (2, 'senior', 8000);
insert into payments (id, position, salary) values (2, 'senior', 9000);
insert into payments (id, position, salary) values (2, 'senior', 7000);
insert into payments (id, position, salary) values (2, 'senior', 4000);
select
max(case when position = 'junior' then salary else 0 end) junior,
max(case when position = 'senior' then salary else 0 end) senior
from payments;

Here is my attempt at teaching myself crosstab:
CREATE EXTENSION IF NOT EXISTS tablefunc;
select Junior
, Senior
from
(
select *
from crosstab
(
'select 1, position, max(salary)
from candidates
group by position
'
, $$VALUES('Junior'), ('Senior')$$
)
as ct(row_number integer, Junior integer, Senior integer) --I don't know your actual data types, so you will need to update this as needed
) q
Edit: Below is no longer relevant as this appears to be PostgreSQL
Based on your description, it sounds like you probably want a pivot like this:
select q.*
from
(
select position
, salary
from candidates
) q
pivot (
max(salary) for position in ([Junior], [Senior])
) p
This example was made in SQL Server since we don't know DBMS.

It depends on which SQL dialect you are running. It also depends on the complexity of your table. In SQL Server, I believe you can use the solutions provided in this question for relatively simple tables: Efficiently convert rows to columns in sql server

Related

Concatenate distinct strings and numbers

I am trying to get a distinct concatenated list of employee_ids and sum their employee_allowance. However, I do not want to sum duplicate employee_id's employee_allowance.
My expected result
name
employee_ids
allowance
this column is for explanation (not part of output)
Bob
11Bob532, 11Bob923
26
13+13=26 because the id's are different, so we sum both
Sara
12Sara833
93
John
18John243, 18John823
64
21+43=64 because we got rid of the duplicate 18John243's allowance
Table creation/dummy data
CREATE TABLE emp (
name varchar2(100) NOT NULL,
employee_id varchar2(100) NOT NULL,
employee_allowance number not null
);
INSERT INTO emp (name, employee_id, employee_allowance) VALUES ('Bob', '11Bob923', 13);
INSERT INTO emp (name, employee_id, employee_allowance) VALUES ('Bob', '11Bob532', 13);
INSERT INTO emp (name, employee_id, employee_allowance) VALUES ('Sara', '12Sara833', 93);
INSERT INTO emp (name, employee_id, employee_allowance) VALUES ('John', '18John243', 21);
INSERT INTO emp (name, employee_id, employee_allowance) VALUES ('John', '18John243', 21);
INSERT INTO emp (name, employee_id, employee_allowance) VALUES ('John', '18John823', 43);
My attempt
My output gives me the distinct, concatenated employee_ids but still sums up the duplicate employee_allowance row.
SELECT
name,
LISTAGG(DISTINCT employee_id, ', ') WITHIN GROUP (ORDER BY employee_id) "ids",
SUM(employee_allowance)
FROM emp
GROUP BY
name
Find the DISTINCT rows first and then aggregate:
SELECT name,
LISTAGG(employee_id, ', ') WITHIN GROUP (ORDER BY employee_id) AS employee_ids,
SUM(employee_allowance) AS allowance
FROM (
SELECT DISTINCT *
FROM emp
)
GROUP BY name
Which, for the sample data, outputs:
NAME
EMPLOYEE_IDS
ALLOWANCE
Bob
11Bob532, 11Bob923
26
John
18John243, 18John823
64
Sara
12Sara833
93
db<>fiddle here

Optimize Simple SQL Query

Suppose you have a table of students and a gpa. The idea is return the student or students with the highest GPA. If only 1 student, the prize is $1000. Otherwise, the amount is split between the number of students sharing the highest gpa. The result below returns what I would expect, 3 students, and an amount of 333. I'm wondering if this is the best or most optimal way of writing the query?
CREATE TABLE Test (
PersonID int,
Name varchar(255),
GPA DECIMAL(3,2)
);
INSERT INTO Test(personid, name, gpa) VALUES(1, 'Frank', 2.7)
INSERT INTO Test(personid, name, gpa) VALUES(2, 'Barb', 3.7)
INSERT INTO Test(personid, name, gpa) VALUES(3, 'Tammy', 3.7)
INSERT INTO Test(personid, name, gpa) VALUES(4, 'Edward', 3.7)
Select name, gpa,
(Select Case When Count(*) = 1 Then '1000'
Else 1000/COUNT(*)
End
FROM Test
WHERE gpa = (SELECT MAX(gpa) FROM test)
) As 'Prize Amount'
FROM Test
Where gpa = (SELECT MAX(gpa) FROM test)
Results of query
I feel like it isn't efficient because of having to query twice. I'd like to just be able to divide by the number of rows. Something like below doesn't work (groupby issue) and adding a groupby on gpa, name would always display 1000, since each group of name/gpa has 1 record.
Select name, gpa,
Case When Count(*) = 1 Then '1000'
Else 1000/COUNT(*)
End As 'Prize Amount'
FROM Test
Where gpa = (SELECT MAX(gpa) FROM test)
I think you want window functions:
select t.*,
1000.0 / count(*) over ()
from t
where t.gpa = (select max(t2.gpa) from test t2);
With an index on gpa, this is probably the fastest solution.

Retrieving rows randomly in pl/sql query

I have a table (t1). I know how to retrieve percentage of set randomly.
What I want is to insert 30% of randomly selected rows into t2, and insert remaining 70% into table t3.
Is there any other way except inserting 30% into table t2 and than compare t2 with t1 and insert into t3? This method is not good for me since table is huge.
ps. oracle version - 11g
Look into ora_hash. Generate a hash using the table's PK (or some similar column combination) with a bucket of 9, and those with a 0-6 go in one table, and those with 7,8 or 9 go in another.
would an insert all work? here is one I did with the HR employees table so I ordered by random and took 30 percent of them. those ones got an indicator of one. I did a union all on the whole table and give it an indicator of 0. I took the max for the indicator then did an insert all. if the indicator is 1 into the first table otherwise the remaining 70% into the second.
INSERT ALL
WHEN (table_one_ind = 1) THEN
INTO table_one
(
employee_id,
first_name,
last_name,
email,
hire_date,
job_id
)
VALUES
(
employee_id,
first_name,
last_name,
email,
hire_date,
job_id
)
ELSE
INTO table_two
(
employee_id,
first_name,
last_name,
email,
hire_date,
job_id
)
VALUES
(
employee_id,
first_name,
last_name,
email,
hire_date,
job_id
)
SELECT MAX (table_one_ind) table_one_ind,
employee_id,
first_name,
last_name,
email,
hire_date,
job_id
FROM
(SELECT t.*,
1 AS table_one_ind
FROM
( SELECT * FROM employees ORDER BY dbms_random.value
) t
WHERE rownum <=
( SELECT ceil(COUNT(*)*.3) FROM employees
)
UNION ALL
SELECT t.*, 0 FROM employees t
)
GROUP BY employee_id,
first_name,
last_name,
email,
hire_date,
job_id

Calculating precentage SQL [duplicate]

I have a SQL Server table that contains users & their grades. For simplicity's sake, lets just say there are 2 columns - name & grade. So a typical row would be Name: "John Doe", Grade:"A".
I'm looking for one SQL statement that will find the percentages of all possible answers. (A, B, C, etc...) Also, is there a way to do this without defining all possible answers (open text field - users could enter 'pass/fail', 'none', etc...)
The final output I'm looking for is A: 5%, B: 15%, C: 40%, etc...
The most efficient (using over()).
select Grade, count(*) * 100.0 / sum(count(*)) over()
from MyTable
group by Grade
Universal (any SQL version).
select Grade, count(*) * 100.0 / (select count(*) from MyTable)
from MyTable
group by Grade;
With CTE, the least efficient.
with t(Grade, GradeCount)
as
(
select Grade, count(*)
from MyTable
group by Grade
)
select Grade, GradeCount * 100.0/(select sum(GradeCount) from t)
from t;
I have tested the following and this does work. The answer by gordyii was close but had the multiplication of 100 in the wrong place and had some missing parenthesis.
Select Grade, (Count(Grade)* 100 / (Select Count(*) From MyTable)) as Score
From MyTable
Group By Grade
Instead of using a separate CTE to get the total, you can use a window function without the "partition by" clause.
If you are using:
count(*)
to get the count for a group, you can use:
sum(count(*)) over ()
to get the total count.
For example:
select Grade, 100. * count(*) / sum(count(*)) over ()
from table
group by Grade;
It tends to be faster in my experience, but I think it might internally use a temp table in some cases (I've seen "Worktable" when running with "set statistics io on").
EDIT:
I'm not sure if my example query is what you are looking for, I was just illustrating how the windowing functions work.
I simply use this when ever I need to work out a percentage..
ROUND(CAST((Numerator * 100.0 / Denominator) AS FLOAT), 2) AS Percentage
Note that 100.0 returns 1 decimal, whereas 100 on it's own will round up the result to the nearest whole number, even with the ROUND(...,2) function!
You have to calculate the total of grades
If it is SQL 2005 you can use CTE
WITH Tot(Total) (
SELECT COUNT(*) FROM table
)
SELECT Grade, COUNT(*) / Total * 100
--, CONVERT(VARCHAR, COUNT(*) / Total * 100) + '%' -- With percentage sign
--, CONVERT(VARCHAR, ROUND(COUNT(*) / Total * 100, -2)) + '%' -- With Round
FROM table
GROUP BY Grade
You need to group on the grade field. This query should give you what your looking for in pretty much any database.
Select Grade, CountofGrade / sum(CountofGrade) *100
from
(
Select Grade, Count(*) as CountofGrade
From Grades
Group By Grade) as sub
Group by Grade
You should specify the system you're using.
The following should work
ID - Key
Grade - A,B,C,D...
EDIT: Moved the * 100 and added the 1.0 to ensure that it doesn't do integer division
Select
Grade, Count(ID) * 100.0 / ((Select Count(ID) From MyTable) * 1.0)
From MyTable
Group By Grade
This is, I believe, a general solution, though I tested it using IBM Informix Dynamic Server 11.50.FC3. The following query:
SELECT grade,
ROUND(100.0 * grade_sum / (SELECT COUNT(*) FROM grades), 2) AS pct_of_grades
FROM (SELECT grade, COUNT(*) AS grade_sum
FROM grades
GROUP BY grade
)
ORDER BY grade;
gives the following output on the test data shown below the horizontal rule. The ROUND function may be DBMS-specific, but the rest (probably) is not. (Note that I changed 100 to 100.0 to ensure that the calculation occurs using non-integer - DECIMAL, NUMERIC - arithmetic; see the comments, and thanks to Thunder.)
grade pct_of_grades
CHAR(1) DECIMAL(32,2)
A 32.26
B 16.13
C 12.90
D 12.90
E 9.68
F 16.13
CREATE TABLE grades
(
id VARCHAR(10) NOT NULL,
grade CHAR(1) NOT NULL CHECK (grade MATCHES '[ABCDEF]')
);
INSERT INTO grades VALUES('1001', 'A');
INSERT INTO grades VALUES('1002', 'B');
INSERT INTO grades VALUES('1003', 'F');
INSERT INTO grades VALUES('1004', 'C');
INSERT INTO grades VALUES('1005', 'D');
INSERT INTO grades VALUES('1006', 'A');
INSERT INTO grades VALUES('1007', 'F');
INSERT INTO grades VALUES('1008', 'C');
INSERT INTO grades VALUES('1009', 'A');
INSERT INTO grades VALUES('1010', 'E');
INSERT INTO grades VALUES('1001', 'A');
INSERT INTO grades VALUES('1012', 'F');
INSERT INTO grades VALUES('1013', 'D');
INSERT INTO grades VALUES('1014', 'B');
INSERT INTO grades VALUES('1015', 'E');
INSERT INTO grades VALUES('1016', 'A');
INSERT INTO grades VALUES('1017', 'F');
INSERT INTO grades VALUES('1018', 'B');
INSERT INTO grades VALUES('1019', 'C');
INSERT INTO grades VALUES('1020', 'A');
INSERT INTO grades VALUES('1021', 'A');
INSERT INTO grades VALUES('1022', 'E');
INSERT INTO grades VALUES('1023', 'D');
INSERT INTO grades VALUES('1024', 'B');
INSERT INTO grades VALUES('1025', 'A');
INSERT INTO grades VALUES('1026', 'A');
INSERT INTO grades VALUES('1027', 'D');
INSERT INTO grades VALUES('1028', 'B');
INSERT INTO grades VALUES('1029', 'A');
INSERT INTO grades VALUES('1030', 'C');
INSERT INTO grades VALUES('1031', 'F');
SELECT Grade, GradeCount / SUM(GradeCount)
FROM (SELECT Grade, COUNT(*) As GradeCount
FROM myTable
GROUP BY Grade) Grades
In any sql server version you could use a variable for the total of all grades like this:
declare #countOfAll decimal(18, 4)
select #countOfAll = COUNT(*) from Grades
select
Grade, COUNT(*) / #countOfAll * 100
from Grades
group by Grade
You can use a subselect in your from query (untested and not sure which is faster):
SELECT Grade, COUNT(*) / TotalRows
FROM (SELECT Grade, COUNT(*) As TotalRows
FROM myTable) Grades
GROUP BY Grade, TotalRows
Or
SELECT Grade, SUM(PartialCount)
FROM (SELECT Grade, 1/COUNT(*) AS PartialCount
FROM myTable) Grades
GROUP BY Grade
Or
SELECT Grade, GradeCount / SUM(GradeCount)
FROM (SELECT Grade, COUNT(*) As GradeCount
FROM myTable
GROUP BY Grade) Grades
You can also use a stored procedure (apologies for the Firebird syntax):
SELECT COUNT(*)
FROM myTable
INTO :TotalCount;
FOR SELECT Grade, COUNT(*)
FROM myTable
GROUP BY Grade
INTO :Grade, :GradeCount
DO
BEGIN
Percent = :GradeCount / :TotalCount;
SUSPEND;
END
This one is working well in MS SQL. It transforms varchar to the result of two-decimal-places-limited float.
Select field1, cast(Try_convert(float,(Count(field2)* 100) /
Try_convert(float, (Select Count(*) From table1))) as decimal(10,2)) as new_field_name
From table1
Group By field1, field2;
I had a similar issue to this. you should be able to get the correct result multiplying by 1.0 instead of 100.See example Image attached
Select Grade, (Count(Grade)* 1.0 / (Select Count(*) From MyTable)) as Score From MyTable Group By Grade

find MIN without using min()

I am trying to find student who has min score which will be the result of the below query. However, I was asked to write the query without using MIN(). Spent several hours but I can't find the alternative solution :'(.
select s.sname
from student s
where s.score =
(select min(s2.score)
from score s2)
This is one way, which will work even if two students have same lowest score.
SELECT distinct s1.sname
FROM student s1
LEFT JOIN student s2
ON s2.score < s1.score
WHERE s2.score IS NULL
The below is the method using limit, which will return lowest score student, but only one of them if multiple of them have same score.
select sname
from student
order by score asc
limit 1
Here's a possible alternative to the JOIN approach:
select sname from student where score in
(select score from student order by score asc limit 1)
create table student (name varchar(10), score int);
insert into student (name, score) values('joe', 30);
insert into student (name, score) values('jim', 88);
insert into student (name, score) values('jack', 22);
insert into student (name, score) values('jimbo', 15);
insert into student (name, score) values('jo bob',15);
/* folks with lowest score */
select name, score from student where not exists(select 1 from student s where s.score < student.score);
/* the actual lowest score */
select distinct score from student
where not exists(select 1 from student s where s.score < student.score);
Note that not exists can be brutally inefficient, but it'll do the job on a small set.
One way of doing it would be to Order the results in Ascending order and take the first row.
But if you are looking at a more generic solution as a student will have more than one mark associated with him, So you need to find the total marks for each student and then find the student with the least total.
This is the first scenario, A student only has one row in the table.
CREATE TABLE Student
(
SLNO INT,
MARKS FLOAT,
NAME NVARCHAR(MAX)
)
INSERT INTO Student VALUES(1, 80, 't1')
INSERT INTO Student VALUES(2, 90, 't2')
INSERT INTO Student VALUES(3, 76, 't3')
INSERT INTO Student VALUES(4, 98, 't4')
INSERT INTO Student VALUES(5, 55, 't5')
SELECT * From Student ORDER BY MARKS ASC
The second scenario as specified above is, He has multiple rows in the table, So we insert two more rows into the table for existing users.
Then we select the users by taking the sum of their marks grouping the results by name and then ordering the results by their total
INSERT INTO Student VALUES(6, 55, 't1')
INSERT INTO Student VALUES(6, 90, 't5')
SELECT SUM(MARKS) AS TOTAL, NAME FROM Student
GROUP BY NAME
ORDER BY TOTAL
Hope the above is what you are looking for.
You can try stored procedure to find student with minimum score.