I have below data
CREATE TABLE #EmployeeData
(
EmpID INT,
Designation VARCHAR(100),
Grade CHAR(1)
)
INSERT INTO #EmployeeData (EmpID, Designation, Grade)
VALUES (1, 'TeamLead', 'A'),
(2, 'Manager', 'B'),
(3, 'TeamLead', 'B'),
(4, 'SeniorTeamLead', 'A'),
(5, 'TeamLead', 'C'),
(6, 'Manager', 'C'),
(7, 'TeamLead', 'D'),
(8, 'SeniorTeamLead', 'B')
SELECT Designation,CASE WHEN COUNT(DISTINCT GRADE)>1 THEN 'MultiGrade' ELSE Grade END FROM
#EmployeeData
GROUP BY Designation
Desired result:
Designation Grade
--------------------------
Manager MultiGrade
TeamLead MultiGrade
SeniorTeamLead A
Note:
If designation has more than one grade then it is multigrade
If single grade is there then the particular grade
In case there is a combination with A and B then it should be A only
I tried with a query using case but I get this error:
Column '#EmployeeData.Grade' is invalid in the select list because it is not contained in either` an aggregate function or the GROUP BY clause.
Can anyone suggest the query to fetch the desired result?
As the error says, you need to aggregate the columns you are not grouping by. So use MAX and MIN (as Jeroen commented).
SELECT Designation
, CASE WHEN MAX(Grade) = 'B' AND MIN(Grade) = 'A' THEN 'A' WHEN MAX(Grade) <> MIN(Grade) THEN 'MultiGrade' ELSE MIN(Grade) END Grade
FROM #EmployeeData
GROUP BY Designation
ORDER BY Designation;
Your real world situation might be more complex, but the same principle applies.
Related
INSERT INTO customer_data
(cus_id, cust_name, age, city, salary)
VALUES
(1, 'Sam', 'Delhi',9000),
(2, 'ROhit', 'Banglore', 5000),
(3, 'Rahul', 'Mumbai', 12000),
(4, 'Sunny', 'Odisha',15000);
This is my Code
Your SQL query should contain same number of columns as same number of values
you column count in this
INSERT INTO customer_data (**cus_id**, **cust_name**, **age**, **city**, **salary**)
part of query is 5 whereas value count is 4 in this part
VALUES (**1**, **'Sam'**, **'Delhi'**,**9000**) ...
the values part is missing the age value
I have a 2 tables with the following structures:
Table_1: id name age
Table_2: id table_1_id city state
I need to perform bulk insert into Table 1:
INSERT INTO Table_1(name, age)
VALUES ('A', 12), ('B', 13), ('C', 14)
RETURNING id
The ids returned from the first insert are to be used in the below query:
INSERT INTO Table_2(table_1_id, city, state)
VALUES (first_row_id, 'Austin', 'Texas'),
(second_row_id, 'Dallas', 'Texas'),
(third_row_id, 'Houston', 'Texas')
How can I achieve the same using CTE or any other efficient way with least amount of code in Postgres?
I have to admit, I've never considered a construct like this, but for what it's worth I believe this will do what you seek in a single statement, unless I've missed something:
with ids as (
INSERT INTO Table_1(name, age)
VALUES ('A', 12), ('B', 13), ('C', 14)
RETURNING id
),
rowz as (
select id, row_number() over (order by id) as rn
from ids
)
insert into table_2 (table_1_id, city, state)
select
r.id, v.city, v.state
from
rowz r
join (values (1, 'Austin', 'Texas'), (2, 'Houston', 'Texas'), (3, 'Dallas', 'Texas')) v (id, city, state) on
r.rn = v.id
Just my $0.02, but I think it would be easier to follow and more scalable if instead you wrapped this in some code and did it that way (pick your favorite programming language or use PLPGSQL).
The flaw I see with this is ideally you want to provide this:
A 12 Austin Texas
B 13 Houston Texas
And have the script do the rest. Here you have to mess with the individual values, and unless it's always 3 you can't really take advantage of parameters (at least not easily).
I'm pretty new to SQL, but Excel has become far too slow to continue working with, so I'm trying SQLiteStudio. I'm looking to create a column in a query showing running total over time (characterized as Schedule Points, marking each percent through a project's run time). Complete marks whether a Location has completed the install (Y/NULL), and is used simply to filter out incomplete locations from further calculations.
I currently have
With cte as(
Select [Location]
,[HW/NonHW]
,[Obligation/Actual]
,[Schedule Point]
,[CY20$]
,[Vendor Name]
,[Vendor Zip Code]
,[Complete]
,[System Rollup (Import)]
,IIf([Complete] = "Y", [CY20$], 0) As [Completed Costs]
FROM data)
Select [Location]
,[HW/NonHW]
,[Obligation/Actual]
,[Schedule Point]
,[CY20$]
,[Vendor Name]
,[Vendor Zip Code]
,[Complete]
,[System Rollup (Import)]
,[Completed Costs]
,SUM([Completed Costs]) OVER (PARTITION BY [Obligation/Actual], [Normalized Schedule Location 1%],[System Rollup (Import)], [HW/NonHW]) As [CY20$ Summed]
FROM cte
At this point, what I'm looking to do is a sum not for each Schedule Point, but all prior Schedule Points (i.e. the <= operator in an Excel sumifs statement)
For reference, here is the sumifs I am trying to replicate:
=SUMIFS($N$2:$N$541790,$AU$2:$AU$541790,"Y",$AQ$2:$AQ$541790,AQ2,$AI$2:$AI$541790,AI2,$AH$2:$AH$541790,AH2,$AJ$2:$AJ$541790, "<=" & AJ2)
N is CY20$, AU is Complete, AQ is System, AI is Obligation/Actual, AH is HW/NonHW, AJ is Schedule Point.
Any help would be appreciated!
The equivalent to SUMIFS is a combination of SUM and CASE-WHEN in SQL.
Abstract example:
SELECT
SUM(
CASE
WHEN <condition1> AND <condition2> AND <condition3> THEN 1
ELSE 0
END
)
FROM yourtable
In the above, condition1, condition2 and condition3 are logical expressions, the < and > are just notifying that you have an expression there, it is not part of the syntax. It is also unnecessary to have exactly 3 conditions, you can have as many as you like. Also, it is unnecessary to use AND as the operator, you can construct your own expression as you like. The reason for which I have used the AND operator was that you intend to have a disjunction, presumably, based on the fact that you used SUMIFS.
A more concrete example:
CREATE TABLE person(
number int,
name text,
age int
);
INSERT INTO person(number, name, age)
VALUES(1, 'Joe', 12);
INSERT INTO person(number, name, age)
VALUES(2, 'Jane' 12);
INSERT INTO person(number, name, age)
VALUES(3, 'Robert', 16);
INSERT INTO person(number, name, age)
VALUES(4, 'Roberta', 15);
INSERT INTO person(number, name, age)
VALUES(5, 'Blian', 18);
INSERT INTO person(number, name, age)
VALUES(6, 'Bigusdqs', 19);
SELECT
SUM(
CASE
WHEN age <= 16 AND name <> 'Joe' THEN 1
ELSE 0
END
) AS MySUMIFS
FROM person;
EDIT
If we are interested to know how many people have a smaller age than the current person, then we can do a join:
SELECT
SUM(
CASE
WHEN p2.age <= p1.age THEN 1
ELSE 0
END
) AS MySUMIFS, name
FROM person p1
JOIN person p2
ON p1.name <> p2.name
GROUP BY p1.name;
EDIT2
Created a Fiddle based on the ideas described above, you can reach it at https://dbfiddle.uk/?rdbms=sqlite_3.27&fiddle=3cb0232e5d669071a3aa5bb1df68dbca
The code in the fiddle:
CREATE TABLE person(
number int,
name text,
age int
);
INSERT INTO person(number, name, age)
VALUES(1, 'Joe', 12);
INSERT INTO person(number, name, age)
VALUES(2, 'Jane' 12);
INSERT INTO person(number, name, age)
VALUES(3, 'Robert', 16);
INSERT INTO person(number, name, age)
VALUES(4, 'Roberta', 15);
INSERT INTO person(number, name, age)
VALUES(5, 'Blian', 18);
INSERT INTO person(number, name, age)
VALUES(6, 'Bigusdqs', 19);
SELECT
SUM(
CASE
WHEN p2.age <= p1.age THEN 1
ELSE 0
END
) AS MySUMIFS, p1.name
FROM person p1
JOIN person p2
ON p1.name <> p2.name
GROUP BY p1.name;
I have a table like this
sub_id reference
1 A
1 A
1 A
1 A
1 A
1 A
1 C
2 B
2 B
3 D
3 D
I want to make sure all the references in each group have the same reference.
Meaning, for example, all references in:
group 1 should be A
group 2 should be B
group 3 should be D
If they are not, then I would like to have returned a list of sub_id's.
So for the table above my result would be: 1
Ideally, with these conditions reference would be in a separate table with sub_id as PK, but I need to fix first for a massive dataset before I can move on restructuring the database.
You could use the following method:
select t.sub_id
from YourTable t
group by t.sub_id
having max(t.reference) <> min(t.reference)
Change YourTable to suit.
Are you looking for simple aggregation ?
select sub_id
from table t
group by sub_id
having count(distinct reference) > 1;
The query you want:
SELECT sub_id
FROM test_sub
GROUP BY sub_id HAVING count(DISTINCT reference) > 1
;
Here is what I used to test it:
CREATE TABLE `test_sub` (
sub_id int(11) NOT NULL,
reference varchar(45) DEFAULT NULL
);
INSERT INTO test_sub (sub_id, reference) VALUES
(1, 'A'),
(1, 'A'),
(1, 'A'),
(1, 'A'),
(1, 'C'),
(2, 'B'),
(2, 'B'),
(3, 'D'),
(3, 'D'),
(3, 'D'),
(4, 'E'),
(4, 'E'),
(4, 'E'),
(5, 'F'),
(5, 'G')
;
I have a two tables
Student
--------
Id Name
1 John
2 David
3 Will
Grade
---------
Student_id Mark
1 A
2 B
2 B+
3 C
3 A
Is it possible to make native Postgresql SELECT to get results like below:
Name Array of marks
-----------------------
'John', {'A'}
'David', {'B','B+'}
'Will', {'C','A'}
But not like below
Name Mark
----------------
'John', 'A'
'David', 'B'
'David', 'B+'
'Will', 'C'
'Will', 'A'
Use array_agg: http://www.sqlfiddle.com/#!1/5099e/1
SELECT s.name, array_agg(g.Mark) as marks
FROM student s
LEFT JOIN Grade g ON g.Student_id = s.Id
GROUP BY s.Id
By the way, if you are using Postgres 9.1, you don't need to repeat the columns on SELECT to GROUP BY, e.g. you don't need to repeat the student name on GROUP BY. You can merely GROUP BY on primary key. If you remove the primary key on student, you need to repeat the student name on GROUP BY.
CREATE TABLE grade
(Student_id int, Mark varchar(2));
INSERT INTO grade
(Student_id, Mark)
VALUES
(1, 'A'),
(2, 'B'),
(2, 'B+'),
(3, 'C'),
(3, 'A');
CREATE TABLE student
(Id int primary key, Name varchar(5));
INSERT INTO student
(Id, Name)
VALUES
(1, 'John'),
(2, 'David'),
(3, 'Will');
What I understand you can do something like this:
SELECT p.p_name,
STRING_AGG(Grade.Mark, ',' ORDER BY Grade.Mark) As marks
FROM Student
LEFT JOIN Grade ON Grade.Student_id = Student.Id
GROUP BY Student.Name;
EDIT
I am not sure. But maybe something like this then:
SELECT p.p_name,
array_to_string(ARRAY_AGG(Grade.Mark),';') As marks
FROM Student
LEFT JOIN Grade ON Grade.Student_id = Student.Id
GROUP BY Student.Name;
Reference here
You could use the following:
SELECT Student.Name as Name,
(SELECT array(SELECT Mark FROM Grade WHERE Grade.Student_id = Student.Id))
AS ArrayOfMarks
FROM Student
As described here: http://www.mkyong.com/database/convert-subquery-result-to-array/
Michael Buen got it right. I got what I needed using array_agg.
Here just a basic query example in case it helps someone:
SELECT directory, ARRAY_AGG(file_name)
FROM table
WHERE type = 'ZIP'
GROUP BY directory;
And the result was something like:
| parent_directory | array_agg |
+-------------------------+----------------------------------------+
| /home/postgresql/files | {zip_1.zip,zip_2.zip,zip_3.zip} |
| /home/postgresql/files2 | {file1.zip,file2.zip} |
This post also helped me a lot: "Group By" in SQL and Python Pandas.
It basically says that it is more convenient to use only PSQL when possible, but that Python Pandas can be useful to achieve extra functionalities in the filtering process.