Related
I'm pretty new to SQL, but Excel has become far too slow to continue working with, so I'm trying SQLiteStudio. I'm looking to create a column in a query showing running total over time (characterized as Schedule Points, marking each percent through a project's run time). Complete marks whether a Location has completed the install (Y/NULL), and is used simply to filter out incomplete locations from further calculations.
I currently have
With cte as(
Select [Location]
,[HW/NonHW]
,[Obligation/Actual]
,[Schedule Point]
,[CY20$]
,[Vendor Name]
,[Vendor Zip Code]
,[Complete]
,[System Rollup (Import)]
,IIf([Complete] = "Y", [CY20$], 0) As [Completed Costs]
FROM data)
Select [Location]
,[HW/NonHW]
,[Obligation/Actual]
,[Schedule Point]
,[CY20$]
,[Vendor Name]
,[Vendor Zip Code]
,[Complete]
,[System Rollup (Import)]
,[Completed Costs]
,SUM([Completed Costs]) OVER (PARTITION BY [Obligation/Actual], [Normalized Schedule Location 1%],[System Rollup (Import)], [HW/NonHW]) As [CY20$ Summed]
FROM cte
At this point, what I'm looking to do is a sum not for each Schedule Point, but all prior Schedule Points (i.e. the <= operator in an Excel sumifs statement)
For reference, here is the sumifs I am trying to replicate:
=SUMIFS($N$2:$N$541790,$AU$2:$AU$541790,"Y",$AQ$2:$AQ$541790,AQ2,$AI$2:$AI$541790,AI2,$AH$2:$AH$541790,AH2,$AJ$2:$AJ$541790, "<=" & AJ2)
N is CY20$, AU is Complete, AQ is System, AI is Obligation/Actual, AH is HW/NonHW, AJ is Schedule Point.
Any help would be appreciated!
The equivalent to SUMIFS is a combination of SUM and CASE-WHEN in SQL.
Abstract example:
SELECT
SUM(
CASE
WHEN <condition1> AND <condition2> AND <condition3> THEN 1
ELSE 0
END
)
FROM yourtable
In the above, condition1, condition2 and condition3 are logical expressions, the < and > are just notifying that you have an expression there, it is not part of the syntax. It is also unnecessary to have exactly 3 conditions, you can have as many as you like. Also, it is unnecessary to use AND as the operator, you can construct your own expression as you like. The reason for which I have used the AND operator was that you intend to have a disjunction, presumably, based on the fact that you used SUMIFS.
A more concrete example:
CREATE TABLE person(
number int,
name text,
age int
);
INSERT INTO person(number, name, age)
VALUES(1, 'Joe', 12);
INSERT INTO person(number, name, age)
VALUES(2, 'Jane' 12);
INSERT INTO person(number, name, age)
VALUES(3, 'Robert', 16);
INSERT INTO person(number, name, age)
VALUES(4, 'Roberta', 15);
INSERT INTO person(number, name, age)
VALUES(5, 'Blian', 18);
INSERT INTO person(number, name, age)
VALUES(6, 'Bigusdqs', 19);
SELECT
SUM(
CASE
WHEN age <= 16 AND name <> 'Joe' THEN 1
ELSE 0
END
) AS MySUMIFS
FROM person;
EDIT
If we are interested to know how many people have a smaller age than the current person, then we can do a join:
SELECT
SUM(
CASE
WHEN p2.age <= p1.age THEN 1
ELSE 0
END
) AS MySUMIFS, name
FROM person p1
JOIN person p2
ON p1.name <> p2.name
GROUP BY p1.name;
EDIT2
Created a Fiddle based on the ideas described above, you can reach it at https://dbfiddle.uk/?rdbms=sqlite_3.27&fiddle=3cb0232e5d669071a3aa5bb1df68dbca
The code in the fiddle:
CREATE TABLE person(
number int,
name text,
age int
);
INSERT INTO person(number, name, age)
VALUES(1, 'Joe', 12);
INSERT INTO person(number, name, age)
VALUES(2, 'Jane' 12);
INSERT INTO person(number, name, age)
VALUES(3, 'Robert', 16);
INSERT INTO person(number, name, age)
VALUES(4, 'Roberta', 15);
INSERT INTO person(number, name, age)
VALUES(5, 'Blian', 18);
INSERT INTO person(number, name, age)
VALUES(6, 'Bigusdqs', 19);
SELECT
SUM(
CASE
WHEN p2.age <= p1.age THEN 1
ELSE 0
END
) AS MySUMIFS, p1.name
FROM person p1
JOIN person p2
ON p1.name <> p2.name
GROUP BY p1.name;
I have the following two tables:
CREATE TABLE empleados (
id INTEGER PRIMARY KEY,
nombre VARCHAR(255) NOT NULL,
gerenteId INTEGER,
FOREIGN KEY (gerenteId) REFERENCES empleados(id)
);
CREATE TABLE ventas (
id INTEGER PRIMARY KEY,
empleadoId INTEGER NOT NULL,
valorOrden INTEGER NOT NULL,
FOREIGN KEY (empleadoId) REFERENCES empleados(id)
);
With the following data:
INSERT INTO empleados(id, nombre, gerenteId) VALUES(1, 'Roberto', null);
INSERT INTO empleados(id, nombre, gerenteId) VALUES(2, 'Tomas', null);
INSERT INTO empleados(id, nombre, gerenteId) VALUES(3, 'Rogelio', 1);
INSERT INTO empleados(id, nombre, gerenteId) VALUES(4, 'Victor', 3);
INSERT INTO empleados(id, nombre, gerenteId) VALUES(5, 'Johnatan', 4);
INSERT INTO empleados(id, nombre, gerenteId) VALUES(6, 'Gustavo', 2);
INSERT INTO ventas(id, empleadoId, valorOrden) VALUES(1, 3, 400);
INSERT INTO ventas(id, empleadoId, valorOrden) VALUES(2, 4, 3000);
INSERT INTO ventas(id, empleadoId, valorOrden) VALUES(3, 5, 3500);
INSERT INTO ventas(id, empleadoId, valorOrden) VALUES(4, 2, 40000);
INSERT INTO ventas(id, empleadoId, valorOrden) VALUES(5, 6, 3000);
I'm trying to get a query to obtain the sum of all the "Orders" which belong directly or inderectly to the main managers. The main managers are the ones whose doesn't report to anybody else. In this calse, Roberto and Tomas are the main managers but there could be other ones. The result must to take into account not just the sales (ventas) made directly by him but also by any of their employees (direct employees or employees of their employees).
So in this case I'm expecting the following result:
-- Id TotalVentas
-- ----------------
-- 1 6900
-- 2 43000
Where the Id column refers to the employees' id which are "main" managers and TotalVentas column is the sum of all the ventas (valorOrden) made by them and their employees.
So Roberto has no records for orders but Rogelio (his employee) has one of 400, Victor (Rogelio's employee) has one for 3000 and Johnatan (Victor's employee) has another for 3500. So the sum of all of them is 6900. And it is the same case with Tomas which has one venta directly made by him plus another one made by Gustavo who is his employee.
The query that I have so far is the following:
WITH cte_org AS (
SELECT
id,
nombre,
gerenteId,
0 as EmpLevel
FROM
dbo.empleados
WHERE gerenteId IS NULL
UNION ALL
SELECT
e.id,
e.nombre,
e.gerenteId,
o.EmpLevel + 1
FROM
dbo.empleados e
INNER JOIN cte_org o
ON o.id = e.gerenteId
WHERE e.gerenteId IS NOT NULL
)
SELECT cte.id, SUM(s.orderValue)
FROM cte_org cte, dbo.sales s
WHERE (cte.id = s.employeeId AND cte.gerenteId is null)
OR
(cte.id = s.employeeId AND cte.EmpLevel <> 0 AND
cte.gerenteId in (select ee.id from dbo.empleados ee where ee.gerenteId is null)
)
--AND
--(cte.gerenteId in (select ee.id from dbo.empleados ee where ee.gerenteId is null)
--OR
--cte.gerenteId is null)
--AND cte.gerenteId = NULL
group by cte.id
;
Could anybody help me with this?
This is traversing a hierarchy, starting with the highest level managers, and then joining in the sales:
with cte as (
select id, nombre, id as manager
from empleados e
where gerenteid is null
union all
select e.id, e.nombre, cte.manager
from cte join
empleados e
on cte.id = e.gerenteid
)
select cte.manager, sum(valororden)
from cte join
ventas v
on cte.id = v.empleadoid
group by cte.manager;
Here is a db<>fiddle. The Fiddle uses SQL Server, because that is consistent with the syntax that you are using.
In oracle you can do this using below query:
SQLFiddle
select manager, sum(amount) as total_amount from
(
select level, CONNECT_BY_ROOT employeeid Manager, a.* from
(
SELECT a.id as employeeid, a.nombre as name , a.GERENTEID as manager_id,
b.EMPLEADOID as sales_id, b.VALORORDEN amount
from empleados a left outer join ventas b
on (a.id = b.empleadoId)
) a
start with manager_id is null
connect by prior employeeid = manager_id) x
group by manager;
The prompt is to form a SQL query.
That finds the students name and ID who attend all lectures having ects more than 4.
The tables are
CREATE TABLE CLASS (
STUDENT_ID INT NOT NULL,
LECTURE_ID INT NOT NULL
);
CREATE TABLE STUDENT (
STUDENT_ID INT NOT NULL,
STUDENT_NAME VARCHAR(255),
PRIMARY KEY (STUDENT_ID)
)
CREATE TABLE LECTURE (
LECTURE_ID INT NOT NULL,
LECTURE_NAME VARCHAR(255),
ECTS INT,
PRIMARY KEY (LECTURE_ID)
)
I came up with this query but this didn't seem to work on SQLFIDDLE. I'm new to SQL and this query has been a little troublesome for me. How would you query this?
SELECT STUD.STUDENT_NAME FROM STUDENT STUD
INNER JOIN CLASS CLS AND LECTURE LEC ON
CLS.STUDENT_ID = STUD.STUDENT_ID
WHERE LEC.CTS > 4
How do I fix this query?
UPDATE
insert into STUDENT values(1, 'wick', 20);
insert into STUDENT values(2, 'Drake', 25);
insert into STUDENT values(3, 'Bake', 42);
insert into STUDENT values(4, 'Man', 5);
insert into LECTURE values(1, 'Math', 6);
insert into LECTURE values(2, 'Prog', 6);
insert into LECTURE values(3, 'Physics', 1);
insert into LECTURE values(4, '4ects', 4);
insert into LECTURE values(5, 'subj', 4);
insert into SCLASS values(1, 3);
insert into SCLASS values(1, 2);
insert into SCLASS values(2, 3);
insert into SCLASS values(3, 1);
insert into SCLASS values(3, 2);
insert into SCLASS values(3, 3);
insert into SCLASS values(4, 4);
insert into SCLASS values(4, 5);
The following approach might get the job done.
It works by generating two subqueries :
one that counts how many lectures whose ects is greater than 4 were taken by each user
another that just counts the total number of lectures whose ects is greater than 4
Then, the outer query filters in users whose count reaches the total :
SELECT x.student_id, x.student_name
FROM
(
SELECT s.student_id, s.student_name, COUNT(DISTINCT l.lecture_id) cnt
FROM
student s
INNER JOIN class c ON c.student_id = s.student_id
INNER JOIN lecture l ON l.lecture_id = c.lecture_id
WHERE l.ects > 4
GROUP BY s.student_id, s.student_name
) x
CROSS JOIN (SELECT COUNT(*) cnt FROM lecture WHERE ects > 4 ) y
WHERE x.cnt = y.cnt ;
As GMB already said in their answer: count required lections and compare with those taken per student. Here is another way to write such query. We outer join classes to all lectures with ECTS > 4. Analytic window functions allow us to aggregate by two different groups at the same time (here: all rows and student's rows).
select *
from student
where (student_id, 0) in -- 0 means no gap between required and taken lectures
(
select
student_id,
count(distinct lecture_id) over () -
count(distinct lecture_id) over (partition by c.student_id) as gap
from lecture l
left join class c using (lecture_id)
where l.ects > 4
);
Demo: https://dbfiddle.uk/?rdbms=oracle_18&fiddle=74371314913565243863c225847eb044
You can try the following query.
SELECT distinct
STUD.STUDENT_NAME,
STUD.STUDENT_ID
FROM STUDENT STUD
INNER JOIN CLASS CLS ON CLS.STUDENT_ID = STUD.STUDENT_ID
INNER JOIN LECTURE LEC ON LEC.LECTURE_ID=CLS.LECTURE_ID
where LEC.ECTS > 4 group by STUD.STUDENT_ID,STUD.STUDENT_NAME
having COUNT(STUD.STUDENT_ID) =(SELECT COUNT(*) FROM LECTURE WHERE ECTS > 4)
I hope everyone is fine and learning more. I need some advice on select statement and on its fine tuning. I am using Oracle 11gR2. Please find below table and data scripts.
create table employee (emp_id number, emp_name varchar2(50), manager_id number);
create table department (dept_id number, dept_name varchar2(50), emp_name varchar2(50), manager_level varchar2(20));
create table manager_lookup (manager_level_id number, manager_level varchar2(20));
insert into employee values (1, 'EmpA',3);
insert into employee values (2, 'EmpB',1);
insert into employee values (3, 'EmpC',1);
insert into employee values (4, 'EmpD',2);
insert into employee values (5, 'EmpE',1);
insert into employee values (6, 'EmpF',3);
insert into department values (1, 'DeptA','EmpD','Level3');
insert into department values (2, 'DeptB','EmpC','Level2');
insert into department values (3, 'DeptC','EmpA','Level1');
insert into department values (4, 'DeptD','EmpF','Level1');
insert into department values (5, 'DeptD','EmpA','Level3');
insert into department values (6, 'DeptA',NULL,'Level3');
insert into manager_lookup values (1, 'Level1');
insert into manager_lookup values (2, 'Level2');
insert into manager_lookup values (3, 'Level3');
commit;
Below query is returning me dept_id by passing some emp_name. I need those dept_id where manager_level is same as emp_name passed but don't need to have that same emp_name in result data set.
SELECT b.dept_id
FROM (SELECT DISTINCT manager_level
FROM department dpt
WHERE emp_name = 'EmpA'
and emp_name is not null) a,
department b
WHERE a.manager_level = b.manager_level
AND NVL (b.emp_name, 'ABC') <> 'EmpA';
Above query is returning me data set as below:
dept_id
--------
1
4
6
I want same result set but need to rewrite above query in a way to avoid department table scan two times. This is just sample query but in real time scanning big table two times is giving performance issues. I want to re-write this query in a way to perform better and avoid same table scan two times.
Can you please help on providing your wonderful suggestions or solutions? I will really appreciate all responses.
Thank you for going over the question.
If you want your query to be more efficient, then use indexes:
create index idx_department_name_level on department(emp_name, manager_level)
and
create index idx_department_name_level on department(manager_level, emp_name, dept_id)
also, you have a redundant null check which may be avoiding indexes...
SELECT b.dept_id
FROM (SELECT manager_level
FROM department dpt
WHERE emp_name = 'EmpA') a,
department b
WHERE a.manager_level = b.manager_level
AND NVL (b.emp_name, 'ABC') <> 'EmpA';
post your explain plan for more help
This should work:
SELECT a.*
FROM
(
SELECT d.*,
SUM(CASE WHEN emp_name = 'EmpA' THEN 1 ELSE 0 END)
OVER (PARTITION BY manager_level) AS hits
FROM department d
) a
WHERE hits > 0
AND NVL(emp_name, 'Dummy') <> 'EmpA'
ORDER BY dept_id
;
The query does the following:
Calculate how many times EmpA appears in a given manager_level
Keep all records with a manager_level that has at least one occurrence of EmpA in it
Excluding the EmpA records themselves
SQL Fiddle of the query in action: http://sqlfiddle.com/#!4/a9e03/7
You can verify that the execution plan contains only one full table scan.
I have an interesting query I'm trying to figure out. I have a view which is getting a column added to it. This column is pivoted data coming from other tables, to form into a single row. Now, I need to wipe out duplicate entries in this pivoted data. Listagg is great for getting the data to a single row, but I need to make it unique. While I know how to make it unique, I'm tripping up on the fact that correlated sub-queries only go 1 level deep. So... not really sure how to get a distinct list of values. I can get it to work if I don't do the distinct just fine. Anyone out there able to work some SQL magic?
Sample data:
drop table test;
drop table test_widget;
create table test (id number, description Varchar2(20));
create table test_widget (widget_id number, test_fk number, widget_type varchar2(20));
insert into test values(1, 'cog');
insert into test values(2, 'wheel');
insert into test values(3, 'spring');
insert into test_widget values(1, 1, 'A');
insert into test_widget values(2, 1, 'A');
insert into test_widget values(3, 1, 'B');
insert into test_widget values(4, 1, 'A');
insert into test_widget values(5, 2, 'C');
insert into test_widget values(6, 2, 'C');
insert into test_widget values(7, 2, 'B');
insert into test_widget values(8, 3, 'A');
insert into test_widget values(9, 3, 'C');
insert into test_widget values(10, 3, 'B');
insert into test_widget values(11, 3, 'B');
insert into test_widget values(12, 3, 'A');
commit;
Here is an example of the query that works, but shows duplicate data:
SELECT A.ID
, A.DESCRIPTION
, (SELECT LISTAGG (WIDGET_TYPE, ', ') WITHIN GROUP (ORDER BY WIDGET_TYPE)
FROM TEST_WIDGET
WHERE TEST_FK = A.ID) widget_types
FROM TEST A
Here is an example of what does NOT work due to the depth of where I try to reference the ID:
SELECT A.ID
, A.DESCRIPTION
, (SELECT LISTAGG (WIDGET_TYPE, ', ') WITHIN GROUP (ORDER BY WIDGET_TYPE)
FROM (SELECT DISTINCT WIDGET_TYPE
FROM TEST_WIDGET
WHERE TEST_FK = A.ID))
WIDGET_TYPES
FROM TEST A
Here is what I want displayed:
1 cog A, B
2 wheel B, C
3 spring A, B, C
If anyone knows off the top of their head, that would fantastic! Otherwise, I can post up some sample create statements to help you with dummy data to figure out the query.
You can apply the distinct in a subquery, which also has the join - avoiding the level issue:
SELECT ID
, DESCRIPTION
, LISTAGG (WIDGET_TYPE, ', ')
WITHIN GROUP (ORDER BY WIDGET_TYPE) AS widget_types
FROM (
SELECT DISTINCT A.ID, A.DESCRIPTION, B.WIDGET_TYPE
FROM TEST A
JOIN TEST_WIDGET B
ON B.TEST_FK = A.ID
)
GROUP BY ID, DESCRIPTION
ORDER BY ID;
ID DESCRIPTION WIDGET_TYPES
---------- -------------------- --------------------
1 cog A, B
2 wheel B, C
3 spring A, B, C
I was in a unique situation using the Pentaho reports writer and some inconsistent data. The Pentaho writer uses Oracle to query data, but has limitations. The data pieces were unique but not classified in a consistent manner, so I created a nested listagg inside of a left join to present the data the way I wanted to:
left join
(
select staff_id, listagg(thisThing, ' --- '||chr(10) ) within group (order by this) as SCHED_1 from
(
SELECT
staff_id, RPT_STAFF_SHIFTS.ORGANIZATION||': '||listagg(
RPT_STAFF_SHIFTS.DAYS_OF_WEEK
, ',' ) within group (order by BEGIN_DATE desc)
as thisThing
FROM "RPT_STAFF_SHIFTS" where "RPT_STAFF_SHIFTS"."END_DATE" is null
group by staff_id, organization)
group by staff_id
) schedule_1 on schedule_1.staff_id = "RPT_STAFF"."STAFF_ID"
where "RPT_STAFF"."STAFF_ID" ='555555'
This is a different approach than using the nested query, but it some situations it might work better by taking into account the level issue when developing the query and taking an extra step to fully concatenate the results.