PostgreSQL: Adjust columns value based on criteria - sql

Imagine the following data:
student category exam_id adjusted_category
Carl A 44 A
Carl A 55 A
Carl A 88 A
Carl A 1 A
Carl A 2 A
Carl A 3 A
Carl B 1 B
Carl B 2 B
Carl B 3 B
John C 100 C
John C 200 C
John C 300 C
If for the same user, both categories A and B are encountered but specific exam_ids are found (44, 55, 88), I'd like to adjust the category to be A. Otherwise, keep the same category.
The output I'm aiming is:
student adjusted_category
Carl A
Carl C
Code I'm currently attempting to modify:
with my_table (student, category, exam_id)
as (values
('Carl', 'A', 44),
('Carl', 'A', 55),
('Carl', 'A', 88),
('Carl', 'A', 1),
('Carl', 'A', 2),
('Carl', 'A', 3),
('Carl', 'B', 1),
('Carl', 'B', 2),
('Carl', 'B', 3),
('John', 'C', 100),
('John', 'C', 200),
('John', 'C', 300)
)
select *,
case
when category in ('A','B') and exam_id in (44, 55, 88) then 'A'
else category
end as adjusted_category
from my_table
The reason why the code above is not what I'm after is because I end up getting the adjusted category as A only where the exams id_s are 44, 55, or 88. Id's like all of the entries for Carl to have A as the adjusted category.
How can I achieve the desired output?

I may have misinterpreted your requirement. If so, you can change the bool_or() to bool_and(). The bool_or() expression over a window partitioned by student will return true if the student has any one of (44, 55, 88):
with my_table (student, category, exam_id)
as (values
('Carl', 'A', 44),
('Carl', 'A', 55),
('Carl', 'A', 88),
('Carl', 'A', 1),
('Carl', 'A', 2),
('Carl', 'A', 3),
('Carl', 'B', 1),
('Carl', 'B', 2),
('Carl', 'B', 3),
('John', 'C', 100),
('John', 'C', 200),
('John', 'C', 300)
)
select *,
case
when category in ('A','B')
and bool_or(exam_id in (44, 55, 88)) over (partition by student) then 'A'
else category
end as adjusted_category
from my_table;
db<>fiddle here

Related

Find the 3rd top selling item postgres

I have a transaction table, I want to get the product ID of the record which is ranked 3rd highest for sales. Please note there there can be multiple transactions for an item and one transaction can have multiple qty. so i am wanting to find which product id has the 3rd highest qty.
I think i need to use something like rank, but my query returns ranks weirdly, not sure whats wrong.
select distinct t.product_id,
sum(t.qty) over (partition by t.product_id) qty,
rank() over(partition by t.product_id order by t.qty desc) rnk
from transaction t
order by rnk`
CREATE TABLE IF NOT EXISTS "transaction"
(DATE_ID BIGINT NOT NULL,
STORE_ID INT NOT NULL,
TRANSACTION_TYPE_ID CHAR(1) NOT NULL,
PRODUCT_ID INT NOT NULL,
QTY INT NOT NULL);
INSERT INTO "transaction"
(DATE_ID, STORE_ID, TRANSACTION_TYPE_ID, PRODUCT_ID, QTY)
VALUES
(1, 1, 'A', 1, 2),
(1, 1, 'B', 1, 1),
(1, 2, 'A', 4, 1),
(1, 6, 'A', 3, 1),
(1, 1, 'A', 1, 1),
(2, 1, 'B', 1, 1),
(2, 1, 'A', 1, 1),
(2, 2, 'A', 2, 5),
(3, 2, 'A', 2, 7),
(3, 3, 'A', 2, 1),
(3, 3, 'B', 1, 15),
(3, 3, 'A', 1, 1),
(4, 4, 'A', 1, 1),
(4, 4, 'A', 1, 5),
(4, 4, 'A', 1, 11),
(4, 5, 'A', 3, 2),
(4, 6, 'A', 3, 1),
(4, 6, 'A', 3, 1),
(4, 6, 'B', 2, 1),
(5, 2, 'A', 2, 2),
(5, 2, 'B', 1, 1),
(5, 2, 'A', 2, 1),
(5, 2, 'A', 4, 1),
(5, 2, 'A', 5, 1),
(6, 2, 'B', 4, 1),
(6, 2, 'A', 6, 1),
(6, 3, 'A', 3, 5),
(7, 3, 'A', 2, 7),
(7, 4, 'A', 2, 1),
(7, 4, 'B', 2, 15),
(7, 4, 'A', 2, 1),
(7, 5, 'A', 2, 1),
(7, 5, 'A', 2, 5),
(7, 5, 'A', 2, 11),
(7, 6, 'A', 2, 2),
(7, 1, 'A', 2, 1),
(8, 1, 'A', 2, 1),
(8, 1, 'B', 2, 1),
(8, 3, 'A', 3, 2),
(9, 3, 'B', 3, 1),
(9, 3, 'A', 3, 1),
(9, 3, 'A', 3, 1),
(9, 3, 'A', 3, 1),
(10, 3, 'B', 3, 1),
(10, 3, 'A', 3, 1),
(10, 4, 'A', 4, 5),
(10, 4, 'A', 4, 7),
(10, 5, 'A', 5, 1),
(10, 5, 'B', 5, 15),
(10, 5, 'A', 5, 1),
(10, 6, 'A', 6, 1),
(10, 6, 'A', 6, 5),
(10, 6, 'A', 6, 11),
(10, 1, 'A', 1, 2),
(10, 2, 'A', 2, 1),
(11, 2, 'A', 2, 1),
(11, 2, 'B', 2, 1),
(11, 3, 'A', 5, 2),
(11, 3, 'B', 5, 1),
(11, 3, 'A', 5, 1),
(12, 3, 'A', 5, 1),
(12, 3, 'A', 5, 1)
Yet another option is:
aggregating the "qty" values per "product_id" (SUM(qty) GROUP BY product_id)
extracting a ranking value for each product_id summed quantities (DENSE_RANK() OVER(ORDER BY SUM(qty) DESC))
ordering your output rows with respect to when this ranking value equals 3 (DENSE_RANK() ... = 3)
keeping only the first row given your ordering (FETCH FIRST 1 ROWS WITH TIES )
SELECT product_id
FROM "transaction"
GROUP BY product_id
ORDER BY DENSE_RANK() OVER(ORDER BY SUM(qty) DESC) = 3 DESC
FETCH FIRST 1 ROWS WITH TIES
Check the demo here.
select product_id
from (
select product_id
,dense_rank() over(order by sum(qty)) as d_rnk
from transaction
group by product_id
) t
where d_rnk = 3
product_id
5
Fiddle

Only assign values to entries relevant to condition

I have the following code
with my_table (id, student, category, score)
as (values
(1, 'Alex', 'A', 11),
(2, 'Alex', 'D', 4),
(3, 'Bill', 'A', 81),
(4, 'Bill', 'B', 1),
(5, 'Bill', 'D', 22),
(6, 'Carl', 'C', 5),
(7, 'Carl', 'D', 10)
)
select
id, student, category, score,
case when
max(score) filter (where category in ('A', 'B', 'C')) over (partition by student) >
min(score) filter (where category = 'D') over (partition by student)
then 'Review'
else 'Pass'
end as result
from my_table
order by student, id
which outputs
id student category score conclusion
1 Alex A 11 Review
2 Alex D 4 Review
3 Bill A 81 Review
4 Bill B 1 Review
5 Bill D 22 Review
6 Carl C 5 Pass
7 Carl D 10 Pass
How can I edit it so only the entries where either A, B, or C are larger than D are assigned 'Review' to them. So in this case, the desired output would be:
id student category score conclusion
1 Alex A 11 Review
2 Alex D 4 Review
3 Bill A 81 Review
4 Bill B 1 Pass
5 Bill D 22 Review
6 Carl C 5 Pass
7 Carl D 10 Pass
For Bill, A>D so Review is assigned to it; B<D so Pass is assigned to it.
From your logic, you can try to use subquery to get count then compare
with my_table (id, student, category, score)
as (values
(1, 'Alex', 'A', 11),
(2, 'Alex', 'D', 4),
(3, 'Bill', 'A', 81),
(4, 'Bill', 'B', 1),
(5, 'Bill', 'D', 22),
(6, 'Carl', 'C', 5),
(7, 'Carl', 'D', 10)
)
SELECT id, student, category, score,
CASE WHEN
COUNT(*) filter (where D_Score<score) over (partition by student) = 0 OR score < D_Score
THEN 'Pass'
ELSE 'Review' END
FROM (
SELECT *,
min(score) filter (where category = 'D') over (partition by student) D_Score
FROM my_table
) t1
sqlfiddle

How to complete this specific question/query using SQL and an inline view?

I'm currently working on a hotel project where I need to make queries with SQL but I'm stuck on one question.
The question is:
How many employees have made at least 2 bookings for at least 3 customers?
I have figured out that I need to use an inline view but I have not gone any further because I'm stuck on the next part.
This is the table in the database:
bookingid | int | primary key
bookingdate | date| -
numOfGuests | int | -
customerId | int | foreign key
employeeId | int | foreign key
bookingid | bookingdate | numOfGuests | customerId | employeeId
1 2016-01-25 4 2 2
2 2016-06-12 1 3 2
3 2016-12-05 1 2 2
4 2016-04-01 2 3 2
5 2016-11-01 3 2 3
6 2016-11-03 1 8 2
7 2017-06-02 6 2 2
8 2016-02-07 2 8 2
9 2016-12-25 2 4 5
10 2017-06-21 1 10 2
11 2016-08-12 2 10 2
... ... ... ... ...
So does anyone know how to complete this question with a SQL query using an inline view?
The result I want are the employeeId's or id that satisfies the specifications of the question: Result based on sample data
CountOfemployeeID |
1
Please check this script-
SELECT COUNT(DISTINCT C.employeeId)
FROM
(
SELECT A.employeeId,B.customerid,COUNT(B.bookingid) T
FROM (
--Select users who atlease booked for 3 customer
SELECT employeeId,COUNT(DISTINCT customerid) customerid
FROM Table1
GROUP BY employeeId
HAVING COUNT(customerid)> 2
)A
--Select users who atleast booked twice per customer
INNER JOIN (
SELECT bookingid,bookingdate,numOfGuests,customerId,employeeId
FROM Table1
) B
ON A.employeeId = B.employeeId
GROUP BY A.employeeId,B.customerid
HAVING COUNT(B.bookingid) > 1
)C
declare #userData TABLE(
bookingid int,
bookingdate date,
numOfGuests int,
customerId int,
employeeId int
)
insert into #userData
values
(1, '2016-01-25', 4, 2, 2),
(2, '2016-06-12', 1, 3, 3),
(3, '2016-12-05', 1, 2, 4),
(4, '2016-04-01', 2, 2, 3),
(5, '2016-11-12', 3, 2, 3),
(6, '2017-01-15', 1, 5, 5),
(6, '2017-01-15', 1, 5, 5),
(6, '2017-01-15', 1, 5, 5),
(6, '2017-01-15', 1, 5, 5),
(6, '2017-01-15', 1, 5, 5),
(6, '2017-01-15', 1, 5, 5),
(1, '2016-01-25', 4, 2, 2),
(2, '2016-06-12', 1, 3, 3),
(3, '2016-12-05', 1, 2, 4),
(4, '2016-04-01', 2, 2, 3),
(5, '2016-11-12', 3, 2, 3),
(6, '2017-01-15', 1, 2, 5),
(6, '2017-01-15', 1, 2, 5),
(6, '2017-01-15', 1, 3, 5),
(6, '2017-01-15', 1, 3, 5),
(6, '2017-01-15', 1, 4, 5),
(6, '2017-01-15', 1, 4, 5),
(1, '2016-01-25', 4, 2, 2),
(2, '2016-06-12', 1, 3, 3),
(3, '2016-12-05', 1, 2, 4),
(4, '2016-04-01', 2, 2, 3),
(5, '2016-11-12', 3, 2, 3),
(6, '2017-01-15', 1, 1, 5),
(6, '2017-01-15', 1, 2, 5),
(6, '2017-01-15', 1, 3, 5),
(6, '2017-01-15', 1, 4, 5),
(6, '2017-01-15', 1, 7, 5),
(6, '2017-01-15', 1, 6, 5),
(1, '2016-01-25', 4, 3, 2),
(1, '2016-01-25', 4, 3, 2),
(1, '2016-01-25', 4, 1, 2),
(1, '2016-01-25', 4, 1, 2)
select * from #userData
; with CTE as
(
select count(customerId) count, customerId, employeeId from #userData
group by customerId, employeeid having count(customerid) >= 2
), cte2 as
(
Select employeeId from CTE group by Employeeid having count(employeeId) >= 3
)
select count, customerid, employeeid from CTE as a
inner join CTE2 as b on a.employeeId = b.employeeId
OUTPUT
count customerId employeeId
2 1 2
3 2 2
2 3 2
3 2 5
3 3 5
3 4 5
6 5 5
If You need only the EmployeeId, then just fire
Select employeeId from CTE2
output
Employeeid
2
5

How to get sum of a column values based on a specific value in another column which exists in different row

Here is data details
ID NAME STRVALUE INTVALUE
1 d_n d1 (null)
2 d_n d1 (null)
3 d_n d1 (null)
3 u_n u1 (null)
3 s_l 4 4
3 p_n A (null)
4 d_n d2 (null)
5 d_n d2 (null)
6 d_n d2 (null)
6 u_n u2 (null)
6 s_l 5 5
6 p_n A (null)
7 d_n d1 (null)
8 d_n d1 (null)
9 d_n d1 (null)
9 u_n u1 (null)
9 s_l 5 5
9 p_n A (null)
10 d_n d1 (null)
In my data, for each ID I have multiple rows. Each row has name, strvalue and intvalue.
Could someone please help me getting the output in the following format?
STRVALUE INTVALUE_SUM
d1 9
d2 5
Query
SELECT
strvalue,
SUM(intvalue) AS `Sum`
FROM
d
WHERE id IN
(SELECT
id
FROM
e
WHERE TYPE = 'a_e'
AND USER = 1)
GROUP BY id
Creating the test data:
DECLARE #Test TABLE
(
ID INT,
NAME VARCHAR(10),
STRVALUE VARCHAR(10),
INTVALUE INT NULL
)
INSERT INTO #Test
(ID, NAME, STRVALUE, INTVALUE)
VALUES
(1, 'd_n', 'd1', NULL),
(2, 'd_n', 'd1', null),
(3, 'd_n', 'd1', null),
(3, 'u_n', 'u1', null),
(3, 's_l', '4', 4),
(3, 'p_n', 'A', null),
(4, 'd_n', 'd2', null),
(5, 'd_n', 'd2', null),
(6, 'd_n', 'd2', null),
(6, 'u_n', 'u2', null),
(6, 's_l', '5', 5),
(6, 'p_n', 'A', null),
(7, 'd_n', 'd1', null),
(8, 'd_n', 'd1', null),
(9, 'd_n', 'd1', null),
(9, 'u_n', 'u1', null),
(9, 's_l', '5', 5),
(9, 'p_n', 'A', null),
(10, 'd_n', 'd1', null);
From the answer you are expecting, I guess that you are looking for the s_l rows matching the d_n ones and trying to get those values, here is the query.
SELECT d1.STRVALUE, SUM(d2.INTVALUE) AS INTVALUE_SUM FROM #Test d1
INNER JOIN #Test d2
ON d1.ID = d2.ID
AND d1.NAME = 'd_n'
AND d2.NAME = 's_l'
GROUP BY D1.STRVALUE
Here is the output:
STRVALUE INTVALUE_SUM
d1 9
d2 5
Now just fix up the query to pull the correct ids based on that other subquery you posted, and you should get your counts.

SQL: Find filled area [duplicate]

This question already has answers here:
Finding filled rectangles given x, y coordinates using SQL
(2 answers)
Closed 8 years ago.
I have this table:
CREATE TABLE coordinates (
x INTEGER NOT NULL,
y INTEGER NOT NULL,
color VARCHAR(1) NOT NULL,
PRIMARY KEY(x,y)
)
Here are some sample data:
INSERT INTO coordinates
(x, y, color)
VALUES
(0, 4, 'g'),
(1, 0, 'g'),
(1, 1, 'g'),
(1, 2, 'g'),
(1, 3, 'g'),
(0, 4, 'g'),
(1, 0, 'g'),
(1, 1, 'g'),
(1, 2, 'g'),
(1, 3, 'g'),
(1, 4, 'g'),
(2, 0, 'b'),
(2, 1, 'g'),
(2, 2, 'g'),
(2, 3, 'g'),
(2, 4, 'g'),
(4, 0, 'b'),
(4, 1, 'r'),
(4, 2, 'r'),
(4, 3, 'g'),
(4, 4, 'g'),
(6, 0, 'r'),
(6, 1, 'g'),
(6, 2, 'g'),
(6, 3, 'r'),
(6, 4, 'r')
;
I am trying to write a query that finds all the largest-area rectangles. This is assuming a rectangle is defined by its bottom left and top right, and that 1/4 is r, 1/4 is b, 1/4 is g, 1/4 is y.
So result should be something similarly like this:
x1 | y1 | x2 | y2 | area
-------------------------
0 1 6 9 58
1 2 4 7 58
Make a function that calculates the area and then query the function to get the biggest one.