Which query is more performant using groupBy in subquery?

Which query is more performant using groupBy in subquery? - sql

I have next task:
I have table columns and I have table tasks - One Column has Many Tasks, Task belongsTo one Column.
First query^
SELECT id,
name,
color,
created_at,
CASE
WHEN jt.tc IS NULL THEN 0
ELSE jt.tc
END
FROM columns AS c1
LEFT JOIN
(SELECT count(*) AS tc,
column_id
FROM tasks AS t
GROUP BY column_id) AS jt ON c1.id=jt.column_id
WHERE board_id = 'some id here';
In that case in jt table will be grouped all records from tasks table and with big amount of data in tasks it will not work at all (very slow slow speed)
Second query:
SELECT id,
name,
color,
created_at,
CASE
WHEN jt.tc IS NULL THEN 0
ELSE jt.tc
END
FROM columns AS c1
LEFT JOIN
(SELECT count(*) AS tc,
column_id
FROM tasks AS t
LEFT JOIN columns c ON t.column_id = c.id
WHERE c.board_id = 'some id here'
GROUP BY column_id) AS jt ON c1.id=jt.column_id
WHERE board_id = 'some id here';
In that case within jt table will be only those columns which i need, so where clause will cut selection a lot.
Am i right?

Related

Optimize a complex PostgreSQL Query

I am attempting to make a complex SQL join on several tables: as shown below. I have included an image of the dB schema also.
Consider table_1 -
e_id name
1 a
2 b
3 c
4 d
and table_2 -
e_id date
1 1/1/2019
1 1/1/2020
2 2/1/2019
4 2/1/2019
The issue here is performance. From the tables 2 - 4 we only want the most recent entry for a given e_id but because these tables contain historical data (~ >3.5M rows) it's quite slow. I've attached an example of how we're currently trying to achieve this but it only includes one join of 'table_1' with 'table_x'. We group by e_id and get the max date for it. The other way we've thought about doing this is creating a Materialized View and pulling data from that and refreshing it after some period of time. Any improvements welcome.
from fds.region as rg
inner join (
select e_id, name, p_id
from fds.table_1
where sec_type = 'S' AND active_flag = 1
) as table_1 on table_1.e_id = rg.e_id
inner join fds.table_2 table_2 on table_2.e_id = rg.e_id
inner join fds.sec sec on sec.p_id = table_1.p_id
inner join fds.entity ent on ent.int_entity_id = sec.int_entity_id
inner join (
SELECT int_1.e_id, int_1.date, int_1.int_price
FROM fds.table_4 int_1
INNER JOIN (
SELECT e_id, MAX(date) date
FROM fds.table_2
GROUP BY e_id
) int_2 ON int_1.e_id = int_2.fsym_id AND int_1.date = int_2.date
) as table_4 on table_4.e_id = rg.e_id
where rg.region_str like '%US' and ent.sec_type = 'P'
order by table_2.int_price
limit 500;

You can simplify this logic:
(
SELECT int_1.e_id, int_1.date, int_1.int_price
FROM fds.table_4 int_1
INNER JOIN (
SELECT e_id, MAX(date) date
FROM fds.table_2
GROUP BY e_id
) int_2 ON int_1.e_id = int_2.fsym_id AND int_1.date = int_2.date
) as table_4
To:
(SELECT DISTINCT ON (int_1.e_id) int_1.*
FROM fds.table_4 int_1
ORDER BY int_1.e_id, int_1.date DESC
) table_4
This can take advantage of an index on fds.table_4(e_id, date desc) -- and might be wicked fast with such an index.
You also want appropriate indexes for the joins and filtering. However, it is hard to be more specific without an execution plan.

Condition on count of associated records in SQL

I have the following tables (with given columns):
houses (id)
users (id, house_id, active)
custom_values (name, house_id, type)
I want to get all the (distinct) houses and the count of associated users that:
have at least 1 associated custom_value which name column contains the string 'red' (case insensitive) AND the custom_value column type value is 'mandatory'.
have at least 100 associated users which status column is 'active'
How can I run this query in PostgreSQL?
Right now I have this query (which was answered in Get records where associated records name contain a string AND associated record count is bigger than threshold), but I don't know how to select the count of users too (:
select h.*
from houses
where
exists (
select 1
from custom_values cv
where cv.house_id = h.house_id and cv.type = 'mandatory' and lower(cv.name) = 'red'
)
and (
select count(*)
from users u
where u.house_id = h.house_id and u.status = 'active'
) >= 100

You can turn the subquery to a lateral join:
select h.*, u.no_users
from houses h
cross join lateral (
select count(*) no_users
from users u
where u.house_id = h.house_id and u.status = 'active'
) u
where
u.cnt >= 100
and exists (
select 1
from custom_values cv
where cv.house_id = h.house_id and cv.type = 'mandatory' and lower(cv.name) = 'red'
)

Case statement for join condition

Hi I want to use case statement in where condition or some similar logic.
I want to ignore where condition when there are no rows in tmp_collaboration table and use the condition when table has some rows in it.
Select clbid from tmp_collaboration;
select customer_id, product_id ,clbid
from customers c
where hdr_id = 10
and clbid in (select clbid from tmp_collaboration)
and status = 'y';

Is this what you want?
select customer_id, product_id ,clbid
from customers c
where hdr_id = 10 and status = 'y' and
(clbid in (select clbid from tmp_collaboration) or
not exists (select 1 from tmp_collaboration)
);

why not use a JOIN .. if there are rows in table the rows are involved otherwise not .
select customer_id, product_id ,clbid
from customers c
INNER JOIN tmp_collaboration t on t.clbid = c.clbid
AND hdr_id = 10
AND status = 'y';

Catch multiple types of data in SQL Server

I have a table (Task) like this:
Task Table
and I need answer like this:
TaskResult
I am doing the first query like this:
select
StudentID, AdmissionID, EnquiryID, EnquiryDetailsID
from
Task
where
TaskUser = 0 and BranchID = 1
If I'm getting studentID then I create second query in loop for searching the student first name and last name.
elseif I'm getting EnquiryID then I create second query in loop for searching the Enquiry first name and last name.
elseif I'm getting AdmissionID then I create second query in loop for searching the Admission guys first name and last name.
elseif I'm getting EnquiryDetailsID then I create second query in loop for searching the EnquiryDetails first name and last name.
So it creates loop in a loop and I get heavy load time on the page.
I need to combine both queries into one query. So page won't be loading.
I only have two elements i.e. taskUser and BranchID.
Please help me!! Thanks in advance !!!

So - it looks like you have an oddly organized task table, and as a result, you're going to have to do mildly weird things to query right. According to your description, a row in the task table contains either a studentId, an admissionId, an enquiryId, or an enquiryDetailId. This isn't an optimal way to do this...but I understand that sometimes you have to get by with what you have.
So, to get the names, you have to join to the source of the names...and assuming they're all over the place, in related tables, you could do something like:
select
t.StudentID,t.AdmissionID,t.EnquiryID,t.EnquiryDetailsID,x.FirstName,x.LastName
from Task t inner join Student s on t.StudentId = s.Id
union all
select
t.StudentID,t.AdmissionID,t.EnquiryID,t.EnquiryDetailsID,x.FirstName,x.LastName
from Task t inner join Admission a on t.AdmissionId = a.Id
union all
select
t.StudentID,t.AdmissionID,t.EnquiryID,t.EnquiryDetailsID,x.FirstName,x.LastName
from Task t inner join Enquiry e on t.EnquiryId = e.Id
union all
select
t.StudentID,t.AdmissionID,t.EnquiryID,t.EnquiryDetailsID,x.FirstName,x.LastName
from Task t inner join EnquiryDetail d on t.EnquiryDetailId = d.Id
...or, you can accomplish the same thing kinda inside-out:
select
t.StudentID,
t.AdmissionID,
t.EnquiryID,
t.EnquiryDetailsID,
x.FirstName,
x.LastName
from
Task t
inner join
(
select 's' source, Id, FirstName, LastName from Student union all
select 'a' source, Id, FirstName, LastName from Admission union all
select 'e' source, Id, FirstName, LastName from Enquiry union all
select 'd' source, Id, FirstName, LastName from EnquiryDetail
) as x
on
( t.StudentId = x.Id and x.source = 's' )
or
( t.AdmissionId = x.Id and x.source = 'a' )
or
( t.EnquiryId = x.Id and x.source = 'e' )
or
( t.EnquiryDetailId = x.Id and x.source = 'd' )
where
t.TaskUser=0 and t.BranchID=1

Use LEFT JOIN with COALESCE like this:
--not tested
select StudentID, AdmissionID, EnquiryID, EnquiryDetailsID,
COALESCE(s.name, e.name, d.name, ed.name) as name, etc.
from Task t
left join student s on s.id = t.studentID
left join Enquiry e on e.id = t.EnquiryID
left join Admission d on d.id = t.AdmissionID
left join EnquiryDetails ed on ed.id = t.EnquiryDetailsID
where TaskUser=0 and BranchID=1

Query tables and columns based on table data

Without knowing the name of a table and columns, I want to query the database retrieve the table and column names and then query the given tables.
I have an Oracle database schema that is like the following:
Item table:
Item_id, Item_type,
=================
1 box
2 book
3 box
Book table:
Item_id, title, author
===========================
2 'C# Programer', 'Joe'
Box table:
Item_id, Size
=====================
1, 'Large'
3, 'X Large'
Column_mapping table
Item_type, column_name, display_order
=====================================
box, Size, 1
book, title, 1
book, author 2
Table_mapping table:
Item_type, Table_name
========================
box, Box
book, Book
I would like a SQL statement that would give something like the following results:
Item_id, Item_type column1 column2
====================================
1, box, 'Large', <null>
2, book, 'C# Programer', 'Joe'
3, box, 'X Large', <null>
When I tried the simplified query
select *
from
(select Table_name
from Table_mapping
where Item_type = 'box')
where
Item_id = 1;
I get an error that Item_id is invalid identifier
and if I try
select *
from
(select Table_name
from Table_mapping
where Item_type = 'box');
I just get
Table_name
===========
Box
I am not sure how to proceed.

One way is to join both table and then use a coalesce on the column that can contain data from either table
SELECT
i.Item_id,
i.Item_type,
COALESCE(b.title, bx.size) column1,
b.author column2
FROM
Item i
LEFT JOIN Book b
ON i.item_id = b.item_id
LEFT JOIN Box bx
ON i.item_id = bx.item_id
Depending on how large your datasets are you may want to add a filter on the join e.g.
LEFT JOIN Book b
ON i.item_id = b.item_id
and i.item_type = 'book'
LEFT JOIN Box bx
ON i.item_id = bx.item_id
and i.item_type = 'box'
See it work at this SQLFiddle
If you wanted to do something based on the data in table_mapping or column_mapping you'd need to use dynamic SQL

Basically it is two separate queries. One for boxes and one for books. You can use union to merge the result sets together.
select i.Item_id, i.Item_type, b.size, null
from Item i inner join Box b on i.Item_id=b.Item_id
where i.Item_type = "box"
UNION
select i.Item_id, i.Item_type, b.title, b.author
from Item i inner join Book b on i.Item_id=b.Item_id
where i.Item_type = "book"

ORACLE actually stores the table- and column names in its data dictionary, so there is no need for you to maintain those data separately. Try this to get the table names:
SELECT table_name FROM user_tables;
Then do this to get the columns for each table:
SELECT column_name FROM user_tab_columns WHERE table_name = 'MYTABLE';
Once you do that, you will need to create a stored procedure in order to execute dynamic SQL. I don't think you can do this in a plain-vanilla query.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Which query is more performant using groupBy in subquery? - sql

Related

Optimize a complex PostgreSQL Query

Condition on count of associated records in SQL

Case statement for join condition

Catch multiple types of data in SQL Server

Query tables and columns based on table data

Categories

Resources