Many left joins on subqueries, need some way to increase performance

Many left joins on subqueries, need some way to increase performance - sql

Below is an example of my query as it stands. I have at most, approximately 10 of these joins/subqueries all of basically the same format, but with different joins and where clauses.
SELECT DISTINCT mytable.label, tableA.counter, tableB.counter
FROM mytable
LEFT JOIN
(SELECT COUNT(id) as counter, label
FROM mytable
...joins...
...where...
GROUP BY label) tableA
ON tableA.label=mytable.label
LEFT JOIN
(SELECT COUNT(id) as counter, label
FROM mytable
...joins...
...where...
GROUP BY label) tableB
ON tableB.label=mytable.label
...
It's taking about 2-4 seconds and this is a high-traffic page, so that kind of speed isn't good enough. Can anyone recommend a way to improve performance here?

No need to GROUP here, as you're only returning 1 value. Try a subquery approach like this:
SELECT DISTINCT T.label,
(SELECT COUNT(id) as counter FROM tableA A WHERE A.blah = T.blah) as AValue,
(SELECT COUNT(id) as counter FROM tableB B WHERE B.blah = T.blah) as BValue
FROM mytable T

In addition to Jon Tirjan's solution, i'd share another one via using UNION and PIVOT table.
SELECT [A], [B]
FROM (
SELECT 'A' AS TableName, COUNT(id) as counter
FROM tableA
UNION ALL
SELECT 'B' AS TableName, COUNT(id) as counter
FROM tableB
) AS DT
PIVOT(SUM(counter) FOR TableName IN([A], [B])) AS PVT

Related

PostgreSQL: Error in left join

I am trying to join my master table to some sub-tables in PostgreSQL in a single select query. I am getting a syntax error and I have the feeling I am making a terrible mistake or doing something which is not allowed. The code:
Select
id,
length,
other_stuff
from my_table tbl1
Left join
(
Select
id,
height
from my_table2 tbl2) tbl2 using (id)
left join
-- I get syntax error here
(
With a as (select id from some_table),
b as (Select value from other_table)
Select id, value from a, b) tbl3 using (id)
order by tbl1.id
Can we use WITH clause in left joins sub or nested queries and Is there a better way to do this?
UPDATE1
Well, I would like to add some more details. I have three select queries like this (having unique ID) and I want to join them based on ID.
Query1:
With a as (Select id, my_other records... from postgres_table1)
b as (select id, my_records... from postgres_table2)
c as (select id, my_record.. from postgres_table3, b)
Select
id,
my_records
from a left join c on some_condtion_with_a
order by 1
Second query:
Select
id, my_records
from
(
multiple_sub_queries_by_getting_records_from_c
)
Third Query:
With d as (select id, records.. from b),
e as (select id, records.. from d),
f as (select id, records.. from e)
select
id,
records..
from f
I tried to join them using left join. The first two queries were joined successfully. While, joining third query I got the syntax error. Maybe, I am complicating things thus I asked is there a better way to do it.

You are over complicating things. There is no need to use a derived table to outer join my_table2. And there is no need for a CTE plus a derived table to join the tbl3 alias:
Select id,
length,
other_stuff
from my_table tbl1
Left join my_table2 tbl2 using (id)
left join (
select st.id, ot.value
from some_table st
cross join other_table ot
) tbl3 using (id)
order by tbl1.id;
This assumes that the cross join you create with Select id, value from a, b is intended.

Not tested, but I think you need this. try:
with a as (select id from some_table),
b as (Select value from other_table)
Select
id,
length,
other_stuff
from my_table tbl1
Left join
(
Select
id,
height
from my_table2 tbl2
)
tbl2 using (id)
left join
(
Select id, value from a, b
)
tbl3 using (id)
order by tbl1.id

I've only ever seen/used WITH in the following format:
WITH
temptablename(columns) as (query),
temptablename2(columns) as (query),
...
temptablenameX(columns) as (query)
SELECT ...
i.e. they come first
You'll probably find it easier to write queries if you use indentation to describe nesting levels. I like to make my SELECT FROM WHERE GROUPBY ORDERBY at one indent level, and then tablename INNER JOIN ON etc more indented:
SELECT
column
FROM
table
INNER JOIN
(
SELECT subcolumn FROM subtable WHERE subclause
) myalias
ON
table.id = myalias.whatever
WHERE
blah
Organising your indents every time you nest down a layer really helps. By making everything that is "a table or a block of data like a table (i.e. a subquery)" indented the same amount you can easily see the notional order that the DB should retrieve
Move your WITHs to the top of the statement, you will still use the alias names in place in the sub sub query of course
Looking at your query, there isn't much point in your subqueries.. You don't do any grouping or particularly complex processing of the data, you just select an ID and another column and then join it in. Your query will be simpler if you don't do this:
SELECT
column
FROM
table
INNER JOIN
(
SELECT subcolumn FROM subtable WHERE subclause
) myalias
ON
table.id = myalias.whatever
WHERE
blah
Instead, do this:
SELECT
column
FROM
table
INNER JOIN
subtable
ON
table.id = subtable.id
WHERE
blah

Re your updated requirements, following the same pattern.
look for --my comments
With a as (Select id, my_other records... from postgres_table1)
b as (select id, my_records... from postgres_table2)
c as (select id, my_record.. from postgres_table3, b)
d as (select id, records.. from b),
e as (select id, records.. from d),
f as (select id, records.. from e)
SELECT * FROM
(
--your first
Select
id,
my_records
from a left join c on some_condtion_with_a
) Q1
LEFT OUTER JOIN
(
--your second
Select
id, my_records
from
(
multiple_sub_queries_by_getting_records_from_c
)
) Q2
ON Q1.XXXX = Q2.XXXX --fill this in !!!!!!!!!!!!!!!!!!!
LEFT OUTER JOIN
(
--your third
select
id,
records..
from f
) Q3
ON QX.XXXXX = Q3.XXXX --fill this in !!!!!!!!!!!!!!!!!!!
It'll work, but it might not be the prettiest or most necessary SQL arrangement. As both i and HWNN have said, you can rewrite a lot of these queries where you're just doing some simple selecting in your WITH.. But likely that theyre simple enough that the database optimizer can also see this and rerwite the query for you when it runs it
Just remember to code clearly, and lay your indentation out nicely to stop it tunring into a massive, unmaintainable, undebuggable spaghetti mess

Written a subquery that can return more than one field without using the Exists

The query below is supposed to pull records for fields with the max date.
I am getting an error
You have written a subquery that can return more than one field without using EXISTS reserved word in the Main query's FROM clause. Revise the SELECT statement of the subquery to request only one column.
Code:
SELECT *
FROM TableName
WHERE (((([Project_Name], [Date])) IN (SELECT Project_Name, MAX(Date)
FROM TableName
GROUP BY Project)));

Your probably thinking of a nested subquery used as a table, like the below:
select a.*, b.1, b.2
from FirstTable A
join (Select Id, firstcolumn as 1, secondcolumn as 2
from SecondTable) B on b.ID = a.ID
Works pretty much like a regular join except you are using a subquery. Hope that helps,

SELECT A.*
FROM TableName A
INNER JOIN (select Project_Name, max(Date) MaxDate
from TableName
group by Project) B
ON A.[Project_Name] = B.[Project_Name]
AND A.[Date] = B.MaxDate

A version using EXISTS() looks like this:
SELECT *
FROM TableName AS A
WHERE EXISTS(
SELECT * FROM (
SELECT B.Project_Name, MAX( B.Date ) AS MaxDate
FROM TableName AS B
GROUP BY B.Project_Name ) AS C
WHERE C.Project_Name = A.Project_Name AND C.MaxDate = A.Date
);
Although I have the feeling this will have poorer performance than a JOIN because the GROUP BY statement might have to be executed for each record and each call to the EXISTS() function...

get the distinct values for a column in four tables by SQL server 2008, but it very slow

I need to get the distinct values for a column in four tables by SQL server 2008.
All tables have about 8 columns and 80,000 rows. All column values are int, varchar, or double.
The query column is int.
SELECT COUNT(distinct a.id) as a_num_distinc_id,
COUNT(distinct b.id) as b_num_distinc_id,
COUNT(distinct c.id) as c_num_distinc_id,
COUNT(distinct d.id) as d_num_distinc_id
FROM table1 as a, table2 as b
table3 as c, table4 as d
If I get the distinct values for the column for each table one by one, it run fast. But, if I run them together. It run very very slow, even more than 20 minutes.
Why ? thanks !
UPDATE -------------------------------------------------
I have solve the above problem from your answers.
Now, I have a new one, which is related to OP but different.
I have a very large table 1 billion rows and 12 columns, which are int, double, varchar.
I need to know the distinct values for each volumn.
Althought I use
SELECT COUNT(distinct a.id) as num_dist_id
FROM my_large_table as a
It is very slow.
Are there better ways to do that ?

You are doing a humongous cross join on all the tables. Simple rule: Never use a comma in the from clause.
You can get what you want with nested subqueries in the select clause:
SELECT (select COUNT(distinct a.id) from table1 a) as a_num_distinc_id,
(select COUNT(distinct b.id) from table2 b) as b_num_distinc_id,
(select COUNT(distinct c.id) from table3 c) as c_num_distinc_id,
(select COUNT(distinct d.id) from table4 d) as d_num_distinc_id;

Because when you run them together, you're creating a Cartesian product of all the values in all the tables.
Try
select
(select COUNT(distinct a.id) From table1) as a_num_distinc_id,
(select COUNT(distinct b.id) From table2) as b_num_distinc_id,
(select COUNT(distinct c.id) From table3) as c_num_distinc_id,
(select COUNT(distinct d.id) From table4) as d_num_distinc_id

ODBC Firebird Sql Query - Syntax

Trying to get an slightly more complex sql statement structured but can't seem to get the syntax right. Trying to select counts, of various columns, in two different tables.
SELECT
SUM(ColumninTable1),
SUM(Column2inTable1),
COUNT(DISTINCT(Column3inTable1))
FROM TABLE1
This works, however I can't for the life of me figure out how to add in a COUNT(DISTINCT(Column1inTable2) FROM TABLE2 with what syntax.

There are several solutions you can take:
Disjunct FULL OUTER JOIN
SELECT
SUM(MYTABLE.ID) as theSum,
COUNT(DISTINCT MYTABLE.SOMEVALUE) as theCount,
COUNT(DISTINCT MYOTHERTABLE.SOMEOTHERVALUE) as theOtherCount
FROM MYTABLE
FULL OUTER JOIN MYOTHERTABLE ON 1=0
UNION two queries and leave the column for the other table null
SELECT
MAX(theSum) as theSum,
MAX(theCount) as theCount,
MAX(theOtherCount) AS theOtherCount
FROM (
SELECT
SUM(ID) as theSum,
COUNT(DISTINCT SOMEVALUE) as theCount,
NULL as theOtherCount
FROM MYTABLE
UNION ALL
SELECT
NULL,
NULL,
COUNT(DISTINCT SOMEOTHERVALUE)
FROM MYOTHERTABLE
)
Query 'with a query per column' against a single record table (eg RDB$DATABASE)
SELECT
(SELECT SUM(ID) FROM MYTABLE) as theSum,
(SELECT COUNT(DISTINCT SOMEVALUE) FROM MYTABLE) as theCount,
(SELECT COUNT(DISTINCT SOMEOTHERVALUE) FROM MYOTHERTABLE) as theOtherCount
FROM RDB$DATABASE
CTE per table + cross join
WITH query1 AS (
SELECT
SUM(ID) as theSum,
COUNT(DISTINCT SOMEVALUE) as theCount
FROM MYTABLE
),
query2 AS (
SELECT
COUNT(DISTINCT SOMEOTHERVALUE) as theOtherCount
FROM MYOTHERTABLE
)
SELECT
query1.theSum,
query1.theCount,
query2.theOtherCount
FROM query1
CROSS JOIN query2
There are probably some more solutions. You might want to ask yourself if it is worth the effort of coming up with a (convoluted, hard to understand) single query to get this data were two queries are sufficient, easier to understand and in the case of large datasets: two separate queries might be faster.

In this case all "count" would return the same value.
Try to do the same using sub queries:
Select
(Select count (*) from Table1),
   (Select count (*) from table2)
from Table3

two SQL COUNT() queries?

I want to count both the total # of records in a table, and the total # of records that match certain conditions. I can do these with two separate queries:
SELECT COUNT(*) AS TotalCount FROM MyTable;
SELECT COUNT(*) AS QualifiedCount FROM MyTable
{possible JOIN(s) as well e.g. JOIN MyOtherTable mot ON MyTable.id=mot.id}
WHERE {conditions};
Is there a way to combine these into one query so that I get two fields in one row?
SELECT {something} AS TotalCount,
{something else} AS QualifiedCount
FROM MyTable {possible JOIN(s)} WHERE {some conditions}
If not, I can issue two queries and wrap them in a transaction so they are consistent, but I was hoping to do it with one.
edit: I'm most concerned about atomicity; if there are two sub-SELECT statements needed that's OK as long as if there's an INSERT coming from somewhere it doesn't make the two responses inconsistent.
edit 2: The CASE answers are helpful but in my specific instance, the conditions may include a JOIN with another table (forgot to mention that in my original post, sorry) so I'm guessing that approach won't work.

One way is to join the table against itself:
select
count(*) as TotalCount,
count(s.id) as QualifiedCount
from
MyTable a
left join
MyTable s on s.id = a.id and {some conditions}
Another way is to use subqueries:
select
(select count(*) from Mytable) as TotalCount,
(select count(*) from Mytable where {some conditions}) as QualifiedCount
Or you can put the conditions in a case:
select
count(*) as TotalCount,
sum(case when {some conditions} then 1 else 0 end) as QualifiedCount
from
MyTable
Related:
SQL Combining several SELECT results

In Sql Server or MySQL, you can do that with a CASE statement:
select
count(*) as TotalCount,
sum(case when {conditions} then 1 else 0 end) as QualifiedCount
from MyTable
Edit: This also works if you use a JOIN in the condition:
select
count(*) as TotalCount,
sum(case when {conditions} then 1 else 0 end) as QualifiedCount
from MyTable t
left join MyChair c on c.TableId = t.Id
group by t.id, t.[othercolums]
The GROUP BY is there to ensure you only find one row from the main table.

if you are just counting rows you could just use nested queries.
select
(SELECT COUNT(*) AS TotalCount FROM MyTable) as a,
(SELECT COUNT(*) AS QualifiedCount FROM MyTable WHERE {conditions}) as b

In Oracle SQL Developer I had to add a * FROM in my select, or else i was getting a syntax error:
select * FROM
(select COUNT(*) as foo FROM TABLE1),
(select COUNT(*) as boo FROM TABLE2);

MySQL doesn't count NULLs, so this should work too:
SELECT count(*) AS TotalCount,
count( if( field = value, field, null)) AS QualifiedCount
FROM MyTable {possible JOIN(s)} WHERE {some conditions}
That works well if the QuailifiedCount field comes from a LEFT JOIN, and you only care if it exists. To get the number of users, and the number of users that have filled in their address:
SELECT count( user.id) as NumUsers, count( address.id) as NumAddresses
FROM Users
LEFT JOIN Address on User.address_id = Address.id;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Many left joins on subqueries, need some way to increase performance - sql

No need to GROUP here, as you're only returning 1 value. Try a subquery approach like this: SELECT DISTINCT T.label, (SELECT COUNT(id) as counter FROM tableA A WHERE A.blah = T.blah) as AValue, (SELECT COUNT(id) as counter FROM tableB B WHERE B.blah = T.blah) as BValue FROM mytable T

Related

PostgreSQL: Error in left join

Written a subquery that can return more than one field without using the Exists

get the distinct values for a column in four tables by SQL server 2008, but it very slow

ODBC Firebird Sql Query - Syntax

two SQL COUNT() queries?

Categories

Resources