Many to many query with a count of condition > 0 - sql

I have 3 tables:
Emails
Foo
EmailFoos (The many to many join table)
Foo can be complete or not.
I need to find the set of emails where the count of completed foos > 0 (and the inverse, but I can probably do that ;)
I tried something like:
SELECT e.id, e.address, count(l.foo_id) as foo_count
FROM emails e
LEFT JOIN (
SELECT l.foo_id, l.email_id
FROM foo_emails l
JOIN foos f ON (l.foo_id = f.id AND f.status = 'complete')) l ON
(l.email_id = e.id)
GROUP BY e.id
HAVING foo_count > 0;
But it keeps telling me that foo_count does not exist. My SQL_FU is weak. Postgress is the DBMS. Any help is much appreciated.

I'm pretty sure you just need to replace the foo_count in the having with count(l.foo_id).

Related

SQL : How can i join different tables from competition of football?

The problem
I have to make a SQL (postgres) query, to show the names of players, whose teams participate in the "Primeira Liga" and the "Taça de Portugal". Also, include the names of players of "Académica" club and all of teams founded in the 1940s.
Image of Database design
I'm really stuck and can't figure out, how to connect all tables. Does anyone know how to solve the problem?
Thanks in advance.
Updated (10/01/2021):
** I had to change the names to match my DB, so everyone can understand.
(SELECT DISTINCT j.nome
FROM jogador as j
INNER JOIN equipa as E
ON E.id = j.equipa_id
INNER JOIN competicao_equipa as eq
ON E.id = eq.equipa_id
WHERE (eq.competicao_id = 1 OR eq.competicao_id = 3))
UNION
(SELECT j.nome
FROM jogador as j
where equipa_id = 1)
UNION
(SELECT j.nome
FROM jogador as j,equipa
where equipa.fundacao BETWEEN '01/01/1940' AND '31/12/1949')

Additional select if one select result is not null

I am interested in the results found in table g, which shares a key, sample_name, with tables s and l. In this question the tables are
s - samples,
p - projects,
l - analyses, and
g, analysis g,
all within schema a.
In the interest of optimization, I only want to look for table g after having confirmed that l.analysis_g is NOT NULL.
Given: The only information that I start out with is the project names. The project table, p is linked with other tables by the samples table s. s is linked to every table. Table l contains types of analysis and each column is either NULL or 1.
In the example below I am trying a case but I realize this may be totally incorrect.
SELECT s.sample_name,
s.project_name,
g.*
FROM a.samples s
JOIN a.analyses l
ON s.sample_name = l.sample_name
JOIN a.analysis_g g
ON s.sample_name = g.sample_name
WHERE s.project_name IN (SELECT p.project_name
FROM a.projects p
WHERE p.project_name_other
IN ('PROJ_1',
'PROJ_2'))
;
Then perhaps in the where clause? It's still really hard to understand what you want . . .
SELECT s.sample_name,
s.project_name,
g.*
FROM a.samples s
JOIN a.analyses l
ON s.sample_name = l.sample_name
JOIN a.analysis_g g
ON s.sample_name = g.sample_name
WHERE s.project_name IN (SELECT p.project_name
FROM a.projects p
WHERE p.project_name_other
IN ('PROJ_1',
'PROJ_2'))
and l.analysis_g IS NOT NULL
;
As a side note, I think you could join p.project_name and avoid the where clause. AND I think you might want some inner joins -- but I'm not sure.
SELECT s.sample_name,
s.project_name,
g.*
FROM a.samples s
JOIN a.analyses l ON s.sample_name = l.sample_name
JOIN a.analysis_g g ON s.sample_name = g.sample_name
JOIN a.projects p ON s.project_name = p.project_name
WHERE p.project_name_other IN ('PROJ_1', 'PROJ_2')
and l.analysis_g IS NOT NULL
Again: Please show an example! We can't help if we have to guess, but I'll give it a try...
If l.analysis_g contains an ID from table g, then you can just use:
SELECT * FROM g
JOIN l on g.id = l.analysis_g
WHERE blah, blah, blah...
I removed your WHERE clause because you haven't provided enough information to allow anyone to help optimize it (if needed).

Refactoring slow SQL query

I currently have this very very slow query:
SELECT generators.id AS generator_id, COUNT(*) AS cnt
FROM generator_rows
JOIN generators ON generators.id = generator_rows.generator_id
WHERE
generators.id IN (SELECT "generators"."id" FROM "generators" WHERE "generators"."client_id" = 5212 AND ("generators"."state" IN ('enabled'))) AND
(
generators.single_use = 'f' OR generators.single_use IS NULL OR
generator_rows.id NOT IN (SELECT run_generator_rows.generator_row_id FROM run_generator_rows)
)
GROUP BY generators.id;
An I'm trying to refactor it/improve it with this query:
SELECT g.id AS generator_id, COUNT(*) AS cnt
from generator_rows gr
join generators g on g.id = gr.generator_id
join lateral(select case when exists(select * from run_generator_rows rgr where rgr.generator_row_id = gr.id) then 0 else 1 end as noRows) has on true
where g.client_id = 5212 and "g"."state" IN ('enabled') AND
(g.single_use = 'f' OR g.single_use IS NULL OR has.norows = 1)
group by g.id
For reason it doesn't quite work as expected(It returns 0 rows). I think I'm pretty close to the end result but can't get it to work.
I'm running on PostgreSQL 9.6.1.
This appears to be the query, formatted so I can read it:
SELECT gr.generators_id, COUNT(*) AS cnt
FROM generators g JOIN
generator_rows gr
ON g.id = gr.generator_id
WHERE gr.generators_id IN (SELECT g.id
FROM generators g
WHERE g.client_id = 5212 AND
g.state = 'enabled'
) AND
(g.single_use = 'f' OR
g.single_use IS NULL OR
gr.id NOT IN (SELECT rgr.generator_row_id FROM run_generator_rows rgr)
)
GROUP BY gr.generators_id;
I would be inclined to do most of this work in the FROM clause:
SELECT gr.generators_id, COUNT(*) AS cnt
FROM generators g JOIN
generator_rows gr
ON g.id = gr.generator_id JOIN
generators gg
on g.id = gg.id AND
gg.client_id = 5212 AND gg.state = 'enabled' LEFT JOIN
run_generator_rows rgr
ON g.id = rgr.generator_row_id
WHERE g.single_use = 'f' OR
g.single_use IS NULL OR
rgr.generator_row_id IS NULL
GROUP BY gr.generators_id;
This does make two assumptions that I think are reasonable:
generators.id is unique
run_generator_rows.generator_row_id is unique
(It is easy to avoid these assumptions, but the duplicate elimination is more work.)
Then, some indexes could help:
generators(client_id, state, id)
run_generator_rows(id)
generator_rows(generators_id)
Generally avoid inner selects as in
WHERE ... IN (SELECT ...)
as they are usually slow.
As it was already shown for your problem it's a good idea to think of SQL as of set- theory.
You do NOT join tables on their sole identity:
In fact you take (SQL does take) the set (- that is: all rows) of the first table and "multiply" it with the set of the second table - thus ending up with n times m rows.
Then the ON- clause is used to (often strongly) reduce the result by simply selecting each one of those many combinations by evaluating this portion to either true (take) or false (drop). This way you can chose any arbitrary logic to select those combinations in favor.
Things get trickier with LEFT JOIN and RIGHT JOIN, but one can easily think of them as to take one side for granted:
output the combinations of that row IF the logic yields true (once at least) - exactly like JOIN does
output exactly ONE row, with 'the other side' (right side on LEFT JOIN and vice versa) consisting of ALL NULL for every column.
Count(*) is great either, but if things getting complicated don't stick to it: Use Sub- Selects for the keys only, and once all the hard word is done join the Fun- Stuff to it. Like in
SELECT SUM(VALID), ID
FROM SELECT
(
(1 IF X 0 ELSE) AS VALID, ID
FROM ...
)
GROUP BY ID) AS sub
JOIN ... AS details ON sub.id = details.id
Difference is: The inner query is executed only once. The outer query does usually have no indices left to work with and will be slow, but if the inner select here doesn't make the data explode this is usually many times faster than SELECT ... WHERE ... IN (SELECT..) constructs.

SQL filtering, select specific values count when condition is met

Maybe I am just being dumb, but for some reason, this is working and yes, it works great but we wanted to add a condition with a sub table but maintainig same format.
BEGIN
SELECT
v.[id]
,v.[Vacante]
,v.[deptoId]
,v.[StatusId]
,v.[scholarYearId]
,v.[tipoVacanteId]
,v.[detalle]
,v.[createdDate]
,v.[createdBy]
,d.nombre as DeptoNombre
,s.nombre as statusNombre
,y.nombre as scholarYearNombre
,t.nombre as tipoVacanteNombre
,count(uv.id) as totalCandidatos
FROM
[dbo].[tbl_vacantes] v
LEFT JOIN
tbl_usuariosPorVacante uv on v.id = uv.vacanteId
--LEFT JOIN
-- dbo.[tbl_user] u on uv.userId=u.id
INNER JOIN
dbo.[tbl_depto] d ON d.Id = v.[deptoId]
INNER JOIN
dbo.[tbl_status] s ON s.Id = v.[statusId]
LEFT JOIN
tbl_scholarYear y ON v.scholarYearId=y.Id
LEFT JOIN
tbl_tipoVacante t ON v.tipoVacanteId=t.Id
--WHERE
-- u.progressId =3 OR u.progressId is null --Solo usuarios que ya temrinaron su proceso.
GROUP BY
v.[id]
,v.[Vacante]
,v.[deptoId]
,v.[StatusId]
,v.[scholarYearId]
,v.[tipoVacanteId]
,v.[detalle]
,v.[createdDate]
,v.[createdBy]
,d.nombre
,s.nombre
,y.nombre
,t.nombre
ORDER BY
v.id DESC
END
What we want to do, is in totalCandidatos (the count) keep a count, yes, but only count when dbo.[tbl_user] u has progressId = 3 and 4. Since now, it is counting all kind of progressId.
I know, it may be dumb. But Im stuck in this one.
Thanks!
You can use sum(case when <condition> then 1 else 0 end) to count the number of records returned that meet a certain criteria.

SQL Query Filtering COUNT without using HAVING

I'm having trouble with this SQL query. My goal is the retrieve the ContactIDs of Contacts who live in CT or MA and have had more than 2 events.
Here is the query I'm trying to use:
SELECT `Contacts`.`ContactID`
FROM (`Contacts`)
JOIN `Events` ON `Contacts`.`ContactID` = `Events`.`ContactID`
JOIN `Contact_Addresses` ON `Contacts`.`ContactID` = `Contact_Addresses`.`ContactID`
WHERE `Contact_Addresses`.`State` IN ('CT', 'MA') AND COUNT(Events.EventID) > 2
I know I could use the group by statement HAVING. Like so:
...WHERE `Contact_Addresses`.`State` IN ('CT', 'MA')
HAVING COUNT(Events.EventID) > 2
But this doesn't give me the correct results that I'm looking for. I know I'm close, I think maybe I need a subquery added in? Any guidance in the direction I should go would be a huge help.
...this doesn't give me the correct results that I'm looking for.
That doesn't give us any idea what the problem is.
Use:
SELECT c.contactid
FROM CONTACTS c
WHERE EXISTS(SELECT NULL
FROM EVENTS e
WHERE e.contactid = c.contactid
GROUP BY e.eventid
HAVING COUNT(e.eventid) > 2)
AND EXISTS(SELECT NULL
FROM CONTACT_ADDRESSES ca
WHERE ca.contactid = c.contactid
AND ca.state IN ('CT', 'MA'))
Try changing your query to:
SELECT `Contacts`.`ContactID`
FROM (`Contacts`)
JOIN `Events` ON `Contacts`.`ContactID` = `Events`.`ContactID`
JOIN `Contact_Addresses` ON `Contacts`.`ContactID` = `Contact_Addresses`.`ContactID`
WHERE `Contact_Addresses`.`State` IN ('CT', 'MA')
GROUP BY `Contacts`.`ContactID`
HAVING COUNT(DISTINCT Events.EventID) > 2
You need to group on ContactID to count the number of events per contact, and you need to count distinct EventID values in case a Contact has more than one address in CT or MA.
You should be using a GROUP BY clause as well, otherwise COUNT(Events.EventID) will be returning 1 in all cases:
SELECT ...
FROM ...
JOIN ....
WHERE ...
GROUP BY Events.ContactID
You only need to use backticks on your field names if they happen to be a reserved word, fyi.