Query not working as expected - sql

Goal
Select distinct ids from blog_news where
active = 1
title is not empty
has at least one picture unless picture is logo, or at least one video
The statement so far
select distinct n.id from blog_news n
left join blog_pics p ON n.id = p.blogid and active = '1' and trim(n.title) != ''
left join blog_vdos v ON n.id = v.blogid
where (p.islogo = '0' and p.id is not null) OR (v.id is not null)
order by `newsdate` desc, `createdate` desc
The issue
selects blog_news ids that have pictures, unless they're logos [correct]
selects blog_news ids that have both videos and pictures [correct]
does not select blog_news ids that have only videos [wrong]

How about this:
SELECT DISTINCT n.id
FROM blog_news n
WHERE n.active = '1'
AND trim(n.title) != ''
AND (EXISTS (SELECT 1
FROM blog_pics p
WHERE p.blogid = n.id
AND p.islogo = 0)
OR EXISTS (SELECT 1
FROM blog_vdos v
WHERE v.blogid = n.id)
)
ORDER BY n.newsdate desc, n.createdate desc
Where you are just interested in the existence (or not) of child rows then it is often clearer and easier to use EXISTS.

I can't see any problem in your query.
I expect active is column in blog_news table, you should call it n.active. If this column is in blog_pics table, then this is the problem.
I would add the condition (n.active, n.title) to WHERE, as it's not related to left join (blog_pics) - but that's just for better readability, the result would be the same.
You can write the query using sub selects as well:
SELECT n.id FROM blog_news n
WHERE n.active = 1 AND TRIM(n.title) != '' AND n.id IN (
SELECT DISTINCT p.blogid FROM blog_pics p WHERE p.islogo = 0 UNION
SELECT DISTINCT v.blogid FROM blog_vdos
);

Related

Match specific or default value on multiple columns

raw_data :
name
account_id
type
element_id
cost
First
1
type1
element1
0.1
Second
2
type2
element2
0.2
First
11
type2
element11
0.11
components:
name
account_id (default = -1)
type (default = null)
element_id (default = null)
cost
First
-1
null
null
0.1
Second
2
type2
null
0.2
First
11
type2
element11
0.11
I seek to check whether the cost logged in raw_data is the same as that in components for a given combination. They need to be joined on column name.
Remaining fields in raw_data are always populated. In components, any row can be a combination of specific values and the default values.
I seek to match the columns from raw_data to components wherever I find a match and otherwise need to use the default value to get the cost.
I failed with left join and union and IN.
E.g. For the first row in raw_data table with name "First", I do not have account_id = 1 in the components table. So I need to go with account_id = -1.
Match as many specific values as found in components, Otherwise resort to default values.
I think one way you could do this is something like:
SELECT *
FROM
(
SELECT rd.name, rd.account_id, rd.type, rd.element_id, rd.cost raw_cost, c.account_id component_account_id, c.type component_type, c.element_id component_element_id, c.cost component_cost,
row_number() OVER (PARTITION BY rd.name, rd.account_id, rd.type, rd.element_id
ORDER BY
CASE WHEN c.account_id <> -1 THEN 1 END
+ CASE WHEN c.type IS NOT NULL THEN 1 END
+ CASE WHEN c.element_id IS NOT NULL THEN 1 END DESC) rd
FROM raw_data rd LEFT OUTER JOIN components c
ON rd.name = c.name
AND (rd.account_id = c.account_id or c.account_id = -1)
AND (rd.type = c.type OR c.type IS NULL)
AND (rd.element_id = c.element_id OR c.element_id IS NULL)
) iq
WHERE rd = 1
The idea here is to match on an actual match or the default. Then the row_number window function is used to prioritize the matches based on a count of how many columns actually matched (you said you don't care about ties, so this doesn't handle that). The outer query throws away the matches that aren't the best.
With the sample data above, this could be an inner join, but I left it as a left join since that's what was mentioned.
Here's a fiddle of it working. Hopefully this is close to what you want.
If the ratio of records in the tables is not 1 to 1, then an unambiguous sample cannot be made.
Also, if the selection condition is "Coincidence of at least one parameter", then it will also not work to make an unambiguous selection.
Below is an example that will bring you closer to solving the problem. It selects data that matches one of the selection criteria, however there may be duplicates!
Try this variant and report the result, possibly on a larger variant of samples and records
Maybe this will get you closer to the solution.
select rd.name, rd.account_id, rd.type, rd.element_id, rd.cost, c.cost
from raw_data rd
left join components c on rd.name = c.name
where (c.account_id = rd.account_id or c.account_id = -1) OR
(c.type = rd.type OR c.type is null) OR
(c.element_id = rd.element_id OR c.element_id = null)
You can build the priority of checking values through union
select rd.name, rd.account_id, rd.type, rd.element_id, rd.cost, c.cost
from raw_data rd
left join components c on rd.name = c.name
where c.account_id = rd.account_id and
c.type = rd.type
c.element_id = rd.element_id
union
select rd.name, rd.account_id, rd.type, rd.element_id, rd.cost, c.cost
from raw_data rd
left join components c on rd.name = c.name
where c.account_id = rd.account_id and
c.type = rd.type
union
select rd.name, rd.account_id, rd.type, rd.element_id, rd.cost, c.cost
from raw_data rd
join components c on rd.name = c.name
where c.account_id = rd.account_id
etc
Without seeing all the problems, all the data options in the tables, it is difficult to give the right solution, which may not be...

Selecting ONLY Duplicates from a joined tables query

I have the following query that I'm trying to join two tables matching their ID so I can get the duplicated values in "c.code". I've tried a lot of queries but nothing works. I have a 500k rows in my database and with this query I only get 5k back, which is not right. Im positive it's at least 200K. I also tried to use Excel but it's too much for it to handle.
Any ideas?
Thanks in advance, everyone.
SELECT c.code, c.name as SCT_Name, t.name as SYNONYM_Name, count(c.code)
FROM database.Terms as t
join database.dbo.Concepts as c on c.ConceptId = t.ConceptId
where t.TermTypeCode = 'SYNONYM' and t.ConceptTypeCode = 'NAME_Code' and c.retired = '0'
Group by c.code, c.name, t.name
HAVING COUNT(c.code) > = 1
Order by c.code
with data as (
select c.code, c.name as SCT_Name, t.name as SYNONYM_Name
from database.Terms as t inner join database.dbo.Concepts as c
on c.ConceptId = t.ConceptId
where
t.TermTypeCode = 'SYNONYM'
and t.ConceptTypeCode = 'NAME_Code'
and c.retired = '0'
)
select *
--, (select count(*) from data as d2 where d2.code = data.code) as code_count
--, count(*) over (partition by code) as code_count
from data
where code in (select code from data group by code having count(*) > 1)
order by code
If you want just duplicates of c.code, your Group By is wrong (and so is your Having clause). Try this:
SELECT c.code
FROM database.Terms as t
join database.dbo.Concepts as c on c.ConceptId = t.ConceptId
where t.TermTypeCode = 'SYNONYM' and t.ConceptTypeCode = 'NAME_Code' and c.retired = '0'
Group by c.code
HAVING COUNT(c.code) > 1
This will return all rows where you have more than one c.code value.
You need to use INTERSECT instead of JOIN. Basically you perform the select on the first table then intersect with the second table. The result is the duplicate rows.
Only select the id column, though, otherwise the intersect won't work as expected.

Display Y/N column if record found in detail table

I'm trying to create a query so that I can have a column show Y/N if a particular item was ordered for a group of orders. The item I'm looking for would be OLI.id = '538'.
So my results would be:
Order#, Customer#, FreightPaid
12345, 00112233, Y
12346, 00112233, N
I cannot figure out if I need to use a subquery or the where exists function ?
Here's my current query:
SELECT distinct
OrderID,
Accountuid as Customerno
FROM [SMILEWEB_live].[dbo].[OrderLog] OL
inner join Orderlog_item OLI on OLI.orderlogkey = OL.[key]
inner join Account A on A.uid = OL.Accountuid
where A.GroupId = 'X9955'
and OL.CreateDate >= GETDATE() - 60
I would suggest an exists clause instead of a join:
select ol.OrderID, ol.Accountuid as Customerno,
(case when exists (select 1
from Orderlog_item OLI join
Account A
on A.uid = OL.Accountuid
where OLI.orderlogkey = OL.[key] and A.GroupId = 'X9955'
)
then 1 else 0
end) as flag
from [SMILEWEB_live].[dbo].[OrderLog] OL
where OL.CreateDate >= GETDATE() - 60;
This prevents a couple of problems. First, duplicate rows which are caused when there are multiple matching rows (and select distinct add unnecessary overhead). Second, missing rows, which happen when you use inner join instead of an outer join.

Limit join to one row

I have the following query:
SELECT sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount, 'rma' as
"creditType", "Clients"."company" as "client", "Clients".id as "ClientId", "Rmas".*
FROM "Rmas" JOIN "EsnsRmas" on("EsnsRmas"."RmaId" = "Rmas"."id")
JOIN "Esns" on ("Esns".id = "EsnsRmas"."EsnId")
JOIN "EsnsSalesOrderItems" on("EsnsSalesOrderItems"."EsnId" = "Esns"."id" )
JOIN "SalesOrderItems" on("SalesOrderItems"."id" = "EsnsSalesOrderItems"."SalesOrderItemId")
JOIN "Clients" on("Clients"."id" = "Rmas"."ClientId" )
WHERE "Rmas"."credited"=false AND "Rmas"."verifyStatus" IS NOT null
GROUP BY "Clients".id, "Rmas".id;
The problem is that the table "EsnsSalesOrderItems" can have the same EsnId in different entries. I want to restrict the query to only pull the last entry in "EsnsSalesOrderItems" that has the same "EsnId".
By "last" entry I mean the following:
The one that appears last in the table "EsnsSalesOrderItems". So for example if "EsnsSalesOrderItems" has two entries with "EsnId" = 6 and "createdAt" = '2012-06-19' and '2012-07-19' respectively it should only give me the entry from '2012-07-19'.
SELECT (count(*) * sum(s."price")) AS amount
, 'rma' AS "creditType"
, c."company" AS "client"
, c.id AS "ClientId"
, r.*
FROM "Rmas" r
JOIN "EsnsRmas" er ON er."RmaId" = r."id"
JOIN "Esns" e ON e.id = er."EsnId"
JOIN (
SELECT DISTINCT ON ("EsnId") *
FROM "EsnsSalesOrderItems"
ORDER BY "EsnId", "createdAt" DESC
) es ON es."EsnId" = e."id"
JOIN "SalesOrderItems" s ON s."id" = es."SalesOrderItemId"
JOIN "Clients" c ON c."id" = r."ClientId"
WHERE r."credited" = FALSE
AND r."verifyStatus" IS NOT NULL
GROUP BY c.id, r.id;
Your query in the question has an illegal aggregate over another aggregate:
sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount
Simplified and converted to legal syntax:
(count(*) * sum(s."price")) AS amount
But do you really want to multiply with the count per group?
I retrieve the the single row per group in "EsnsSalesOrderItems" with DISTINCT ON. Detailed explanation:
Select first row in each GROUP BY group?
I also added table aliases and formatting to make the query easier to parse for human eyes. If you could avoid camel case you could get rid of all the double quotes clouding the view.
Something like:
join (
select "EsnId",
row_number() over (partition by "EsnId" order by "createdAt" desc) as rn
from "EsnsSalesOrderItems"
) t ON t."EsnId" = "Esns"."id" and rn = 1
this will select the latest "EsnId" from "EsnsSalesOrderItems" based on the column creation_date. As you didn't post the structure of your tables, I had to "invent" a column name. You can use any column that allows you to define an order on the rows that suits you.
But remember the concept of the "last row" is only valid if you specifiy an order or the rows. A table as such is not ordered, nor is the result of a query unless you specify an order by
Necromancing because the answers are outdated.
Take advantage of the LATERAL keyword introduced in PG 9.3
left | right | inner JOIN LATERAL
I'll explain with an example:
Assuming you have a table "Contacts".
Now contacts have organisational units.
They can have one OU at a point in time, but N OUs at N points in time.
Now, if you have to query contacts and OU in a time period (not a reporting date, but a date range), you could N-fold increase the record count if you just did a left join.
So, to display the OU, you need to just join the first OU for each contact (where what shall be first is an arbitrary criterion - when taking the last value, for example, that is just another way of saying the first value when sorted by descending date order).
In SQL-server, you would use cross-apply (or rather OUTER APPLY since we need a left join), which will invoke a table-valued function on each row it has to join.
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
-- CROSS APPLY -- = INNER JOIN
OUTER APPLY -- = LEFT JOIN
(
SELECT TOP 1
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(#in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(#in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
) AS FirstOE
In PostgreSQL, starting from version 9.3, you can do that, too - just use the LATERAL keyword to achieve the same:
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
LEFT JOIN LATERAL
(
SELECT
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(__in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(__in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
LIMIT 1
) AS FirstOE
Try using a subquery in your ON clause. An abstract example:
SELECT
*
FROM table1
JOIN table2 ON table2.id = (
SELECT id FROM table2 WHERE table2.table1_id = table1.id LIMIT 1
)
WHERE
...

MySql scoping problem with correlated subqueries

I'm having this Mysql query, It works:
SELECT
nom
,prenom
,(SELECT GROUP_CONCAT(category_en) FROM
(SELECT DISTINCT category_en FROM categories c WHERE id IN
(SELECT DISTINCT category_id FROM m3allems_to_categories m2c WHERE m3allem_id = 37)
) cS
) categories
,(SELECT GROUP_CONCAT(area_en) FROM
(SELECT DISTINCT area_en FROM areas c WHERE id IN
(SELECT DISTINCT area_id FROM m3allems_to_areas m2a WHERE m3allem_id = 37)
) aSq
) areas
FROM m3allems m
WHERE m.id = 37
The result is:
nom prenom categories areas
Man Multi Carpentry,Paint,Walls Beirut,Baalbak,Saida
It works correclty, but only when i hardcode into the query the id that I want (37).
I want it to work for all entries in the m3allem table, so I try this:
SELECT
nom
,prenom
,(SELECT GROUP_CONCAT(category_en) FROM
(SELECT DISTINCT category_en FROM categories c WHERE id IN
(SELECT DISTINCT category_id FROM m3allems_to_categories m2c WHERE m3allem_id = m.id)
) cS
) categories
,(SELECT GROUP_CONCAT(area_en) FROM
(SELECT DISTINCT area_en FROM areas c WHERE id IN
(SELECT DISTINCT area_id FROM m3allems_to_areas m2a WHERE m3allem_id = m.id)
) aSq
) areas
FROM m3allems m
And I get an error:
Unknown column 'm.id' in 'where
clause'
Why?
From the MySql manual:
13.2.8.7. Correlated Subqueries
[...]
Scoping rule: MySQL evaluates from inside to outside.
So... do this not work when the subquery is in a SELECT section? I did not read anything about that.
Does anyone know? What should I do? It took me a long time to build this query... I know it's a monster query but it gets what I want in a single query, and I am so close to getting it to work!
Can anyone help?
You can only correlate one level deep.
Use:
SELECT m.nom,
m.prenom,
x.categories,
y.areas
FROM m3allens m
LEFT JOIN (SELECT m2c.m3allem_id,
GROUP_CONCAT(DISTINCT c.category_en) AS categories
FROM CATEGORIES c
JOIN m3allems_to_categories m2c ON m2c.category_id = c.id
GROUP BY m2c.m3allem_id) x ON x.m3allem_id = m.id
LEFT JOIN (SELECT m2a.m3allem_id,
GROUP_CONCAT(DISTINCT a.area_en) AS areas
FROM AREAS a
JOIN m3allems_to_areas m2a ON m2a.area_id = a.id
GROUP BY m2a.m3allem_id) y ON y.m3allem_id = m.id
WHERE m.id = ?
The reason for the error is that in the subquery m is not defined. It is defined later in the outer query.