Hi I'm having trouble writing a query. I will be very glad to receive some advice!
My table is called TagObjects; it has 5 columns, but the 3 important to the problem are: tob_tag, tob_object and tob_objectType.
What I need to achieve with the query is the following: with an unknown number of pairs (tob_object, tob_objectType) I need to know all the tob_tag all the pairs have in common.
I have tried with this query (the numbers are just as an example):
SELECT "TagsObjects"."tob_tag"
FROM "TagsObjects"
WHERE TRUE
AND ("TagsObjects"."tob_object" = 8 AND "TagsObjects"."tob_objecttype" = 1)
AND ("TagsObjects"."tob_object" = 9 AND "TagsObjects"."tob_objecttype" = 1)
GROUP BY "TagsObjects"."tob_tag";
The WHERE TRUE is there because I'm building the query dynamically. This query works for one pair (one AND in the WHERE clause), when I tried it with two pairs (like the example I post above) it doesn't return any rows (and the data is there!).
If someone knows what I'm doing wrong or a way to do this it will be a BIG HELP!
Using PostgreSQL 9.0.1.
When building your optional clauses, do them using this pattern
SELECT "TagsObjects"."tob_tag"
FROM "TagsObjects"
WHERE FALSE
OR ("TagsObjects"."tob_object" = 8 AND "TagsObjects"."tob_objecttype" = 1)
OR ("TagsObjects"."tob_object" = 9 AND "TagsObjects"."tob_objecttype" = 1)
GROUP BY "TagsObjects"."tob_tag";
However, if there are no conditions at all, then add OR TRUE to the list, so it becomes
SELECT "TagsObjects"."tob_tag"
FROM "TagsObjects"
WHERE FALSE
OR TRUE
GROUP BY "TagsObjects"."tob_tag";
As for this part
I need to know all the tob_tag all the pairs have in common.
If you only want tob_tags that have both 8/1 and 9/1 (or more combinations), then you need a GROUP BY and HAVING clause.
SELECT "TagsObjects"."tob_tag"
FROM "TagsObjects"
WHERE FALSE
OR ("TagsObjects"."tob_object" = 8 AND "TagsObjects"."tob_objecttype" = 1)
OR ("TagsObjects"."tob_object" = 9 AND "TagsObjects"."tob_objecttype" = 1)
GROUP BY "TagsObjects"."tob_tag"
HAVING COUNT(*) = 2;
Let's start with what's wrong. You are trying to find a row where both "TagsObjects"."tob_object" = 8 and "TagsObjects"."tob_object" = 9. The "TagsObjects"."tob_object" cannot be both at the same time so no rows cannot be returned.
What you should do then?
From you specification I gather that there are several pairs of ("TagsObjects"."tob_object", "TagsObjects"."tob_objectType") where neither field is a constant. You want to create an union of all rows that are returned for each pair.
WITH matchingTagsObjects AS (
SELECT "TagsObjects"."tob_tag"
FROM "TagsObjects"
WHERE ("TagsObjects"."tob_object" = 8 AND "TagsObjects"."tob_objecttype" = 1)
UNION ALL
SELECT "TagsObjects"."tob_tag"
FROM "TagsObjects"
WHERE ("TagsObjects"."tob_object" = 9 AND "TagsObjects"."tob_objecttype" = 1)
)
SELECT "TagsObjects"."tob_tag"
FROM matchingTagsObjects
GROUP BY "TagsObjects"."tob_tag";
The named subquery matchingTgsObjects lists all tob_tags that are found for pairs (8,1) and (8,2). The actual tags are selected in the main query and distinct tob_tags are selected using the group by clause as with you solution. I used UNION ALL because the grouping is done in the main query and I didn't find any reason to prune duplicate rows in the subquery at this point. You can achieve that by leaving out the ALL from UNION ALL.
You can also include the subquery directly in the from part instead of using a named subquery.
There's also an alternative that you use OR conditions in the where clause as in WHERE (matches pair A) OR (matches pair B). A co-worker of mine would go ballistic if he saw that used: when an OR is needed to for matching in this kind of scenario it tells that there might be something to be done with actual model.
Related
I am getting different number of results when I have the script like the following:
select count(distinct(t1.ticketid)),t1.TicketStatus from ticket as t1
inner join Timepoint as t2 on t1.TicketID=t2. ticketid
where
t2.BuilderAnalystID=10 and t1.SubmissionDT >='04-01-2018' AND
(t1.TicketBuildStatusID<>12 OR
t1.TicketBuildStatusID<>11 OR
t1.TicketBuildStatusID<>10
)
And when I use it like this:
select count(distinct(t1.ticketid)),t1.TicketStatus from ticket as t1
inner join Timepoint as t2 on t1.TicketID=t2. ticketid
where
t2.BuilderAnalystID=10 and t1.SubmissionDT >='04-01-2018' AND
t1.TicketBuildStatusID<>12 AND
t1.TicketBuildStatusID<>11 AND
t1.TicketBuildStatusID<>10
Can someone tell me why there is a difference, to me the logic is the same!
Thanks,
In your example, it won't matter because you have all AND clauses. That said, you need to be aware of precedence (ie order of operations) where NOT comes before AND, AND comes before OR and so on.
So just like 3 + 3 x 0 means 3 + (3 x 0), A or B and C means A or (B and C), even if that's not what you meant.
So in cases where you have mixed AND and OR clauses, it matters a lot.
Consider this example:
select *
from A, B
where A.id = B.id and A.family_code = 'ABC' or A.family_code = 'DEF'
It's horrible code, I admit, but for illustrative purposes, bear with me.
You may have meant this:
select *
from A, B
where A.id = B.id and (A.family_code = 'ABC' or A.family_code = 'DEF')
but you said this:
select *
from A, B
where (A.id = B.id and A.family_code = 'ABC') or A.family_code = 'DEF'
Which in the construct above completely blows away your join, resulting in a cartesian product for all cases where the family code is DEF.
So bottom line: when you mix clauses (AND, OR, NOT), it's best to use parentheses to be explicit about what you mean, even when it's not necessary.
Food for thought.
-- EDIT --
The question was changed after I wrote this so that the queries were NOT the same (ands were changed to ors).
Hopefully my explanation still helps.
After the edited to your question there will now be a difference.
t2.BuilderAnalystID=10 and t1.SubmissionDT >='04-01-2018' AND
(t1.TicketBuildStatusID<>12 OR
t1.TicketBuildStatusID<>11 OR
t1.TicketBuildStatusID<>10
)
This query will return values where t1.TicketBuildStatusID is 10, 11 and 12. It states that it should not be 10 (so 11 and 12), or not be 11 (so 10 and 11), or not be 12 (so 10 and 11).
Yes, those queries will produce different results. In fact, the first query will return every value of TicketBuildStatusID unless it has a value of NULL.
When TicketBuildStatusID has a value or 12 it doesn't have a value of 11 or 12 so the expression (t1.TicketBuildStatusID<>12 OR t1.TicketBuildStatusID<>11 OR t1.TicketBuildStatusID<>10), is true. If it has a value of 11, then the same applies again, and for every other possible value, apart from NULL (as {expression}<>NULL = NULL which is not true).
when you do this
AND
(t1.TicketBuildStatusID<>12 OR
t1.TicketBuildStatusID<>11 OR
t1.TicketBuildStatusID<>10)
you are basically doing no filter because any of the condition evaluated to true will make all the condition true e.i.
true AND (true or false or false) = true
when you do this all conditions should match like status should not be 12,11,10
AND
t1.TicketBuildStatusID<>12 AND
t1.TicketBuildStatusID<>11 AND
t1.TicketBuildStatusID<>10
OR isn't the logic that you want. Because if x = 12, then it is not 11. So, all values match x <> 12 and x <> 11.
So, just simply the logic and use not in:
select count(distinct t1.ticketid), t1.TicketStatus
from ticket t1 inner join
Timepoint t2
on t1.TicketID = t2.ticketid
where t2.BuilderAnalystID = 10 and
t1.SubmissionDT >= '2018-04-01' and
t1.TicketBuildStatusID not in (12, 11, 10)
Notes:
distinct is not a function, so there is no need to place the following expression in parentheses.
Use standard date formats. Either 'YYYYMMDD' or 'YYYY-M-DD'.
I have the following table. I'm trying to get the rows that met my specific condition.
Table look as follows:
account|transactiontypecode|
-------|-------------------|
1000058| 8|
1000067| 2|
1000067| 8|
The query output would retrieve only the account 1000058, as it applies to the transactiontypecode 8. The other account applies too, but it also has another transactiontypecode that does not applies.
So requirement would be to get the accounts that meet specifics transaction codes, and excludes accounts even though it can also have the code required but has codes unwanted too.
This was my guess over above issue, among others, but I think that other eyes may guide me on a better direction.
with cte1 as (
select
gp.account,
case
when gp.transactiontypecode in (2,8,17) then TRUE
else false
end as txcheck
from
gp.t2001 gp
group by
1, 2)
select
account,
txcheck
from
cte1
where
txcheck is true and txcheck is not false;
If anyone can help me achieve above requirement, would be great!
Just use not exists if you want the entire rows:
select t.*
from gp.t2001 t
where t.transactiontypecode = 8 and
not exists (select 1
from gp.t2001 t2
where t2.account = t.account and
t2.transactiontypecode <> 8
);
Or aggregation if you just want the account:
select t.account
from gp.t2001 t
group by t.account
having min(transactiontypecode) = max(transactiontypecode) and
min(transactiontypecode) = 8;
You can use aggregation in a HAVING clause checking the count of codes to be exactly one and the code is 8 -- wrap it in e.g. max(), if there's only one value the maximum is that one value:
SELECT gp.account
FROM gp.t2001 gp
GROUP BY gp.account
HAVING count(gp.transactiontypecode) = 1
AND max(gp.transactiontypecode) = 8;
Or, if it is allowed that the code of 8 can occur multiple times for an account and you want all of them not having any other code, change it using conditional aggregation to count the codes of 8 and compare it to the overall count of codes. If they match they're all 8:
...
HAVING count(CASE
WHEN gp.transactiontypecode = 8 THEN
1
END) = count(*);
Another option, if the code may occur more than once is to use NOT EXISTS to check for other rows with another code:
SELECT DISTINCT
gp1.account
FROM gp.t2001 gp1
WHERE gp1.transactiontypecode = 8
AND NOT EXISTS (SELECT *
FROM gp.t2001 gp2
WHERE gp2.account = gp1.account
AND gp2.transactiontypecode <> 8);
Try something like this:
SELECT account
FROM [Table]
GROUP BY account
HAVING
COUNT(transactiontypecode) = 1 AND
transactiontypecode = 8
The COUNT inside having clause should give you the accounts with only 1 transaction type code. Then, you can apply any other condition.
I want the following sql statement to give me different results based on whether it finds a result within the specific string or it needs to search the whole parent group (so within the table there are for example:
column 1, column 2
a - 1
a - 2
a - 3
b - 5
b - 7
b - 1
so if it can find the result if i put 1 it will display me
a -1
b -1
. the problem is that in the where clause exist both the parent group and the child group
i have tried to use case and also to simulate an if with ands and ors but it didn't work
select 1,
aapv.aapv_keyext1,
aapv.aapv_area,
aapv.aapv_valuecharmax,
aapv.aapv_valuechardefault,
aapv.aapv_valuecharmin, aap.aap_ident
from a_parameter_value aapv,
a_parameter aap
where aap.aap_ident in (string1,string2,string3)
and aap.aap_ref = aapv.aap_ref
and aap.aap_idento = string4
and ((aapv.Aapv_Keyext1 = 'LaD1' --child clause
and aapv.aapv_keyext1 is not null)
or aapv.Aapv_Area = 'LSDe' --parent clause
and aapv.Aapv_Area is null)
I expect the output to be if the aapv_keyext1 value finds any results then the appv_area is not used at all but either only the child clause is used with the above code or both if i remove the is null clause
Okay, you need to provide more information for us to give you a real answer, but I wanted to point out that this section has some logic problems:
and ((aapv.Aapv_Keyext1 = 'LaD1' --child clause
and aapv.aapv_keyext1 is not null)
or aapv.Aapv_Area = 'LSDe' --parent clause
and aapv.Aapv_Area is null)
The first part is saying aapv_keyext1 = 'LaD1' AND aapv_keyext1 is not null; the second half can never be false, so it's redundant. The second part is saying aapv_area = 'LSDe' AND aapv_area is null. This will never be true. So this whole section is equivalent to:
and (aapv.aapv_keyext1 = 'LaD1')
Which probably isn't what you want. You say you want "if the aapv_keyext1 value finds any results then the appv_area is not used at all". I suspect what you mean is that "if any results exist for aapv_keyext1 in any rows then don't use aapv_area" which is more complicated, you need a subquery (or analytic/aggregate functions) to look at what other rows are doing.
select 1,
aapv.aapv_keyext1,
aapv.aapv_area,
aapv.aapv_valuecharmax,
aapv.aapv_valuechardefault,
aapv.aapv_valuecharmin, aap.aap_ident
from a_parameter_value aapv,
a_parameter aap
where aap.aap_ident in (string1,string2,string3)
and aap.aap_ref = aapv.aap_ref
and aap.aap_idento = string4
and (-- prefer keyext1
aapv.Aapv_Keyext1 = 'LaD1'
OR
-- if keyext1 doesn't find results...
(NOT EXISTS (select 1 from a_parameter_value aapv2
where aapv2.aap_ident = aap.aap_ident
and aap2.aap_ref = aap.aap_ref
and aap2.aap_idento = aap.aap_idento
and aapv.Aapv_Keyext1 = 'LaD1')
AND
-- ... use aapv_area
aapv.Aapv_Area = 'LSDe')
);
You can also do this kind of conditional logic with CASE statements, but you're still going to need a subquery or something if you want your logic to depend on the values in rows other than the one currently being looked at.
Let me know if I've misunderstood your question and I'll try to update with a better answer.
just having a problem using the AND operator in SQL as it returns a zero result set.
I have the following table structure:
idcompany, cloudid, cloudkey, idsearchfield, type, userValue
Now I execute the following statement:
SELECT *
FROM filter_view
WHERE
(idsearchfield = 4 and compareResearch(userValue,200) = true)
AND (idsearchfield = 6 and compareResearch(userValue,1) = true)
compareResearch ist just a function that casts the userValue and compares it to the other value and returns true if the value is equal or greater. UserValue is actually stored as a string (that's a decision made 6 years ago)
Okay, I get a zero resultset which is because both criterias in braces () are AND combined and one row can only have one idsearchfield and therefor one of the criterias won't match.
How do I get around this? I NEED the AND Comparison, but it won't work out this way.
I hope my problem is obvious :-)
If you've recognised that both conditions can't ever both be true, in what way can the AND comparison be the correct one?
select *
from filter_view
where (idsearchfield = 4 and compareResearch(userValue,200) = true)
OR (idsearchfield = 6 and compareResearch(userValue,1) = true)
This will return 2 rows (or more). Or are you looking for some way to correlate these two rows so that they appear as a single row?
Okay, so making a tonne of assumptions, because you haven't included enough information in your question.
filter_view returns a number of columns, one of which is some form of record identifier (lets call that ID). It also includes the aforementioned idsearchfield and userValue columns.
What you actually want to find is those id values, for which one row of filter_view has idsearchfield = 4 and compareResearch(userValue,200) = true and another row of filter_view has idsearchfield = 6 and compareResearch(userValue,1) = true
The general term for this is "relational division". In this simple case, and assuming that id/idsearchfield are unique in this view, we can answer it with:
select id,COUNT(*)
from filter_view
where (idsearchfield = 4 and compareResearch(userValue,200) = true)
OR (idsearchfield = 6 and compareResearch(userValue,1) = true)
group by id
having COUNT(*) = 2
If this doesn't answer your question, you're going to have to add more info to your question, including sample data, and expected results.
I'm trying to retrieve the "Best" possible entry from an SQL table.
Consider a table containing tv shows:
id, title, episode, is_hidef, is_verified
eg:
id title ep hidef verified
1 The Simpsons 1 True False
2 The Simpsons 1 True True
3 The Simpsons 1 True True
4 The Simpsons 2 False False
5 The Simpsons 2 True False
There may be duplicate rows for a single title and episode which may or may not have different values for the boolean fields. There may be more columns containing additional info, but thats unimportant.
I want a result set that gives me the best row (so is_hidef and is_verified are both "true" where possible) for each episode. For rows considered "equal" I want the most recent row (natural ordering, or order by an abitrary datetime column).
3 The Simpsons 1 True True
5 The Simpsons 2 True False
In the past I would have used the following query:
SELECT * FROM shows WHERE title='The Simpsons' GROUP BY episode ORDER BY is_hidef, is_verified
This works under MySQL and SQLite, but goes against the SQL spec (GROUP BY requiring aggragates etc etc). I'm not really interested in hearing again why MySQL is so bad for allowing this; but I'm very interested in finding an alternative solution that will work on other engines too (bonus points if you can give me the django ORM code for it).
Thanks =)
In some way similar to Andomar's but this one really works.
select C.*
FROM
(
select min(ID) minid
from (
select distinct title, ep, max(hidef*1 + verified*1) ord
from tbl
group by title, ep) a
inner join tbl b on b.title=a.title and b.ep=a.ep and b.hidef*1 + b.verified*1 = a.ord
group by a.title, a.ep, a.ord
) D inner join tbl C on D.minid = C.id
The first level tiebreak converts bits (SQL Server) or MySQL boolean to an integer value using *1, and the columns are added to produce the "best" value. You can give them weights, e.g. if hidef > verified, then use hidef*2 + verified*1 which can produce 3,2,1 or 0.
The 2nd level looks among those of the "best" scenario and extracts the minimum ID (or some other tie-break column). This is essential to reduce a multi-match result set to just one record.
In this particular case (table schema), the outer select uses the direct key to retrieve the matched records.
This is basically a form of the groupwise-maximum-with-ties problem. I don't think there is a SQL standard compliant solution. A solution like this would perform nicely:
SELECT s2.id
, s2.title
, s2.episode
, s2.is_hidef
, s2.is_verified
FROM (
select distinct title
, episode
from shows
where title = 'The Simpsons'
) s1
JOIN shows s2
ON s2.id =
(
select id
from shows s3
where s3.title = s1.title
and s3.episode = s1.episode
order by
s3.is_hidef DESC
, s3.is_verified DESC
limit 1
)
But given the cost of readability, I would stick with your original query.