Something like GROUP BY HAVING ALL IN [closed] - sql

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed last month.
Improve this question
I want to select a ProductConfig that has exactly the given variants. As it is a many-to-many relationship I have an association table. With the association table I have been trying to use it with the GROUP BY so I can work on the other column.
The problem I am having is that I need an exactly equal operator to a set of values inside the HAVING. Something like HAVING variant_id = (1, 2, 3, 99).
For now I have the following query with some problems
SELECT productconfig_id
FROM association_productconfig_elementvariant
GROUP BY productconfig_id
HAVING variant_id IN (1, 2, 3, 99);
This will match if productconfig_id has variant_id equal to ANY subset of {1, 2 3, 99} like {1, 2} or {1, 3} but I only want it to match with the exact set {1, 2, 3, 99}.
I have another problem the other way around. If productconfig_id has variant_id equal to {1, 2, 50} it will also match because the first to is in the values even though the last is not.
Basically I want to compare equality over a column and a set of values. This second problem will solve if you had something like HAVING ALL IN.

This is probably more on-target with what you need. Here, I am doing both a COUNT() and a sum() based on the matching variant_id in question. This is making sure that whatever records DO qualify the variant get to the 4 count, but ALSO the count() of every variant per configuration.
So, if one product had variants of (1, 2, 3, 5, 12, 99, 102, 150) would have a count(*) = 8, but the specific match = 4 based on those in question.
Now, if you can ignore the overall count of 8, just remove that AND portion from below, but at least you know the primary 4 in consideration are accounted for.
SELECT
productconfig_id
FROM
association_productconfig_elementvariant
GROUP BY
productconfig_id
HAVING
sum( case when variant_id in ( 1, 2, 3, 99 )
then 1 else 0 end ) = 4
AND count(*) = 4

Could you try this :
Select productconfig_id
From (
SELECT productconfig_id, count(1) as _count
FROM association_productconfig_elementvariant
GROUP BY productconfig_id
HAVING variant_id IN (1, 2, 3, 99)
) as s where _count = 4;
Basicly the only productconfig with count = 4 will be the one you are looking for.

Related

Checking for occurrence of value in postgres - NodeJS

Within my database my data can look one of two ways
1 -
hh_match_count: 5,
hh_total_fhc_0: 6,
hh_total_fhc_1: 5,
hh_total_fhc_2: 3,
hh_total_fhc_3: 2,
hh_total_fhc_4: 4
2 -
hh_match_count: 3,
hh_total_fhc_0: 6,
hh_total_fhc_1: 5,
hh_total_fhc_2: 3,
hh_total_fhc_3: null,
hh_total_fhc_4: null
What I want to do is calculate the number of times a value is >= 1 (will want to expand this to >= 2, >= 3 etc) from each of hh_total_fhc_0, hh_total_fhc_1, hh_total_fhc_2, hh_total_fhc_3, hh_total_fhc_4 and then divide that by hh_match_count. So basically getting the % of occurrences.
What query should I be looking at executing here? Slowly getting more involved with SQL statements.
coalesce returns the first non-null value it's passed. That turns your null values into zeroes, since you need to count them as zero for the average. Next step is to add least to the mix:
SELECT least(1, coalesce(hh_total_fhc_0, 0)) FROM fixtures gives you a 0 if the value is zero (crucially, null isn't a number so least(1, null) is 1!), and a 1 if it's a positive value. Apply that to each of your columns and then you can calculate the hit percentage exactly as you were thinking.

How to get the most recent rows in a group

I have a Rails 4.2.5.x project running PostGres. I have a table with a similar structure to this:
id, contact_id, date, domain, f1, f2, f3, etc
1, ABC, 01-01-16, abc.com, 1, 2, 3, ...
2, ABC, 01-01-15, abc.com, 1, 2, 3, ...
3, ABC, 01-01-14, abc.com, 1, 2, 3, ...
4, DEF, 01-01-15, abc.com, 1, 2, 3, ...
5, DEF, 01-01-14, abc.com, 1, 2, 3, ...
6, GHI, 01-11-16, abc.com, 1, 2, 3, ...
7, GHI, 01-01-16, abc.com, 1, 2, 3, ...
8, GHI, 01-01-15, abc.com, 1, 2, 3, ...
9, GHI, 01-01-14, abc.com, 1, 2, 3, ...
...
...
99, ZZZ, 01-01-16, xyz.com, 1, 2, 3, ...
I need to query to find:
The most recent rows by date
filtered by domain
for a distinct contact_id (grouped by?)
row-limited result. In this example, I'm not adding this complication but this needs to be factored in. If there are 50 distinct contacts, I am only interested in the top 3 by date.
ID is the primary key.
there are indexes on the other columns
the fX columns indicate other data in the model that is needed (such as contact email, for example).
In MySQL, this would be a simple SELECT * FROM table WHERE domain='abc.com' GROUP BY contact_id ORDER BY date DESC, however, PostGres complains, in this case, that:
ActiveRecord::StatementInvalid: PG::GroupingError: ERROR: column "table.id" must appear in the GROUP BY clause or be used in an aggregate function
I expect to get back 3 rows; 1, 4 and 6. Ideally, I'd like to get back the full rows in a single query... but I accept that I may need to do one query to get the IDs first, then another to find the items I want.
This is the closest I have got:
ExampleContacts
.select(:contact_id, 'max(date) AS max_date')
.where(domain: 'abc.com')
.group(:contact_id)
.order('max_date desc')
.limit(3)
However... this returns the contact_id, not the id. I cannot add the ID for the row.
EDIT:
Essentially, I need to get the primary key back for the row which is grouped on the non-primary key and sorted by another field.
If you want the rows, you don't need grouping. It's simply Contact.select('DISTINCT ON (contact_id)').where(domain: 'abc.com').order(date: :desc).limit(3)
Just to clarify #murad-yusufov's accepted answer, I ended up doing this:
subquery = ExampleContacts.select('DISTINCT ON (contact_id) *')
.where(domain: 'abc.com')
.order(contact_id)
.order(date: :desc)
ExampleContacts.from("(#{subquery.to_sql}) example_contacts")
.order(date: :desc)

Can SQL Server perform an update on rows with a set operation on the aggregate max or min value?

I am a fairly experienced SQL Server developer but this problem has me REALLY stumped.
I have a FUNCTION. The function is referencing a table that is something like this...
PERFORMANCE_ID, JUDGE_ID, JUDGING_CRITERIA, SCORE
--------------------------------------------------
101, 1, 'JUMP_HEIGHT', 8
101, 1, 'DEXTERITY', 7
101, 1, 'SYNCHRONIZATION', 6
101, 1, 'SPEED', 9
101, 2, 'JUMP_HEIGHT', 6
101, 2, 'DEXTERITY', 5
101, 2, 'SYNCHRONIZATION', 8
101, 2, 'SPEED', 9
101, 3, 'JUMP_HEIGHT', 9
101, 3, 'DEXTERITY', 6
101, 3, 'SYNCHRONIZATION', 7
101, 3, 'SPEED', 8
101, 4, 'JUMP_HEIGHT', 7
101, 4, 'DEXTERITY', 6
101, 4, 'SYNCHRONIZATION', 5
101, 4, 'SPEED', 8
In this example there are 4 judges (with IDs 1, 2, 3, and 4) judging a performance (101) against 4 different criteria (JUMP_HEIGHT, DEXTERITY, SYNCHRONIZATION, SPEED).
(Please keep in mind that in my real data there are 10+ criteria and at least 6 judges.)
I want to aggregate the results in a score BY JUDGING_CRITERIA and then aggregate those into a final score by summing...something like this...
SELECT SUM (Avgs) FROM
(SELECT AVG(SCORE) Avgs
FROM PERFORMANCE_SCORES
WHERE PERFORMANCE_ID=101
GROUP BY JUDGING_CRITERIA) result
BUT... that is not quite what I want IN THAT I want to EXCLUDE from the AVG the highest and lowest values for each JUDGING_CRITERIA grouping. That is the part that I can't figure out. The AVG should be applied only to the MIDDLE values of the GROUPING FOR EACH JUDGING_CRITERIA. The HI value and the LO value for JUMP_HEIGHT should not be included in the average. The HI value and the LO value for DEXTERITY should not be included in the average. ETC.
I know this could be accomplished with a cursor to set the hi and lo for each criteria to NULL. But this is a FUNCTION and should be extremely fast.
I am wondering if there is a way to do this as a SET operation but still automatically exclude HI and LO from the aggregation?
Thanks for your help. I have a feeling it can probably be done with some advanced SQL syntax but I don't know it.
One last thing. This example is actually a simplification of the problem I am trying to solve. I have other constraints not mentioned here for the sake of simplicity.
Seth
EDIT: -Moved the WHERE clause to inside the CTE.
-Removed JudgeID from the partition
This would be my approach
;WITH Agg1 AS
(
SELECT PERFORMANCE_ID
,JUDGE_ID
,JUDGING_CRITERIA
,SCORE
,MinFind = ROW_NUMBER() OVER ( PARTITION BY PERFORMANCE_ID
,JUDGING_CRITERIA
ORDER BY SCORE ASC )
,MaxFind = ROW_NUMBER() OVER ( PARTITION BY PERFORMANCE_ID
,JUDGING_CRITERIA
ORDER BY SCORE DESC )
FROM PERFORMANCE_SCORES
WHERE PERFORMANCE_ID=101
)
SELECT AVG(Score)
FROM Agg1
WHERE MinFind > 1
AND MaxFind > 1
GROUP BY JUDGING_CRITERIA

Finding matching parents where all children also match

I'm trying to write a sql query in MS SQL Server 2008 that will match parent rows where the parents match and all their children match.
Assuming I have this basic table structure:
ParentTable:
ParentID, Item, Price
ChildTable:
ChildID, ParentID, Accessory, Price
I want to get a grouping of ParentIDs where the parents match on Item and Price and they have the same number of children, each of which match on Accessory and Price.
For example:
ParentTable:
---------------------
1, "Computer", 1000
2, "Stereo", 500
3, "Computer", 500
4, "Computer", 1000
ChildTable:
---------------------
1, 1, "Mouse", 10
2, 1, "Keyboard", 10
3, 2, "Speakers", 50
4, 3, "Keyboard", 10
4, 3, "Mouse", 10
5, 4, "Keyboard", 10
6, 4, "Mouse", 10
The expected results would be something like
ParentID, Grouping
---------------------
1, 1
2, 2
3, 3
4, 1
This would imply that ParentID 1 and 4 are exactly the same and 2 and 3 are unique. I dont really care about the format of the result, as long as I get a list of parents that match.
I'm not opposed to doing (some or all of) this in .net either.
your question is a little ambiguous, but I thought I'd give it a shot anyway.
here goes. Free form SQL. Hard to get it exactly right without access to some DML.
So this would be my general approach. This should work in SQL Server, probably Oracle as well. I'm not claiming this is perfect. My mental schema doesn't match above exactly, I'll leave that as an exercise for the reader. I typed it straight in.
SELECT DISTINCT p.id,p.name,p.dte,q.cnt
FROM parent p
JOIN
(
select p.id, p.dte, count(*) cnt
from parent p
join child ch
on ch.pid = p.id
group by p.id, p.dte
) q
ON p.id=q.id and p.dte=q.dte
GROUP BY p.id,p.name,q.cnt
ORDER BY p.id,p.name,q.cnt
btw: your question is a little ambiguous.
UPDATE:
this function looks promissing for the child rows to csv direction
http://sql-ution.com/function-to-convert-rows-to-csv/
OK if you can do this with temp tables, then this may give you ideas. Off the top of my head, so syntax not checked. Also, this is limited, as bigint typically only goes up to 2**63 or something.
First put unique child accessory and price into #child
create table #child ( accessory varchar, price decimal, id identity,
bnumber bigint null)
insert into #child(accessory, price)
select accesory,price from childtable group by accessory price
assuming id will be 1,2,3,4 etc
update #child set bnumber = 2**(id-1)
sets bnumber to 1,2,4,8 etc (this is where the bigint limitation may kick in). So now you have
mouse, 10,1,1
keyboard,10,2,2
speakers,50,3,4
Now you can sum these numbers by parent
select p.item, sum(ctemp.bnumber)
from parent p, child c, #child ctemp
where p.parentid = c.parentid
and c.accessory = ctemp.accessory
and c.price = ctemp.price
group by p.item
giving
1, 3
2, 4
3, 3
4, 3
..which I think is the answer you want. This is a bit clunky, and it's been a long day(!), but it might help.

SQL COUNT of COUNT

I have some data I am querying. The table is composed of two columns - a unique ID, and a value. I would like to count the number of times each unique value appears (which can easily be done with a COUNT and GROUP BY), but I then want to be able to count that. So, I would like to see how many items appear twice, three times, etc.
So for the following data (ID, val)...
1, 2
2, 2
3, 1
4, 2
5, 1
6, 7
7, 1
The intermediate step would be (val, count)...
1, 3
2, 3
7, 1
And I would like to have (count_from_above, new_count)...
3, 2 -- since three appears twice in the previous table
1, 1 -- since one appears once in the previous table
Is there any query which can do that? If it helps, I'm working with Postgres. Thanks!
Try something like this:
select
times,
count(1)
from ( select
id,
count(distinct value) as times
from table
group by id ) a
group by times