Join top n records from child table - sql

I am struggling to figure out how to select only the first 4 records from a child table per record in the parent table in a master-detail relationship.
Tables example:
Product ProductImage
--------- -------------------
Id | Name PKeyFld1 | PKeyFld2
--------- -------------------
1 | Apple 1 | 1
2 | Banana 1 | 2
3 | Cranberry 1 | 3
1 | 4
1 | 5
2 | 1
2 | 2
2 | 3
3 | 1
3 | 3
3 | 4
3 | 8
3 | 9
The primary key for ProductImage is a combination of the two shown fields. I need to get the first 4 images per product, ordered by PKeyFld1, PKeyFld2, which would result in:
ProductImage
-------------------
PKeyFld1 | PKeyFld2
-------------------
1 | 1
1 | 2
1 | 3
1 | 4
2 | 1
2 | 2
2 | 3
3 | 1
3 | 3
3 | 4
3 | 8
The nicest solution would be to have only one query with 1 record per product, but I can also deal with two queries; 1 for the products and 1 for the images. In C#, I can fetch them and add the image data to the model before further processing it.
Can somebody help me with the query for the productImages? The hard part is in getting only the top 4 images per product, without limiting the whole ProductImage table to only 4 records. I have done this with Postgres in the past, but cannot find how to do this in SQL Server.

You can use row_number() for solve this greatest-n-per-group problem. As far as concerns, you don't need to involve the Product table.
select PKeyFld1, PKeyFld2
from (
select t.*, row_number() over(partition by PKeyFld1 order by PKeyFld2) rn
from ProductImage t
) t
where rn <= 4
order by PKeyFld1, PKeyFld2

Related

Select max value from column for every value in other two columns

I'm working on a webapp that tracks tvshows, and I need to get all episodes id's that are season finales, which means, the highest episode number from all seasons, for all tvshows.
This is a simplified version of my "episodes" table.
id tvshow_id season epnum
---|-----------|--------|-------
1 | 1 | 1 | 1
2 | 1 | 1 | 2
3 | 1 | 1 | 3
4 | 1 | 2 | 1
5 | 1 | 2 | 2
6 | 2 | 1 | 1
7 | 2 | 1 | 2
8 | 2 | 1 | 3
9 | 2 | 1 | 4
10 | 2 | 2 | 1
11 | 2 | 2 | 2
The expect output:
id
---|
3 |
5 |
9 |
11 |
I've managed to get this working for the latest season but I can't make it work for all seasons.
I've also tried to take some ideas from this but I can't seem to find a way to add the tvshow_id in there.
I'm using Postgres v10
SELECT Id from
(Select *, Row_number() over (partition by tvshow_id,season order by epnum desc) as ranking from tbl)c
Where ranking=1
You can use the below SQL to get your result, using GROUP BY with sub-subquery as:
select id from tab_x
where (tvshow_id,season,epnum) in (
select tvshow_id,season,max(epnum)
from tab_x
group by tvshow_id,season)
Below is the simple query to get desired result. Below query is also good in performance with help of using distinct on() clause
select
distinct on (tvshow_id,season)
id
from your_table
order by tvshow_id,season ,epnum desc

SQL Query for Count of distinct values in a column for every distinct value in other column

Scenario:
I had a datatable containing names of different companies. For each distinct company from that table, I would like to calculate number of locations its existing.
Sample data:
Cmpny.COMPANY ID | Cmpny.OFFICELOCATION
1 | 1
1 | 1
1 | 2
1 | 3
2 | 1
2 | 4
2 | 4
2 | 2
Result Required:
COMPANY ID | OFFICELOCATION | Count(OfficeLocation)
1 | 1 | 2
1 | 2 | 1
1 | 3 | 1
2 | 1 | 1
2 | 2 | 1
2 | 4 | 2
ALL
The GROUP BY statement is often used with aggregate functions (COUNT, MAX, MIN, SUM, AVG) to group the result-set by one or more columns.
So here aggregate function Count(CompanyOfficeLocation) which count depends on CompanyID then Office Location.
The following SQL statement count the Office Location by CompanyID
select CompanyID,CompanyOfficeLocation,
count(CompanyOfficeLocation) [Count(OfficeLocation)]
FROM [dbo].[Compny]
group by CompanyID,CompanyOfficeLocation
order by CompanyID,CompanyOfficeLocation

First two rows per combination of two columns

Given a table like this in PostgreSQL:
Messages
message_id | creating_user_id | receiving_user_id | created_utc
-----------+------------------+-------------------+-------------
1 | 1 | 2 | 1424816011
2 | 3 | 2 | 1424816012
3 | 3 | 2 | 1424816013
4 | 1 | 3 | 1424816014
5 | 1 | 3 | 1424816015
6 | 2 | 1 | 1424816016
7 | 2 | 1 | 1424816017
8 | 1 | 2 | 1424816018
I want to get the newest two rows per creating_user_id/receiving_user_id where the other user_id is 1. So the result of the query should look like:
message_id | creating_user_id | receiving_user_id | created_utc
-----------+------------------+-------------------+-------------
1 | 1 | 2 | 1424816011
4 | 1 | 3 | 1424816014
5 | 1 | 3 | 1424816015
6 | 2 | 1 | 1424816016
Using a window function with row_number() I can get the first 2 messages for each creating_user_id or the first 2 messages for each receiving_user_id, but I'm not sure how to get the first two messages for per creating_user_id/receiving_user_id.
Since you filter rows where one of both columns is 1 (and irrelevant), and 1 happens to be the smallest number of all, you can simply use GREATEST(creating_user_id, receiving_user_id) to distill the relevant number to PARTITION BY. (Else you could employ CASE.)
The rest is standard procedure: calculate a row number in a subquery and select the first two in the outer query:
SELECT message_id, creating_user_id, receiving_user_id, created_utc
FROM (
SELECT *
, row_number() OVER (PARTITION BY GREATEST (creating_user_id
, receiving_user_id)
ORDER BY created_utc) AS rn
FROM messages
WHERE 1 IN (creating_user_id, receiving_user_id)
) sub
WHERE rn < 3
ORDER BY created_utc;
Exactly your result.
SQL Fiddle.

ActiveRecord select records based on uniqueness of two attributes in combination

My table looks like this:
ID | Multiple | Itemlist_ID | Inventory_ID
----------------------------------------------------------------
1 | 1 | 1 | 1
2 | 1 | 1 | 2
3 | 1 | 1 | 3
4 | 1 | 4 | 2
5 | 1 | 4 | 3
6 | 2 | 4 | 2
7 | 2 | 4 | 3
How do I retrieve records with unique combo of Multiple and Itemlist_ID? For example below:
ID | Multiple | Itemlist_ID | Inventory_ID
----------------------------------------------------------------
1 | 1 | 1 | 1
4 | 1 | 4 | 2
6 | 2 | 4 | 2
Note, this is retrieving for a View where the Inventory_ID won't be shown, so I'm not concerned whether I get back records [1,4,6] or [1,5,7] or [2,4,6]. Using a first command is fine.
You can perform a GROUP BY query in Activerecord using this syntax (replace Model with your class name):
Model.group('Multiple', 'Itemlist_ID').select('Multiple', 'Itemlist_ID')
This would retrieve all unique combinations of (Multiple, Itemlist_ID).
You could optionally add aggregate operations on columns other than the two grouped columns, such as SUM, AVG, COUNT, etc. For example, if you wanted to know how many records are in each group:
Model.group('Multiple', 'Itemlist_ID').select('Multiple', 'Itemlist_ID', 'COUNT(1) AS group_count')
This would add an attribute to the result called 'group_count' that would be the number of records contained in each group

Getting detail with the highest priority using Joins/subqueries

I hope you can help me with this problem: I have three tables, similar to this:
ORDER
Order_ID | Order_Date
=====================
1 | 01/01/2001
2 | 02/01/2001
3 | 03/01/2001
4 | 04/01/2001
5 | 05/01/2001
ORDER_DETAIL
Order_Detail_ID | Order_ID | Status_ID
======================================
1 | 1 | 1
2 | 1 | 1
3 | 1 | 2
4 | 2 | 2
5 | 2 | 3
6 | 3 | 3
7 | 3 | 3
STATUS
Status_ID | Status_Name | Status_Priority
=========================================
1 | PENDING | 3
2 | COMPLETED | 2
3 | CANCELLED | 1
Now, as I suppose it shows, each row in the ORDER_DETAIL table is related to the ORDER table using Order_ID, and it also has a status indicated by the Status_ID. Also, the STATUS table has a Status_Priority column. What I need to do is show each order, along with, among other columns, the status with highest priority among the order details each order has, like this:
Order_ID | Order_Date | Status_Name
===================================
1 | 01/01/2001 | PENDING
2 | 02/01/2001 | COMPLETED
3 | 03/01/2001 | CANCELLED
4 | 04/01/2001 |
5 | 05/01/2001 |
In this case, for example, since Order_ID 1 has at least 1 Order_Detail_ID with the PENDING status, which has the highest priority among the details it has, that's the one that appears. I tried using a JOIN with a subquery, based on a similar code I have, but I can't seem to adapt it to this case. Any help will be much appreciated. Thanks in advance.
Select columns, join tables by order_id, order by Status_Priority