SQL Server aggregation with several joins

SQL Server aggregation with several joins - sql

I'm trying to join data from 3 tables in SQL Servre and display in result:
Alias of an entity
if the entity is virtual
the last date (if known)
the value (if known)
I tried this :
select
sr.alias, c.virtual, max(d.date) date
from
App_references sr
join
Sensor c on (c.id_capteur = sr.id_capteur)
left join
Sensor_data d on (c.id_capteur = d.id_capteur)
group by
d.id_capteur, sr.alias, c.virtual
order by
sr.alias
Here is the database scheme:
CREATE TABLE [dbo].[App_reference]
(
[id_ref] [int] IDENTITY(1,1) NOT NULL,
[alias] [varchar](60) NOT NULL,
[id_capteur] [int] NOT NULL,
CONSTRAINT [PK_App_reference] PRIMARY KEY CLUSTERED
(
[id_ref] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[Sensor]
(
[id_capteur] [int] IDENTITY(1,1) NOT NULL,
[name] [varchar](50) NOT NULL,
[virtual] [tinyint] NULL,
[unite] [varchar](5) NULL,
[id_type] [int] NOT NULL,
CONSTRAINT [PK_Sensor] PRIMARY KEY CLUSTERED
(
[id_capteur] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[Sensor_data]
(
[id_entry] [int] IDENTITY(1,1) NOT NULL,
[id_capteur] [int] NOT NULL,
[value] [xml] NOT NULL,
[date] [datetime] NOT NULL,
CONSTRAINT [PK_Sensor_data] PRIMARY KEY CLUSTERED
(
[id_entry] ASC,
[id_capteur] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
Supposing each columns like "id_%" are linked by foreign key.
The request on top pass well, I got value :
alias virtual date
Place 1 (Physique) 0 2017-04-27 14:58:42.423
Place 2 1 NULL
Place 3 1 NULL
But I tried to select the value too by doing this :
select
sr.alias, c.virtual, max(d.date) date, d.value
from
Citopia_test.dbo.Smartparking_reference sr
join
Citopia_test.dbo.Sensor c on (c.id_capteur = sr.id_capteur)
left join
Citopia_test.dbo.Sensor_data d on (c.id_capteur = d.id_capteur)
group by
d.id_capteur, sr.alias, c.virtual
order by
sr.alias
And I got this error :
Column 'Sensor_data.value' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
So I tried several things like adding column in the group by but nothing changes.

You probably want the value from the record with max date. Use ROW_NUMBER to get those records.
select alias, virtual, date, value
from
(
select
sr.alias, c.virtual, d.date, d.value,
row_number() over (partition by sr.alias order by d.date desc) as rn
from Citopia_test.dbo.Smartparking_reference sr
join Citopia_test.dbo.Sensor c on (c.id_capteur = sr.id_capteur)
left join Citopia_test.dbo.Sensor_data d on (c.id_capteur = d.id_capteur)
) numbered
where rn = 1
order by sr.alias;
This gets you one row per sr.alias. If you want one row per sr.alias + c.virtual then change the partition by clause accordingly.

Related

SQL Server: aggregate error on grouping

I have a table called tasks, which lists different tasks a worker can complete. Then i have a relationship table that links a completed task to a worker.
I'm trying to write query that groups the tasks into a list based on the worker id, but the query gives me the following error (see below).
Column 'mater.dbo.worker_task_completion.FK_task_id' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Tables
CREATE TABLE [dbo].[tasks]
(
[task_id] [int] IDENTITY(1,1) NOT NULL,
[name] [nvarchar](50) NOT NULL,
[icon] [nvarchar](max) NULL,
[isActive] [int] NOT NULL,
[time] [int] NOT NULL,
CONSTRAINT [PK_tasks] PRIMARY KEY CLUSTERED
(
[task_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
CREATE TABLE [dbo].[worker_task_completion]
(
[FK_worker_id] [int] NOT NULL,
[FK_task_id] [int] NOT NULL,
[update_date] [datetime] NOT NULL,
CONSTRAINT [PK_worker_task_completion] PRIMARY KEY CLUSTERED
(
[FK_worker_id] ASC,
[FK_task_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Query
SELECT top 100 wtc.FK_worker_id,
tasks = Stuff((SELECT ', ' + dbo.tasks.NAME
FROM dbo.tasks
WHERE dbo.tasks.task_id =
wtc.FK_task_id
FOR xml path ('')), 1, 1, '')
FROM dbo.worker_task_completion AS wtc
LEFT JOIN dbo.tasks AS tc
ON tc.task_id = wtc.fk_task_id
-- WHERE wtc.FK_worker_id IN ()
GROUP BY wtc.FK_worker_id

Hmmm. You cannot use a non-aggregated column for the correlation clause. The solution is to move the JOIN into the subquery:
SELECT top 100 wtc.FK_worker_id,
stuff((SELECT ', ' + t.NAME
FROM dbo.worker_task_completion wtc2 JOIN
dbo.tasks t
ON t.task_id = wtc2.FK_task_id
WHERE wtc2.FK_worker_id = wtc.FK_worker_id
FOR xml path ('')
), 1, 2, ''
) as tasks
FROM (SELECT DISTINCT wtc.FK_worker_id
FROM dbo.worker_task_completion wtc
) wtc
-- WHERE wtc.FK_worker_id IN ()
Note that I changed the second argument for STUFF(). Presumably, you want to remove the space as well as the comma.

;WITH CTE_worker_task_completion
AS (SELECT [FK_worker_id]
,[FK_task_id]
FROM [dbo].[worker_task_completion])
------------------------------------------
SELECT [WT].[FK_worker_id]
,[Task]=Stuff(
(
SELECT ', '+[T].[Name]
FROM [dbo].[tasks] AS [T]
INNER JOIN CTE_worker_task_completion AS [WTC]
ON [T].[Task_ID]=[WTC].[Task_ID]
Where [WT].[Worker_ID]=[WTC].[Worker_ID]
ORDER BY [T].[Name] FOR XML PATH('')
),1,1,'')
FROM CTE_worker_task_completion AS [WT]
GROUP BY [WT].[FK_worker_id];

how do I select a record which is not present in other related table using sql?

I have 3 tables like below
CREATE TABLE [dbo].[Languages](
[Title] [nvarchar](50) NOT NULL,
[Description] [nvarchar](200) NULL,
[Id] [uniqueidentifier] NOT NULL,
CONSTRAINT [PK_Language] PRIMARY KEY NONCLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)
CREATE TABLE [dbo].[LocalizationKeys](
[Key] [nvarchar](500) NOT NULL,
[Id] [uniqueidentifier] NOT NULL,
CONSTRAINT [PK_LocalizationKey] PRIMARY KEY NONCLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)
CREATE TABLE [dbo].[LocalizationValues](
[Value] [nvarchar](500) NOT NULL,
[LanguageId] [uniqueidentifier] NULL,
[LocalizationKeyId] [uniqueidentifier] NULL,
[Id] [uniqueidentifier] NOT NULL,
CONSTRAINT [PK_LocalizationValue] PRIMARY KEY NONCLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)
with following data
Languages
Title Description Id
en-US English C3B95465-0B0B-4B97-8C60-19D611F6A185
fi-FI Finnish 7EDF3C04-3846-4B01-8045-315E26D05CD1
LocalizationKeys
Key Id
Save A4B7E6DE-BBCE-40FB-8F0E-FE1034B2CAAF
LocalizationValues
Value LanguageId LocalizationKeyId Id
Save C3B95465-0B0B-4B97-8C60-19D611F6A185 A4B7E6DE-BBCE-40FB-8F0E-FE1034B2CAAF 6EB167F5-550B-435E-B6B3-35A02F21F630
Now here in above LocalizationValues table value for language fi-FI is missing for LocalizationKeyId='A4B7E6DE-BBCE-40FB-8F0E-FE1034B2CAAF'
so I want to find it out by giving LanguageId which LocalizationKeyId
is missing for that LanguageId
I have tested this query for record - it is not working
select locvalue.LocaliZationKeyId, locvalue.LanguageId
from languages lang left join localizationvalues locvalue
on lang.Id = locvalue.LanguageId
left join [LocalizationKeys] locKey
on lockey.Id = locvalue.LocalizationKeyId

You can use NOT EXISTS for this:
SELECT [Key], Id
FROM LocalizationKeys AS lk
WHERE NOT EXISTS (SELECT 1
FROM Languages AS l
JOIN LocalizationValues AS lv ON l.Id = lv.LanguageId
WHERE l.Title = 'fi-FI' AND lv.LocalizationKeyId = lk.Id)
The above query returns the LocalizationKeyId value that is missing for a specified LanguageId.
Demo here

try below query...
select * from Languages l
inner join LocalizationKeys lk on 1=1
left join LocalizationValues lv on lv.LanguageId = l.id and lv.LocalizationKeyId = lk.id
if you find LocalizationValues fields values is null then its missing

Figure Out Which OrderIDs are 0$ Payment Totals

I am in need to some help writing a SQL 2012 query that will help me find and mark orderID's that are a $0.00 payments due to reversal(s)
So far I have:
Select Distinct a.orderID, a.orderPaid,
(Select SUM((c1.linePrice + c1.lineShippingCost + c1.lineTaxCost + c1.lineOptionCost) * c1.lineQuantity)
From vwSelectOrderLineItems c1 Where c1.orderID = a.orderID) As OrderAmount,
(Select SUM(b1.payAmount) FROM vwSelectOrderPayments b1 Where b1.orderID = a.orderID) as Payment,
1 As IsReversal
From vwSelectOrders a
Left Outer Join vwSelectOrderPayments b On b.orderID = a.orderID
Where b.payValid = 1 AND a.orderPaid = 0
Which is returning me some $0 payments on some orders. When I query that payment table with the orderID of these records, I can see that 2 payments were posted... 1 the original payment, 2 the reversal.
How Can I flag the Orders that are $0 payments?
Oders
CREATE TABLE [dbo].[TblOrders](
[orderID] [bigint] IDENTITY(1000,1) NOT NULL,
[orderPaid] [bit] NOT NULL,
[orderPaidOn] [datetime] NULL
CONSTRAINT [PK_TblOrders] PRIMARY KEY CLUSTERED
(
[orderID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 50) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [dbo].[TblOrders] ADD CONSTRAINT [DF__TblOrders__order__1975C517] DEFAULT ((0)) FOR [orderPaid]
Order Line Items
CREATE TABLE [dbo].[TblOrderLineItems](
[lineID] [bigint] IDENTITY(1,1) NOT NULL,
[orderID] [bigint] NOT NULL,
[lineQuantity] [int] NOT NULL,
[linePrice] [money] NOT NULL,
[lineShippingCost] [money] NOT NULL,
[lineTaxCost] [money] NOT NULL,
[lineOptionCost] [money] NOT NULL,
CONSTRAINT [PK_TblOrderLineItems] PRIMARY KEY CLUSTERED
(
[lineID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 50) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [dbo].[TblOrderLineItems] ADD CONSTRAINT [DF_TblOrderLineItems_lineShippingCost] DEFAULT ((0)) FOR [lineShippingCost]
GO
ALTER TABLE [dbo].[TblOrderLineItems] ADD CONSTRAINT [DF_TblOrderLineItems_lineTaxCost] DEFAULT ((0)) FOR [lineTaxCost]
GO
ALTER TABLE [dbo].[TblOrderLineItems] ADD CONSTRAINT [DF_TblOrderLineItems_lineOptionCost] DEFAULT ((0)) FOR [lineOptionCost]
GO
Order Payments
CREATE TABLE [dbo].[TblOrderPayments](
[paymentID] [bigint] IDENTITY(1,1) NOT NULL,
[orderID] [bigint] NOT NULL,
[payAmount] [money] NOT NULL,
[payPosted] [datetime] NOT NULL,
[payValid] [bit] NOT NULL,
CONSTRAINT [PK_TblOrderPayments] PRIMARY KEY CLUSTERED
(
[paymentID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 50) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [dbo].[TblOrderPayments] ADD CONSTRAINT [DF_TblOrderPayments_payValid] DEFAULT ((0)) FOR [payValid]
GO
Views
CREATE VIEW [dbo].[vwSelectOrderLineItems] AS
SELECT linePrice, lineShippingCost, lineTaxCost, lineOptionCost, lineQuantity
FROM [dbo].[TblOrderLineItems]
CREATE VIEW [dbo].[vwSelectOrderPayments] AS
SELECT paymentID, orderID, payAmount, payValid
FROM dbo.TblOrderPayments
CREATE VIEW [dbo].[vwSelectOrders] AS
SELECT orderID , orderPaid
FROM dbo.TblOrders
Note
I cannot change the table structure

SELECT distinct a.orderid,
a.orderPaid,
c.OrderAmount
d.Payment
From vwSelectOrders AS a
INNER JOIN ( Select SUM((linePrice + lineShippingCost + lineTaxCost + lineOptionCost) * lineQuantity) As orderAmount,OrderID
From vwSelectOrderLineItems group by orderid) AS C on c.orderID = a.orderID
INNER JOIN (Select SUM(payAmount) as Payment,orderID FROM vwSelectOrderPayments WHERE isnull(SUM(PayAmount),0) > 0 GROUP BY OrderID) AS d ON d.orderID = a.orderID
Left Outer Join vwSelectOrderPayments b On b.orderID = a.orderID
Where b.payValid = 1 AND a.orderPaid = 0 AND
This is a better query as you do not have to us a correlated subquery. Correlated queries are when a subquery references an outerquery row. This isn't optimal because every row the outerquery runs the correlated subquery will execute. Once you give us table definitions we can probably fix the overall data return of your query.

Optimizing CLUSTERED INDEX for use with JOIN

table optin_channel_1 (for each 'channel' there's a dedicated table)
CREATE TABLE [dbo].[optin_channel_1](
[key_id] [bigint] NOT NULL,
[valid_to] [datetime] NOT NULL,
[valid_from] [datetime] NOT NULL,
[key_type_id] [int] NOT NULL,
[optin_flag] [tinyint] NOT NULL,
[source_proc_id] [int] NOT NULL,
[date_inserted] [datetime] NOT NULL
) ON [PRIMARY]
CREATE CLUSTERED INDEX [ix_id] ON [dbo].[optin_channel_1]
(
[key_type_id] ASC,
[key_id] ASC,
[valid_to] ASC,
[valid_from] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
table profile_conns
CREATE TABLE [dbo].[profile_conns](
[profile_key_id] [bigint] NOT NULL,
[valid_to] [datetime] NOT NULL,
[valid_from] [datetime] NOT NULL,
[conn_key_id] [bigint] NOT NULL,
[conn_key_type_id] [int] NOT NULL,
[conn_type_id] [int] NOT NULL,
[source_proc_id] [int] NOT NULL,
[date_inserted] [datetime] NOT NULL
) ON [PRIMARY]
CREATE CLUSTERED INDEX [ix_id] ON [dbo].[profile_conns]
(
[profile_key_id] ASC,
[conn_key_type_id] ASC,
[conn_key_id] ASC,
[valid_to] ASC,
[valid_from] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
table lu_channel_conns
CREATE TABLE [dbo].[lu_channel_conns](
[channel_id] [int] NOT NULL,
[conn_type_id] [int] NOT NULL,
CONSTRAINT [PK_lu_channel_conns] PRIMARY KEY CLUSTERED
(
[channel_id] ASC,
[conn_type_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
table lu_conn_type
CREATE TABLE [dbo].[lu_conn_type](
[conn_type_id] [int] NOT NULL,
[default_key_type_id] [int] NOT NULL,
[master_key_type_id] [int] NOT NULL,
[date_inserted] [datetime] NOT NULL,
CONSTRAINT [PK_lu_conns] PRIMARY KEY CLUSTERED
(
[conn_type_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
view v_source_proc_id_by_group_id
SELECT DISTINCT x.source_proc_id, x.source_proc_group_id
FROM lu_source_proc x INNER JOIN lu_source_proc_group y ON x.source_proc_group_id = y.group_id
There's a dynamic SQL statement going to be executed:
SET #sql_str='SELECT #ret=MAX(o.optin_flag)
FROM optin_channel_'+CAST(#channel_id AS NVARCHAR(100))+' o
INNER HASH JOIN dbo.v_source_proc_id_by_group_id y ON o.source_proc_id=y.source_proc_id AND y.source_proc_group_id=#source_proc_group_id
INNER HASH JOIN profile_conns z ON z.profile_key_id=cast(#profile_key_id AS NVARCHAR(100)) AND z.conn_key_type_id=o.key_type_id AND z.conn_key_id=o.[key_id] AND z.valid_to=''01.01.3000''
INNER HASH JOIN lu_channel_conns x ON x.channel_id=#channel_id AND z.conn_type_id=x.conn_type_id
INNER HASH JOIN lu_conn_type ct ON ct.conn_type_id=x.conn_type_id AND ct.default_key_type_id=o.key_type_id'
SET #param='#channel_id INT, #profile_key_id INT, #source_proc_group_id INT, #ret NVARCHAR(400) OUTPUT'
EXEC sp_executesql #sql_str,#param,#channel_id,#profile_key_id,#source_proc_group_id,#ret OUTPUT
I.e. this gives:
SELECT #ret=MAX(o.optin_flag) AS optin_flag
FROM optin_channel_1 o
INNER HASH JOIN dbo.v_source_proc_id_by_group_id y
ON o.source_proc_id=y.source_proc_id
AND y.source_proc_group_id=5
INNER HASH JOIN profile_conns z
ON z.profile_key_id=1
AND z.conn_key_type_id=o.key_type_id
AND z.conn_key_id=o.[key_id]
AND z.valid_to='01.01.3000'
INNER HASH JOIN lu_channel_conns x
ON x.channel_id=1
AND z.conn_type_id=x.conn_type_id
INNER HASH JOIN lu_conn_type ct
ON ct.conn_type_id=x.conn_type_id
AND ct.default_key_type_id=o.key_type_id
These tables are used for an optin database. optin_flag could be 0 or 1. With the last statement I want to get a 1 as optin_flag from optin_channel_1 for the given channel_id=1 for user with profile_key_id=1, when optin was inserted into database by process belonging to source_proc_group_id=5. I hope this is enough to comprehend what's going on.
Is this the best way to use the CLUSTERED INDEX'es? Or would it be better to remove profile_key_id from index on profile_conns and put z.profile_key_id=1 in a WHERE clause?
May be there's a much better way for optimizing this select (changes in database schema is not possible, only changes on indexes and modifing statement).

Without knowing the size of the tables and the sort of data stored in it them it is difficult to gauge.
Assuming optin_channel_1 has a lot of data and profile_cons has a lot of data I would try the following:
Clustered index on optin_channel_1(key_id) or key_type_id depending on which field has the most distinct values. (since you don't have a covering index)
Clustered index on profile_conns (cons_key_id) or cons_key_type_id depending on what you have chosen in optin_channel_1
etc...
Basically, if your table profile_conns table has not much data, I would put the clustered index on the most fragmented "filter" field (I suspect profile_key_id). If the table has a lot of data I would aim for a hash/merge join and match the clustered index with the clustered index of the optin_channel_1 table.
I would also rewrite the query as such:
SELECT #ret = MAX(o.optin_flag) AS optin_flag
FROM optin_channel_1 o
JOIN dbo.v_source_proc_id_by_group_id y
ON o.source_proc_id = y.source_proc_id
JOIN profile_conns z
ON z.conn_key_type_id = o.key_type_id
AND z.conn_key_id = o.[key_id]
JOIN lu_channel_conns x
ON z.conn_type_id = x.conn_type_id
JOIN lu_conn_type ct
ON ct.conn_type_id = x.conn_type_id
AND ct.default_key_type_id=o.key_type_id
WHERE y.source_proc_group_id = 5
AND z.profile_key_id = 1
AND x.channel_id = 1
AND z.valid_to = '01.01.3000'
The query changed this way because:
Putting the filter conditions in the where clause shows you what are relevant fields to aim for a hash/merge join
Putting join hints is rarely a good idea. It is very hard to beat the query governor to determine the best query plan. A bad plan usually indicates you have an issue with your indexes/statistics.
So as summary:
small table joined to big table ==> go for nested loops & focus your clustered index on the "filter" field in the small table & the join field in the big table.
big table joined to big table => go for hash/merge join and put the clustered index on the matching field on both sides
multi-field indexes usually only a good idea when they are "covering", this means all the fields you query are included in the index. (or are included with the include() clause)

Select highest rated, oldest track

I have several tables:
CREATE TABLE [dbo].[Tracks](
[Id] [uniqueidentifier] NOT NULL,
[Artist_Id] [uniqueidentifier] NOT NULL,
[Album_Id] [uniqueidentifier] NOT NULL,
[Title] [nvarchar](255) NOT NULL,
[Length] [int] NOT NULL,
CONSTRAINT [PK_Tracks_1] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[TrackHistory](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Track_Id] [uniqueidentifier] NOT NULL,
[Datetime] [datetime] NOT NULL,
CONSTRAINT [PK_TrackHistory] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
INSERT INTO [cooltunes].[dbo].[TrackHistory]
([Track_Id]
,[Datetime])
VALUES
("335294B0-735E-4E2C-8389-8326B17CE813"
,GETDATE())
CREATE TABLE [dbo].[Ratings](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Track_Id] [uniqueidentifier] NOT NULL,
[User_Id] [uniqueidentifier] NOT NULL,
[Rating] [tinyint] NOT NULL,
CONSTRAINT [PK_Ratings] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
INSERT INTO [cooltunes].[dbo].[Ratings]
([Track_Id]
,[User_Id]
,[Rating])
VALUES
("335294B0-735E-4E2C-8389-8326B17CE813"
,"C7D62450-8BE6-40F6-80F1-A539DA301772"
,1)
Users
User_Id|Guid
Other fields
Links between the tables are pretty obvious.
TrackHistory has each track added to it as a row whenever it is played ie. a track will appear in there many times.
Ratings value will either be 1 or -1.
What I'm trying to do is select the Track with the highest rating, that is more than 2 hours old, and if there is a duplicate rating for a track (ie a track receives 6 +1 ratings and 1 - rating, giving that track a total rating of 5, another track also has a total rating of 5), the track that was last played the longest ago should be returned. (If all tracks have been played within the last 2 hours, no rows should be returned)
I'm getting somewhere doing each part individually using the link above, SUM(Value) and GROUP BY Track_Id, but I'm having trouble putting it all together.
Hopefully someone with a bit more (MS)SQL knowledge will be able to help me. Many thanks!

select top 1 t.Id, SUM(r.Rating) as Rating, MAX(Datetime) as LastPlayed
from Tracks t
inner join TrackHistory h on t.Id = h.Track_Id
inner join Ratings r on t.Id = r.Track_Id
where h.Track_Id not in (
select Track_Id
from TrackHistory
where Datetime > DATEADD(HOUR, -2, getdate())
)
group by t.Id
order by Rating desc, LastPlayed

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Server aggregation with several joins - sql

Related

SQL Server: aggregate error on grouping

how do I select a record which is not present in other related table using sql?

Figure Out Which OrderIDs are 0$ Payment Totals

Optimizing CLUSTERED INDEX for use with JOIN

Select highest rated, oldest track

Categories

Resources