Using MAX when selecting a high number of fields in a query - sql

I understand some varieties of this question have been asked, but I could not find an answer to my specific scenario.
My query has over 50 fields being selected, and only one of them is an aggregate, using MAX(). On the GROUP BY clause, I would only like to pass two specific fields, name and UserID, not all 50 to make the query run. See small subset below.
SELECT
t1.name,
MAX(t2.id) as UserID,
t3.age,
t3.height,
t3.dob,
FROM table1 t1
LEFT JOIN table2 t2 ON t1.id = t2.id
LEFT JOIN table3 t3 ON t1.id = t3.id
GROUP BY t1.name, UserID
Is there any workaround or better approach to accomplish my goal?
The database is SQL Server and any help would be greatly appreciated.

Hmmm . . . What values do you want for the other fields? If you want the max() of one column for each id and code, you can do:
select t.*
from (select t.*, max(col) over (partition by id, code) as maxcol
from t
) t
where col = maxcol;
Given that id might be unique, you might want the maximum id as well as the other columns for each code:
select t.*
from (select t.*, max(id) over (partition by code) as maxid
from t
) t
where id = maxid;

Related

SQL Server query showing most recent distinct data

I am trying to build a SQL query to recover only the most young record of a table (it has a Timestamp column already) where the item by which I want to filter appears several times, as shown in my table example:
.
Basically, I have a table1 with Id, Millis, fkName and Price, and a table2 with Id and Name.
In table1, items can appear several times with the same fkName.
What I need to achieve is building up a single query where I can list the last record for every fkName, so that I can get the most actual price for every item.
What I have tried so far is a query with
SELECT DISTINCT [table1].[Millis], [table2].[Name], [table1].[Price]
FROM [table1]
JOIN [table2] ON [table2].[Id] = [table1].[fkName]
ORDER BY [table2].[Name]
But I don't get the correct listing.
Any advice on this? Thanks in advance,
A simple and portable approach to this greatest-n-per-group problem is to filter with a subquery:
select t1.millis, t2.name, t1.price
from table1 t1
inner join table2 t2 on t2.id = t1.fkName
where t1.millis = (select max(t11.millis) from table1 t11 where t11.fkName = t1.fkName)
order by t1.millis desc
using Common Table Expression:
;with [LastPrice] as (
select [Millis], [Price], ROW_NUMBER() over (Partition by [fkName] order by [Millis] desc) rn
from [table1]
)
SELECT DISTINCT [LastPrice].[Millis],[table2].[Name],[LastPrice].[Price]
FROM [LastPrice]
JOIN [table2] ON [table2].[Id] = [LastPrice].[fkName]
WHERE [LastPrice].rn = 1
ORDER BY [table2].[Name]

Grouped random order in SQL

I have a table name Employees with two columns, case id and owner. I need to write a query to select random 5 caseids for every name in owner column.
Owners name are non sorted and also caseid are unique.
Also if anyone can explain using ranking over partition by in this case?
I tried this code but its not working using self join.
select t.*
from t
where t.id in (select top 5 id
from t as t2
where t2.name = t.name
order by Rnd(-Timer()*[ID])
);
I guess you have specify the table in the subquery:
select t.*
from t
where t.id in (select top 5 t2.id
from t as t2
where t2.name = t.name
order by Rnd(-Timer()*t2.[ID])
);

SQL Tables need to be appended, but not like JOINS

I've a scenario as below.
I've two tables, and a Common column/Key between the two.
I need Table2 data to be just appended to Table1 without repetition like JOINs.
If there are more rows in one table, other table rows can be NULL.
As shown in figure, Result Table has NULLs when there is no corresponding row count from Table2.
I tried using Joins, but I'm getting a result of 45 rows. But I should get 9 rows.
Thanks in advance.
Edit: Added my queries
SELECT DISTINCT
APPT.PRSN_ID
,APPT.SCHEDULEDAPPOINTMENTS
,APPT.OVERDUEAPPOINTMENTS
,Visit.ENCOUNTER_CATEGORY
,Visit.ENCOUNTER_TYPE
,Visit.ENCOUNTER_DATE
,Visit.ENCOUNTER_FOLLOWUP_DATE
FROM
APPOINTMENTS APPT
OUTER APPLY dbo.fn_GetVisitsOfAPerson(PROV.PRSN_ID) AS Visit
/***********************************************************/
--IN THE ABOVE QUERY, THE FUNCTION IS DEFINED AS BELOW
CREATE FUNCTION dbo.fn_GetVisitsOfAPerson(#PrsnID AS bigint)
RETURNS TABLE
AS
RETURN
(
SELECT
VISITS.PVISITS_PRSN_KEY
,DATA.HE_Category_Description AS 'ENCOUNTER_CATEGORY'
,DATA.HE_Type_Description AS 'ENCOUNTER_TYPE'
,VISITS.PVISITS_DATE AS 'ENCOUNTER_DATE'
,VISITS.PVISITS_FOLLOWUP_DATE AS 'ENCOUNTER_FOLLOWUP_DATE'
FROM
[HS_PRSN_HEALTH_VISITS] VISITS INNER JOIN
[HS_HealthEncounter_Table] DATA ON
ENCOUNTER.PVISITS_CATEGORY = DATA.HE_Category_Code AND
ENCOUNTER.PVISITS_TYPE = DATA.HE_Type_Code
WHERE
PVISITS_PRSN_KEY = #PrsnID
AND PVISITS_VOID = 0
)
GO
I cannot read your data tables, but I think you just need to introduce a row number.
It would be something like this:
select . . .
from (select t1.*, row_number() over (partition by key order by ??) as seqnum
from table1 t1
) full join
(select t2.*, row_number() over (partition by key order by ??) as seqnum
from table1 t2
) t2
on t1.key = t2.key and t1.seqnum = t2.seqnum;
Based on you wanting the 9 rows present in the Visit table it looks like you need an OUTER JOIN, something like:
SELECT DISTINCT
APPT.PRSN_ID
,APPT.SCHEDULEDAPPOINTMENTS
,APPT.OVERDUEAPPOINTMENTS
,Visit.ENCOUNTER_CATEGORY
,Visit.ENCOUNTER_TYPE
,Visit.ENCOUNTER_DATE
,Visit.ENCOUNTER_FOLLOWUP_DATE
FROM Visit
LEFT OUTER JOIN
APPOINTMENTS APPT
ON Visit.key = APP.key

Can the result of a subquery be joined with itself?

Let's say I need to find the oldest animal in each zoo. It's a typical maximum-of-a-group sort of query. Only here's a complication: the zebras and giraffes are stored in separate tables. To get a listing of all animals, be they giraffes or zebras, I can do this:
(SELECT id,zoo,age FROM zebras
UNION ALL
SELECT id,zoo,age FROM giraffes) t1
Then given t1, I could build a typical maximum-of-a-group query:
SELECT t1.*
FROM t1
JOIN
(SELECT zoo,max(age) as max_age
FROM t1
GROUP BY zoo) t2
ON (t1.zoo = t2.zoo)
Clearly I could store t1 as a temporary table, but is there a way I could do this all within one query, and without having to repeat the definition of t1? (Please let's not discuss modifications to the table design; I want to focus on the issue of working with the subquery result.)
Here is a link to the with clause.
Understanding the WITH Clause
with t1 as
(select id, zoo, age from zebras
union all
select id, zoo, age from giraffes)
select t1.*
from t1
join
(SELECT zoo,max(age) as max_age
FROM t1
GROUP BY zoo) t2
on (t1.zoo = t2.zoo);
Note: You could move t2 up to your with clause as well.
Note 2: An alternative solution is to simply create t1 as a view and use it in your query instead.
To find the oldest animal, you should use wjndoq functions:
select z.*
from (select z.*,
row_number() over (partition by zoo order by age desc) as seqnum
from ( <subquery to union all animals>) z
)
where seqnum = 1

How to convert a SQL subquery to a join

I have two tables with a 1:n relationship: "content" and "versioned-content-data" (for example, an article entity and all the versions created of that article). I would like to create a view that displays the top version of each "content".
Currently I use this query (with a simple subquery):
SELECT
t1.id,
t1.title,
t1.contenttext,
t1.fk_idothertable
t1.version
FROM mytable as t1
WHERE (version = (SELECT MAX(version) AS topversion
FROM mytable
WHERE (fk_idothertable = t1.fk_idothertable)))
The subquery is actually a query to the same table that extracts the highest version of a specific item. Notice that the versioned items will have the same fk_idothertable.
In SQL Server I tried to create an indexed view of this query but it seems I'm not able since subqueries are not allowed in indexed views. So... here's my question... Can you think of a way to convert this query to some sort of query with JOINs?
It seems like indexed views cannot contain:
subqueries
common table expressions
derived tables
HAVING clauses
I'm desperate. Any other ideas are welcome :-)
Thanks a lot!
This probably won't help if table is already in production but the right way to model this is to make version = 0 the permanent version and always increment the version of OLDER material. So when you insert a new version you would say:
UPDATE thetable SET version = version + 1 WHERE id = :id
INSERT INTO thetable (id, version, title, ...) VALUES (:id, 0, :title, ...)
Then this query would just be
SELECT id, title, ... FROM thetable WHERE version = 0
No subqueries, no MAX aggregation. You always know what the current version is. You never have to select max(version) in order to insert the new record.
Maybe something like this?
SELECT
t2.id,
t2.title,
t2.contenttext,
t2.fk_idothertable,
t2.version
FROM mytable t1, mytable t2
WHERE t1.fk_idothertable == t2.fk_idothertable
GROUP BY t2.fk_idothertable, t2.version
HAVING t2.version=MAX(t1.version)
Just a wild guess...
You Might be able to make the MAX a table alias that does group by.
It might look something like this:
SELECT
t1.id,
t1.title,
t1.contenttext,
t1.fk_idothertable
t1.version
FROM mytable as t1 JOIN
(SELECT fk_idothertable, MAX(version) AS topversion
FROM mytable
GROUP BY fk_idothertable) as t2
ON t1.version = t2.topversion
I think FerranB was close but didn't quite have the grouping right:
with
latest_versions as (
select
max(version) as latest_version,
fk_idothertable
from
mytable
group by
fk_idothertable
)
select
t1.id,
t1.title,
t1.contenttext,
t1.fk_idothertable,
t1.version
from
mytable as t1
join latest_versions on (t1.version = latest_versions.latest_version
and t1.fk_idothertable = latest_versions.fk_idothertable);
M
If SQL Server accepts LIMIT clause, I think the following should work:
SELECT
t1.id,
t1.title,
t1.contenttext,
t1.fk_idothertable
t1.version
FROM mytable as t1 ordery by t1.version DESC LIMIT 1;
(DESC - For descending sort; LIMIT 1 chooses only the first row and
DBMS usually does good optimization on seeing LIMIT).
I don't know how efficient this would be, but:
SELECT t1.*, t2.version
FROM mytable AS t1
JOIN (
SElECT mytable.fk_idothertable, MAX(mytable.version) AS version
FROM mytable
) t2 ON t1.fk_idothertable = t2.fk_idothertable
Like this...I assume that the 'mytable' in the subquery was a different actual table...so I called it mytable2. If it was the same table then this will still work, but then I imagine that fk_idothertable will just be 'id'.
SELECT
t1.id,
t1.title,
t1.contenttext,
t1.fk_idothertable
t1.version
FROM mytable as t1
INNER JOIN (SELECT MAX(Version) AS topversion,fk_idothertable FROM mytable2 GROUP BY fk_idothertable) t2
ON t1.id = t2.fk_idothertable AND t1.version = t2.topversion
Hope this helps