Select top 1 column value in the group by clause? - sql

I am joining two tables and my query contains the group by clause. There is one column in the join table which I want to show in the result but doesn't want it to be part of group by because values are different in column values. I just want to show any top value of that column.
I tried to use Distinct and Top 1 and others but nothing works for me.
SELECT t1.Code, t2.Details, t2.FineDate ,Sum(t1.Amount) from Emp_Actions t1
INNER JOIN Emp_Fines ON t1.Code = t2.Code
where (t1.Code = "MYParameter" or #MyParameter = "" )
group by t1.Code,t2.Details,t2.FineDate
Please note that I am using StoredProcedure and code be specific or all. My actual query is too big and I just made a sample to elaborate my issue. I need Top 1 FineDate, I don't want it to part of group by however I want to show it.

One way to get only one value, is using MAX(col) or MIN(col) in your SELECT statement, where col is the column that you don't want to group on. One advantage is that you get somewhat consistent values (first or last in order).
SELECT t1.Code, t2.Details, t2.FineDate, MAX(col) my_col, Sum(t1.Amount) from Emp_Actions t1
Also, there are more advanced analytical window functions (first_value for example), if you want to go that way, to have more control over which value is chosen, also depending on other column values.
https://learn.microsoft.com/en-us/sql/t-sql/functions/analytic-functions-transact-sql?view=sql-server-2017

Use MIN() or MAX():
select t1.Code, t2.Details, t2.FineDate, Sum(t1.Amount),
max(<your column here>) as column+name
from Emp_Actions t1 join
Emp_Fines
on t1.Code = t2.Code
where (t1.Code = "MYParameter" or #MyParameter = "" )
group by t1.Code, t2.Details, t2.FineDate

You can use use subquery for required result.
I am giving you a sample code :
Select
t1.Code, t2.Details, t2.FineDate,Sum(t1.Amount)
,(select top 1 a.ColA from Emp_Fines a where a.Code = t1.Code order by ColOfSort Asc/Desc) as RequiredColumn
from Emp_Actions t1
INNER JOIN Emp_Fines t2 ON t1.Code = t2.Code
where (t1.Code = "MYParameter" or #MyParameter = "" )
group by t1.Code,t2.Details,t2.FineDate
Where RequiredColumn is the column that you wants in the result but don't want to use in group by

Related

Control on SQL join condition when there are more than one matching row

I have a join with more than one matching row. How can I have more control over the join? In particular, as shown in the example, when there is more than one matching row, I don't want to obtain any result and maintain my input.
In the picture, I am showing the desired result. The row with the green color is correctly obtained because of directing matching and the row with the red color is also correctly not obtained, because there are more than one matching.
Select t2.id, t1.Insurance_num, t1Name, t1.Surname
From Table_1 t1
left join Table_2 t2
on t1.Insurance_num=t2.Insurance_num
I am using Hive, but I guess the answer to this question should be a generic one.
You might try this... I am not sure if it is exactly what you need.
If the coalesce with left-join would be null if no matching Insurance_Num, thus returning 0, or, there is a matching insurance_num, but the count is other than 1, it gets the ELSE condition of the CASE query.
Only if there is a match in the pre-query, and there is one record for that insurance_num does it get the ID from that pre-aggregation query.
select
case when coalesce( PreSum.NumEntries, 0 ) = 1
then cast( PreSum.LowID as varchar )
else 'N.D' end as IDorNot,
t1.Insurance_num,
t1.Name,
t1.Surname
from
table_1 t1
LEFT JOIN
( select
t2.Insurance_Num,
min( t2.id ) as LowID,
count(*) as NumEntries
from
Table_2 t2
group by
t2.Insurance_Num ) PreSum
on t1.insurance_Num = PreSum.Insurance_Num

SQL Server query showing most recent distinct data

I am trying to build a SQL query to recover only the most young record of a table (it has a Timestamp column already) where the item by which I want to filter appears several times, as shown in my table example:
.
Basically, I have a table1 with Id, Millis, fkName and Price, and a table2 with Id and Name.
In table1, items can appear several times with the same fkName.
What I need to achieve is building up a single query where I can list the last record for every fkName, so that I can get the most actual price for every item.
What I have tried so far is a query with
SELECT DISTINCT [table1].[Millis], [table2].[Name], [table1].[Price]
FROM [table1]
JOIN [table2] ON [table2].[Id] = [table1].[fkName]
ORDER BY [table2].[Name]
But I don't get the correct listing.
Any advice on this? Thanks in advance,
A simple and portable approach to this greatest-n-per-group problem is to filter with a subquery:
select t1.millis, t2.name, t1.price
from table1 t1
inner join table2 t2 on t2.id = t1.fkName
where t1.millis = (select max(t11.millis) from table1 t11 where t11.fkName = t1.fkName)
order by t1.millis desc
using Common Table Expression:
;with [LastPrice] as (
select [Millis], [Price], ROW_NUMBER() over (Partition by [fkName] order by [Millis] desc) rn
from [table1]
)
SELECT DISTINCT [LastPrice].[Millis],[table2].[Name],[LastPrice].[Price]
FROM [LastPrice]
JOIN [table2] ON [table2].[Id] = [LastPrice].[fkName]
WHERE [LastPrice].rn = 1
ORDER BY [table2].[Name]

SQL carry forward non-null value within groups

The following thread (SQL QUERY replace NULL value in a row with a value from the previous known value) was very helpful to carry forward non-null values, but I'm can't figure out how to add a grouping column.
For example, I would like to do the following:
Example Data
Here is how I would have liked to code it:
UPDATE t1
SET
#n = COALESCE(number, #n),
number = COALESCE(number, #n)
GROUP BY grp
I know this isn't possible, and have seen several solutions that rely on inner joins, but those examples focus on aggregation rather than carrying forward values. For example: SQL Server Update Group by.
My attempt to make this work is to do something like the following:
UPDATE t1
SET
#n = COALESCE(number, #n),
number = COALESCE(number, #n)
FROM t1
INNER JOIN (
-- Lost on what to put in the inner join...
SELECT grp, COUNT(*) FROM t1 GROUP BY grp
) t2
on t1.grp = t2.grp
I think you can do what you want with a correlated subquery:
UPDATE t1
SET number = (SELECT TOP (1) tt1.number
FROM t tt1
WHERE tt1.grp = t1.grp AND tt1.? <= t1.? AND tt1.number IS NOT NULL
ORDER BY t1.? DESC
)
FROM t1
WHERE t1.number IS NULL;
The ? is for the column that specifies "forward" in your expression "carry forward".

sql - ignore duplicates while joining

I have two tables.
Table1 is 1591 rows. Table2 is 270 rows.
I want to fetch specific column data from Table2 based on some condition between them and also exclude duplicates which are in Table2. Which I mean to join the tables but get only one value from Table2 even if the condition has occurred more than time. The result should be exactly 1591 rows.
I tried to make Left,Right, Inner joins but the data comes more than or less 1591.
Example
Table1
type,address,name
40,blabla,Adam
20,blablabla,Joe
Table2
type,currency
40,usd
40,gbp
40,omr
Joining on 'type'
Result
type,address,name,currency
40,blabla,name,usd
20,blblbla,Joe,null
try this it has to work
select *
from
Table1 h
inner join
(select type,currency,ROW_NUMBER()over (partition by type order by
currency) as rn
from
Table2
) sr on
sr.type=h.type
and rn=1
Try this. It's standard SQL, therefore, it should work on your rdbms system.
select * from Table1 AS t
LEFT OUTER JOIN Table2 AS y ON t.[type] = y.[type] and y.currency IN (SELECT MAX(currency) FROM Table2 GROUP BY [type])
If you want to control which currency is joined, consider altering Table2 by adding a new column active/non active and modifying accordingly the JOIN clause.
You can use outer apply if it's supported.
select a.type, a.address, a.name, b.currency
from Table1 a
outer apply (
select top 1 currency
from Table2
where Table2.type = a.type
) b
I typical way to do this uses a correlated subquery. This guarantees that all rows in the first table are kept. And it generates an error if more than one row is returned from the second.
So:
select t1.*,
(select t2.currency
from table2 t2
where t2.type = t1.type
fetch first 1 row only
) as currency
from table1 t1;
You don't specify what database you are using, so this uses standard syntax for returning one row. Some databases use limit or top instead.

SQL - get max result

Assume there is a table name "test" below:
name value
n1 1
n2 2
n3 3
Now, I want to get the name which has the max value, I have some solution below:
Solution 1:
SELECT TOP 1 name
FROM test
ORDER BY value DESC
solution 2:
SELECT name
FROM test
WHERE value = (SELECT MAX(value) FROM test);
Now, I hope use join operation to find the result, like
SELECT name
FROM test
INNER JOIN test ON...
Could someone please help and explain how it works?
If you are looking for JOIN then
SELECT T.name, T.value
FROM test T
INNER JOIN
( SELECT T1.name, T1.value ,
RANK() OVER (PARTITION BY T1.name ORDER BY T1.value) N
FROM test T1
WHERE T1.value IN (SELECT MAX(t2.value) FROM test T2)
)T3 ON T3.N = 1 AND T.name = T3.name
FIDDLE DEMO
or
select name, value
from
(
select name, value,
row_number() over(order by value desc) rn
from test
) src
where rn = 1
FIDDLE DEMO
First, note that solutions 1 and 2 could give different results when value is not unique. If in your test data there would be an additional record ('n4', 3), then solution 1 would return either 'n3' or 'n4', but solution 2 would return both.
A solution with JOIN will need aliases for the table, because as you started of, the engine would say Ambiguous column name 'name'.: it would not know whether to take name from the first or second occurrence of the test table.
Here is a way to complete the JOIN version:
SELECT t1.name
FROM test t1
LEFT JOIN test t2
ON t2.value > t1.value
WHERE t2.value IS NULL;
This query takes each of the records, and checks if any records exist that have a higher value. If not, the first record will be in the result. Note the use of LEFT: this denotes an outer join, so that records from t1 that have no match with t2 -- based on the ON condition -- are not immediately rejected (as would be the case with INNER): in fact, we want to reject all the other records, which is done with the WHERE clause.
A way to understand this mechanism, is to look at a variant of the query above, which lacks the WHERE clause and returns the values of both tables:
SELECT t1.value, t2.value
FROM test t1
LEFT JOIN test t2
ON t2.value > t1.value
On your test data this will return:
t1.value t2.value
1 2
1 3
2 3
3 (null)
Note that the last entry would not be there if the join where an INNER JOIN. But with the outer join, one can now look for the NULL values and actually get those records in the result that would be excluded from an INNER JOIN.
Note that this query will give the same result as solution 2 when there are duplicate values. If you want to have also only one result like with solution 1, it suffices to add TOP 1 after SELECT.
Here is a fiddle.
Alternative with pure INNER JOIN
If you really want an INNER join, then this will do it. Again the TOP 1 is only needed if you have non-unique values:
SELECT TOP 1 t1.name
FROM test t1
INNER JOIN (SELECT Max(value) AS value FROM test) t2
ON t2.value = t1.value;
But this one really is very similar to what you did in solution 2. Here is fiddle for it.