return column name of the maximum value in sql server 2012 - sql

My table looks like this (Totally different names)
ID Column1--Column2---Column3--------------Column30
X 0 2 6 0101 31
I want to find the second maximum value of Column1 to Column30 and Put the column_Name in a seperate column.
First row would look like :
ID Column1--Column2---Column3--------------Column30------SecondMax
X 0 2 6 0101 31 Column3
Query :
Update Table
Set SecondMax= (select Column_Name from table where ...)

with unpvt as (
select id, c, m
from T
unpivot (c for m in (c1, c2, c3, ..., c30)) as u /* <-- your list of columns */
)
update T
set SecondMax = (
select top 1 m
from unpvt as u1
where
u1.id = T.id
and u1.c < (
select max(c) from unpvt as u2 where u2.id = u1.id
)
order by c desc, m
)
I really don't like relying on top but this isn't a standard sql question anyway. And it doesn't do anything about ties other than returning the first column name by order of alphabetical sort.
You could use a modification via the condition below to get the "third maximum". (Obviously the constant 2 comes from 3 - 1.) Your version of SQL Server lets you use a variable there as well. I think SQL 2012 also supports the limit syntax if that's preferable to top. And since it should work for top 0 and top 1 as well, you might just be able to run this query in a loop to populate all of your "maximums" from first to thirty.
Once you start having ties you'll eventually get a "thirtieth maximum" that's null. Make sure you cover those cases though.
and u1.c < all (
select top 2 distinct c from unpvt as u2 where u2.id = u1.id
)
And after I think about it. If you're going to rank and update so many columns it would probably make even more sense to use a proper ranking function and do the update all at once. You'll also handle the ties a lot better even if the alphabetic sorting is still arbitrary.
with unpvt as (
select id, c, m, row_number() over (partition by id order by c desc, m) as nthmax
from T
unpivot (c for m in (c1, c2, c3, ..., c30)) as u /* <-- your list of columns */
)
update T set
FirstMax = (select c from unpvt as u where u.id = T.id and nth_max = 1),
SecondMax = (select c from unpvt as u where u.id = T.id and nth_max = 2),
...
NthMax = (select c from unpvt as u where u.id = T.id and nth_max = N)

Related

Optimizing SQL Cross Join that checks if any array value in other column

Let's say I have a table events with structure:
id
value_array
XXXX
[a,b,c,d]
...
...
I have a second table values_of_interest with structure:
value
x
y
z
a
I want to find id's that have any of the values found in values_of_interest. All else equal, what would be the most performant SQL to make this happen? (I am using BigQuery, but feel free to answer more generally)
My current thought is:
SELECT
DISTINCT e.id
FROM
events e, values_of_interest vi
WHERE
EXISTS(
SELECT
value
FROM
UNNEST(e.value_array) value
JOIN
vi ON vi.value = e.value
)
Few quick options for BigQuery Standard SQL
Option 1
select id
from `project.dataset.events`
where exists (
select 1
from `project.dataset.values_of_interest`
where value in unnest(value_array)
)
Option 2
select id
from `project.dataset.events` t
where (
select count(1)
from t.value_array as value
join `project.dataset.values_of_interest`
using(value)
) > 0
I would write this using exists and a join:
select e.id
from `project.dataset.events` e
where exists (select 1
from unnest(e.value_array) val join
`project.dataset.values_of_interest` voi
on val = voi.value
);

SQL - Query to compare the exact number and content of strings

I want to write a query that returns pairs of books (f_DOI, s_DOI), which respect the following criterion: the keywords associated with s_DOI (second book) are ALL also associated with f_DOI (first book).
Keywords
Doi Keyword
1 'Adventure'
2 'Adventure'
1 'Fantasy'
2 'Thriller'
3 'Football'
4 'Football'
5 'History'
This is my code:
select k1.doi f_DOI , k2.doi s_DOI, k1.keyword
from keywords k1
join keywords k2
on k2.doi > k1.doi
where k1.keyword= k2.keyword;
This is my output:
f_DOI s_DOI KEYWORD
1 2 Adventure
3 4 Football
The first row is not correct, how you can see the f_DOI = 1 and s_DOI = 2 have in common only the 'Adventure' keyword the other two are different (How you can see in the table Keyword DOI = 1 have also keyword 'fantasy' and DOI = 2 have keyword 'thriller').
If I understand correctly, you seem to want the second to be a super set of the first. That would be:
with k as (
select k.*, count(*) over (partition by doi) as cnt
from keywords k
)
select k.doi, k2.doi
from k join
k k2
on k2.keyword = k.keyword
group by k.doi, k2.doi, k.cnt
having count(*) = k.cnt;
If you want exact matches, then include k2.cnt = k.cnt in the on clause.
(And this assumes no duplicates.)
EDIT:
You can get the exact same keywords using:
with k as (
select k.*, count(*) over (partition by doi) as cnt
from (select distinct keyword, doi from keywords k) k
)
select k.doi, k2.doi
from k join
k k2
on k2.keyword = k.keyword and k2.cnt = k.cnt
group by k.doi, k2.doi, k.cnt
having count(*) = k.cnt;
Or the listagg() approach:
select keywords, listagg(doi, ',') within group (order by keyword)
from (select doi,
listagg(keyword, ',') within group (order by keyword) as keywords
from (select distinct doi, keyword from keywords) k
group by doi
) d
group by keywords;
This can be limited by Oracle's limitations on string length.
You can use the listagg as follows:
With cte as
(Select t.doi, listagg(keyword,',') within group (order by keyword) as keyword
from your_table t
Group by t.id)
Select distinct t1.doi f_doi, t2.id s_doi, t1.keyword
From cte t1 join cte t2
On t1.adventure = t2.adventure
And t1.doi < t2.doi;

How to find rows that have one equal value and one different value from the table

I have the following table:
ID Number Revision
x y 0
x y 1
z w 0
a w 0
a w 1
b m 0
b m 0
I need to return rows that for the same Number thare are more then one ID with the same Revision.Number can be "Null" and I don't need those values.
The output should be:
z w 0
a w 0
I have tried the following query:
SELECT a.id,a.number,a.revision,
FROM table a INNER JOIN
(SELECT id, number, revision FROM table where number > '0'
GROUP BY number HAVING COUNT(*) > 1
) b ON a.revision = b.revision AND a.id != b.id
A little addition- I have rows in my table with the same Number, ID and Revision- I don't need those rows in my query to be displayed!
It is not working! Please help me to figure out how to fix it.
Thanks.
Select t.Id,s.number,t.revision
from (Select number,count(*) 'c'
from table t1
where revision=0
group by number
having count(*) > 1
) s join table t on t.number= s.number
where revision = 0
Another simple approach:
SELECT DISTINCT b.id, b.Number, b.Revision
FROM tbl a
INNER JOIN tbl b
ON a.ID != b.ID AND a.Number = b.Number AND a.Revision = b.Revision;
This is tested in MySql 5, syntax might differ slightly.
You are not that far away with your query:
SELECT a.id,a.number,a.revision
FROM table a
JOIN (
-- multiple id for the same number and revision
SELECT number, revision
FROM table
GROUP BY number, revision
HAVING COUNT(*) > 1
) b
ON a.revision = b.revision
AND a.number = b.number
Untested, but you should get the idea. If your sql-server is a resent version you can solve this with OLAP functions as well.
To filter out rows where the whole row is duplicated we can select only unique rows via group by and having:
SELECT a.id,a.number,a.revision
FROM table a
JOIN (
-- multiple id for the same number and revision
SELECT number, revision
FROM table
GROUP BY number, revision
HAVING COUNT(*) > 1
) b
ON a.revision = b.revision
AND a.number = b.number
GROUP BY a.id,a.number,a.revision
HAVING COUNT(1) = 1

Select random record within groups sqlite

I can't seem to get my head around this. I have a single table in SQlite, from which I need to select a random() record for EACH group. So, considering a table such as:
id link chunk
2 a me1
3 b me1
4 c me1
5 d you2
6 e you2
7 f you2
I need sql that will return a random link value for each chunk. So one time I run it would give:
me1 | a
you2 | f
the next time maybe
me1 | c
you2 | d
I know similar questions have been answered but I'm not finding a derivation of one that applies here.
UPDATE:
Nuts, follow up question: so now I need to EXCLUDE rows where a new field "qcinfo" is set to 'Y'.
This, of course, hides rows whenever the random ID hits one where qcinfo = 'Y', which is wrong. I need to exclude the row from being considered in the chunk, but still generate a random record for the chunk if any records have qcinfo <> 'Y'.
select t.chunk ,t.id, t.qcinfo, t.link from table1
inner join
(
select chunk ,cast(min(id)+abs(random() % (max(id)-min(id)))as int) AS random_id
from table1
group by chunk
) sq
on t.chunk = sq.chunk
and t.id = sq.random_id
where qcinfo <> 'Y'
A bit hackish, but it works... See sql fiddle http://sqlfiddle.com/#!2/81e75/7
select t.chunk
,t.link
from table1 t
inner join
(
select chunk
,FLOOR(min(id) + RAND() * (max(id)-min(id))) AS random_id
from table1
group by chunk
) sq
on t.chunk = sq.chunk
and t.id = sq.random_id
Sorry, I thought that you said MySQL.
Here is the fiddle and the code for SQLite
http://sqlfiddle.com/#!5/81e75/12
select t.chunk
,t.link
from table1 t
inner join
(
select chunk
,cast(min(id)+abs(random() % (max(id)-min(id)))as int) AS random_id
from table1
group by chunk
) sq
on t.chunk = sq.chunk
and t.id = sq.random_id
Note that SQLite returns the first value corresponding to a group when we do a group by without an aggregation
select link, chunk from table group by chunk;
Running that you'll get this
me1 | a
you2| d
Now you can make the first value random by randomly sorting the table and then grouping. Here's the final solution.
select link, chunk from (select * from table order by random()) group by chunk;

Replace no result

I have a query like this:
SELECT TV.Descrizione as TipoVers,
sum(ImportoVersamento) as ImpTot,
count(*) as N,
month(DataAllibramento) as Mese
FROM PROC_Versamento V
left outer join dbo.PROC_TipoVersamento TV
on V.IDTipoVersamento = TV.IDTipoVersamento
inner join dbo.PROC_PraticaRiscossione PR
on V.IDPraticaRiscossioneAssociata = PR.IDPratica
inner join dbo.DA_Avviso A
on PR.IDDatiAvviso = A.IDAvviso
where DataAllibramento between '2012-09-08' and '2012-09-17' and A.IDFornitura = 4
group by V.IDTipoVersamento,month(DataAllibramento),TV.Descrizione
order by V.IDTipoVersamento,month(DataAllibramento)
This query must always return something. If no result is produced a
0 0 0 0
row must be returned. How can I do this. Use a isnull for every selected field isn't usefull.
Use a derived table with one row and do a outer apply to your other table / query.
Here is a sample with a table variable #T in place of your real table.
declare #T table
(
ID int,
Grp int
)
select isnull(Q.MaxID, 0) as MaxID,
isnull(Q.C, 0) as C
from (select 1) as T(X)
outer apply (
-- Your query goes here
select max(ID) as MaxID,
count(*) as C
from #T
group by Grp
) as Q
order by Q.C -- order by goes to the outer query
That will make sure you have always at least one row in the output.
Something like this using your query.
select isnull(Q.TipoVers, '0') as TipoVers,
isnull(Q.ImpTot, 0) as ImpTot,
isnull(Q.N, 0) as N,
isnull(Q.Mese, 0) as Mese
from (select 1) as T(X)
outer apply (
SELECT TV.Descrizione as TipoVers,
sum(ImportoVersamento) as ImpTot,
count(*) as N,
month(DataAllibramento) as Mese,
V.IDTipoVersamento
FROM PROC_Versamento V
left outer join dbo.PROC_TipoVersamento TV
on V.IDTipoVersamento = TV.IDTipoVersamento
inner join dbo.PROC_PraticaRiscossione PR
on V.IDPraticaRiscossioneAssociata = PR.IDPratica
inner join dbo.DA_Avviso A
on PR.IDDatiAvviso = A.IDAvviso
where DataAllibramento between '2012-09-08' and '2012-09-17' and A.IDFornitura = 4
group by V.IDTipoVersamento,month(DataAllibramento),TV.Descrizione
) as Q
order by Q.IDTipoVersamento, Q.Mese
Use COALESCE. It returns the first non-null value. E.g.
SELECT COALESCE(TV.Desc, 0)...
Will return 0 if TV.DESC is NULL.
You can try:
with dat as (select TV.[Desc] as TipyDesc, sum(Import) as ToImp, count(*) as N, month(Date) as Mounth
from /*DATA SOURCE HERE*/ as TV
group by [Desc], month(Date))
select [TipyDesc], ToImp, N, Mounth from dat
union all
select '0', 0, 0, 0 where (select count (*) from dat)=0
That should do what you want...
If it's ok to include the "0 0 0 0" row in a result set that has data, you can use a union:
SELECT TV.Desc as TipyDesc,
sum(Import) as TotImp,
count(*) as N,
month(Date) as Mounth
...
UNION
SELECT
0,0,0,0
Depending on the database, you may need a FROM for the second SELECT. In Oracle, this would be "FROM DUAL". For MySQL, no FROM is necessary