Last value per column in group, one row per group - sql

This should be simple but for some reason I'm stuck. Consider the following data:
KEY1 KEY2 COL1 COL2 COL3
--------------------------------------
1 1 A 7 (null)
1 2 A 8 (null)
1 3 (null) 7 (null)
2 2 (null) (null) 4
2 4 B 6 (null)
3 1 A B (null)
(KEY1 is the Id, KEY2 is the generation, and there are actually about 30 data columns but I'm only listing 3 here for simplicity.)
I want to get one row per Id, and for each column get the last non-null value. In other words...
KEY1 COL1 COL2 COL3
----------------------------
1 A 7 (null)
2 B 6 4
3 A B (null)
I tried the following but it seems to do nothing other than echo out all my rows.
SELECT key1,
LAST_VALUE(col1) OVER (PARTITION BY key1 ORDER BY key2 ASC) AS col1,
LAST_VALUE(col2) OVER (PARTITION BY key1 ORDER BY key2 ASC) AS col2,
LAST_VALUE(col3) OVER (PARTITION BY key1 ORDER BY key2 ASC) AS col3
FROM test1
(And this is for SQL Server 2012 and SQL Server Express.)

SQL Server does not (yet) support the IGNORE NULL option on window functions. One method is to use conditional aggregation. This requires an intelligent generation of sequence numbers for the columns, to ensure that the value "1" for the sequence is assigned to non-NULL values.
Here is a query that should do this:
select t1.key1,
max(case when seqnum1 = 1 then col1 end) as col1,
max(case when seqnum2 = 1 then col2 end) as col2,
max(case when seqnum3 = 1 then col3 end) as col13
from (select t1.*,
row_number() over (partition by key1
order by (case when col1 is not null then 1 else 2 end),
key2 desc
) as seqnum1,
row_number() over (partition by key1
order by (case when col2 is not null then 1 else 2 end),
key2 desc
) as seqnum2,
row_number() over (partition by key1
order by (case when col3 is not null then 1 else 2 end),
key2 desc
) as seqnum3
from test1 t1
) t1
group by t1.key1

If I understood the requirements correctly, shouldn't this work? Might be quite expensive depending on the amount of data / columns.
select
key1,
(select top 1 col1 from test1 t2 where t.key1 = t2.key1 and col1 is not null order by key2 desc) as col1,
(select top 1 col2 from test1 t2 where t.key1 = t2.key1 and col2 is not null order by key2 desc) as col2,
(select top 1 col3 from test1 t2 where t.key1 = t2.key1 and col3 is not null order by key2 desc) as col3
from
(select distinct key1 from test1) t

Related

Specify row number start point

I have some sample data below and I want to use row_number but make it start when the value is 0 for col3
I have tried the below but it doesn't work
row_number() over (partition by col1,col2, case when col3 = 0 then 1 end order by col4 desc) as row2
Col1
col2
col3
col4
row_number (output wanted)
abc
def
7
500
abc
def
0
300
1
abc
def
1
200
abc
def
0
2
2
abc
def
4
30
Have NULL for others
case when col3 = 0
then row_number() over (partition by col1,col2, col3 order by col4 desc) end as row2
A row number is just a running count of rows. So you should be able to do this:
select
col1, col2, col3, col4,
count(case when col3 = 0 then 1 end)
over (partition by col1, col2 order by col4 desc, ctid) as row2
from ...
order by col1, col2, col4 desc, ctid;
I have added Postgre's internal CTID in order to get a deterministic order for the case of duplicate col4 values. If such duplicates are not possible, you can remove CTID from the ORDER BY clause.
Thanks guys but I just figured it out.
I did
case when col3 =0 then row_number() over (partition by col1,col2,col3=0 order by col4 desc) else null end as Row
so what this did was a running count for both sets of 0 and non0 and I'm just hiding the non 0 ones with the case.

Can I change column order in SQL table based on a value that appears in different columns?

I have a table that looks like this:
Column1 | Column2 | Column3| Column4
4 | 3 | 2 | 1
2 | 1
3 | 2 | 1
I want to flip the columns so that 1 always start in column 1 and then the rest of the values follow to the right. Like this:
Column1 | Column2 | Column3 | Column4
1 | 2 | 3 | 4
1 | 2
1 | 2 | 3
This is an example table. The real table is a hierarchy of a company so 1 = CEO and 2 = SVP for example. 1 is always the same name but as the number gets higher (lower in chain of command) the more names that are in that level. I'm hoping for an automated solution that looks for 1, makes that the first column and then populates the columns. I am struggling because the value that 1 represents is in different columns so I can't just change the order of the columns.
I was able to accomplish this using VBA but I would prefer to keep it in SQL.
I don't have any useful code that I have tried so far.
You can use Case expression:
WITH CTE1 AS
(SELECT 4 AS COL1, 3 AS COL2 , 2 AS COL3, 1 AS COL4 FROM DUAL
UNION ALL
SELECT 2, 1, NULL, NULL FROM DUAL
UNION ALL
SELECT 3, 2, 1, NULL FROM DUAL
)
SELECT CASE WHEN COL1 <> 1 THEN 1 ELSE COL1 END AS COL1,
CASE WHEN COL2 <> 2 THEN 2 ELSE COL2 END AS COL2,
CASE WHEN COL3 <> 3 THEN 3 ELSE COL3 END AS COL3,
CASE WHEN COL4 <> 4 THEN 4 ELSE COL4 END AS COL4
FROM CTE1;
You can apply some CASEes checking all possibilities, this is assuming NULLs for missing data:
COALESCE(col4,col3,col2,col1) AS c1,
CASE
WHEN col4 IS NOT NULL THEN col3
WHEN col3 IS NOT NULL THEN col2
WHEN col2 IS NOT NULL THEN col1
END AS c2,
CASE
WHEN col4 IS NOT NULL THEN col2
WHEN col3 IS NOT NULL THEN col1
END AS c3,
CASE
WHEN col4 IS NOT NULL THEN col1
END AS c4
You want to sort the values. A generic SQL solution would use:
select max(case when seqnum = 1 then col end) as col1,
max(case when seqnum = 2 then col end) as col2,
max(case when seqnum = 3 then col end) as col3,
max(case when seqnum = 4 then col end) as col4
from (select col1, col2, col3, col4, col,
row_number() over (order by col) as seqnum
from ((select col1 as col, 1 as which, col1, col2, col3, col4 from t) union all
(select col2 as col, 2 as which, col1, col2, col3, col4 from t) union all
(select col3 as col, 3 as which, col1, col2, col3, col4 from t) union all
(select col4 as col, 4 as which, col1, col2, col3, col4 from t)
) t
where col is not null
) t
group by col1, col2, col3, col4;
This would be simpler in a database that supports lateral joins. And a unique id on each row would also help.

Sqlite insert both even and odd rows in one expression

I am using sqlite3 and I have a sqlite table which has somewhat duplicated/overlapping columns. To illustrate:
No Col1 Col2 Col3 Col4
row1 1 1 1 2 2
row2 2 1 1 3 3
row3 3 2 2 4 4
row4 4 2 2 5 5
Col1 and Col2 stores the same information, however, Col3 and Col4 has different information.
I want to condense the rows into one row like this:
No Col1 Col2 Col3 Col4 Col3.2 Col4.2
row1 1 1 1 2 2 3 3
row3 3 2 2 4 4 5 5
I have created a new table with the columns, and was able to select the odd rows.
INSERT INTO [Table] ( No, Col1, Col2, Col3, Col4
)
SELECT No, Col1, Col2, Col3, Col4
FROM [Table]
WHERE ([No] % 2) = 1
ORDER BY [No];
The result table would be something like:
No Col1 Col2 Col3 Col4 Col3.2 Col4.2
row1 1 1 1 2 2 null null
row3 3 2 2 4 4 null null
Now I am not sure how to insert the even values into the new table. Using similar expressions only insert more rows. Is it possible to do this INSERT INTO expression in one sentence? Or how do I update the new table?
Just join the table with itself based on the following condition. It'll even work if the No column has gaps:
SELECT o.No, o.Col1, o.Col2, o.Col3, o.Col4, e.Col3, e.Col4
FROM t AS o
INNER JOIN t AS e ON o.Col1 = e.Col1
AND o.Col2 = e.Col2
AND o.No < e.No
Use pivoting logic with aggregation:
SELECT
MIN(No) AS No,
MAX(CASE WHEN No % 2 = 1 THEN Col1 END) AS Col1,
MAX(CASE WHEN No % 2 = 1 THEN Col2 END) AS Col2,
MAX(CASE WHEN No % 2 = 1 THEN Col3 END) AS Col3,
MAX(CASE WHEN No % 2 = 1 THEN Col4 END) AS Col4,
MAX(CASE WHEN No % 2 = 0 THEN Col1 END) AS Col1_2,
MAX(CASE WHEN No % 2 = 0 THEN Col2 END) AS Col2_2,
MAX(CASE WHEN No % 2 = 0 THEN Col3 END) AS Col3_2,
MAX(CASE WHEN No % 2 = 0 THEN Col4 END) AS Col4_2
FROM yourTable
GROUP BY
(No-1) / 2;
Demo
Another approach, using window functions added in sqlite 3.25:
CREATE TABLE table2(no INTEGER PRIMARY KEY, col1, col2, col3, col4, "col3.2", "col4.2");
INSERT INTO table2
SELECT *
FROM (SELECT no, col1, col2, col3, col4, lead(col3) OVER win, lead(col4) OVER win
FROM table1
WINDOW win AS (ORDER BY no))
WHERE no % 2 = 1;
which gives
SELECT * FROM table2;
no col1 col2 col3 col4 col3.2 col4.2
---------- ---------- ---------- ---------- ---------- ---------- ----------
1 1 1 2 2 3 3
3 2 2 4 4 5 5

SQL Server : get max of the column2 and column3 value must be 1

I have an output of some part of my stored proedure like this:
col1 col2 col3 col4
--------------------------
2016-05-05 1 2 2
2016-05-05 1 3 32
2016-05-12 2 1 11
2016-05-12 3 1 31
Now I need to get result based on this condition
col2 = 1 and col3 = max or col3 = 1
and col2 = max
The final result should be
col1 col2 col3 col4
-------------------------
2016-05-05 1 3 32
2016-05-12 3 1 31
Not sure if thats the most efficient way , but you can use ROW_NUMBER() :
SELECT * FROM (
SELECT t.*,
ROW_NUMBER() OVER(PARTITION BY t.col1 ORDER BY t.col3 DESC) as rnk,
WHERE t.col2 = 1
UNION ALL
SELECT t.*,
ROW_NUMBER() OVER(PARTITION BY t.col1 ORDER BY t.col2 DESC) as rnk,
WHERE t.col3 = 1) tt
WHERE rnk = 1
This will give you all the records with
(col2=1 and col3=max) or (col3=1 and col2=max)
This is a bit tricky. Your data has no ambiguities, such as duplicate maximuma in col4 or "1" values in both col2 and col3.
The following is a direct translation of the logic in your question:
select t.*
from t
where t.col4 = (select max(t2.col4)
from t t2
where t2.col1 = t.col1 and (t2.col2 = 1 or t2.col3 = 1)
);
Try this. Note if there are more than 1 same max value, then you need all of those in output. And it will work for all scenarios, even when col1 is not in sync with col2 and col3.
I am first finding highest values of col2 and col3 and assigning them value as 1. Then in outer query, I am using your join condition. Demo created for Postgres DB as SQLServer wasn't available.
SQLFiddle Demo
select col1,col2,col3,col4
from
(
select t.*,
RANK() OVER(ORDER BY col3 DESC) as col3_max,
RANK() OVER(ORDER BY col2 DESC) as col2_max
from your_table t
) t1
where
(col2=1 and col3_max=1)
OR
(col3=1 and col2_max=1)
Alternative way:
SELECT * FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY col1 ORDER BY iif(col2 = 1, col3, col2) DESC) as r
FROM tbl) t
WHERE r = 1

select query to fetch rows corresponding to all values in a column

Consider this example table "Table1".
Col1 Col2
A 1
B 1
A 4
A 5
A 3
A 2
D 1
B 2
C 3
B 4
I am trying to fetch those values from Col1 which corresponds to all values (in this case, 1,2,3,4,5). Here the result of the query should return 'A' as none of the others have all values 1,2,3,4,5 in Col2.
Note that the values in Col2 are decided by other parameters in the query and they will always return some numeric values. Out of those values the query needs to fetch values from Col1 corresponding to all in Col2. The values in Col2 could be 11,12,1,2,3,4 for instance (meaning not necessarily in sequence).
I have tried the following select query:
select distinct Col1 from Table1 where Col1 in (1,2,3,4,5);
select distinct Col1 from Table1 where Col1 exists (select distinct Col2 from Table1);
and its different variations. But the problem is that I need to apply an 'and' for Col2 not an 'or'.
like Return a value from Col1 where Col2 'contains' all values between 1 and 5.
Appreciate any suggestion.
You could use analytic ROW_NUMBER() function.
SQL FIddle for a setup and working demonstration.
SELECT col1
FROM
(SELECT col1,
col2,
row_number() OVER(PARTITION BY col1 ORDER BY col2) rn
FROM your_table
WHERE col2 IN (1,2,3,4,5)
)
WHERE rn =5;
UPDATE As requested by OP, some explanation about how the query works.
The inner sub-query gives you the following resultset:
SQL> SELECT col1,
2 col2,
3 row_number() OVER(PARTITION BY col1 ORDER BY col2) rn
4 FROM t
5 WHERE col2 IN (1,2,3,4,5);
C COL2 RN
- ---------- ----------
A 1 1
A 2 2
A 3 3
A 4 4
A 5 5
B 1 1
B 2 2
B 4 3
C 3 1
D 1 1
10 rows selected.
PARTITION BY clause will group each sets of col1, and ORDER BY will sort col2 in each group set of col1. Thus the sub-query gives you the row_number for each row in an ordered way. now you know that you only need those rows where row_number is at least 5. So, in the outer query all you need ot do is WHERE rn =5 to filter the rows.
You can use listagg function, like
SELECT Col1
FROM
(select Col1,listagg(Col2,',') within group (order by Col2) Col2List from Table1
group by Col1)
WHERE Col2List = '1,2,3,4,5'
You can also use below
SELECT COL1
FROM TABLE_NAME
GROUP BY COL1
HAVING
COUNT(COL1)=5
AND
SUM(
(CASE WHEN COL2=1 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=2 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=3 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=4 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=5 THEN 1 ELSE 0
END))=5