I have a resultset in the below-mentioned form returned by a SQL:
ID Key
1 A
2 A
3 A
Now my requirement is to show the data in the below form:
Key ID1 ID2 ID3
A 1 2 3
How to build an SQL for this?
A Windowed Aggregate based solution with a single STATS-step in Explain:
SELECT
key,
-- value from 1st row = current row
ID AS ID1,
-- value from next row, similar to LEAD(ID, 1) Over (PARTITION BY Key ORDER BY ID)
Min(ID)
Over (PARTITION BY Key
ORDER BY ID
ROWS BETWEEN 1 Following AND 1 Following) AS ID2 ,
-- value from 3rd row
Min(ID)
Over (PARTITION BY Key
ORDER BY ID
ROWS BETWEEN 2 Following AND 2 Following) AS ID3
FROM mytable
QUALIFY -- only return the 1st row
Row_Number()
Over (PARTITION BY key
ORDER BY ID) = 1
As teradata 14.10 doesn't have a PIVOT function and assuming that for every unique key, there will be no more than 3 IDs( as mentioned in comments), you can use row_number() and aggregate function as below to get your desired result.
SELECT
key1,
MAX(CASE WHEN rn = 1 THEN ID END) AS ID1,
MAX(CASE WHEN rn = 2 THEN ID END) AS ID2,
MAX(CASE WHEN rn = 3 THEN ID END) AS ID3
FROM
(SELECT
t.*,
ROW_NUMBER() OVER (PARTITION BY key1 ORDER BY ID) AS rn
FROM table1 t) t
GROUP BY key1;
Result:
+------------+-----+-----+-----+
| key1 | id1 | id2 | id3 |
+------------+-----+-----+-----+
| A | 1 | 2 | 3 |
+------------+-----+-----+-----+
DEMO
Related
I am trying to write a query where I have some criteria where I pivot the results. However, due to output file constraints I am looking for the output to create a new line after the pivot exceeds X, even if the ID and such is otherwise the same.
What I am trying to do:
|--ID--|-Value-|
| 1 | val1 |
| 1 | val2 |
| 1 | val3 |
| 2 | val1 |
|--ID--|-Col1-|-Col2-|
| 1 | Val1| Val2|
| 1 | Val3| |
| 2 | Val1| |
SELECT *
FROM table
PIVOT(max(value) for field1 in (t1,t2)
as pvt
ORDER BY UNIQUE_ID
This is just a pivot example to pivot this particular column. However the output has a very strict number of column requirement so I'd be looking for any pivot beyond the 5th to "overflow" to the next row while retaining the unique id. I am looking at PIVOT but I dont think it will work here.
Is this even possible within the Snowflake platform or do I need to explore other options?
This requirement is purely presentation matter and in my opinion should not be performed at the database level. With that being said it is possible to achieve it by numbering rows in group and performing modulo division:
Samle data:
CREATE OR REPLACE TABLE tab
AS
SELECT 1 AS id, 'val1' AS value UNION
SELECT 1 AS id, 'val2' AS value UNION
SELECT 1 AS id, 'val3' AS value UNION
SELECT 2 AS id, 'val1' AS value;
Query:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY id ORDER BY value) - 1 AS rn
FROM tab
)
SELECT
id
,MAX(CASE WHEN rn % 2 = 0 THEN value END) AS col1
,MAX(CASE WHEN rn % 2 = 1 THEN value END) AS col2
FROM cte
GROUP BY id, FLOOR(rn / 2)
ORDER BY id, FLOOR(rn / 2);
Intermediate result:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY id ORDER BY value) - 1 AS rn
FROM tab
)
SELECT id,value, rn, FLOOR(rn / 2) AS row_index, rn % 2 AS column_index
FROM cte
ORDER BY ID, rn;
Generalized:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY id ORDER BY value) - 1 AS rn
FROM tab
)
SELECT
id
,MAX(CASE WHEN rn % N = 0 THEN value END) AS col1
,MAX(CASE WHEN rn % N = 1 THEN value END) AS col2
-- ....
,MAX(CASE WHEN rn % N = N-1 THEN value END) AS colN
FROM cte
GROUP BY id, FLOOR(rn / N)
ORDER BY id, FLOOR(rn / N);
I am trying to use the Row Number in SQL. However, it's not giving desired output.
Data :
ID Name Output should be
111 A 1
111 B 2
111 C 3
111 C 3
111 A 4
222 A 1
222 A 1
222 B 2
222 C 3
222 B 4
222 B 4
This is a gaps-and-islands problem. As a starter: for the question to just make sense, you need a column that defines the ordering of the rows - I assumed ordering_id. Then, I would recommend lag() to get the "previous" name, and a cumulative sum() that increases everytime the name changes in adjacent rows:
select id, name,
sum(case when name = lag_name then 0 else 1 end) over(partition by id order by ordering_id) as rn
from (
select t.*, lag(name) over(partition by id order by ordering_id) lag_name
from mytable t
) t
SQL Server 2008 makes this much trickier. You can identify the adjacent rows using a difference of rows numbers. Then you can assign the minimum id in each island and use dense_rank():
select t.*,
dense_rank() over (partition by name order by min_ordcol) as output
from (select t.*,
min(<ordcol>) over (partition by name, seqnum - seqnum_2) as min_ordcol
from (select t.*,
row_number() over (partition by name order by <ordcol>) as seqnum,
row_number() over (partition by name, id order by <ordcol>) as seqnum_2
from t
) t
) t;
I would like to generate an index row in a table that may contain duplicates. the index need to be based on values from the table and last update date.
The data looks like this:
ID Val1 LastUpdateDate
-- ------ -------------
1 0 07.09.2019
1 2.5 12.09.2019
1 2.5 27.09.2019
1 3.5 01.10.2019
1 2.5 24.10.2019
1 0 01.11.2019
I would like to have:
ID Val1 LastUpdateDate index
-- ------ ------------- ----
1 0 07.09.2019 1
1 2.5 12.09.2019 2
1 2.5 27.09.2019 2
1 3.5 01.10.2019 3
1 2.5 24.10.2019 4
1 0 01.11.2019 5
I've tried with the following code but it's not working:
SELECT ID
,Value1
,Value2
,Value3
,LastUpdateDate
,(ROW_NUMBER() OVER (PARTITION BY ID ORDER BY last_update_date) - ROW_NUMBER()OVER(PARTITION BY ID,Value1,Value2,Value3 ORDER BY ID,Value1,Value2,Value3)) AS index
FROM Table1
ORDER BY LastUpdateDate
You can interpret this as a gaps-and-islands problem. However, I think the simplest way is to use LAG() and count the changes:
You seem to want dense_rank():
SELECT t1.*,
SUM(CASE WHEN prev_val1 = val1 THEN 0 ELSE 1 END) OVER (PARTITION BY id ORDER BY last_update_date) as seqnum
FROM (SELECT t1.*,
LAG(val1) OVER (PARTITION BY ID ORDER BY last_update_date) as prev_val1
FROM Table1 t1
) t1
ORDER BY LastUpdateDate;
Note that index is a really bad name for a column, because it is a SQL keyword.
Tbl1
---------------------------------------------------------
Id Date Qty ReOrder
---------------------------------------------------------
1 1-1-18 1 3
2 2-1-18 0 3
3 3-1-18 2 3
4 4-1-18 3< >3
5 5-1-18 2 3
6 6-1-18 0 3
7 7-1-18 1 3
8 8-1-18 0 3
---------------------------------------------------------
I want the result like below
---------------------------------------------------------
Id Date Qty ReOrder
---------------------------------------------------------
1 1-1-18 1 3
5 5-1-18 2 3
---------------------------------------------------------
if ReOrder not same with Qty then date will be same upto after reorder=Qty
You can use cumulative approach with row_number() function :
select top (1) with ties *
from (select *, max(case when qty = reorder then 'v' end) over (order by id desc) grp
from table
) t
order by row_number() over(partition by grp order by id);
Unfortunately this will require SQL Server, But you can also do:
select *
from (select *, row_number() over(partition by grp order by id) seq
from (select *, max(case when qty = reorder then 'v' end) over (order by id desc) grp
from table
) t
) t
where seq = 1;
I need to select data base upon three conditions
Find the latest date (StorageDate Column) from the table for each record
See if there is more then one entry for date (StorageDate Column) found in first step for same ID (ID Column)
and then see if DuplicateID is = 2
So if table has following data:
ID |StorageDate | DuplicateTypeID
1 |2014-10-22 | 1
1 |2014-10-22 | 2
1 |2014-10-18 | 1
2 |2014-10-12 | 1
3 |2014-10-11 | 1
4 |2014-09-02 | 1
4 |2014-09-02 | 2
Then I should get following results
ID
1
4
I have written following query but it is really slow, I was wondering if anyone has better way to write it.
SELECT DISTINCT(TD.RecordID)
FROM dbo.MyTable TD
JOIN (
SELECT T1.RecordID, T2.MaxDate,COUNT(*) AS RecordCount
FROM MyTable T1 WITH (nolock)
JOIN (
SELECT RecordID, MAX(StorageDate) AS MaxDate
FROM MyTable WITH (nolock)
GROUP BY RecordID)T2
ON T1.RecordID = T2.RecordID AND T1.StorageDate = T2.MaxDate
GROUP BY T1.RecordID, T2.MaxDate
HAVING COUNT(*) > 1
)PT ON TD.RecordID = PT.RecordID AND TD.StorageDate = PT.MaxDate
WHERE TD.DuplicateTypeID = 2
Try this and see how the performance goes:
;WITH
tmp AS
(
SELECT *,
RANK() OVER (PARTITION BY ID ORDER BY StorageDate DESC) AS StorageDateRank,
COUNT(ID) OVER (PARTITION BY ID, StorageDate) AS StorageDateCount
FROM MyTable
)
SELECT DISTINCT ID
FROM tmp
WHERE StorageDateRank = 1 -- latest date for each ID
AND StorageDateCount > 1 -- more than 1 entry for date
AND DuplicateTypeID = 2 -- DuplicateTypeID = 2
You can use analytic function rank , can you try this query ?
Select recordId from
(
select *, rank() over ( partition by recordId order by [StorageDate] desc) as rn
from mytable
) T
where rn =1
group by recordId
having count(*) >1
and sum( case when duplicatetypeid =2 then 1 else 0 end) >=1