SQL Server: Two Level Sort (Order/Group By??)

SQL Server: Two Level Sort (Order/Group By??) - sql

Everyone:
I have the following code which produced the below table:
SELECT DISTINCT a.[Date]
,a.[ID]
,a.[Account]
,a.[First_Last]
FROM [Table] AS a
WHERE [First_Last] = 1 OR
[First_Last] = (
SELECT MAX([First_Last])
FROM [Table] AS b
WHERE a.[ID] = b.[ID] AND a.[Account] = b.[Account]
)
ORDER BY [ID], [Account], [Date]
Date ID Account First_Last
10/31/2018 1111 45 1
1/29/2021 1111 45 4
9/29/2017 1111 753 1
9/28/2018 1111 753 2
9/29/2017 2222 481 1
1/31/2018 2222 481 2
10/31/2017 2222 488 1
1/31/2018 2222 488 2
11/30/2017 2222 582 1
1/31/2019 2222 582 3
2/28/2017 2222 621 1
2/28/2018 2222 621 2
6/30/2017 2222 1007 1
6/29/2018 2222 1007 2
But I need it to be ordered this way:
Date ID Account First_Last
9/29/2017 1111 753 1
9/28/2018 1111 753 2
10/31/2018 1111 45 1
1/29/2021 1111 45 4
2/28/2017 2222 621 1
2/28/2018 2222 621 2
6/30/2017 2222 1007 1
6/29/2018 2222 1007 2
9/29/2017 2222 481 1
1/31/2018 2222 481 2
10/31/2017 2222 488 1
1/31/2018 2222 488 2
11/30/2017 2222 582 1
1/31/2019 2222 582 3
Notice that the table I need is not sorted by Account. It is sorted by Date for each ID-Account combination. For example, for ID = 1111, Account 753 comes before Account 45 because 753's first date is 9/29/2017 and 45's first date is 10/31/2018. Since I do not want Account to be sorted, I tried to remove Account from the ORDER BY, but that put the Account numbers in random lines because of Date instead of grouping them together.
What am I missing?
Thank you.

You can use window function to find the "first date" by ID and Account.
order by ID,
min([Date]) over(partition by ID, Account),
[Date]

Related

Getting an element and the next from a table

I have a table with ids, cities and some sequence number, say:
ID CITY SEQ_NO
1 Milan 123
2 Paris 124
1 Rome 125
1 Naples 126
1 Strasbourg 130
3 London 129
3 Manchester 132
2 Strasbourg 128
3 Rome 131
2 Rome 127
4 Moscow 135
5 New York 136
4 Helsinki 137
I want to get the city that comes after Rome for the same id, in this case, I can order them by doing something like:
SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY SEQ_NO) as rownum,
id,
city,
seq_no
FROM mytable
I get:
rownum ID CITY SEQ_NO
1 1 Milan 123
2 1 Rome 125
3 1 Naples 126
4 1 Strasbourg 130
1 2 Paris 124
2 2 Rome 127
3 2 Strasbourg 128
1 3 London 129
2 3 Rome 131
3 3 Manchester 132
1 4 Moscow 135
2 4 Helsinki 137
1 5 New York 136
and, I want to get
ID CITY SEQ_NO
1 Rome 125
1 Naples 126
2 Rome 127
2 Strasbourg 128
3 Rome 131
3 Manchester 132
How do I proceed?

Hmmm . . . I might suggest window functions:
select t.*
from (select t.*,
lag(city) over (partition by id order by seq_no) as prev_city
from mytable t
) t
where 'Rome' in (city, prev_city)

LAG / OVER / PARTITION / ORDER BY using conditions - SQL Server 2017

I have a table that looks like this:
Date AccountID Amount
2018-01-01 123 12
2018-01-06 123 150
2018-02-14 123 11
2018-05-06 123 16
2018-05-16 123 200
2018-06-01 123 18
2018-06-15 123 17
2018-06-18 123 110
2018-06-30 123 23
2018-07-01 123 45
2018-07-12 123 116
2018-07-18 123 60
This table has multiple dates and IDs, along with multiple Amounts. For each individual row, I want grab the last Date where Amount was over a specific value for that specific AccountID. I have been trying to use the LAG( Date, 1 ) in combination with several variatons of CASE and OVER ( PARTITION BY AccountID ORDER BY Date ) statements but I've had no luck. Ultimately, this is what I would like my SELECT statement to return.
Date AccountID Amount LastOverHundred
2018-01-01 123 12 NULL
2018-01-06 123 150 2018-01-06
2018-02-14 123 11 2018-01-06
2018-05-06 123 16 2018-01-06
2018-05-16 123 200 2018-05-16
2018-06-01 123 18 2018-05-16
2018-06-15 123 17 2018-05-16
2018-06-18 123 110 2018-06-18
2018-06-30 123 23 2018-06-18
2018-07-01 123 45 2018-06-18
2018-07-12 123 116 2018-07-12
2018-07-18 123 60 2018-07-12
Any help with this would be greatly appreciated.

Use a cumulative conditional max():
select t.*,
max(case when amount > 100 then date end) over (partition by accountid order by date) as lastoverhundred
from t;

How to restrict the upper limit of rows while doing join in SQL?

I have two tables: balance and calendar.
Balance :
Account Date Balance
1111 01/01/2014 100
1111 02/01/2014 156
1111 03/01/2014 300
1111 04/01/2014 300
1111 07/01/2014 468
1112 02/01/2014 300
1112 03/01/2014 300
1112 06/01/2014 300
1112 07/01/2014 350
1112 08/01/2014 400
1112 09/01/2014 450
1113 01/01/2014 30
1113 02/01/2014 40
1113 03/01/2014 45
1113 06/01/2014 45
1113 07/01/2014 60
1113 08/01/2014 50
1113 09/01/2014 20
1113 10/01/2014 10
Calendar
date business_day_ind
01/01/2014 N
02/01/2014 Y
03/01/2014 Y
04/01/2014 N
05/01/2014 N
06/01/2014 Y
07/01/2014 Y
08/01/2014 Y
09/01/2014 Y
10/01/2014 Y
I need to do the following:
I need to fill in the missing days for all the accounts up to the maximum day for which it has value. Say for account 1111, it has value only till 07/01/2014, so the dates need to be filled only till that. But when I join with the calendar table (plain left join), I am not able restrict the maximum day to the day available for an account.
1111 01/01/2014 100 N
1111 02/01/2014 156 Y
1111 03/01/2014 300 Y
1111 04/01/2014 300 Y
1111 05/01/2014 N
1111 06/01/2014 N
1111 07/01/2014 468 Y
1111 08/01/2014 Y
1111 09/01/2014 Y
1111 10/01/2014 Y
1112 01/01/2014 N
1112 02/01/2014 300 Y
1112 03/01/2014 300 Y
1112 04/01/2014 N
1112 05/01/2014 N
1112 06/01/2014 300 Y
1112 07/01/2014 350 Y
1112 08/01/2014 400 Y
1112 09/01/2014 450 Y
1112 10/01/2014 Y
I need an efficient way (preferably not involving multiple steps) to restrict the dates up to an account's maximum balance available date (07/01/2014 in case of 1111,09/01/2014 in case 1112)
Desired output:
1111 01/01/2014 100 N
1111 02/01/2014 156 Y
1111 03/01/2014 300 Y
1111 04/01/2014 300 Y
1111 05/01/2014 N
1111 06/01/2014 N
1111 07/01/2014 468 Y
1112 01/01/2014 N
1112 02/01/2014 300 Y
1112 03/01/2014 300 Y
1112 04/01/2014 N
1112 05/01/2014 N
1112 06/01/2014 300 Y
1112 07/01/2014 350 Y
1112 08/01/2014 400 Y
1112 09/01/2014 450 Y
After filling the missing days, I am planning to impute the balance of previous business day to the missing days. I am planning to get previous business day for every date and do an update to missing rows by joining the original balance table with acct and previous business day as key.
Thanks.
I am Greenplum database.

A possible way would be put a second select in a subquery. For instance:
select ... from calendar a left outer join balance b on a.date = b.date
where a.date <= (select max(date) from balance c where b.Account = c.Account )

I suppose that you have third table, accounts:
select
accounts.account,
calendar.date,
balance.balance,
calendar.business_day_ind
from
accounts cross join lateral (
select *
from calendar
where calendar.date <= (
select max(date)
from balance
where balance.account = accounts.account)) as calendar left join
balance on (balance.account = accounts.account and balance.date = calendar.date)
order by
accounts.account, calendar.date;
About lateral joins

That was a fun challenge!
CREATE TABLE balance
(account int, balance_date timestamp, balance int)
DISTRIBUTED BY (account, balance_date);
INSERT INTO balance
values (1111,'01/01/2014', 100),
(1111, '02/01/2014', 156),
(1111, '03/01/2014', 300),
(1111, '04/01/2014', 300),
(1111, '07/01/2014', 468),
(1112, '02/01/2014', 300),
(1112, '03/01/2014', 300),
(1112, '06/01/2014', 300),
(1112, '07/01/2014', 350),
(1112, '08/01/2014', 400),
(1112, '09/01/2014', 450),
(1113, '01/01/2014', 30),
(1113, '02/01/2014', 40),
(1113, '03/01/2014', 45),
(1113, '06/01/2014', 45),
(1113, '07/01/2014', 60),
(1113, '08/01/2014', 50),
(1113, '09/01/2014', 20),
(1113, '10/01/2014', 10);
CREATE TABLE calendar
(calendar_date timestamp, business_day_ind boolean)
DISTRIBUTED BY (calendar_date);
INSERT INTO calendar
values ('01/01/2014', false),
('02/01/2014', true),
('03/01/2014', true),
('04/01/2014', false),
('05/01/2014', false),
('06/01/2014', true),
('07/01/2014', true),
('08/01/2014', true),
('09/01/2014', true),
('10/01/2014', true);
analyze balance;
analyze calendar;
And now the query.
select d.account, d.my_date, b.balance, c.business_day_ind
from (
select account, start_date + interval '1 month' * (generate_series(0, duration)) AS my_date
from (
select account, start_date, (date_part('year', duration) * 12 + date_part('month', duration))::int as duration
from (
select start_date, age(end_date, start_date) as duration, account
from (
select account, min(balance_date) as start_date, max(balance_date) as end_date
from balance
group by account
) as sub1
) as sub2
) sub3
) as d
left outer join balance b on d.account = b.account and d.my_date = b.balance_date
join calendar c on c.calendar_date = d.my_date
order by d.account, d.my_date;
Results:
account | my_date | balance | business_day_ind
---------+---------------------+---------+------------------
1111 | 2014-01-01 00:00:00 | 100 | f
1111 | 2014-02-01 00:00:00 | 156 | t
1111 | 2014-03-01 00:00:00 | 300 | t
1111 | 2014-04-01 00:00:00 | 300 | f
1111 | 2014-05-01 00:00:00 | | f
1111 | 2014-06-01 00:00:00 | | t
1111 | 2014-07-01 00:00:00 | 468 | t
1112 | 2014-02-01 00:00:00 | 300 | t
1112 | 2014-03-01 00:00:00 | 300 | t
1112 | 2014-04-01 00:00:00 | | f
1112 | 2014-05-01 00:00:00 | | f
1112 | 2014-06-01 00:00:00 | 300 | t
1112 | 2014-07-01 00:00:00 | 350 | t
1112 | 2014-08-01 00:00:00 | 400 | t
1112 | 2014-09-01 00:00:00 | 450 | t
1113 | 2014-01-01 00:00:00 | 30 | f
1113 | 2014-02-01 00:00:00 | 40 | t
1113 | 2014-03-01 00:00:00 | 45 | t
1113 | 2014-04-01 00:00:00 | | f
1113 | 2014-05-01 00:00:00 | | f
1113 | 2014-06-01 00:00:00 | 45 | t
1113 | 2014-07-01 00:00:00 | 60 | t
1113 | 2014-08-01 00:00:00 | 50 | t
1113 | 2014-09-01 00:00:00 | 20 | t
1113 | 2014-10-01 00:00:00 | 10 | t
(25 rows)
I had to get the min and max dates for each account and then use generate_series to generate the months between the two dates. It would have been a bit cleaner query if you wanted a record for each day but I had to use another subquery to get the results at a monthly level.

SQL 2008 Running average with group by

So i have a table containing below columns.
I want to compute an running average from positiondate and for example 3 days back, grouped on dealno.
I know how to do with "case by" but problem is that I have around 200 different DealNo so I do not want to write an own case by clause for every deal.
On dealNo 1 it desired output should be Average(149 243 440 + 149 224 446 + 149 243 451)
DealNo PositionDate MarketValue
1 | 2016-11-27 | 149 243 440
2 | 2016-11-27 | 21 496 418
3 | 2016-11-27 | 32 249 600
1 | 2016-11-26 | 149 243 446
2 | 2016-11-26 | 21 496 418
3 | 2016-11-26 | 32 249 600
1 | 2016-11-25 | 149 243 451
3 | 2016-11-25 | 32 249 600
2 | 2016-11-25 | 21 496 418
3 | 2016-11-24 | 32 249 600
1 | 2016-11-24 | 149 225 582
2 | 2016-11-24 | 21 498 120
1 | 2016-11-23 | 149 256 867
2 | 2016-11-23 | 21 504 181
3 | 2016-11-23 | 32 253 440
1 | 2016-11-22 | 149 256 873
2 | 2016-11-22 | 21 506 840
3 | 2016-11-22 | 32 253 440
1 | 2016-11-21 | 149 234 535
2 | 2016-11-21 | 21 509 179
3 | 2016-11-21 | 32 253 600
I tried below script but it was not very effective since my table contains around 300k rows and approx 200 different dealno.
Is there a more effective way to do this in SQL 2008?
with cte as (
SELECT ROW_NUMBER() over(order by dealno, positiondate desc) as Rownr,
dealno,
positiondate,
Currency,
MvCleanCcy
FROM T1
)
select
rownr, positiondate, DealNo, Currency,
mvcleanavg30d = (select avg(MvCleanCcy) from cte2 where Rownr between c.Rownr and c.Rownr+3)
from cte as c

You don't need window functions. You can do this using outer apply:
select t1.*, tt1.marketvalue_3day
from t1 outer apply
(select avg(tt1.marketvalue) as marketvalue_3day
from (select top 3 tt1.*
from t1 tt1
where tt1.deal1 = t1.deal1 and
tt1.positiondate <= t1.positiondate
order by tt1.positiondate desc
) tt1
) tt1;

Counting the duplicate and setting the record in calculated column in SQL

I have a table Student_Information, with columns and data like:
ID StudentName FatherName NIC No_of_Childrens Date_of_Birth Date_of_Admission
1 Mark John 85 2010-04-01 2015-04-19
2 Akbar Aslam 89 2009-05-01 2015-04-19
3 Percul John 85 2010-04-01 2015-04-19
4 Ali Aslam 89 2009-05-01 2015-04-19
5 Diglor John 85 2010-04-01 2015-04-19
6 Sabi Aslam 89 2009-05-01 2015-04-19
I want to count the NIC column for duplicates and give numbers to no of childs column. Like this:
ID StudentName FatherName NIC No_of_Childrens Date_of_Birth Date_of_Admission
1 Mark John 85 1 2010-04-01 2015-04-19
2 Akbar Aslam 89 1 2009-05-01 2015-04-19
3 Percul John 85 2 2010-04-01 2015-04-19
4 Ali Aslam 89 2 2009-05-01 2015-04-19
5 Diglor John 85 3 2010-04-01 2015-04-19
6 Sabi Aslam 89 3 2009-05-01 2015-04-19

Try the below snippet -
SELECT ID, StudentName, FatherName, NIC,No_of_Childrens,Date_of_Birth, Date_of_Admission
FROM
(
SELECT ID, StudentName, FatherName, NIC,Date_of_Birth, Date_of_Admission,No_of_Childrens = RANK() OVER(PARTITION BY NIC ORDER BY ID)
FROM Student_Information) A ORDER BY ID

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Server: Two Level Sort (Order/Group By??) - sql

You can use window function to find the "first date" by ID and Account. order by ID, min([Date]) over(partition by ID, Account), [Date]

Related

Getting an element and the next from a table

LAG / OVER / PARTITION / ORDER BY using conditions - SQL Server 2017

How to restrict the upper limit of rows while doing join in SQL?

SQL 2008 Running average with group by

Counting the duplicate and setting the record in calculated column in SQL

Categories

Resources