create table with 2 column with different conditions SQL - sql

I have a table with this format:
Id_command, date_creat
01 01-01-2020
02 01-01-2021
03 01-11-2020
..
I would like to extract from a table a new table where the first table contain all the id_command where date_creat > 01-01-2020 and a second column where date_creat > 01-01-2021.
The expected result :
Id_command (date_creat > 01-01-2020) , id command(date_creat < 31-12-2020)
01 02
03
I got the idea to crate two differnt table, then outer_join, but i am not sure if we can do this with a simpler manner
Thanks

First select the relevant rows from the table and add a row number
select Id_command,
row_number() over (order by Id_command) as rn
from tab
where date_creat > DATE'2020-01-01'
ID_COMMAND RN
---------- ----------
2 1
3 2
Make the same for the second conditions.
Finally use those two subqueries and full outer join them using the row number.
with a as(
select Id_command,
row_number() over (order by Id_command) as rn
from tab
where date_creat > DATE'2020-01-01'
), b as (
select Id_command,
row_number() over (order by Id_command) as rn
from tab
where date_creat <= DATE'2020-01-01')
select a.Id_command, b.Id_command
from a
full outer join b
on a.rn = b.rn
order by 1,2
ID_COMMAND ID_COMMAND
---------- ----------
2 1
3

Related

Get Previous Data Row in Oracle

I have a Oracle table where data can be ordered on the basis of date. Now I have a request to get data to specific condition and previous row to that data. for example :
if I have
Date
Dept
Employee
18-Aug
2
John
19-Aug
1
Meredith
20-Aug
9
Steve
21-Aug
0
Bella
so i give condition Dept = '0' , it should retrun below 2 rows :
Date
Dept
Employee
08/20
9
Steve
08/21
0
Bella
This would give you all with dept 0 and it predecessors, but would not have duplicates
SELECT
"Date", "Dept", "Employee" FROM tab1 WHERE "Dept" = 0
UNION
SELECT "Date", "Dept", "Employee" FROM tab1
WHERE "Date" IN (SELECT "DAT"
FROM
(SELECT "Date","Dept",
LAG("Date") OVER (ORDER BY
"Date") dat FROM tab1 ) t1
WHERE "Dept" = 0
)
Date | Dept | Employee
:---- | ---: | :-------
08/20 | 9 | Steve
08/21 | 0 | Bella
db<>fiddle here
You may use a subquery to get the date of the required department and select the first two rows where date column is less than or equal to that department date. Try the following (Supposing that the date is unique among departments)
Select D.Date_, D.Dept, D.Employee
From tbl_name D
Where D.Date_ <= (Select Date_ From tbl_name Where Dept = 0)
Order By Date_ DESC
FETCH NEXT 2 ROWS ONLY;
See a demo from db<>fiddle.
If the date is not unique, you may choose an extra column to order by, i.e. employee column. In this case you may try the following:
With CTE As
(
Select Date_, Dept, Employee,
ROW_NUMBER() Over (Order By Date_ DESC, Employee) rn
From tbl_name
)
Select Date_, Dept, Employee
From CTE
Where rn >= (Select rn From CTE Where Dept = 0)
FETCH NEXT 2 ROWS ONLY;
The second query is valid for both cases, see a demo.

multiple top n aggregates query defined as a view (or function)?

I couldn't find a past question exactly like this problem. I have an orders table, containing a customer id, order date, and several numeric columns (how many of a particular item were ordered on that date). Removing some of the numberics, it looks like this:
customer_id date a b c d
0001 07/01/22 0 3 3 5
0001 07/12/22 12 0 50 0
0002 06/30/22 5 65 0 30
0002 07/20/22 1 0 19 2
0003 08/01/22 0 0 99 0
I need to sum each numeric column by customer_id, then return the top n customers for each column. Obviously that means a single customer may appear multiple times, once for each column. Assuming top 2, the desired output would look something like this:
column_ranked customer_id sum rank
'a' 001 12 1
'a' 002 6 2
'b' 002 65 1
'b 001 3 2
'c' 003 99 1
'c' 001 53 2
'd' 002 30 1
'd' 001 5 2
(this assumes no date range filter)
My first thought was a CTE to collapse the table into its per-customer sums, then a union from the CTE, with a limit n clause, once for each summed column. That works if the date range is hard-coded into the CTE .... but I want to define this as a view, so it can be called by users something like this:
SELECT * from top_customers_view WHERE date_range BETWEEN ( date1 and date2 )
How can I pass the date restriction down to the CTE? Or am I taking the wrong approach entirely? If a view isn't possible, can it be done as a function? (without using a costly cursor, that is.)
Since the date ranges clearly produce a massive number of combinations you cannot generate a view with them. You can write a query, however, as shown below:
with
p as (select cast ('2022-01-01' as date) as ds, cast ('2022-12-31' as date) as de),
a as (
select top 10 customer_id, 'a' as col, sum(a) as s
from t cross join p where date between ds and de
group by customer_id order by s desc
),
b as (
select top 10 customer_id, 'b' as col, sum(b) as s
from t cross join p where date between ds and de
group by customer_id order by s desc
),
c as (
select top 10 customer_id, 'c' as col, sum(b) as s
from t cross join p where date between ds and de
group by customer_id order by s desc
),
d as (
select top 10 customer_id, 'd' as col, sum(b) as s
from t cross join p where date between ds and de
group by customer_id order by s desc
)
select * from a
union all select * from b
union all select * from c
union all select * from d
order by customer_id, col, s desc
The date range is in the second line.
See db<>fiddle.
Alternatively, you could create a data warehousing solution, but it would require much more effort to make it work.

Update a column based on other rows column value

I have a table t which looks like this
key fill store end_date status
1 123 1 2019-04-30 0
2 1234 1 2019-04-30 0
3 123 1 2019-05-01 0
Now I need to update the first record and set status=1 as the third record has same fill, store value and it is latest.
Output:
key fill store end_date status
1 123 1 2019-04-30 1
2 1234 1 2019-04-30 0
3 123 1 2019-05-01 0
I tried calculating row_number and tried to update the column based on it but unable to figure out how to use the result in the update clause.
update t set
status = 1
from (
select *
from (
select *
, row_number() over (partition by fill, store order by end_dt desc) as row_num from t
) a
where row_num = 2
) b
This query is updating all the records, what should change in my query to get the expected result?
I think that you want:
with cte as (
select status, row_number() over(partition by fill, store order by end_date desc) rn
from t
)
update cte set status = 1 where rn > 1
In the common table expression, row_number() ranks records having the same fill and store by descending end_date. Then, the outer query sets status to 1 on rows that were not ranked first.
You can do a correlated subquery:
update my_table a
set status = 1
where exists (
select 1
from my_table b
where b.fill = a.fill
and b.store = a.store
and b.end_date > a.end_date
)

Picking up latest 2 records from table in hive

Team, I have a scenario here.
I need to pick 2 latest record through Hql.
I have tried rownumber but does not seems to be getting expected out put
Select
A.emp_ref_i,
A.last_updt_d,
A.start_date,
case when A.Last_updt_d=max(A.Last_updt_d) over (partition by A.emp_ref_i)
and A.start_date=max(a.start_date) over (partition by A.emp_ref_i)
then 'Y' else 'N' end as Valid_f,
a.CHANGE
from
(
select
distinct(emp_ref_i),
last_updt_d,
start_date,
CHANGE
from
PR) A
Currently getting output as
EMP_REF_I LAST_UPDT_D start_date Valid_f CHANGE
1 123 3/29/2020 2/3/2019 Y CHG3
2 123 3/30/2019 2/4/2018 N CHG2
3 123 3/29/2019 2/4/2018 N CHG1
but required:
EMP_REF_I LAST_UPDT_D start_date Valid_f CHANGE
1 123 3/29/2020 2/3/2019 Y CHG3
2 123 3/30/2019 2/4/2018 N CHG2
Use row_number and filter:
select s.emp_ref_i,
s.last_updt_d,
s.start_date,
case when rn=1 then 'Y' else 'N' end Valid_f,
s.change
from
(
Select
A.*,
row_number() over(partition by A.emp_ref_i order by a.Last_updt_d desc, a.start_date desc) rn
from (...) A
)s
where rn<=2;

First value in DATE minus 30 days SQL

I have bunch of data out of which I'm showing ID, max date and it's corresponding values (user id, type, ...). Then I need to take MAX date for each ID, substract 30 days and show first date and it's corresponding values within this date period.
Example:
ID Date Name
1 01.05.2018 AAA
1 21.04.2018 CCC
1 05.04.2018 BBB
1 28.03.2018 AAA
expected:
ID max_date max_name previous_date previous_name
1 01.05.2018 AAA 05.04.2018 BBB
I have working solution using subselects, but as I have quite huge WHERE part, refresh takes ages.
SUBSELECT looks like that:
(SELECT MIN(N.name)
FROM t1 N
WHERE N.ID = T.ID
AND (N.date < MAX(T.date) AND N.date >= (MAX(T.date)-30))
AND (...)) AS PreviousName
How'd you write the select?
I'm using TSQL
Thanks
I can do this with 2 CTEs to build up the dates and names.
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE t1 (ID int, theDate date, theName varchar(10)) ;
INSERT INTO t1 (ID, theDate, theName)
VALUES
( 1,'2018-05-01','AAA' )
, ( 1,'2018-04-21','CCC' )
, ( 1,'2018-04-05','BBB' )
, ( 1,'2018-03-27','AAA' )
, ( 2,'2018-05-02','AAA' )
, ( 2,'2018-05-21','CCC' )
, ( 2,'2018-03-03','BBB' )
, ( 2,'2018-01-20','AAA' )
;
Main Query:
;WITH cte1 AS (
SELECT t1.ID, t1.theDate, t1.theName
, DATEADD(day,-30,t1.theDate) AS dMinus30
, ROW_NUMBER() OVER (PARTITION BY t1.ID ORDER BY t1.theDate DESC) AS rn
FROM t1
)
, cte2 AS (
SELECT c2.ID, c2.theDate, c2.theName
, ROW_NUMBER() OVER (PARTITION BY c2.ID ORDER BY c2.theDate) AS rn
, COUNT(*) OVER (PARTITION BY c2.ID) AS theCount
FROM cte1
INNER JOIN cte1 c2 ON cte1.ID = c2.ID
AND c2.theDate >= cte1.dMinus30
WHERE cte1.rn = 1
GROUP BY c2.ID, c2.theDate, c2.theName
)
SELECT cte1.ID, cte1.theDate AS max_date, cte1.theName AS max_name
, cte2.theDate AS previous_date, cte2.theName AS previous_name
, cte2.theCount
FROM cte1
INNER JOIN cte2 ON cte1.ID = cte2.ID
AND cte2.rn=1
WHERE cte1.rn = 1
Results:
| ID | max_date | max_name | previous_date | previous_name |
|----|------------|----------|---------------|---------------|
| 1 | 2018-05-01 | AAA | 2018-04-05 | BBB |
| 2 | 2018-05-21 | CCC | 2018-05-02 | AAA |
cte1 builds the list of max_date and max_name grouped by the ID and then using a ROW_NUMBER() window function to sort the groups by the dates to get the most recent date. cte2 joins back to this list to get all dates within the last 30 days of cte1's max date. Then it does essentially the same thing to get the last date. Then the outer query joins those two results together to get the columns needed while only selecting the most and least recent rows from each respectively.
I'm not sure how well it will scale with your data, but using the CTEs should optimize pretty well.
EDIT: For the additional requirement, I just added in another COUNT() window function to cte2.
I would do:
select id,
max(case when seqnum = 1 then date end) as max_date,
max(case when seqnum = 1 then name end) as max_name,
max(case when seqnum = 2 then date end) as prev_date,
max(case when seqnum = 2 then name end) as prev_name,
from (select e.*, row_number() over (partition by id order by date desc) as seqnum
from example e
) e
group by id;