sql/oracle select values seperated by comma with grouping - sql

I have first table like this: table_1
date
group_number
c_id
rate
01.01.2020
A
001
12.0
02.01.2020
A
001
12.0
01.01.2020
A
002
10.0
01.01.2020
B
103
8.0
01.01.2020
B
101
8.0
01.01.2020
C
203
11.0
And have second table_2 with name of group with date of records:
date
group_number
01.01.2020
A
02.02.2020
A
03.03.2020
A
01.01.2020
B
01.02.2020
B
01.01.2020
C
The task is to write to new column in table_2 the rates of each c_id seperated by comma, grouped by group_number. I need to add new column to table_2 as next:
date
group_number
rate_for_groups
01.01.2020
A
12.0, 10.0
02.02.2020
A
12.0, 10.0
03.03.2020
A
12.0, 10.0
01.01.2020
B
8.0, 8.0
01.02.2020
B
8.0, 8.0
01.01.2020
C
11.0
I have tried to do smth like this:
select *,
listagg(rate, ',') within group (order by C_ID) as rates
from table_1
group by group_number
but it raised the error "not a group by expression"

Your query shows only half the task: You are only looking at table_1. With GROUP BY group_number you tell the DBMS to select one row per group_number only. That is fine for that table. But you cannot SELECT * then, because there are several rows per group_number. How is the DBMS supposed to know which row's values to display then for a group_number?
Remove the * from that query to get it valid. select the group_number instead. Then join this result to table_2.
select *
from table_2 t2
left outer join
(
select
group_number,
listagg(rate, ',') within group (order by c_id) as rates
from table_1
group by group_number
) t1 on t1.group_number = t2.group_number
order by t2.group_number, t2.date;

Related

SQL (bigquery) Repeat values from previous day if they don't exist already

I have a table that includes users that created an event in the app and they produced some revenue based on this event (it's added cummulatively every day):
date
user_id
revenue
2022-04-01
A
0.5
2022-04-01
B
0.3
2022-04-01
C
0.7
2022-04-02
B
0.6
2022-04-02
C
0.9
2022-04-03
C
1.2
What I want to do is use the data about all the users from the first day, but if they don't bring any revenue, I would like to use the revenue value for this user from the previous day, like so:
date
user_id
revenue
2022-04-01
A
0.5
2022-04-01
B
0.3
2022-04-01
C
0.7
2022-04-02
A
0.5
2022-04-02
B
0.6
2022-04-02
C
0.9
2022-04-03
A
0.5
2022-04-03
B
0.6
2022-04-03
C
1.2
My initial idea was to somehow copy first day user_id's and leave the revenue value for this day as null:
date
user_id
revenue
2022-04-01
A
0.5
2022-04-01
B
0.3
2022-04-01
C
0.7
2022-04-02
A
null
2022-04-02
B
0.6
2022-04-02
C
0.9
2022-04-03
A
null
2022-04-03
B
null
2022-04-03
C
1.2
Then I would use this to find the right values to fill NULLs with
SELECT date,
user_id,
revenue,
LAST_VALUE(revenue, IGNORE NULLS) as last_values
FROM table
So the question is, how do I go about "copying" my first day users to every following day in the table?
Maybe, there is a better solution than the one I've thought about?
The problem of this task is generating an output that contains the cartesian product of "user_id" values with "date" values. One option is to generate the empty cartesian product first, then LEFT JOIN with your original table and fill the NULL values using a window function.
Instead of the LAST_VALUE function, you could use the MAX window function and limit its frame till the current row. Given that your revenue values are cumulative, you should get exactly the last non-null value as a correct output.
WITH cte AS (
SELECT *
FROM (SELECT DISTINCT date FROM tab) dates
INNER JOIN (SELECT DISTINCT user_id FROM tab) users
ON 1=1
)
SELECT cte.user_id,
cte.date,
COALESCE(tab.revenue,
MAX(revenue) OVER(PARTITION BY user_id
ORDER BY date
ROWS UNBOUNDED PRECEDING)) AS revenue
FROM cte
LEFT JOIN tab
ON cte.user_id = tab.user_id
AND cte.date = tab.date
ORDER BY cte.user_id,
cte.date
Consider below approach
select date, user_id,
first_value(revenue ignore nulls) over prev_values as revenue
from (select distinct date from your_table),
(select distinct user_id from your_table)
left join your_table
using(date, user_id)
window prev_values as (
partition by user_id order by date desc
rows between current row and unbounded following
)
order by date, user_id
if applied to sample data in y our question - output is

INNER JOIN SQL with DateTime return multiple record

I have the following table:
Group RecDate oData
---------------------------------------
123 2022-03-20 02:00:00 F1xR
123 2022-03-21 02:30:00 F1xF
123 2022-03-22 05:00:00 F1xN
123 2022-03-15 04:00:00 F2xR
From the table above, I want to get the MAX date group by 2 char from oData field. Then I wrote a query like this:
SELECT a.Group, MAX(a.RecDate) RecDate, LEFT(a.oData, 2) oDataNo
INTO #t1
FROM TableData a
GROUP BY a.Group, LEFT(a.oData, 2)
SELECT * FROM #t1
Then, the result should be:
Group RecDate oDataNo
--------------------------------------------
123 2022-03-22 05:00:00 F1
123 2022-03-15 04:00:00 F2
From the result above (#t1), I want to join with the TableData to get the RIGHT character (1 digit) from oData field. So I INNER JOIN the #t1 with TableData. The JOIN field is RecDate. But it is strange that the result isn't what I want.
The query like:
SELECT RIGHT(a.oData,1) oDataStat, b.*
FROM TableData a
INNER JOIN #t1 b ON a.RecDate = b.RecDate
The wrong result like:
The result should be:
Group RecDate oDataNo oDataStat
-----------------------------------------------------------
123 2022-03-22 05:00:00 F1 N
123 2022-03-15 04:00:00 F2 R
Am I doing wrong approach?
Please advise. Really appreciated.
Thank you.
The query you provided returns the data you desire. However its cleaner to do it in a single query e.g.
WITH cte AS (
SELECT *
, RIGHT(a.oData,1) oDataStat
, ROW_NUMBER() OVER (PARTITION BY LEFT(a.oData, 2) ORDER BY RecDate DESC) rn
FROM TableData a
)
SELECT [Group], RecDate, oData, oDataStat
FROM cte
WHERE rn = 1
ORDER BY RecDate;
returns:
Group
RecDate
oData
oDataStat
123
2022-03-15 04:00:00
F2xR
R
123
2022-03-22 05:00:00
F1xN
N
Note: Your query as posted doesn't actually run due to not escaping [Group] - you should ensure everything you post has any errors removed first.

Window Function / Aggregate Function / Interrupting Window

I have a Table looking like this (Cols A-D):
A B C D E
----------------------------------------------------------
1 2011 2011-06-30 A 2013-06-30
1 2012 2012-06-30 A 2013-06-30
1 2013 2013-06-30 A 2013-06-30
1 2014 2015-06-30 B 2015-06-30
1 2015 9999-12-31 A 9999-12-31
2 2014 9999-12-31 C 9999-12-31
2 2015 9999-12-31 C 9999-12-31
2 2016 9999-12-31 C 9999-12-31
I try to create col E based on A-D via window functions. I need to calculate the max(C) without interruption of D (if it changes the next window should begin) ordered by A, B and C.
You need to identify adjacent groups. One method uses a difference of window functions to identify the groups:
select t.*,
max(c) over (partition by a, seqnum_a - seqnum_ad) as e
from (select t.*,
row_number() over (partition by a order by b) as seqnum_a,
row_number() over (partition by a, d order by b) as seqnum_ad
from t
) t;
It is a bit hard to explain how the difference of row numbers works. However, if you run the subquery and stare at the results, you'll probably see how it works.
Try below query to get the requested result
select t1.*,t2.C as E from table1 as t1
(select D,max(c) C from table1 group by D) as t2 on t1.D=t2.D

How to get lastest date group by employee of a column but without another column

I'm working a query in SQL 2005.
I'm trying to get the latest date for a number column. The trick is there is another column (rate) that use the column date and I fetch the wrong column in the end.
An example will better explain my question.
This is my SQL table EmployeeRates:
----------------------------------
FkEmployee | Date | Rate | Number |
----------------------------------
1 2000 15 1.5
1 2001 16 1.5
1 2002 16 1.6
2 2000 12 1.5
2 2001 14 1.6
2 2002 15 1.6
So if I fetch the latest date, currently I have :
FkEmployee #1 = 2002 (which is correct because it's the latest date for the number column.)
FkEmployee #2 = 2002 (which is not what I want, because that year it was the rate that changed and there is a duplicate number) What I want is 2001.
The code I have right now (2015-08-10 14:15)
SELECT t1.FkEmployee, t1.Date
FROM EmployeeRates t1
INNER JOIN
(
SELECT FkEmployee, MAX(Date) AS MaxDate
FROM EmployeeRates
GROUP BY FkEmployee
)
t2 ON t1.FkEmploye = t2.FkEmploye
AND t1.DateTaux = t2.MaxDate
ORDER BY t1.FkEmploye
Thanks for anybody that can help =)
This should work. First find MIN date by Employee, Number, then get the MAX of that. This will ensure you are getting the earliest date per number, but latest date per employee:
SELECT t1.FkEmployee, t1.Date
FROM EmployeeRates t1
INNER JOIN
(SELECT FkEmployee,MAX(MinDate) AS MaxDate from
(SELECT FkEmployee, MIN(Date) AS MinDate
FROM EmployeeRates
GROUP BY FkEmployee,Number) a
GROUP BY Fkemployee
)
t2 ON t1.FkEmployee = t2.FkEmployee
AND t1.DateTaux = t2.MaxDate
ORDER BY t1.FkEmployee

Select Most Recent Entry in SQL

I'm trying to select the most recent non zero entry from my data set in SQL. Most examples of this are satisfied with returning only the date and the group by variables, but I would also like to return the relevant Value. For example:
ID Date Value
----------------------------
001 2014-10-01 32
001 2014-10-05 10
001 2014-10-17 0
002 2014-10-03 17
002 2014-10-20 60
003 2014-09-30 90
003 2014-10-10 7
004 2014-10-06 150
005 2014-10-17 0
005 2014-10-18 9
Using
SELECT ID, MAX(Date) AS MDate FROM Table WHERE Value > 0 GROUP BY ID
Returns:
ID Date
-------------------
001 2014-10-05
002 2014-10-20
003 2014-10-10
004 2014-10-06
005 2014-10-18
But whenever I try to include Value as one of the selected variables, SQLServer results in an error:
"Column 'Value' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause."
My desired result would be:
ID Date Value
----------------------------
001 2014-10-05 10
002 2014-10-20 60
003 2014-10-10 7
004 2014-10-06 150
005 2014-10-18 9
One solution I have thought of would be to look up the results back in the original Table and return the Value that corresponds to the relevant ID & Date (I have already trimmed down and so I know these are unique), but this seems to me like a messy solution. Any help on this would be appreciated.
NOTE: I do not want to group by Value as this is the result I am trying to pull out in the end (i.e. for each ID, I want the most recent Value). Further Example:
ID Date Value
----------------------------
001 2014-10-05 10
001 2014-10-06 10
001 2014-10-10 10
001 2014-10-12 8
001 2014-10-18 0
Here, I only want the last non zero entry. (001, 2014-10-12, 8)
SELECT ID, MAX(Date) AS MDate, Value FROM Table WHERE Value > 0 GROUP BY ID, Value
Would return:
ID Date Value
----------------------------
001 2014-10-10 10
001 2014-10-12 8
This can also be done using a window function which is very ofter faster than a join on a grouped query:
select id, date, value
from (
select id,
date,
value,
row_number() over (partition by id order by date desc) as rn
from the_table
) t
where rn = 1
order by id;
Assuming you don't have repeated dates for the same ID in the table, this should work:
SELECT A.ID, A.Date, A.Value
FROM
T1 AS A
INNER JOIN (SELECT ID,MAX(Date) AS Date FROM T1 WHERE Value > 0 GROUP BY ID) AS B
ON A.ID = B.ID AND A.Date = B.Date
select a.id, a.date, a.value from Table1 a inner join (
select id, max(date) mydate from table1
where Value>0 group by ID) b on a.ID=b.ID and a.Date=b.mydate
Using Subqry,
SELECT ID, Date AS MDate, VALUE
FROM table t1
where date = (Select max(date)
from table t2
where Value >0
and t1.id = t2.id
)
Answers provided are perfectly adequate, but Using CTE:
;WITH cteTable
AS
(
SELECT
Table.ID [ID], MAX(Date) [MaxDate]
FROM
Table
WHERE
Table.Value > 0
GROUP BY
Table.ID
)
SELECT
cteTable.ID, cteTable.Date, Table.Value
FROM
Table INNER JOIN cteTable ON (Table.ID = cteTable.ID)