SQL Fill Missing Values

SQL Fill Missing Values - sql

i have an existing table that looks like this
date | days_num | value
2023-01-01 | 0 | 2
2023-01-01 | 1 | 3
2023-01-01 | 2 | 4
2023-01-01 | 3 | 4
2023-01-01 | 4 | 2
2023-01-01 | 5 | 1
2023-01-02 | 0 | 2
2023-01-02 | 1 | 3
2023-01-02 | 2 | 4
2023-01-02 | 3 | 2
2023-01-02 | 4 | 2
2023-01-03 | 0 | 3
2023-01-03 | 1 | 4
2023-01-03 | 2 | 5
targer is fill or complete the missing days_num to certain number eg. 5, for all of my dates, then set value to NULL. is it possible to query it?
date | days_num | value
2023-01-01 | 0 | 2
2023-01-01 | 1 | 3
2023-01-01 | 2 | 4
2023-01-01 | 3 | 4
2023-01-01 | 4 | 2
2023-01-01 | 5 | 1
2023-01-02 | 0 | 2
2023-01-02 | 1 | 3
2023-01-02 | 2 | 4
2023-01-02 | 3 | 2
2023-01-02 | 4 | 2
2023-01-02 | 5 | NULL *
2023-01-03 | 0 | 3
2023-01-03 | 1 | 4
2023-01-03 | 2 | 5
2023-01-03 | 3 | NULL *
2023-01-03 | 4 | NULL *
2023-01-03 | 5 | NULL *

Firstly you need to generate dates with sequence between min and max.
Then you need to generate days_nums with sequence again.
Then get cartesian product of dates and days_nums.
Then you need to run an outer join.
SELECT seq.date, seq.days_num, NULL as value
FROM (
SELECT * FROM (
VALUES
(DATE '2023-01-01', 0, 2),
(DATE '2023-01-01', 1, 3),
(DATE '2023-01-01', 2, 4),
(DATE '2023-01-01', 3, 4),
(DATE '2023-01-01', 4, 2),
(DATE '2023-01-01', 5, 1),
(DATE '2023-01-02', 0, 2),
(DATE '2023-01-02', 1, 3),
(DATE '2023-01-02', 2, 4),
(DATE '2023-01-02', 3, 2),
(DATE '2023-01-02', 4, 2),
(DATE '2023-01-03', 0, 3),
(DATE '2023-01-03', 1, 4),
(DATE '2023-01-03', 2, 5)
) AS t (date, days_num, value)
) AS ed
RIGHT OUTER JOIN (
SELECT * FROM
UNNEST(sequence(DATE '2023-01-01', DATE '2023-01-03', interval '1' day)) t(date),
UNNEST(sequence(0, 5)) t(days_num)
) AS seq ON ed.date = seq.date AND seq.days_num = ed.days_num
WHERE ed.date IS NULL AND ed.days_num IS NULL
ORDER BY seq.date, seq.days_num
OUTPUT
2023-01-02 00:00:00.000 5 NULL
2023-01-03 00:00:00.000 3 NULL
2023-01-03 00:00:00.000 4 NULL
2023-01-03 00:00:00.000 5 NULL

Related

put a 1 if the customer has bought in 3 following months sql assistant

I'm a beginner with teradata SQL assistant and I don't know if it can do what I need.
I have a base with the variables ID, month (or period) and the incomes of that month. What I need is to put a 1 if the client buys in the next 3 months or a 0 if not, and do it for all ID. For example, if I am in month 1 and there's a purchase in the next 3 months, then put a 1 in that row for that client. In the last periods as there will not be 3 months, an NA appears.
Here is code for the sample data:
IF OBJECT_ID('tempdb..#StackTest') IS NOT NULL
DROP TABLE #StackTest;
CREATE TABLE #StackTest
(Id int
,Month int
,Income int
);
INSERT INTO #StackTest
(Id
,Month
,Income
)
VALUES
(1, 1, 5000),
(1, 2, 0),
(1, 3, 0),
(1',4, 0),
(1,5, 0),
(1,6, 0),
(1, 7, 400),
(1, 8, 0),
(1, 9, 0),
(1, 10, 0),
(1, 11, 0),
(1, 12, 0),
(1, 13, 400),
(2, 1, 5000),
(2, 2, 0),
(2, 3, 100),
(2,4, 0),
(2,5, 0),
(2,6, 0),
(2, 7, 0),
(2, 8, 1500),
(2, 9, 0),
(2, 10, 0),
(2, 11, 0),
(2, 12, 100),
(2, 13, 750),
(3, 1, 0),
(3, 2, 0),
(3, 3, 0),
(3',4, 0),
(3,5, 700),
(3,6, 240),
(3, 7, 100),
(3, 8, 0),
(3, 9, 0),
(3, 10, 0),
(3, 11, 0),
(3, 12, 500),
(3, 13, 760);
ID | Month | Incomes
1 | 1 | 5000
1 | 2 | 0
1 | 3 | 0
1 | 4 | 0
1 | 5 | 0
1 | 6 | 0
1 | 7 | 400
1 | 8 | 300
1 | 9 | 0
1 | 10 | 0
1 | 11 | 0
1 | 12 | 0
1 | 13 | 400
2 | 1 | 0
2 | 2 | 100
2 | 3 | 0
2 | 4 | 0
2 | 5 | 0
2 | 6 | 0
2 | 7 | 0
2 | 8 | 1500
2 | 9 | 0
2 | 10 | 0
2 | 11 | 0
2 | 12 | 100
2 | 13 | 750
3 | 1 | 0
3 | 2 | 0
3 | 3 | 0
3 | 4 | 0
3 | 5 | 700
3 | 6 | 240
3 | 7 | 100
3 | 8 | 0
3 | 9 | 0
3 | 10 | 0
3 | 11 | 0
3 | 12 | 500
3 | 13 | 760
I had to do it with R and here they could help me, but now I've to do it with teradata sql assistant.
This is what I want:
ID | Month | Incomes | Quarterly
1 | 1 | 5000 | 0
1 | 2 | 0 | 0
1 | 3 | 0 | 0
1 | 4 | 0 | 1
1 | 5 | 0 | 1
1 | 6 | 0 | 1
1 | 7 | 400 | 1
1 | 8 | 300 | 0
1 | 9 | 0 | 0
1 | 10 | 0 | 0
1 | 11 | 0 | NA
1 | 12 | 0 | NA
1 | 13 | 400 | NA
2 | 1 | 0 | 1
2 | 2 | 100 | 0
2 | 3 | 0 | 0
2 | 4 | 0 | 0
2 | 5 | 0 | 1
2 | 6 | 0 | 1
2 | 7 | 0 | 1
2 | 8 | 1500 | 0
2 | 9 | 0 | 1
2 | 10 | 0 | 1
2 | 11 | 0 | NA
2 | 12 | 100 | NA
2 | 13 | 750 | NA
3 | 1 | 0 | 0
3 | 2 | 0 | 1
3 | 3 | 0 | 1
3 | 4 | 0 | 1
3 | 5 | 700 | 1
3 | 6 | 240 | 1
3 | 7 | 100 | 0
3 | 8 | 0 | 0
3 | 9 | 0 | 1
3 | 10 | 0 | 1
3 | 11 | 0 | NA
3 | 12 | 500 | NA
3 | 13 | 760 | NA
This was my attempt, but obviously it failed and I didn't get what I expected.
select Id,Month,Incomes, SUM(Incomes) OVER (PARTITION BY Id ORDER BY Month ROWS 3 PRECEDING) AS Quarterly from rentability order by Id, Month
*rentability is a table created.
Does anyone how to mark with a 1 or with the max of that period? Thanks!

Consider:
select
t.*,
case when sum(income) over(
partition by id
order by month
range between 1 following and 3 following
) > 0
then 1
else 0
end quaterly
from StackTest t
This works by performing a window sum over the 3 following months (we use a range definition instead of a row definition, so this should work even if you have missing records).

How to query table based on specific rows in another table using SQL SELECT

There's a table with data for several teams that looks like this:
original_dates:
date | team_id | value
---------------------------------
2019-01-01 | 1 | 13
2019-01-01 | 2 | 88
2019-01-02 | 1 | 17
2019-01-02 | 2 | 99
2019-01-03 | 1 | 26
2019-01-03 | 2 | 105
2019-01-04 | 1 | 49
2019-01-04 | 2 | 134
2019-01-04 | 1 | 56
2019-01-04 | 2 | 167
However, on a certain date, we want to reset that day's value to 0, set all previous dates with that ID to 0, and subtract that value from all following dates, with a minimum of 0. Here's a table of dates that need to be reset:
inflection_dates:
date | team_id | value
-----------------------------------
2019-01-02 | 2 | 99
2019-01-03 | 1 | 26
And here's the resulting table, which I'm hoping to achieve:
result:
date | team_id | value
---------------------------------
2019-01-01 | 1 | 0
2019-01-01 | 2 | 0
2019-01-02 | 1 | 0
2019-01-02 | 2 | 0 <- row in inflection_dates (value was 99)
2019-01-03 | 1 | 0 <- row in inflection_dates (value was 26)
2019-01-03 | 2 | 6 (-99)
2019-01-04 | 1 | 23 (-26)
2019-01-04 | 2 | 35 (-99)
2019-01-04 | 1 | 30 (-26)
2019-01-04 | 2 | 68 (-99)
The only constraint is that all tables are read only, so I can only query them and not modify them.
Does anyone know if this might be possible?

With a join of the tables and a CASE expression to calculate the new value:
select o.date, o.team_id,
case
when o.date <= i.date then 0
else o.value - i.value
end value
from original_dates o inner join inflection_dates i
on i.team_id = o.team_id
See the demo (for MySql but it's standard SQL).
Results:
| date | team_id | value|
| ------------------- | ------- | ---- |
| 2019-01-01 00:00:00 | 1 | 0 |
| 2019-01-01 00:00:00 | 2 | 0 |
| 2019-01-02 00:00:00 | 1 | 0 |
| 2019-01-02 00:00:00 | 2 | 0 |
| 2019-01-03 00:00:00 | 1 | 0 |
| 2019-01-03 00:00:00 | 2 | 6 |
| 2019-01-04 00:00:00 | 1 | 23 |
| 2019-01-04 00:00:00 | 2 | 35 |
| 2019-01-04 00:00:00 | 1 | 30 |
| 2019-01-04 00:00:00 | 2 | 68 |

Try this:
drop table #tmp
---------------------------------
select '2019-01-01' as date, 1 as team_id, 13 as value into #tmp
union select '2019-01-01', 2, 88
union select '2019-01-02', 1, 17
union select '2019-01-02', 2, 99
union select '2019-01-03', 1, 26
union select '2019-01-03', 2, 105
union select '2019-01-04', 1, 49
union select '2019-01-04', 2, 134
union select '2019-01-04', 1, 56
union select '2019-01-04', 2, 167
drop table #tmpinflection
---------------------------------
select '2019-01-02' as date, 2 as team_id, 99 as value into #tmpinflection
union select '2019-01-03', 1, 26
select a.date, a.team_id,
case when a.date <= b.date then 0
else a.value - b.value end as value
from #tmp a left join #tmpinflection b on a.team_id = b.team_id where b.date is not null

How group days with time slot to know if I have a all week in SQL?

I have a SQL table contains weekly slots with this columns :
| [id] | [dayOfWeek] | [startTime] | [endTime]
This table correspond with time slot when the shop is open :
ex : (1, 2, 14:00:00, 16:00:00) ==> the shop is open Tuesday (2nd day of week) between 14h and 16h.
How can I know with a sql function if I have the same time slot (ex: 14h => 16h) for each days of a week ?
EDIT
This is an example of my data :
| id | dayOfWeek | startTime | endTime |
|====|===========|===========|==========|
| 1 | 1 | 07:00:00 | 08:00:00 |
| 2 | 1 | 09:00:00 | 10:00:00 |
| 3 | 0 | 14:00:00 | 18:00:00 |
| 4 | 1 | 14:00:00 | 18:00:00 |
| 5 | 2 | 14:00:00 | 18:00:00 |
| 6 | 3 | 14:00:00 | 18:00:00 |
| 7 | 4 | 14:00:00 | 18:00:00 |
| 8 | 5 | 14:00:00 | 18:00:00 |
| 9 | 6 | 14:00:00 | 18:00:00 |
| 10 | 3 | 16:00:00 | 19:00:00 |
| 11 | 5 | 13:00:00 | 23:00:00 |
I want that my request return :
| dayOfWeek | startTime | endTime |
|===========|===========|==========|
| 1 | 07:00:00 | 08:00:00 |
| 1 | 09:00:00 | 10:00:00 |
| | 14:00:00 | 18:00:00 | --> my all week (id 3 --> 9)
| 3 | 16:00:00 | 19:00:00 |
| 5 | 13:00:00 | 23:00:00 |

Is this what you're looking for?
IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL
BEGIN DROP TABLE #TestData; END;
CREATE TABLE #TestData (
id INT NOT NULL,
[dayOfWeek] TINYINT NOT NULL,
startTime TIME(0) NOT NULL,
endTime TIME(0) NOT NULL
);
INSERT #TestData(id, dayOfWeek, startTime, endTime) VALUES
(1 , 1, '07:00:00', '08:00:00'),
(2 , 1, '09:00:00', '10:00:00'),
(3 , 0, '14:00:00', '18:00:00'),
(4 , 1, '14:00:00', '18:00:00'),
(5 , 2, '14:00:00', '18:00:00'),
(6 , 3, '14:00:00', '18:00:00'),
(7 , 4, '14:00:00', '18:00:00'),
(8 , 5, '14:00:00', '18:00:00'),
(9 , 6, '14:00:00', '18:00:00'),
(10, 3, '16:00:00', '19:00:00'),
(11, 5, '13:00:00', '23:00:00');
--=====================================
SELECT
td1.startTime,
td1.endTime,
dw.daysOfWeek
FROM (
SELECT DISTINCT
td.startTime,
td.endTime
FROM
#TestData td
) td1
CROSS APPLY (
SELECT
STUFF((
SELECT
CONCAT(', ', td2.dayOfWeek)
FROM
#TestData td2
WHERE
td1.startTime = td2.startTime
AND td1.endTime = td2.endTime
FOR XML PATH('')
), 1, 2, '')
) dw (daysOfWeek);
Results:
startTime endTime daysOfWeek
---------------- ---------------- -------------------------
07:00:00 08:00:00 1
09:00:00 10:00:00 1
13:00:00 23:00:00 5
14:00:00 18:00:00 0, 1, 2, 3, 4, 5, 6
16:00:00 19:00:00 3

The following returns ids that are open on 14:00 - 16:00 on every day of the week:
select id
from t
where startTime <= '14:00:00' and endTime >= '16:00:00'
group by id
having count(*) = 7

Calculating an Avg in SQL excluding the current row

I have a DB where certain records are tagged with an ID and I want create a view that contains the Average of all those records with the same ID, EXCLUDING the current record. For example, if my data looks like this:
ROW - ID - Value
1 1 20
2 1 30
3 1 40
4 2 60
5 2 80
6 2 40
7 3 50
8 3 20
9 3 40
My view needs to calculate the average of every row with the same ID, EXCLUDING the row it's on, so my output would look something like this:
ROW - ID - Value AVG
1 1 20 35
2 1 30 30
3 1 40 25
4 2 60 60
5 2 80 50
6 2 40 70
7 3 50 30
8 3 20 55
9 3 40 35
So, in the case of row 3, it's extracted rows 1 and 2, as they have the same ID and given me the avg of their values - 25.
I'm gone round the houses on this for a while now, but can't seem to nail it. Any help would be appreciated.

One Option if you have window functions
Declare #YourTable table (ROW int,ID int,Value int)
Insert Into #YourTable values
(1, 1, 20),
(2, 1, 30),
(3, 1, 40),
(4, 2, 60),
(5, 2, 80),
(6, 2, 40),
(7, 3, 50),
(8, 3, 20),
(9, 3, 40)
Select *
,Avg = (sum(value) over (Partition By ID)-Value)/NullIf((sum(1) over (Partition By ID)-1),0)
From #YourTable
Another Option is a OUTER APPLY
Select A.*
,B.*
From #YourTable A
Outer Apply (Select Avg=avg(value)
From #YourTable
where ID=A.ID and Row<>A.Row
) B
Both Return

SELECT t1.gid, AVG(t2.value)
FROM table1 as t1 INNER JOIN
table1 as t2 ON (t1.gid != t2.gid)
GROUP BY t1.gid;
Basically, join the table to itself on your condition and then group the results based on the first table's key.
This solution should work regardless of what database system you are usimg; there may be minor syntax details to change.
A table like this:
ID | Value
1 | 4
2 | 6
3 | 5
Becomes (when joined):
t1.ID | t2.ID | t1.Value | t2.Value
1 | 2 | 4 | 6
1 | 3 | 4 | 5
2 | 1 | 6 | 4
2 | 3 | 6 | 5
3 | 1 | 5 | 4
3 | 2 | 5 | 6
And, then the aggregate of the grouped rows yields the wanted values.

This query works for me:
select t1.row, t1.id, t1.value, (select avg(value) from test_table as t2 where t1.id = t2.id and t1.row != t2.row) as avg from test_table as t1;
Data in table created by me (i assume is simmilar to Yours):
mysql> select * from test_table;
+-----+------+-------+
| row | id | value |
+-----+------+-------+
| 1 | 1 | 20 |
| 2 | 1 | 30 |
| 3 | 1 | 40 |
| 4 | 2 | 60 |
| 5 | 2 | 80 |
| 6 | 2 | 40 |
| 7 | 3 | 50 |
| 8 | 3 | 20 |
| 9 | 3 | 40 |
+-----+------+-------+
Result of query:
+-----+------+-------+---------+
| row | id | value | avg |
+-----+------+-------+---------+
| 1 | 1 | 20 | 35.0000 |
| 2 | 1 | 30 | 30.0000 |
| 3 | 1 | 40 | 25.0000 |
| 4 | 2 | 60 | 60.0000 |
| 5 | 2 | 80 | 50.0000 |
| 6 | 2 | 40 | 70.0000 |
| 7 | 3 | 50 | 30.0000 |
| 8 | 3 | 20 | 45.0000 |
| 9 | 3 | 40 | 35.0000 |
+-----+------+-------+---------+

T-SQL: Using OVER and PARTITION BY

I have the following data
| Item | Value | Date |
------------------------------
| 1 | 10 | 01.01.2010
| 1 | 20 | 02.01.2010
| 1 | 30 | 03.01.2010
| 1 | 40 | 04.01.2010
| 1 | 50 | 05.01.2010
| 1 | 80 | 10.01.2010
| 2 | 30 | 04.01.2010
| 2 | 60 | 06.01.2010
| 2 | 70 | 07.01.2010
| 2 | 80 | 08.01.2010
| 2 | 100 | 09.01.2010
And the following statement
SELECT Item, Value, MIN(Date) OVER (PARTITION BY Item)
FROM Data
WHERE Value >= 50
And I get the following result
| Item | Value | Date |
------------------------------
| 1 | 50 | 05.01.2010
| 1 | 80 | 05.01.2010
| 2 | 60 | 06.01.2010
| 2 | 70 | 06.01.2010
| 2 | 80 | 06.01.2010
| 2 | 100 | 06.01.2010
But what I need is this
| Item | Value | Date |
------------------------------
| 1 | 10 | 05.01.2010
| 1 | 20 | 05.01.2010
| 1 | 30 | 05.01.2010
| 1 | 40 | 05.01.2010
| 1 | 50 | 05.01.2010
| 1 | 80 | 05.01.2010
| 2 | 30 | 06.01.2010
| 2 | 60 | 06.01.2010
| 2 | 70 | 06.01.2010
| 2 | 80 | 06.01.2010
| 2 | 100 | 06.01.2010
Is there any quick solution to get this with one statment without a self-join?
Thank you :)

without a self join, try this:
DECLARE #YourTable table (item int,value int, Date datetime)
INSERT #YourTable VALUES (1 , 10 , '01/01/2010')
INSERT #YourTable VALUES (1 , 20 , '02/01/2010')
INSERT #YourTable VALUES (1 , 30 , '03/01/2010')
INSERT #YourTable VALUES (1 , 40 , '04/01/2010')
INSERT #YourTable VALUES (1 , 50 , '05/01/2010')
INSERT #YourTable VALUES (1 , 80 , '10/01/2010')
INSERT #YourTable VALUES (2 , 30 , '04/01/2010')
INSERT #YourTable VALUES (2 , 60 , '06/01/2010')
INSERT #YourTable VALUES (2 , 70 , '07/01/2010')
INSERT #YourTable VALUES (2 , 80 , '08/01/2010')
INSERT #YourTable VALUES (2 , 100 , '09/01/2010')
SELECT Item, Value, MIN(CASE WHEN Value >= 50 THEN Date ELSE NULL END) OVER (PARTITION BY Item)
FROM #YourTable
OUTPUT:
Item Value
----------- ----------- -----------------------
1 10 2010-05-01 00:00:00.000
1 20 2010-05-01 00:00:00.000
1 30 2010-05-01 00:00:00.000
1 40 2010-05-01 00:00:00.000
1 50 2010-05-01 00:00:00.000
1 80 2010-05-01 00:00:00.000
2 30 2010-06-01 00:00:00.000
2 60 2010-06-01 00:00:00.000
2 70 2010-06-01 00:00:00.000
2 80 2010-06-01 00:00:00.000
2 100 2010-06-01 00:00:00.000
Warning: Null value is eliminated by an aggregate or other SET operation.
(11 row(s) affected)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Fill Missing Values - sql

Related

put a 1 if the customer has bought in 3 following months sql assistant

How to query table based on specific rows in another table using SQL SELECT

How group days with time slot to know if I have a all week in SQL?

Calculating an Avg in SQL excluding the current row

T-SQL: Using OVER and PARTITION BY

Categories

Resources