I have two tables:
Main table
ID
Date
Device
252708
2022-01-01
Phone
252708
2022-01-01
Email
252252
2022-01-02
Phone
252252
2022-01-02
Phone
252252
2022-01-02
Phone
253022
2022-01-06
Phone
253022
2022-01-06
Phone
253228
2022-01-06
Email
253228
2022-01-06
Email
252708
2022-01-06
Phone
256703
2022-01-09
Phone
Date table
Date
Week
2022-01-01
WK 17
2022-01-02
WK 18
2022-01-03
WK 18
2022-01-04
WK 18
2022-01-05
WK 18
2022-01-06
WK 18
2022-01-07
WK 18
2022-01-08
WK 18
2022-01-09
WK 19
2022-01-10
WK 19
2022-01-11
WK 19
I want to merge the IDs into rows, grouping by Wk (using my date table)
ID
Date
Device_1
Wk
252708
2022-01-01
Phone, Email
WK17
252252
2022-01-02
Phone, Phone, Phone
WK18
253022
2022-01-06
Phone, Phone
WK18
253228
2022-01-06
Email, Email
WK18
252708
2022-01-06
Phone
WK18
256703
2022-01-09
Phone
WK19
I know I need the string_agg function to merge the devices into rows, however, I'm not sure how to separate by week. Thanks in advance
You can always use "FOR XML" instead of STRING_AGG. It would be something like this:
select
distinct(dev.ID)
,dev.Date
,(
select
Device + ',' as [text()]
FROM MainTable a
JOIN DateTable b on a.Date = b.Date
Where dev.ID = a.ID and b.Week = cal.Week
FOR XML PATH ('')
) as Device_1
,cal.Week
FROM MainTable dev
JOIN DateTable cal on dev.Date = cal.Date
Order By Week
Related
I have a pandas data frame called final_data that looks like this
cust_id
start_date
end_date
10001
2022-01-01
2022-01-30
10002
2022-02-01
2022-02-30
10003
2022-01-01
2022-01-30
10004
2022-03-01
2022-03-30
10005
2022-02-01
2022-02-30
I have another table in my sql database called penalties that looks like this
cust_id
level1_pen
level_2_pen
date
10001
1
4
2022-01-01
10001
1
1
2022-01-02
10001
0
1
2022-01-03
10002
1
1
2022-01-01
10002
5
0
2022-02-01
10002
4
0
2022-02-04
10003
1
6
2022-01-02
I want the final_data frame to look like this where it aggregates the data from the penalties table in SQL database based on the cust_id, start_date and end_date
cust_id
start_date
end_date
total_penalties
10001
2022-01-01
2022-01-30
8
10002
2022-02-01
2022-02-30
9
10003
2022-01-01
2022-01-30
7
How do I combine a lambda function for each row where it aggregates the data from the SQL query based on the cust_id, start_date, and end_date variables from each row of the pandas dataframe
Suppose
df = final_data table
df2 = penalties table
you can get the final_data frame that you want using this query:
SELECT
df.cust_id,
df.start_date,
df.end_date,
SUM(df2.level1_pen + df2.level_2_pen) as total_penalties
FROM
df
LEFT JOIN df2 ON df.cust_id = df2.cust_id
AND df2.date BETWEEN df.start_date AND df.end_date
GROUP BY
df.cust_id,
df.start_date,
df.end_date;
I have data like this..
month_date location sales
2022-01-01 Asia 150
2022-01-01 Europe 250
2022-02-01 Asia 100
2022-02-01 Europe 100
and breakdown by day_date
day_date location sales
2022-01-01 Asia 12
2022-01-02 Asia 10
2022-01-03 Asia 15
2022-01-04 Asia 19
2022-01-05 Asia 15
2022-01-06 Asia 11
....
2022-01-31 Asia 2
total: Asia 132
but when I compare sales between month_date=150 and day_date=132 I still have minus 18.
is it possible to add random data which contain minus 18 but breakdown by day_date?
like below
day_date location sales
2022-01-01 Asia 13
2022-01-02 Asia 11
2022-01-03 Asia 16
2022-01-04 Asia 20
2022-01-05 Asia 16
2022-01-06 Asia 12
....
2022-01-31 Asia 2
total: Asia 150
You might consider below. (diff number, 18 in this case, will be added by one starting from 1st day of month.)
SELECT day_date, location,
sales + DIV(diff, days) + IF(EXTRACT(DAY FROM day_date) <= MOD(diff, days), 1, 0) AS sales
FROM (
SELECT d.*,
m.sales - SUM(d.sales) OVER w AS diff,
DATE_DIFF(LAST_DAY(month_date), month_date, DAY) AS days
FROM day_table d
LEFT JOIN month_table m
ON DATE_TRUNC(d.day_date, MONTH) = m.month_date AND d.location = m.location
WINDOW w AS (PARTITION BY DATE_TRUNC(day_date, MONTH), d.location)
);
Query results
I am trying to create a dates table in SQL based on a set of inputs, but I haven't been able to figure it out.
I am receiving in SQL inputs as below:
This table:
Date
Value
2022-01-01
5
2022-07-12
10
2022-11-15
3
A Start Date = 2022-01-01
A stop Date = 2022-12-01
I need to get a table as below starting from Start Date until Stop Date, assiging each correspondent number based on the initial table to each date in that period:
Date
Value
2022-01-01
5
2022-01-02
5
2022-01-03
5
2022-01-04
5
.
5
.
5
.
5
2022-07-09
5
2022-07-10
5
2022-07-11
5
2022-07-12
10
2022-07-13
10
2022-07-14
10
.
10
.
10
2022-11-13
10
2022-11-14
10
2022-11-15
3
2022-11-16
3
2022-11-17
3
2022-11-18
3
How can I do that?
Thanks.
Using the window function lead() over() in concert with an ad-hoc tally table
Example
Select Date = dateadd(DAY,N,A.Date)
,A.Value
From (
Select *
,nDays = datediff(DAY,Date,lead(Date,1,dateadd(day,1,'2022-12-01')) over (order by date))
From YourTable
) A
Join ( Select Top 1000 N=-1+Row_Number() Over (Order By (Select NULL)) From master..spt_values n1, master..spt_values n2 ) B
on N<NDays
Order by Date
Results
Date Value
2022-01-01 5
2022-01-02 5
2022-01-03 5
2022-01-04 5
2022-01-05 5
...
2022-07-10 5
2022-07-11 5
2022-07-12 10
2022-07-13 10
2022-07-14 10
...
2022-11-12 10
2022-11-13 10
2022-11-14 10
2022-11-15 3
2022-11-16 3
2022-11-17 3
...
2022-11-30 3
2022-12-01 3
I'm trying to write a query that will return the amount of sales given the previous day. I am doing a test task for an internship device but have not done this before.
Source table:
saledate
salesum
2022-01-01
100
2022-01-02
150
2022-01-03
200
2022-01-05
100
Estimated result:
saledate
salesum
2022-01-01
100
2022-01-02
250
2022-01-03
350
2022-01-05
300
My query:
SELECT t1.saledate, t1.salesum=t1.salesum+t2.salesum
FROM sales t1
INNER JOIN (
SELECT saledate, salesum FROM sales
) t2
ON t1.saledate=t2.saledate;
My result:
saledate
salesum
2022-01-01
f
2022-01-02
f
2022-01-03
f
2022-01-05
f
select saledate
,salesum + coalesce(lag(salesum) over(order by saledate),0) as salesum
from t
saledate
salesum
2022-01-01
100
2022-01-02
250
2022-01-03
350
2022-01-05
300
Fiddle
I am trying to do a case statement within the where clause in snowflake but I’m not quite sure how should I go about doing it.
What I’m trying to do is, if my current month is Jan, then the where clause for date is between start of previous year and today. If not, the where clause for date would be between start of current year and today.
WHERE
CASE MONTH(CURRENT_DATE()) = 1 THEN DATE BETWEEN DATE_TRUNC(‘YEAR’, DATEADD(YEAR, -1, CURRENT_DATE())) AND CURRENT_DATE()
CASE MONTH(CURRENT_DATE()) != 1 THEN DATE BETWEEN DATE_TRUNC(‘YEAR’, CURRENT_DATE()) AND CURRENT_DATE()
END
Appreciate any help on this!
Use a CASE expression that returns -1 if the current month is January or 0 for any other month, so that you can get with DATEADD() a date of the previous or the current year to use in DATE_TRUNC():
WHERE DATE BETWEEN
DATE_TRUNC('YEAR', DATEADD(YEAR, CASE WHEN MONTH(CURRENT_DATE()) = 1 THEN -1 ELSE 0 END, CURRENT_DATE()))
AND
CURRENT_DATE()
I suspect that you don't even need to use CASE here:
WHERE
(MONTH(CURRENT_DATE()) = 1 AND
DATE BETWEEN DATE_TRUNC(‘YEAR’, DATEADD(YEAR, -1, CURRENT_DATE())) AND
CURRENT_DATE()) OR
(MONTH(CURRENT_DATE()) != 1 AND
DATE BETWEEN DATE_TRUNC(‘YEAR’, CURRENT_DATE()) AND CURRENT_DATE())
So the other answers are quite good, but... the answer can be even simpler
Making a little table to brake down what is happening.
select
row_number() over (order by null) - 1 as rn,
dateadd('day', rn * 5, date_trunc('year',current_date())) as pretend_current_date,
DATEADD(YEAR, -1, pretend_current_date) as pcd_sub1,
month(pretend_current_date) as pcd_month,
DATE_TRUNC(year, iff(pcd_month = 1, pcd_sub1, pretend_current_date)) as _from,
pretend_current_date as _to
from table(generator(ROWCOUNT => 30))
order by rn;
this shows:
RN
PRETEND_CURRENT_DATE
PCD_SUB1
PCD_MONTH
_FROM
_TO
0
2022-01-01
2021-01-01
1
2021-01-01
2022-01-01
1
2022-01-06
2021-01-06
1
2021-01-01
2022-01-06
2
2022-01-11
2021-01-11
1
2021-01-01
2022-01-11
3
2022-01-16
2021-01-16
1
2021-01-01
2022-01-16
4
2022-01-21
2021-01-21
1
2021-01-01
2022-01-21
5
2022-01-26
2021-01-26
1
2021-01-01
2022-01-26
6
2022-01-31
2021-01-31
1
2021-01-01
2022-01-31
7
2022-02-05
2021-02-05
2
2022-01-01
2022-02-05
8
2022-02-10
2021-02-10
2
2022-01-01
2022-02-10
9
2022-02-15
2021-02-15
2
2022-01-01
2022-02-15
10
2022-02-20
2021-02-20
2
2022-01-01
2022-02-20
11
2022-02-25
2021-02-25
2
2022-01-01
2022-02-25
12
2022-03-02
2021-03-02
3
2022-01-01
2022-03-02
13
2022-03-07
2021-03-07
3
2022-01-01
2022-03-07
14
2022-03-12
2021-03-12
3
2022-01-01
2022-03-12
15
2022-03-17
2021-03-17
3
2022-01-01
2022-03-17
16
2022-03-22
2021-03-22
3
2022-01-01
2022-03-22
17
2022-03-27
2021-03-27
3
2022-01-01
2022-03-27
18
2022-04-01
2021-04-01
4
2022-01-01
2022-04-01
19
2022-04-06
2021-04-06
4
2022-01-01
2022-04-06
20
2022-04-11
2021-04-11
4
2022-01-01
2022-04-11
21
2022-04-16
2021-04-16
4
2022-01-01
2022-04-16
22
2022-04-21
2021-04-21
4
2022-01-01
2022-04-21
23
2022-04-26
2021-04-26
4
2022-01-01
2022-04-26
24
2022-05-01
2021-05-01
5
2022-01-01
2022-05-01
25
2022-05-06
2021-05-06
5
2022-01-01
2022-05-06
26
2022-05-11
2021-05-11
5
2022-01-01
2022-05-11
27
2022-05-16
2021-05-16
5
2022-01-01
2022-05-16
28
2022-05-21
2021-05-21
5
2022-01-01
2022-05-21
29
2022-05-26
2021-05-26
5
2022-01-01
2022-05-26
Your logic is asking "is the current date in the month of January", at which point take the prior year, and then date truncate to the year, otherwise take the current date and truncate to the year. As the start of a BETWEEN test.
This is the same as getting the current date subtracting one month, and truncating this to year.
Thus there is no need for any IFF or CASE
WHERE date BETWEEN DATE_TRUNC(year, DATEADD(month,-1, CURRENT_DATE())) AND CURRENT_DATE()
and if you like to drop some paren's, CURRENT_DATE can be used if you leave it in upper case, thus it can even be smaller:
WHERE date BETWEEN DATE_TRUNC(year, DATEADD(month,-1, CURRENT_DATE)) AND CURRENT_DATE