I'm trying to write a query that will return the amount of sales given the previous day. I am doing a test task for an internship device but have not done this before.
Source table:
saledate
salesum
2022-01-01
100
2022-01-02
150
2022-01-03
200
2022-01-05
100
Estimated result:
saledate
salesum
2022-01-01
100
2022-01-02
250
2022-01-03
350
2022-01-05
300
My query:
SELECT t1.saledate, t1.salesum=t1.salesum+t2.salesum
FROM sales t1
INNER JOIN (
SELECT saledate, salesum FROM sales
) t2
ON t1.saledate=t2.saledate;
My result:
saledate
salesum
2022-01-01
f
2022-01-02
f
2022-01-03
f
2022-01-05
f
select saledate
,salesum + coalesce(lag(salesum) over(order by saledate),0) as salesum
from t
saledate
salesum
2022-01-01
100
2022-01-02
250
2022-01-03
350
2022-01-05
300
Fiddle
Related
I have a pandas data frame called final_data that looks like this
cust_id
start_date
end_date
10001
2022-01-01
2022-01-30
10002
2022-02-01
2022-02-30
10003
2022-01-01
2022-01-30
10004
2022-03-01
2022-03-30
10005
2022-02-01
2022-02-30
I have another table in my sql database called penalties that looks like this
cust_id
level1_pen
level_2_pen
date
10001
1
4
2022-01-01
10001
1
1
2022-01-02
10001
0
1
2022-01-03
10002
1
1
2022-01-01
10002
5
0
2022-02-01
10002
4
0
2022-02-04
10003
1
6
2022-01-02
I want the final_data frame to look like this where it aggregates the data from the penalties table in SQL database based on the cust_id, start_date and end_date
cust_id
start_date
end_date
total_penalties
10001
2022-01-01
2022-01-30
8
10002
2022-02-01
2022-02-30
9
10003
2022-01-01
2022-01-30
7
How do I combine a lambda function for each row where it aggregates the data from the SQL query based on the cust_id, start_date, and end_date variables from each row of the pandas dataframe
Suppose
df = final_data table
df2 = penalties table
you can get the final_data frame that you want using this query:
SELECT
df.cust_id,
df.start_date,
df.end_date,
SUM(df2.level1_pen + df2.level_2_pen) as total_penalties
FROM
df
LEFT JOIN df2 ON df.cust_id = df2.cust_id
AND df2.date BETWEEN df.start_date AND df.end_date
GROUP BY
df.cust_id,
df.start_date,
df.end_date;
I have data like this..
month_date location sales
2022-01-01 Asia 150
2022-01-01 Europe 250
2022-02-01 Asia 100
2022-02-01 Europe 100
and breakdown by day_date
day_date location sales
2022-01-01 Asia 12
2022-01-02 Asia 10
2022-01-03 Asia 15
2022-01-04 Asia 19
2022-01-05 Asia 15
2022-01-06 Asia 11
....
2022-01-31 Asia 2
total: Asia 132
but when I compare sales between month_date=150 and day_date=132 I still have minus 18.
is it possible to add random data which contain minus 18 but breakdown by day_date?
like below
day_date location sales
2022-01-01 Asia 13
2022-01-02 Asia 11
2022-01-03 Asia 16
2022-01-04 Asia 20
2022-01-05 Asia 16
2022-01-06 Asia 12
....
2022-01-31 Asia 2
total: Asia 150
You might consider below. (diff number, 18 in this case, will be added by one starting from 1st day of month.)
SELECT day_date, location,
sales + DIV(diff, days) + IF(EXTRACT(DAY FROM day_date) <= MOD(diff, days), 1, 0) AS sales
FROM (
SELECT d.*,
m.sales - SUM(d.sales) OVER w AS diff,
DATE_DIFF(LAST_DAY(month_date), month_date, DAY) AS days
FROM day_table d
LEFT JOIN month_table m
ON DATE_TRUNC(d.day_date, MONTH) = m.month_date AND d.location = m.location
WINDOW w AS (PARTITION BY DATE_TRUNC(day_date, MONTH), d.location)
);
Query results
I have a table like below :
ID
Amount
Date
1
500
2022-01-03
1
200
2022-01-04
1
500
2022-01-05
1
340
2022-01-06
1
500
2022-01-25
1
500
2022-01-26
1
567
2022-01-27
1
500
2022-01-28
1
598
2022-01-31
1
500
2022-02-01
1
787
2022-02-02
1
500
2022-02-03
1
5340
2022-02-04
PROBLEM :-
So I have to calculate average of column where StartDate = 03/01/2022 (3rd Jan 2022) and for each month it would be like for January Average of Amount from StartDate to 25th Jan, then for Feb Startdate to 22nd Feb, so this date logic is also there
SET #Last = (SELECT DATEADD(DAY, CASE DATENAME(WEEKDAY, #Date)
WHEN 'Sunday' THEN -6
When 'Saturday' THEN -5
ELSE -7 END, DATEDIFF(DAY, 0, #Date)))
RETURN #Last
ID
Amount
Date
Last
1
500
2022-01-03
2022-01-25
1
500
2022-01-04
2022-01-25
1
340
2022-01-05
2022-01-25
1
500
2022-01-06
2022-01-25
1
567
2022-01-25
2022-01-25
1
500
2022-01-26
2022-01-25
1
500
2022-01-27
2022-01-25
1
40
2022-01-28
2022-01-25
1
500
2022-01-31
2022-01-25
1
589
2022-02-01
2022-02-22
1
540
2022-02-02
2022-02-22
1
500
2022-02-03
2022-02-22
1
5340
2022-02-04
2022-02-22
Like the above table..
Now if I calculate Avg(Amount), from 3rd jan to 25th Jan for Jan and 3rd Jan to 22nd Feb and so on.. It's not giving correct average, like it is calculating the rest of the days amount also. Also grouping by is grouping month wise not as where clause
Select Avg(Amount) from Table
where Date BETWEEN #StartDate AND Last
StartDate is fixed # 3rd Jan.
This is not giving the correct Avg. Any other way I could get the required data?
I have two tables:
Main table
ID
Date
Device
252708
2022-01-01
Phone
252708
2022-01-01
Email
252252
2022-01-02
Phone
252252
2022-01-02
Phone
252252
2022-01-02
Phone
253022
2022-01-06
Phone
253022
2022-01-06
Phone
253228
2022-01-06
Email
253228
2022-01-06
Email
252708
2022-01-06
Phone
256703
2022-01-09
Phone
Date table
Date
Week
2022-01-01
WK 17
2022-01-02
WK 18
2022-01-03
WK 18
2022-01-04
WK 18
2022-01-05
WK 18
2022-01-06
WK 18
2022-01-07
WK 18
2022-01-08
WK 18
2022-01-09
WK 19
2022-01-10
WK 19
2022-01-11
WK 19
I want to merge the IDs into rows, grouping by Wk (using my date table)
ID
Date
Device_1
Wk
252708
2022-01-01
Phone, Email
WK17
252252
2022-01-02
Phone, Phone, Phone
WK18
253022
2022-01-06
Phone, Phone
WK18
253228
2022-01-06
Email, Email
WK18
252708
2022-01-06
Phone
WK18
256703
2022-01-09
Phone
WK19
I know I need the string_agg function to merge the devices into rows, however, I'm not sure how to separate by week. Thanks in advance
You can always use "FOR XML" instead of STRING_AGG. It would be something like this:
select
distinct(dev.ID)
,dev.Date
,(
select
Device + ',' as [text()]
FROM MainTable a
JOIN DateTable b on a.Date = b.Date
Where dev.ID = a.ID and b.Week = cal.Week
FOR XML PATH ('')
) as Device_1
,cal.Week
FROM MainTable dev
JOIN DateTable cal on dev.Date = cal.Date
Order By Week
I have the following table in the database:
date account_id currency balanceUSD
01-01-2022 17:17:25 1 USD 1000
01-01-2022 17:17:25 1 EUR 1200
01-01-2022 23:14:34 1 USD 1050
01-01-2022 23:14:34 1 EUR 1350
01-02-2022 15:14:42 1 USD 1040
01-02-2022 15:14:42 1 EUR 1460
01-02-2022 20:17:45 1 USD 1030
01-02-2022 20:17:45 1 EUR 1550
01-01-2022 17:17:25 2 USD 3000
01-01-2022 17:17:25 2 EUR 2300
01-01-2022 23:14:34 2 USD 3200
01-01-2022 23:14:34 2 EUR 1450
01-02-2022 15:14:42 2 USD 3350
01-02-2022 15:14:42 2 EUR 1850
01-02-2022 20:17:45 2 USD 3400
01-02-2022 20:17:45 2 EUR 1900
What I want to do is group by (year, month, day) and account_id and sum the balanceUSD. i.e.
date account_id balanceUSD
01-01-2022 1 4600
01-02-2022 1 5080
01-01-2022 2 9950
01-02-2022 2 10500
How can this be done?
We can use the function date_trunc('day', rental_date) to extract the date from the timestamp.
SELECT
date_trunc('day', date) as "date",
account_id,
sum(balanceUSD) as "balanceUSD"
FROM
account_id,
table_name
GROUP BY
account_id
date_trunc('day', date)
ORDER BY
account_id,
date_trunc('day', date) ;