There is a table where once a day/hour lines are added that contain the product ID, price, name and time at which the line was added.
CREATE TABLE products
(
id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
product_id integer NOT NULL,
title text NOT NULL,
price double precision NOT NULL,
checked_at timestamp with time zone DEFAULT now()
);
The data in the products table looks like this:
id
product_id
title
price
checked_at
1
1000
Watermelon
50
2022-07-19 10:00:00
2
2000
Apple
30
2022-07-19 10:00:00
3
3000
Pear
20
2022-07-19 10:00:00
4
1000
Watermelon
100
2022-07-20 10:00:00
5
2000
Apple
50
2022-07-20 10:00:00
6
3000
Pear
35
2022-07-20 10:00:00
7
1000
Watermelon
150
2022-07-21 10:00:00
8
2000
Apple
50
2022-07-21 10:00:00
9
3000
Pear
60
2022-07-21 10:00:00
I need to pass a date range (for example, from 2022-07-19 to 2022-07-21) and get the difference in prices of all unique products, that is, the answer should be like this:
product_id
title
price_difference
1000
Watermelon
100
2000
Apple
20
3000
Pear
40
I only figured out the very beginning, where I need to get the ID of all unique products in the table using DISTINCT. Next, I need to find the rows that are closest to the date range. And finally find the difference in the price of each product.
You could use an aggregation approach here:
SELECT product_id, title,
MAX(price) FILTER (WHERE checked_at::date = '2022-07-21') -
MAX(price) FILTER (WHERE checked_at::date = '2022-07-19') AS price_difference
FROM products
GROUP BY product_id, title
ORDER BY product_id;
Related
I have a table named orders in a Postgres database that looks like this:
customer_id order_id order_date price product
1 2 2021-03-05 15 books
1 13 2022-03-07 3 music
1 14 2022-06-15 900 travel
1 11 2021-11-17 25 books
1 16 2022-08-03 32 books
2 4 2021-04-12 4 music
2 7 2021-06-29 9 music
2 20 2022-11-03 8 music
2 22 2022-11-07 575 travel
2 24 2022-11-20 95 food
3 3 2021-03-17 25 books
3 5 2021-06-01 650 travel
3 17 2022-08-17 1200 travel
3 19 2022-10-02 6 music
3 23 2022-11-08 70 food
4 9 2021-08-20 3200 travel
4 10 2021-10-29 2750 travel
4 15 2022-07-15 1820 travel
4 21 2022-11-05 8000 travel
4 25 2022-11-29 27 books
5 1 2021-01-04 3 music
5 6 2021-06-09 820 travel
5 8 2021-07-30 19 books
5 12 2021-12-10 22 music
5 18 2022-09-19 20 books
Here's a SQL Fiddle: http://sqlfiddle.com/#!17/262fc/1
I'd like to return the average money spent by customers per product, but only consider orders within the first 12 months of a given customer's first purchase within the given product group. (yes, this is challenging!)
For example, for customer 1, order ID 2 and order ID 11 would be factored into the average for books(because order ID 11 took place less than 12 months after customer 1's first order for books, which was order ID 2), but order ID 16 would not be factored into the average (because 8/3/22 is more than 12 months from customer 1's first purchase for books, which took place on 3/5/21).
Here is a matrix showing which orders would be included within a given product (denoted by "yes"):
The desired output would look as follows:
average_spent
books 22.20
music 7.83
travel 1530.71
food 82.50
How would I do this?
Thanks in advance for any assistance you can give!
You can use a subquery to check whether or not to include a product's price in the summation:
select o.product, sum(o.price)/count(*) val from orders o
where o.order_date < (select min(o1.order_date) from orders o1 where
o1.product = o.product and o.user_id = o1.user_id) + interval '12 months'
group by o.product
See fiddle
This question already has answers here:
Is there a way to access the "previous row" value in a SELECT statement?
(9 answers)
Closed 7 months ago.
I have a table in SQL Server with sales price data of items on different dates like this:
Item
Date
Price
1
2021-05-01
200
1
2021-06-11
210
1
2021-06-27
225
1
2021-08-01
250
2
2021-02-10
600
2
2021-04-21
650
2
2021-06-17
675
2
2021-07-23
700
I'm creating a table that specifies the start and end date of prices as below:
Item
DateStart
Price
DateEnd
1
2021-05-01
200
2021-06-10
1
2021-06-11
210
2021-06-26
1
2021-06-27
225
2021-07-31
1
2021-08-01
250
Today date
2
2021-02-10
600
2021-04-20
2
2021-04-21
650
2021-06-16
2
2021-06-17
675
2021-07-22
2
2021-07-23
700
Today date
As you can see, the end date is one day less than the next price change date. I also have a calendar table called "DimDates" with one row per day. I had hoped to use joins but it doesn't do what I thought it would do. Any suggestions on how to write the query? I'm using SQL Server 2016.
We can use LEAD() here along with DATEADD():
WITH cte AS (
SELECT *, DATEADD(day, -1, LEAD(Date, 1, GETDATE())
OVER (PARTITION BY Item
ORDER BY Date)) AS LastDate
FROM yourTable
)
SELECT Item, Date AS DateStart, Price, LastDate AS DateEnd
FROM cte
ORDER BY Item, Date;
Demo
I have the following table
timestamp ID eur
-----------------------
2022-01-01 A 10
2022-01-02 A 20
2022-01-01 B 30
2022-01-02 B 40
2022-01-03 B 50
2022-01-04 B 60
Now I am interested in all previous information for a specific ID. Then I want to do something with this information, lets say calculating the mean. Here is what I am aiming for:
timestamp ID eur sum_all mean_all
------------------------------------------------
2022-01-01 A 10 10 10
2022-01-02 A 20 30 15
2022-01-01 B 30 30 30
2022-01-02 B 40 70 35
2022-01-03 B 50 120 40
2022-01-04 B 60 180 45
This seems so easy but I just can't get my head around how to do this in SQL.
I appreciate any help. Thanks!
You can use the sum and avg window functions:
select *, sum(eur) over(partition by ID order by timestamp) as sum_all,
avg(eur) over(partition by ID order by timestamp) as mean_all
from table_name
In the project I am currently working on in my company, I would like to show sales related KPIs together with Customer Score metric on SQL / Tableau / BigQuery
The primary key is order id in both tables. However, order date and the date we measure Customer Score may be different. For example the the sales information for an order that is released in Feb 2020 will be aggregated in Feb 2020, however if the customer survey is made in March 2020, the Customer Score metric must be aggregated in March 2020. And what I would like to achieve in the relational database is as follows:
Sales:
Order ID
Order Date(m/d/yyyy)
Sales ($)
1000
1/1/2021
1000
1001
2/1/2021
2000
1002
3/1/2021
1500
1003
4/1/2021
1700
1004
5/1/2021
1800
1005
6/1/2021
900
1006
7/1/2021
1600
1007
8/1/2021
1900
Customer Score Table:
Order ID
Customer Survey Date(m/d/yyyy)
Customer Score
1000
3/1/2021
8
1001
3/1/2021
7
1002
4/1/2021
3
1003
6/1/2021
6
1004
6/1/2021
5
1005
7/1/2021
3
1006
9/1/2021
1
1007
8/1/2021
7
Expected Output:
KPI
Jan-21
Feb-21
Mar-21
Apr-21
May-21
June-21
July-21
Aug-21
Sep-21
Sales($)
1000
2000
1500
1700
1800
900
1600
1900
AVG Customer Score
7.5
3
5.5
3
7
1
I couldn't find a way to do this, because order date and survey date may/may not be the same.
For sample data and expected output, click here.
I think what you want to do is aggregate your results to the month (KPI) first before joining, as opposed to joining on the ORDER_ID
For example:
with order_month as (
select date_trunc(order_date, MONTH) as KPI, sum(sales) as sales
from `testing.sales`
group by 1
),
customer_score_month as (
select date_trunc(customer_survey_date, MONTH) as KPI, avg(customer_score) as avg_customer_score
from `testing.customer_score`
group by 1
)
select coalesce(order_month.KPI,customer_score_month.KPI) as KPI, sales, avg_customer_score
from order_month
full outer join customer_score_month
on order_month.KPI = customer_score_month.KPI
order by 1 asc
Here, we aggregate the total sales for each month based on the order date, then we aggregate the average customer score for each month based on the date the score was submitted. Now we can join these two on the month value.
This results in a table like this:
KPI
sales
avg_customer_score
2021-01-01
1000
null
2021-02-01
2000
null
2021-03-01
1500
7.5
2021-04-01
1700
3.0
2021-05-01
1800
null
2021-06-01
900
5.5
2021-07-01
1600
3.0
2021-08-01
1900
7.0
2021-09-01
null
1.0
You can pivot the results of this table in Tableau, or leverage a case statement to pull out each month into its own column - I can elaborate more if that will be helpful
There's my sql data:
code name total
---------------
3 Sprite 2400
17 Coke 1500
6 Dew 1000
17 Coke 3000
6 Dew 2000
But code and name has duplicated values and I want to sum total from each duplicated field.
Something like this:
code name total
---------------
3 Sprite 2400
17 Coke 4500
6 Dew 3000
How could I do that in sql?
SELECT code, name, sum(total) AS total FROM table GROUP BY code, name