I have two tables
customers
+---------+-------+
| cust_id | name |
+---------+-------+
| 1 | Tom |
+---------+-------+
| 2 | John |
+---------+-------+
| 3 | Lisa |
+---------+-------+
| 4 | Wendy |
+---------+-------+
purchases
+---------------+-------------+---------+
| purchase_date | purchase_id | cust_id |
+---------------+-------------+---------+
| 2021-01-01 | 1 | 1 |
+---------------+-------------+---------+
| 2021-01-01 | 2 | 1 |
+---------------+-------------+---------+
| 2021-01-01 | 3 | 2 |
+---------------+-------------+---------+
| 2021-01-01 | 4 | 1 |
+---------------+-------------+---------+
| 2021-01-01 | 5 | 4 |
+---------------+-------------+---------+
| 2021-01-02 | 6 | 3 |
+---------------+-------------+---------+
| 2021-01-02 | 7 | 3 |
+---------------+-------------+---------+
| 2021-01-02 | 8 | 2 |
+---------------+-------------+---------+
| 2021-01-02 | 9 | 1 |
+---------------+-------------+---------+
| 2021-01-02 | 10 | 4 |
+---------------+-------------+---------+
| 2021-01-03 | 11 | 2 |
+---------------+-------------+---------+
| 2021-01-03 | 12 | 2 |
+---------------+-------------+---------+
| 2021-01-03 | 13 | 3 |
+---------------+-------------+---------+
| 2021-01-03 | 14 | 3 |
+---------------+-------------+---------+
I want to query the count of unique purchasing customers by date (easy) and the cust_id of the customer who made the most purchases by date. If more than one customer made the same number of purchases on the same date, I want to show the lesser cust_id. The results should look like this:
+---------------+------------------+-----------------+
| purchase_date | unique_customers | biggest_spender |
+---------------+------------------+-----------------+
| 2021-01-01 | 3 | 1 |
+---------------+------------------+-----------------+
| 2021-01-02 | 4 | 3 |
+---------------+------------------+-----------------+
| 2021-01-03 | 2 | 2 |
+---------------+------------------+-----------------+
Here is the query in Postgresql, using mode() to determine the biggest spender, alias the most frequent value for each date in your purchase table
SELECT p.purchase_date, count(DISTINCT p.cust_id) as unique_customers , mode() within group (order by p.cust_id) as biggest_spender
FROM purchases p
GROUP BY p.purchase_date
ORDER BY COUNT(p.cust_id) DESC;
Interesting question for you all. Here's a sample of my dataset (see below). I have warehouses, dates, and the change in inventory level at that specific date for a given warehouse.
Ex: Assuming 1/1/2018 is first date, warehouse 1 starts out with 100 in inventory, then 600, then 300, then 500...etc.
My question I'd like to answer in SQL: By warehouse ID, did each warehouse ever have inventory of more than 750 (yes/no)?
I can't sum the entire column, because the ending inventory (sum of column by warehouse) is likely lower than a past inventory level. Any help is appreciated!!
+--------------+------------+---------------+
| Warehouse_id | Date | Inventory_Amt |
+--------------+------------+---------------+
| 1 | 1/1/2018 | +100 |
| 1 | 6/1/2018 | +500 |
| 1 | 6/15/2018 | -300 |
| 1 | 7/1/2018 | +200 |
| 1 | 8/1/2018 | -400 |
| 1 | 12/15/2018 | +100 |
| 2 | 1/1/2018 | +10 |
| 2 | 6/1/2018 | +50 |
| 2 | 6/15/2018 | -30 |
| 2 | 7/1/2018 | +20 |
| 2 | 8/1/2018 | -40 |
| 2 | 12/15/2018 | +10 |
| 3 | 1/1/2018 | +100 |
| 3 | 6/1/2018 | +500 |
| 4 | 6/15/2018 | +300 |
| 4 | 7/1/2018 | +200 |
| 4 | 8/1/2018 | -400 |
| 4 | 12/15/2018 | +100 |
+--------------+------------+---------------+
You want a cumulative sum and then filtering:
select i.*
from (select i.*, sum(inventory_amt) over (partition by warehouse_id order by date) as inventory
from inventory i
) i
where inventory_amt > 750
I'm trying to provide rolled up summaries of the following data including only the group in question as well as excluding the group. I think this can be done with a window function, but I'm having problems with getting the syntax down (in my case Hive SQL).
I want the following data to be aggregated
+------------+---------+--------+
| date | product | rating |
+------------+---------+--------+
| 2018-01-01 | A | 1 |
| 2018-01-02 | A | 3 |
| 2018-01-20 | A | 4 |
| 2018-01-27 | A | 5 |
| 2018-01-29 | A | 4 |
| 2018-02-01 | A | 5 |
| 2017-01-09 | B | NULL |
| 2017-01-12 | B | 3 |
| 2017-01-15 | B | 4 |
| 2017-01-28 | B | 4 |
| 2017-07-21 | B | 2 |
| 2017-09-21 | B | 5 |
| 2017-09-13 | C | 3 |
| 2017-09-14 | C | 4 |
| 2017-09-15 | C | 5 |
| 2017-09-16 | C | 5 |
| 2018-04-01 | C | 2 |
| 2018-01-13 | D | 1 |
| 2018-01-14 | D | 2 |
| 2018-01-24 | D | 3 |
| 2018-01-31 | D | 4 |
+------------+---------+--------+
Aggregated results:
+------+-------+---------+----+------------+------------------+----------+
| year | month | product | ct | avg_rating | avg_rating_other | other_ct |
+------+-------+---------+----+------------+------------------+----------+
| 2018 | 1 | A | 5 | 3.4 | 2.5 | 4 |
| 2018 | 2 | A | 1 | 5 | NULL | 0 |
| 2017 | 1 | B | 4 | 3.6666667 | NULL | 0 |
| 2017 | 7 | B | 1 | 2 | NULL | 0 |
| 2017 | 9 | B | 1 | 5 | 4.25 | 4 |
| 2017 | 9 | C | 4 | 4.25 | 5 | 1 |
| 2018 | 4 | C | 1 | 2 | NULL | 0 |
| 2018 | 1 | D | 4 | 2.5 | 3.4 | 5 |
+------+-------+---------+----+------------+------------------+----------+
I've also considered producing two aggregates, one with the product in question and one without, but having trouble with creating the appropriate joining key.
You can do:
select year(date), month(date), product,
count(*) as ct, avg(rating) as avg_rating,
sum(count(*)) over (partition by year(date), month(date)) - count(*) as ct_other,
((sum(sum(rating)) over (partition by year(date), month(date)) - sum(rating)) /
(sum(count(*)) over (partition by year(date), month(date)) - count(*))
) as avg_other
from t
group by year(date), month(date), product;
The rating for the "other" is a bit tricky. You need to add everything up and subtract out the current row -- and calculate the average by doing the sum divided by the count.
At the first excuse me for my bad english.
I have two tables:
master table:
| product id | pr_name | remain_Qty |
+--------------+------------------+-------------------+
| 1 | x | 13 |
| 2 | y | 18 |
| 3 | z | 21 |
+--------------+------------------+-------------------+
Detail Table (This table contain detail data of bought product):
+--------------+------------------+----------+--------+
| date | pr_id | Qty |price |
+--------------+------------------+----------+--------+
| 2010-01-01 | 1 | 3 | 1000 |
| 2010-01-02 | 1 | 5 | 1200 |
| 2010-01-01 | 2 | 11 | 1100 |
| 2010-01-03 | 1 | 4 | 1400 |
| 2010-01-04 | 3 | 3 | 1300 |
| 2010-01-01 | 2 | 6 | 1600 |
| 2010-01-03 | 1 | 7 | 1700 |
| 2010-01-02 | 3 | 3 | 1300 |
| 2010-01-01 | 3 | 5 | 1500 |
| 2010-01-04 | 3 | 7 | 1700 |
| 2010-01-06 | 2 | 8 | 1800 |
| 2010-01-07 | 2 | 4 | 1400 |
| 2010-01-03 | 1 | 3 | 1300 |
| 2010-01-04 | 3 | 6 | 1600 |
| 2010-01-08 | 1 | 1 | 1100 |
+--------------+------------------+----------+--------+
sum Qty of product 1 = 23
sum Qty of product 2 = 29
sum Qty of product 3 = 21
As a result I want list of the Details table, where the list is sorted by pr_id , date and price, but the sum(Qty) per pr_id don't exceed the remain_Qty of the product_id of the Master table.
For example:
+--------------+------------------+----------+--------+
| date | pr_id | Qty |price |
+--------------+------------------+----------+--------+
| 2010-01-01 | 1 | 3 | 1000 |
| 2010-01-02 | 1 | 5 | 1200 |
| 2010-01-03 | 1 | 4 | 1400 |
| 2010-01-03 | 1 | 1 | 1700 |
| 2010-01-01 | 2 | 11 | 1100 |
| 2010-01-01 | 2 | 6 | 1600 |
| 2010-01-01 | 3 | 5 | 1500 |
| 2010-01-02 | 3 | 3 | 1300 |
| 2010-01-04 | 3 | 3 | 1300 |
| 2010-01-04 | 3 | 7 | 1700 |
+--------------+------------------+----------+--------+
More of a clarification than a direct SQL answer. But what it LOOKS like they may be wanting is based on an inventory being depleted to fill orders from the known available quantity, but even that falls short as the may be missing a second qty of 3 on 2010-01-03 for product 1... which if looking at just ID=1 from his sample data would show...
| date | pr_id | Qty |price | Qty Available to fill order
+--------------+--------+-----+-------+
| 2010-01-01 | 1 | 3 | 1000 | 13 - 3 = 10 avail next order
| 2010-01-02 | 1 | 5 | 1200 | 10 - 5 = 5 avail next order
| 2010-01-03 | 1 | 3 | 1300 | 5 - 3 = 2 avail next order
| 2010-01-03 | 1 | 4 | 1400 | only 2 to PARTIALLY fill this order
| 2010-01-03 | 1 | 7 | 1700 | none available
| 2010-01-08 | 1 | 1 | 1100 | none available
With the extra sample record removed, would result in...
| date | pr_id | Qty |price | Qty Available to fill order
+--------------+--------+-----+-------+
| 2010-01-01 | 1 | 3 | 1000 | 13 - 3 = 10 avail next order
| 2010-01-02 | 1 | 5 | 1200 | 10 - 5 = 5 avail next order
| 2010-01-03 | 1 | 4 | 1400 | 5 - 4 = 1 avail for next order
| 2010-01-03 | 1 | 7 | 1700 | only 1 of the 7 available
| 2010-01-08 | 1 | 1 | 1100 | no more available...
So Aliasghar, does this better represent what you are trying to do??? Fill the available orders based on which order was entered into the system first, fill as many as possible based on inventory and stop there?
Please confirm by adding comment to this answer and maybe we can help resolve... Also, confirm WHICH Database are you using... SQL-Server, Oracle, MySQL, etc...
Here a working query for pr_id=1 , I used MySql:
select final.pr_date, final.pr_id, count(t_qty) as qty, final.price from
(select * FROM (select q.pr_date, q.pr_id, 1 as t_qty, q.price , #t := #t + t_qty total
FROM(
SELECT d.pr_date, d.pr_id, 1 as t_qty, d.price
FROM detail_table d
JOIN generator_4k i
ON i.n between 1 and d.qty
WHERE d.pr_id= 1
Order by d.id, d.pr_date) q
CROSS JOIN (SELECT #t := 0) i) c
WHERE c.total <= (select remain_qty from master_table WHERE product_id = 1)) final
group by final.pr_date , final.pr_id , final.price ;
Here SQL FIDDLE
You have to adapt your detail_table to add a technical id as primary key and create some views, I renamed the date column as pr_date, You'll find the schema on the sql fiddle.
Here another query Using SQL SERVER
select final.pr_date, final.pr_id, count(t_qty) as qty, final.price from
(SELECT top(select remain_qty from master_table WHERE product_id = 1) d.pr_date, d.pr_id, 1 as t_qty, d.price
FROM detail_table d
JOIN generator_4k i
ON i.n between 1 and d.qty
WHERE d.pr_id= 1
Order by d.id, d.pr_date) final
group by final.pr_date , final.pr_id , final.price ;
Here SQL FIDDLE
Daywise product by info
Beneath a suggested statement.
select t2.date,t2.pr_id,t1.pr_name,sum(qty) as qty_buy,sum(price) as amount from master_table as t1
inner join detail_table as t2 on t1.product_id=t2.pr_id
group by t2.date,t2.pr_id
order by t1.date,t2.pr_id
I had a hard time to understand what you really wanted.
So if I understood well, you want some data that correspond to a product but do not go over your remained item.
So I coudn't bypass yet the first query that goes over, and only take the remaining from it.
So my query for now just stop until it s get to the remained items allowed
SQL FIDDLE
To be able to do what you want, you need to first create a view that generate row based on your quantity.
Like something like
> +--------------+------------------+----------+--------+
| date | pr_id | Qty |price |
+--------------+------------------+----------+--------+
| 2010-01-01 | 1 | 3 | 1000 |
turn into something like
> +--------------+------------------+----------+--------+
| date | pr_id | Qty |price |
+--------------+------------------+----------+--------+
| 2010-01-01 | 1 | 1 | 1000
| 2010-01-01 | 1 | 1 | 1000 |
| 2010-01-01 | 1 | 1 | 1000 |
Then you count your rows until your remained item allows you to do it.
After you regroup all of the row by price,pr_id and date.
VOILA
+------------+--------+-----------+---------------+
| paydate | salary | ninumber | payrollnumber |
+------------+--------+-----------+---------------+
| 2015-05-15 | 1000 | jh330954b | 6 |
| 2015-04-15 | 1250 | jh330954b | 5 |
| 2015-03-15 | 800 | jh330954b | 4 |
| 2015-02-15 | 894 | jh330954b | 3 |
| 2015-05-15 | 500 | ew56780e | 6 |
| 2015-04-15 | 1500 | ew56780e | 5 |
| 2015-03-15 | 2500 | ew56780e | 4 |
| 2015-02-15 | 3000 | ew56780e | 3 |
| 2015-05-15 | 400 | rt321298z | 6 |
| 2015-04-15 | 582 | rt321298z | 5 |
| 2015-03-15 | 123 | rt321298z | 4 |
| 2015-02-15 | 659 | rt321298z | 3 |
+------------+--------+-----------+---------------+
The above list is the data in my database. I need to get the average of the previous 3 salaries for each individual and output this.
I don't know where to begin with this so I cannot provide any of my working so far.
In SQL Server, you can use row_number() to get the last three salaries in a subquery. Then use avg():
select ninumber, avg(salary)
from (select t.*,
row_number() over (partition by ninumber order by payrollnumber desc) as seqnum
from table t
) t
where seqnum <= 3
group by ninumber;