In SQL, how can I calculate how many x-day intervals have passed since outset date?
Consider 2023-01-11 as an example for such an "outset date". For any subsequent date, I want to know how many 4-day intervals have passed since the outset date.
For example:
💚 2023-01-13 should return 1, because it's the first 4-day interval.
💙 2023-01-18 should return 2, because it's the second 4-day interval.
💜 2023-02-02 should return 6, because it's the sixth 4-day interval.
## # January 2023
## Su Mo Tu We Th Fr Sa
## -----------------------------------------------------------------------
## |1 |2 |3 |4 |5 |6 |7 |
## | | | | | | | |
## | | | | | | | |
## | | | | | | | |
## | | | | | | | |
## -----------------------------------------------------------------------
## |8 |9 |10 |11 |12 |13 |14 |
## | | | | | | | |
## | | | |*outset* | |💚 | |
## | | | |<<=======|=========|=========|=======>>|
## | | | | | | | |
## -----------------------------------------------------------------------
## |15 |16 |17 |18 |19 |20 |21 |
## | | | | | | | |
## | | | |💙 | | | |
## |<<+++++++|+++++++++|+++++++++|+++++++>>|<<#######|#########|#########|
## | | | | | | | |
## -----------------------------------------------------------------------
## |22 |23 |24 |25 |26 |27 |28 |
## | | | | | | | |
## | | | | | | | |
## |#######>>|<<#######|#########|#########|#######>>|<<*******|*********|
## | | | | | | | |
## -----------------------------------------------------------------------
## |29 |30 |31 |1 |2 |3 |4 |
## | | | | | | | |
## | | | | |💜 | | |
## |*********|*******>>|<<~~~~~~~|~~~~~~~~~|~~~~~~~~~|~~~~~~~>>| |
## | | | | | | | |
## -----------------------------------------------------------------------
So if I have the corresponding SQL table:
CREATE TABLE my_tbl (outset_date DATE, date_of_interest DATE);
INSERT INTO my_tbl (outset_date, date_of_interest)
VALUES ('2023-01-11', '2023-01-13'),
('2023-01-11', '2023-01-18'),
('2023-01-11', '2023-02-02');
How can I write a select statement to get the desired output:
-- desired output
-- +──────────────+───────────────────+─────────────────────────────────+
-- | outset_date | date_of_interest | how_many_intervals_have_passed |
-- +──────────────+───────────────────+─────────────────────────────────+
-- | 2023-01-11 | 2023-01-13 | 1 |
-- | 2023-01-11 | 2023-01-18 | 2 |
-- | 2023-01-11 | 2023-02-02 | 6 |
-- +──────────────+───────────────────+─────────────────────────────────+
If there isn't an "idimoatic" SQL syntax for this, I'd opt for either MySQL or PostgreSQL. Thanks!
To count the difference between two dates in days, you need to subtract the oldest date from the earliest one, i.e. "2023-01-13" - "2023-01-11" = 2
In your case, you need the number of days between the two dates including the first and last dates, this means you need to add 1 day to the difference in days, i.e. "2023-01-13" - "2023-01-11" + 1 = 3
To get the 4 days interval in which a date lies, simply add 3 to the calculated date difference then perform integer division by 4. i.e. for differences (1, 2, 3, 4) it will be (4/4, 5/4, 6/4, 7/4) which equals to 1 for all.
For Postgres try the following:
select *,
(date_of_interest - outset_date + 4) / 4 as expected
from my_tbl
The + 4 here is +1 to calculate the difference between the two dates inclusively as mentioned above, and +3 to perform the integer division.
See demo.
For MySQL, it will be (datediff(date_of_interest, outset_date) + 4) div 4, where div operator is used to perform the integer division.
The basic solution for MySQL:
SELECT
outset_date,
date_of_interest,
CEIL(DATEDIFF(date_of_interest, outset_date) / 4) how_many_intervals_have_passed
FROM my_tbl;
test SQL here
PostgreSQL solution below:
SELECT
outset_date,
date_of_interest,
CEIL((date_of_interest - outset_date)::numeric / 4) how_many_intervals_have_passed
FROM my_tbl;
Is there anyway to create a unique identifier column (user) for a user so that if Email, User_ID or Subscription_ID is a match between any of the rows then I will be able to group by and aggregate by this unique identifier later? The unique identifier doesn't have to be incrementing but I thought that could be a way of implementing it.
Data which could call subscriptions table:
|Email |User_ID | Subscription_ID |
|--------------|-----------|--------------------|
|12#gmail.com |20 | 56 |
|12#gmail.com |30 | 86 |
|13#gmail.com |20 | 96 |
|14#gmail.com |22 | 96 |
|15#gmail.com |80 | 12 |
Desired Result:
|Email |User_ID | Subscription_ID | User |
|--------------|-----------|--------------------|------|
|12#gmail.com |20 | 56 | 1 |
|12#gmail.com |30 | 86 | 1 |
|13#gmail.com |20 | 96 | 1 |
|14#gmail.com |22 | 96 | 1 |
|15#gmail.com |80 | 12 | 2 |
I have the following data:
Input:
----------------------------
| Id | Value|
----------------------------
| 1 |A |
| 1 |B |
| 2 |C |
| 2 |D |
| 2 |E |
| 3 |F |
----------------------------
I need to convert the results to the following:
Output (Count is based on Id)
----------------------------
| Id | Value| Count|
----------------------------
| 1 |A | 2 |
| 1 |B | 2 |
| 2 |C | 3 |
| 2 |D | 3 |
| 2 |E | 3 |
| 3 |F | 1 |
----------------------------
I am using SQL server 2008. Is it possible to write a query to do this?
If yes could anyone help me provide a SQL to obtain the above output from the input data I gave.
You are looking for COUNT OVER:
select id, value, count(*) over (partition by id)
from mytable
order by id, value;
So I have a Request History table that I would like to flag its versions (version is based on end of cycle); I was able to mark the end of the cycle, but somehow I couldn't update the values of each associated with each cycle. Here is an example:
|history_id | Req_id | StatID | Time |EndCycleDate |
|-------------|---------|-------|---------- |-------------|
|1 | 1 |18 | 3/26/2017 | NULL |
|2 | 1 | 19 | 3/26/2017 | NULL |
|3 | 1 |20 | 3/30/2017 | NULL |
|4 |1 | 23 |3/30/2017 | NULL |
|5 | 1 |35 |3/30/2017 | 3/30/2017 |
|6 | 1 |33 |4/4/2017 | NULL |
|7 | 1 |34 |4/4/2017 | NULL |
|8 | 1 |39 |4/4/2017 | NULL |
|9 | 1 |35 |4/4/2017 | 4/4/2017 |
|10 | 1 |33 |4/5/2017 | NULL |
|11 | 1 |34 |4/6/2017 | NULL |
|12 | 1 |39 |4/6/2017 | NULL |
|13 | 1 |35 |4/7/2017 | 4/7/2017 |
|14 | 1 |33 |4/8/2017 | NULL |
|15 | 1 | 34 |4/8/2017 | NULL |
|16 | 2 |18 |3/28/2017 | NULL |
|17 | 2 |26 |3/28/2017 | NULL |
|18 | 2 |20 |3/30/2017 | NULL |
|19 | 2 |23 |3/30/2017 | NULL |
|20 | 2 |35 |3/30/2017 | 3/30/2017 |
|21 | 2 |33 |4/12/2017 | NULL |
|22 | 2 |34 |4/12/2017 | NULL |
|23 | 2 |38 |4/13/2017 | NULL |
Now what I would like to achieve is to derive a new column, namely VER, and update its value like the following:
|history_id | Req_id | StatID | Time |EndCycleDate | VER |
|-------------|---------|-------|---------- |-------------|------|
|1 | 1 |18 | 3/26/2017 | NULL | 1 |
|2 | 1 | 19 | 3/26/2017 | NULL | 1 |
|3 | 1 |20 | 3/30/2017 | NULL | 1 |
|4 |1 | 23 |3/30/2017 | NULL | 1 |
|5 | 1 |35 |3/30/2017 | 3/30/2017 | 1 |
|6 | 1 |33 |4/4/2017 | NULL | 2 |
|7 | 1 |34 |4/4/2017 | NULL | 2 |
|8 | 1 |39 |4/4/2017 | NULL | 2 |
|9 | 1 |35 |4/4/2017 | 4/4/2017 | 2 |
|10 | 1 |33 |4/5/2017 | NULL | 3 |
|11 | 1 |34 |4/6/2017 | NULL | 3 |
|12 | 1 |39 |4/6/2017 | NULL | 3 |
|13 | 1 |35 |4/7/2017 | 4/7/2017 | 3 |
|14 | 1 |33 |4/8/2017 | NULL | 4 |
|15 | 1 | 34 |4/8/2017 | NULL | 4 |
|16 | 2 |18 |3/28/2017 | NULL | 1 |
|17 | 2 |26 |3/28/2017 | NULL | 1 |
|18 | 2 |20 |3/30/2017 | NULL | 1 |
|19 | 2 |23 |3/30/2017 | NULL | 1 |
|20 | 2 |35 |3/30/2017 | 3/30/2017 | 1 |
|21 | 2 |33 |4/12/2017 | NULL | 2 |
|22 | 2 |34 |4/12/2017 | NULL | 2 |
|23 | 2 |38 |4/13/2017 | NULL | 2 |
One method that comes really close is a cumulative count:
select t.*,
count(endCycleDate) over (partition by req_id order by history_id) as ver
from t;
However, this doesn't get the value when the endCycle date is defined exactly right. And the value starts at 0. Most of these problems are fixed with a windowing clause:
select t.*,
(count(endCycleDate) over (partition by req_id
order by history_id
rows between unbounded preceding and 1 preceding) + 1
) as ver
from t;
But that misses the value on the first row first one. So, here is a method that actually works. It enumerates the values backward and then subtracts from the total to get the versions in ascending order:
select t.*,
(1 + count(*) over (partition by req_id) -
(count(endCycleDate) over (partition by req_id
order by history_id desc)
) as ver
from t;
Actually I am stuck in one issue. I have a table:
tbl_color
+------------+
|id | name |
|---|--------|
|1 | Red |
|---|--------|
|2 | Blue |
|---|--------|
|3 | Black |
+------------+
tbl_clothes
+----------------+
|id | name |
| 1 | Pant |
| 2 | Shirt |
| 3 | T-shirt |
+----------------+
tb_sales
+---------------------------------------+
|id | id_cloth | id_color | sales_date |
|---|----------|-----------|------------|
|1 | 1 | 1 | 2016/1/1 |
|---|----------|-----------|------------|
|2 | 1 | 3 | 2016/1/1 |
|---|----------|-----------|------------|
|3 | 1 | 1 | 2016/2/2 |
+---------------------------------------+
So when I change one row of tbl_color to
tbl_color
+---------------------------+
|id | name | modified_on |
|----|--------|-------------|
|1 | Orange | 2016/3/2 |
|----|--------|-------------|
|2 | Blue | 2016/1/2 |
|----|--------|-------------|
|3 | Black | 2016/1/2 |
+---------------------------+
So when I want to get report of sales on 2016/1/1
SELECT * from table tb_sales
JOIN tbl_clothes ON tbl_clothes.id = tbl_sales.id_cloth
JOIN tbl_sales ON tbl_color.id = tbl_sales.id_color
where sales_date = '2016/1/1'
I get the report that have been modified no the original sales
How can I handle this issue?