Repeat Rows Between Date Values in Redshift - sql

I have a table:
id | start_date | end_date
----------------------------
01 | 2016-02-19 | 2017-03-02
02 | 2017-06-19 | 2018-09-11
03 | 2015-03-19 | 2018-05-02
04 | 2018-02-19 | 2018-01-05
05 | 2014-06-19 | 2018-07-25
and I would like to repeat rows based on the time between start_date and end_date, in this case by years extracted from those two date columns. My desired result would resemble:
id | year
=========
01 | 2016
01 | 2017
02 | 2017
02 | 2018
03 | 2015
03 | 2016
03 | 2017
03 | 2018
04 | 2018
05 | 2014
05 | 2015
05 | 2016
05 | 2017
05 | 2018
How can I achieve this in Redshift?

We can try joining with a calendar table containing all years which would appear in your table:
WITH years AS (
SELECT 2014 AS year UNION ALL
SELECT 2015 UNION ALL
SELECT 2016 UNION ALL
SELECT 2017 UNION ALL
SELECT 2018
)
SELECT
t2.id,
t1.year
FROM years t1
INNER JOIN yourTable t2
ON t1.year BETWEEN DATE_PART('year', t2.start_date) AND DATE_PART('year', t2.end_date)
ORDER BY
t2.id,
t1.year;
Demo
Note: Use DATE_PART(year, t2.start_date) for Redshift, where the datetime component does not take single quotes.

Related

Count number of records with matching values in separate fields

I have a table (myTable) as such:
id | name | orig_id
----+-------+--------
01 | Bill | -
02 | Tom | 01
03 | Sam | 01
04 | Alex | 02
05 | Phil | -
06 | Bob | 01
I'd like a query that returns each record but with an added column containing the count of other rows that have an orig_id equal to the current row's id.
The resulting table would look like this:
id | name | orig_id | mycount
----+-------+---------+--------
01 | Bill | - | 3
02 | Tom | 01 | 1
03 | Sam | 01 | 0
04 | Alex | 02 | 0
05 | Phil | - | 0
06 | Bob | 01 | 0
I've tried the following query, but get no results:
SELECT *, COUNT(t.name) AS mycount
FROM "myTable" AS t
WHERE t.id=t.orig_id
GROUP BY t.id;
How can I achieve the desired results?
You can do this with a join and aggregation:
SELECT t.*, tsum.mycount
FROM myTable t join
(select orig_id, count(name) as mycount
from myTable
group by orig_id
) tsum
on t.id = tsum.orig_id;
A simple left join with count will do it:
select t.id, t.name, t.orig_id, count(o.id) mycount
from myTable t
left join myTable o on t.id = o.orig_id
group by t.id, t.name, t.orig_id
See a live demo on SQLFiddle that produces this result:
ID NAME ORIG_ID MYCOUNTÂ
01 Bill (null) 3
02 Tom 01 1
03 Sam 01 0
04 Alex 02 0
05 Phil (null) 0
06 Bob 01 0

How to generate a compounded view of data over time in Oracle SQL

Say I have a base number 10 and a table that has a value of 20 associated to November 2013, and a value of 10 associated to March 2014. I want to populate a list of all months, and their compounded value. So from May-November 2013, the value should be 10, then between Nov and Mar, the value should be 10+20 and afterwards it should be 10+20+10.
So in a table I have the following
MONTH VALUE
Nov-2013 20
Mar-2014 10
I'd like to have a select statement that somehow returns. There's an initial value of 10, hard-coded as the base.
MONTH VALUE
May-2013 10
Jun-2013 10
Jul-2013 10
Aug-2013 10
Sep-2013 10
Oct-2013 10
Nov-2013 30
Dec-2013 30
Jan-2014 30
Feb-2014 30
Mar-2014 40
Is this doable?
In case I understand your requirements correctly,
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE months
("MON" date, "VALUE" int)
;
INSERT ALL
INTO months ("MON", "VALUE")
VALUES (date '2013-11-01', 20)
INTO months ("MON", "VALUE")
VALUES (date '2014-03-01', 10)
SELECT * FROM dual
;
Query 1:
with months_interval as (
select date '2013-05-01' interval_start,
max(mon) interval_end
from months
)
, all_months as (
select add_months(m.interval_start,level-1) mon
from months_interval m
connect by level <= months_between(interval_end, interval_start) + 1
), data_to_sum as (
select am.mon,
decode(am.mon, first_value(am.mon) over(order by am.mon), 10, m.value) value
from months m, all_months am
where am.mon = m.mon(+)
)
select mon, value, sum(value) over(order by mon) cumulative
from data_to_sum
order by 1
Results:
| MON | VALUE | CUMULATIVE |
----------------------------------------------------------
| May, 01 2013 00:00:00+0000 | 10 | 10 |
| June, 01 2013 00:00:00+0000 | (null) | 10 |
| July, 01 2013 00:00:00+0000 | (null) | 10 |
| August, 01 2013 00:00:00+0000 | (null) | 10 |
| September, 01 2013 00:00:00+0000 | (null) | 10 |
| October, 01 2013 00:00:00+0000 | (null) | 10 |
| November, 01 2013 00:00:00+0000 | 20 | 30 |
| December, 01 2013 00:00:00+0000 | (null) | 30 |
| January, 01 2014 00:00:00+0000 | (null) | 30 |
| February, 01 2014 00:00:00+0000 | (null) | 30 |
| March, 01 2014 00:00:00+0000 | 10 | 40 |
This one is probably slightly suboptimal performance-wise (queries months table twice etc.) and should be optimized, but the idea is like this - pregenerate a list of months (I assumed your interval start is somehow fixed), left join it to your data, use analytic sum function.

add categories to a trend table

Hi folks I've been noddling how to approach this one for a while now and I'm just stuck. Hoping this question is useful to the community.
I have a trend table with data like the first table below. I have another table with categories like the second table below. The goal is to display the data in a stacked column chart. Each column in the chart would be a last sample for that day, the series group for each column would be the circuit categories.
the data is sampled from every 10 minutes but for example sake I just entered 2 samples for each day:
time_stamp | circuit1 | circuit2 | circuit3
1/5/13 08:00 | 50 | 60 | 30
1/5/13 04:00 | 48 | 55 | 26
1/4/13 08:00 | 42 | 52 | 22
1/4/13 04:00 | 40 | 51 | 20
etc.
I have a category table similar to this:
Circuit_name | circuit_category
circuit1 | category4
circuit2 | category2
circuit3 | category12
etc.
Maybe I'm not thinking of a simpler way to do this from a reporting standpoint, but in order to get a stacked bar chart day by day like the requirements, I think I need a query which results in the following:
time_stamp | Circuit_name | Circuit_category | Value
1/5/13 08:00 | Circuit1 | category4 | 50
1/5/13 08:00 | Circuit2 | category2 | 60
1/5/13 08:00 | Circuit3 | category12 | 30
1/4/13 08:00 | Circuit1 | category4 | 42
1/4/13 08:00 | Circuit2 | category2 | 52
1/4/13 08:00 | Circuit3 | category12 | 22
I'm thinking I need to write a query to grab the max(time_stamp) grouped by day, but pivot the results so I can join the data to the category table. I've played around with using pivot on the first table since I have to join the circuit_name in table2 to the actual column names in table1, but I keep running into dead ends because I don't understand pivot well enough.
Anyway I'm willing to abandon table 2 if hard coding the circuit categories into the query is necessary, but again this is where I'm stuck. Any guidance would be appreciated.
The data is on a sql2008r2 server.
Thanks!
This seems like unpivot columns to rows... SQL Server has this function :) I believe following query can be improved and optimized. Pleaes comment after you have tried.
SQLFIDDLE DEMO
Query:
select m.*, t.cat
from
(SELECT ts, name, value
FROM
(
SELECT ts,
CONVERT(varchar(20), C1) AS c1,
CONVERT(varchar(20), C2) AS c2,
CONVERT(varchar(20), C3) AS c3
FROM t2
) MyTable
UNPIVOT
(Value FOR name IN
(c1,c2,c3))AS MyUnPivot) m
left join t1 t
on t.name = m.name
;
Results:
TS NAME VALUE CAT
January, 05 2013 08:00:00+0000 c1 50 category4
January, 05 2013 08:00:00+0000 c2 60 category2
January, 05 2013 08:00:00+0000 c3 30 category12
January, 05 2013 04:00:00+0000 c1 48 category4
January, 05 2013 04:00:00+0000 c2 55 category2
January, 05 2013 04:00:00+0000 c3 26 category12
January, 04 2013 08:00:00+0000 c1 42 category4
January, 04 2013 08:00:00+0000 c2 52 category2
January, 04 2013 08:00:00+0000 c3 22 category12
January, 04 2013 04:00:00+0000 c1 40 category4
January, 04 2013 04:00:00+0000 c2 51 category2
January, 04 2013 04:00:00+0000 c3 20 category12

Doubts in query conditions

I have two tables
1.Employee
EMP_NAME,
EMP_CODE
2.Vacations
EMP_NAME,
EMP_CODE,
VACATION_START_DATE-->date type
VACATION_END_DATE-->date type,
My question is how to query to get the EMP_NAME from table1(Employee), where the today is not in between VACATION_START_DATE and VACATION_END_DATE from table2(Vacations)..
Try this please: assuming that you do not have emp_name vacations table..as well as emp_code as the relationship between the two table. So you can use joins.
SQLFIDDLE DEMO
select e.emp_code, e.emp_name,
v.start_date, v.end_Date
from emp e
inner join
vacation v
on e.emp_code = v.emp_code
where not (Now() between v.start_date
and v.end_date)
;
| EMP_CODE | EMP_NAME | START_DATE | END_DATE |
-------------------------------------------------------------------------------------------
| 1 | john | December, 10 2012 00:00:00+0000 | December, 20 2012 00:00:00+0000 |
| 2 | kate | December, 20 2012 00:00:00+0000 | December, 30 2012 00:00:00+0000 |
| 3 | tim | December, 24 2012 00:00:00+0000 | January, 01 2013 00:00:00+0000 |
| 1 | john | January, 01 2013 00:00:00+0000 | January, 08 2013 00:00:00+0000 |
Not sure whether you meant "today" was a variable or the system date, so the following code may need to be modified, but I think it might work:
SELECT
EMP_NAME
FROM
EMPLOYEE
WHERE
NOT EXISTS (SELECT * FROM VACATIONS WHERE EMP_NAME = EMPLOYEE.EMP_NAME AND
VACATIONS.VACATION_START_DATE >= Today AND
VACATIONS.VACATION_END_DATE <= Today)

how to get week number for a date based on month from the date in sql server

I have a date column in a table and I want to get week number for that particular date based on the month from that date irrespective of the day
For example:
01-dec-2012 to 07-dec-2012 should give week number as 1
08-dec-2012 to 14-dec-2012 should give week number as 2
15-dec-2012 to 21-dec-2012 should give week number as 3
22-dec-2012 to 28-dec-2012 should give week number as 4
29-dec-2012 to 31-dec-2012 should give week number as 5
This week number is not dependent on the starting day of the week i.e, it can be any day
How can I write a select statement to get this output in SQL Server 2008?
You can use DAY (Transact-SQL)
select ((day(DateColumn)-1) / 7) + 1
from YourTable
SQL Fiddle
MS SQL Server 2012 Schema Setup:
create table YourTable
(
D datetime
)
insert into YourTable
select getdate()+Number
from master..spt_values
where type = 'P' and
Number between 1 and 15
Query 1:
select D,
((day(D)-1) / 7) + 1 as W
from YourTable
Results:
| D | W |
--------------------------------------
| January, 03 2013 07:48:54+0000 | 1 |
| January, 04 2013 07:48:54+0000 | 1 |
| January, 05 2013 07:48:54+0000 | 1 |
| January, 06 2013 07:48:54+0000 | 1 |
| January, 07 2013 07:48:54+0000 | 1 |
| January, 08 2013 07:48:54+0000 | 2 |
| January, 09 2013 07:48:54+0000 | 2 |
| January, 10 2013 07:48:54+0000 | 2 |
| January, 11 2013 07:48:54+0000 | 2 |
| January, 12 2013 07:48:54+0000 | 2 |
| January, 13 2013 07:48:54+0000 | 2 |
| January, 14 2013 07:48:54+0000 | 2 |
| January, 15 2013 07:48:54+0000 | 3 |
| January, 16 2013 07:48:54+0000 | 3 |
| January, 17 2013 07:48:54+0000 | 3 |
try this
declare #dates datetime
select #dates='2012-12-22'
SELECT datepart(dd,#dates), ceiling (cast(datepart(dd,#dates)as numeric(38,8))/7)
Several options here to do what you wish. Most promising of which seems to be use of the DATEPART function. But do beware that results may differ depending on your local settings.
Hope one of them works out for you.