SQL Server: Create sequence column based on a non-distinct column - sql

I'm not sure if I'm asking this question right, but hopefully I can explain it well enough. I have a table that has a Date, Value, and WeekEndDate column. I want to create a sequence column that counts the distinct weeks from 1-13 and cycles every 13 weeks.
I attached a small sample of the output I'm trying to create. Is this even possible?

Use dense_rank() and some arithmetic:
select t.*,
((dense_rank() over (order by weekEnd) - 1) % 13) + 1
from t;

Related

Impala get the difference between 2 dates excluding weekends

I'm trying to get the day difference between 2 dates in Impala but I need to exclude weekends.
I know it should be something like this but I'm not sure how the weekend piece would go...
DATEDIFF(resolution_date,created_date)
Thanks!
One approach at such task is to enumerate each and every day in the range, and then filter out the week ends before counting.
Some databases have specific features to generate date series, while in others offer recursive common-table-expression. Impala does not support recursive queries, so we need to look at alternative solutions.
If you have a table wit at least as many rows as the maximum number of days in a range, you can use row_number() to offset the starting date, and then conditional aggregation to count working days.
Assuming that your table is called mytable, with column id as primary key, and that the big table is called bigtable, you would do:
select
t.id,
sum(
case when dayofweek(dateadd(t.created_date, n.rn)) between 2 and 6
then 1 else 0 end
) no_days
from mytable t
inner join (select row_number() over(order by 1) - 1 rn from bigtable) n
on t.resolution_date > dateadd(t.created_date, n.rn)
group by id

teradata sql problem: how to calculate the time difference in different columns with previous row order by another column?

It may sound not a new question here. But it is a little tricky here....
I want to apply for a similar sql like this below in teradata...
sel (col2- LAG(col1, 1)) minute OVER (ORDER BY session_id)
from data
I want to calculate the time difference by minutes between col1 and col2 ordered by session_id. So there are three columns here...
Thank you in advance.
I think the syntax you want is:
select (col2- LAG(col1) OVER (ORDER BY session_id)) day(4) to minute
from data
Note that 1 is not necessary; it is the default for LAG().

Calculating a running count of Weeks

I am looking to calculate a running count of the weeks that have occurred since a starting point. The biggest problem here is that the calendar I am working on is not a traditional Gregorian calendar.
The easiest dimension to reference would be something like 'TWEEK' which actually tells you the week of the year that the record falls into.
Example data:
CREATE TABLE #foobar
( DateKey INT
,TWEEK INT
,CumWEEK INT
);
INSERT INTO #foobar (DateKey, TWEEK, CumWEEK)
VALUES(20150630, 1,1),
(20150701,1,1),
(20150702,1,1),
(20150703,1,1),
(20150704,1,1),
(20150705,1,1),
(20150706,1,1),
(20150707,2,2),
(20150708,2,2),
(20150709,2,2),
(20150710,2,2),
(20150711,2,2),
(20150712,2,2),
(20150713,2,2),
(20150714,1,3),
(20150715,1,3),
(20150716,1,3),
(20150717,1,3),
(20150718,1,3),
(20150719,1,3),
(20150720,1,3),
(20150721,2,4),
(20150722,2,4),
(20150723,2,4),
(20150724,2,4),
(20150725,2,4),
(20150726,2,4),
(20150727,2,4)
For sake of ease, I did not go all the way to 52, but you get the point. I am trying to recreate the 'CumWEEK' column. I have a column already that tells me the correct week of the year according to the weird calendar convention ('TWEEK').
I know this will involve some kind of OVER() windowing, but I cannot seem to figure It out.
The windowing function LAG() along with a summation of ORDER BY ROWS BETWEEN "Changes" should get you close enough to work with. The caveat to this is that the ORDER BY ROWS BETWEEN can only take an integer literal.
Year Rollover : I guess you could create another ranking level based on mod 52 to start the count fresh. So 53 would become year 2, week 1, not 53.
SELECT
* ,
SUM(ChangedRow) OVER (ORDER BY DateKey ROWS BETWEEN 99999 PRECEDING AND CURRENT ROW)
FROM
(
SELECT
DateKey,
TWEEK,
ChangedRow=CASE WHEN LAG(TWEEK) OVER (ORDER BY DateKey) <> TWEEK THEN 1 ELSE 0 END
FROM
#foobar F2
)AS DETAIL
Some minutes ago I answered a different question, in a way this is a similar question to
https://stackoverflow.com/a/31303395/5089204
The idea is roughly to create a table of a running number and find the weeks with modulo 7. This you could use as grouping in an OVER clause...
EDIT: Example
CREATE FUNCTION dbo.RunningNumber(#Counter AS INT)
RETURNS TABLE
AS
RETURN
SELECT TOP (#Counter) ROW_NUMBER() OVER(ORDER BY o.object_id) AS RunningNumber
FROM sys.objects AS o; --take any large table here...
GO
SELECT 'test',CAST(numbers.RunningNumber/7 AS INT)
FROM dbo.RunningNumber(100) AS numbers
Dividing by 7 "as INT" offers a quite nice grouping criteria.
Hope this helps...

SQL Average Inter-arrival Time, Time Between Dates

I have a table with sequential timestamps:
2011-03-17 10:31:19
2011-03-17 10:45:49
2011-03-17 10:47:49
...
I need to find the average time difference between each of these(there could be dozens) in seconds or whatever is easiest, I can work with it from there. So for example the above inter-arrival time for only the first two times would be 870 (14m 30s). For all three times it would be: (870 + 120)/2 = 445 (7m 25s).
A note, I am using postgreSQL 8.1.22 .
EDIT: The table I mention above is from a different query that is literally just a one-column list of timestamps
Not sure I understood your question completely, but this might be what you are looking for:
SELECT avg(difference)
FROM (
SELECT timestamp_col - lag(timestamp_col) over (order by timestamp_col) as difference
FROM your_table
) t
The inner query calculates the distance between each row and the preceding row. The result is an interval for each row in the table.
The outer query simply does an average over all differences.
i think u want to find avg(timestamptz).
my solution is avg(current - min value). but since result is interval, so add it to min value again.
SELECT avg(target_col - (select min(target_col) from your_table))
+ (select min(target_col) from your_table)
FROM your_table
If you cannot upgrade to a version of PG that supports window functions, you
may compute your table's sequential steps "the slow way."
Assuming your table is "tbl" and your timestamp column is "ts":
SELECT AVG(t1 - t0)
FROM (
-- All this silliness would be moot if we could use
-- `` lead(ts) over (order by ts) ''
SELECT tbl.ts AS t0,
next.ts AS t1
FROM tbl
CROSS JOIN
tbl next
WHERE next.ts = (
SELECT MIN(ts)
FROM tbl subquery
WHERE subquery.ts > tbl.ts
)
) derived;
But don't do that. Its performance will be terrible. Please do what
a_horse_with_no_name suggests, and use window functions.

MySQL query - ORDER BY

Can I do this?:
SELECT * FROM calendar ORDER BY ((month * 31) + day)
Yes (assuming that you have numeric columns in your table called month and day)
Wouldn't you want
SELECT * FROM calendar ORDER BY ((`month` * 31) + `day`)
though?
Edit
Actually just use
SELECT * FROM calendar ORDER BY `month`, `day`
Unless I'm missing something?
You can order by any column in your query result. If you can SELECT such on-the-fly prepared information then you can ORDER BY it.
If you have columns in your table called month and day, then you should make a few changes to your query and it will work:
SELECT * FROM calendar ORDER BY ((month * 100) + day)
I removed the quotes from your column names and increased your month multiplier by one order of magnitude in order to ensure correct sorting on month and day numbers greater than or equal to 10. Also, I corrected a typo in the spelling of ORDER.
BUT
Your query will not sort correctly by year (not sure if this matters in your use case).
Your query will execute slowly on large data sets because of the dynamic nature of your ORDER by clause. Consider using a single date column and creating an index on it.