I would like to create dynamic column in Redshift, which will add new value incremented by 1 dynamically. Basically it will calculate month distance from specific date, let's say 1 Jan 2020. So for current month it should be 23, in next month it should be 24 etc. Is it possible to somehow replace something which I have now static in WITH statement? Counter stops on 12 and I would have to increment it every month manually.
with months as (
select 1 as mon union all select 2 union all select 3 union all select 4 union all
select 5 as mon union all select 6 union all select 7 union all select 8 union all
select 9 as mon union all select 10 union all select 11 union all select 12
),
I think you should use DATEDIFF function as it gives you the difference in months between two dates. Simply put the dates you want: https://docs.aws.amazon.com/redshift/latest/dg/r_DATEDIFF_function.html
Example:
select datediff(mon,'2020-01-01',current_date) as mon_diff
Depends on the size for your table, maybe save the code as a view so every time you run it you will get the correct difference.
Try this
Alter table tablename
Add New_column number Default
datediff(mon,date_col, current_date);
Or
With data as
(Select row_number() over (order by 1) rn from
table)
Select datediff(month, max(rn), current_date)
from data;
Note: replace table to some table with entries count as more than your required like 9 and so on then can limit the results as required
Related
This question already has answers here:
Why no windowed functions in where clauses?
(8 answers)
Closed 7 months ago.
It is an incredibly common scenario where I want to select the row (or rows) that have either the maximum or minimum value for some column - often a datetime stamp of some kind. It would seem logical that a simple way to do this would be something like this:
SELECT *
FROM MyTable
WHERE DateColumn = MAX(DateColumn)
This, of course, is not allowed. Aggregate functions are not allowed in a WHERE clause (though I don't know why, exactly). One could use a HAVING clause, but this doesn't actually work either:
SELECT *
FROM MyTable
HAVING DateColumn = MAX(DateColumn)
Instead, the only solutions seem to be some variation of a subquery, something like this:
SELECT *
FROM MyTable
WHERE DateColumn = (
SELECT MAX(DateColumn)
FROM MyTable
)
Why is such a common need made so complicated? The intent of both of my examples above seems quite obvious, so why can't the SQL compiler be made to understand them? Or, if there is some technical reason why the existing implementation of WHERE cannot handle this syntax, why has no simple syntax been added to the language? I run into this particular need very frequently, and I see from searching online that I am hardly the only one. It would seem like the language should have accounted for this LONG ago, but it never has. Is there some serious technical or logical limitation I am missing here that makes this unrealistic?
It's SQL. It's set oriented. And it's only ordered if you deliberately state that - and how - you want your data ordered.
And there is, in every DBMS I know, a means to limit the number of rows returned.
In SQL Server, your query would need to be:
SELECT TOP(1)
*
FROM mytable
ORDER BY datecolumn DESC;
And just because we can: let's assume you want the newest row for each ID. There, decent DBMS-s have OLAP functions like ROWNUM() OVER() to help you with that task.
Input would be a table with a bunch of identifiers and several different dates per identifier. And you want the newest for each identifier. See here: ...
WITH
indata(id,dt) AS (
SELECT 1, DATE '2022-01-01'
UNION ALL SELECT 1, DATE '2022-01-02'
UNION ALL SELECT 2, DATE '2022-01-01'
UNION ALL SELECT 2, DATE '2022-01-02'
UNION ALL SELECT 3, DATE '2022-01-01'
UNION ALL SELECT 3, DATE '2022-01-02'
UNION ALL SELECT 4, DATE '2022-01-01'
UNION ALL SELECT 4, DATE '2022-01-02'
)
,
w_rownum AS (
SELECT
ROW_NUMBER() OVER(PARTITION BY id ORDER BY dt DESC) AS rn
, *
FROM indata
)
SELECT
id
, dt
FROM w_rownum
WHERE rn=1
ORDER BY id;
-- out id | dt
-- out ----+------------
-- out 1 | 2022-01-02
-- out 2 | 2022-01-02
-- out 3 | 2022-01-02
-- out 4 | 2022-01-02
I need to get the cumulative sum of column sales and growth starting from second row.
Sample data:
select 1 AS SN,'16000' AS Sales,'0' AS Growth,'16000' AS RequiredTotal
INTO #tempa
union select 2,'','500','16500'
union select 3,'','500','17000'
union select 4,'','500','17500'
union select 5,'','500','18000'
union select 6,'','500','18500'
union select 7,'','500','19000'
SELECT *
FROM #tempa
Here I need to get the requiredtotal column.
First value is the sales itself, And starting from second row, need to get the sum of 1st value of requiredtotal column and the growth column second row.
Use window functions:
select a.*,
(max(sales) over () +
sum(growth) over (order by sn)
) as required
from #tempa a;
Here is a db<>fiddle.
Note that I changed the data types in the fiddle so the numbers are actually numbers. Don't store numbers as strings.
I have a YearMonths table where ever year requires 12 entries, one for each month, i.e.,
Year Month
2013 1
2013 2
...
2013 12
For each new year, I have to generate 12 new records. I know I can do this with a loop, but I'm trying to figure out a way to do it without one. I want to fill the table by selecting all of the years from a Years table and going from there, I'm just not sure how without using a loop.
Assuming that your YEARS table is something like this:
CREATE TABLE YEARS(Year INT)
And your YearMonths table is something like this:
CREATE TABLE YearMonths(Year INT, Month Int)
You can do something like this:
WITH CTE AS (
SELECT 1 AS Mnth
UNION ALL
SELECT Mnth + 1 FROM CTE
WHERE Mnth < 12)
INSERT INTO YearMonths (Year, Month)
SELECT Year, Mnth FROM YEARS CROSS APPLY CTE
ORDER BY Year, Mnth
This approach uses a recursive Common Table Expression (available since SQL Server 2005) to build list of integer 1-12 and then cross applies it to Years table to build final list.
Demo: http://sqlfiddle.com/#!3/bbb0f/1
I have a query that has three prompts; Department, From Date, and To Date. One must select the department ID but has a an option to select the date range. How can I make the date range optional? I was thinking of using the decode function but not sure how to write it so the two date prompts can be left blank.
If you are using a stored procedure you can do something like this in your select statement:
select *
from table
where (field > inDateStart and field < inDateEnd) or
(inDateStart is null and inDateEnd is null)
or using coalesce
select *
from table
where (field => coalesce(inDateStart,field) and
field <= coalesce(inDateEnd,field)
It really depends on your particular situation. Some queries lend themselves to the first some to the second.
Assuming an unspecified date input comes across as NULL, you can do this little trick:
with
TheTable as
(select 1 dept, sysdate dt from dual
union
select 2 dept, sysdate-63 dt from dual
union
select 3 dept, sysdate-95 dt from dual
)
select *
from thetable
where coalesce(:DateFrom,dt) <= dt
and coalesce(:DateTo,dt) >= dt
;
Need a bit more info on the nature of your data to consider dept as an input... Does the table store multiple dates per dept?
I am working on a historical conversion of data and was wondering if there's a more efficient way to accomplish a date increment.
I receive a data from a source system on a saturday date (1-7-13) and would like to push that data to make it fill all days of the previous week (1-6-13,1-5-13 ect).
So currently i am doing several unions
insert into target
(date, name)
select date,name
from
(
SELECT date as date, name FROM SOURCE
UNION
SELECT date - 1 as date, name FROM SOURCE
UNION
SELECT date -2 as date, name FROM SOURCE
)
I only ask because it looks like close to 500 million records are going to be going though this sql script. Incase it matters it is going to be running in a BTEQ script in TERADATA.
First, your code would be faster using union all rather than union. union removes duplicates, which does not seem to be needed in this case. If you do need them removed, then do it at the source level:
from (select distinct name from source)
Rather than doing it implicitly with union.
You can also try a cross join approach:
select date - i, name
from source cross join
(select 0 as i union all select 1 union all select 2 union all select 3 union all
select 4 union all select 5 union all select 6
) const
This might be a bit faster, because it doesn't need to set up the reads to the table multiple times.
One option is to use a recursive query, but I don't think it would be much faster -- just perhaps easier to read:
WITH RECURSIVE recursiveCTE (date, name) AS (
SELECT date, name
FROM Source
UNION ALL
SELECT r.date-1, r.name
FROM recursiveCTE R
JOIN Source T ON R.name = T.name AND T.date < r.date+6
)
INSERT INTO Target (date,name)
SELECT date,name From recursiveCTE