Dynamic dates queries in SQL - sql

I have a big dataset and I want to make it shorter in order to make it easier for Power BI to read. What I need is to get data for only 6 months having my date variable as FechaCarga in MyTable, which is refreshed daily and has daily data.
Example:
select *
from Mytable
where FechaCarga between (
select max(FechaCarga)
from MyTable)
and
--THIS IS THE PART THAT IM MISSING, PROBABLY USING DATEADD.
I expect data from Today (MaxDate) and MaxDate - 6 months. Please help me.
Thanks in advance,
IC

Is this what you want?
select t.*
from (select t.*, max(fechacarga) over () as max_fechacarga
from mytable t
) t
where fechacarga > dateadd(month, -6, max_fechacarga);

Like you said, just use DATEADD(). Try current_date to get today's date. (not sure if all DBMS's support that)
select *
from Mytable
where
FechaCarga between
(select max(FechaCarga) from MyTable)
and dateadd(month, -6, current_date)

The easiest way since you are always looking until the max date is:
select *
from Mytable
where FechaCarga >= dateadd(month, -6, (select max(FechaCarga) from MyTable))

Related

Selecting group of years from date field

I'm trying to get a list of years from a date field that's stored as an nvarchar so am thinking doing a subquery to convert the date then select the year is the best way to go but having a hard time setting it up.
select datepart(yyyy,
(
SELECT convert(date,'21-02-12 6:10:00 PM',5) datenum
)
) as [year]
from SalesReport_AllDBs
group by datepart(yyyy, [datenum])
Any advice would be helpful to get this set up correctly
The subquery should go in your FROM clause:
SELECT datepart(yyyy, mydate) as datenum
FROM (SELECT convert(date, yourdatestringfield ,5) as myDate FROM SalesReport_AllDBs) as years
GROUP BY datepart(yyyy,mydate);
Or in one query without a subquery, which is a lot nicer looking:
SELECT datepart(convert(date, yourdatestringfield ,5)) as datenum
FROM SalesReport_AllDBs
GROUP BY datenum
You should really just fix the table to hold dates instead of strings though. This is just going to lead to some nightmare scenarios and a slow slow query.
select distinct year(cast([datenum] as date)) year
from SalesReport_AllDBs

adding a row for missing data

Between a date range 2017-02-01 - 2017-02-10, i'm calculating a running balance.
I have days where we have missing data, how would I include these missing dates with the previous days balance ?
Example data:
we are missing data for 2017-02-04,2017-02-05 and 2017-02-06, how would i add a row in the query with the previous balance?
The date range is a parameter, so could change....
Can i use something like the lag function?
I would be inclined to use a recursive CTE and then fill in the values. Here is one approach using outer apply:
with dates as (
select mind as dte, mind, maxd
from (select min(date) as mind, max(date) as maxd from t) t
union all
select dateadd(day, 1, dte), mind, maxd
from dates
where dte < maxd
)
select d.dte, t.balance
from dates d outer apply
(select top 1 t.*
from t
where t.date <= d.dte
order by t.date desc
) t;
You can generate dates using tally table as below:
Declare #d1 date ='2017-02-01'
Declare #d2 date ='2017-02-10'
;with cte_dates as (
Select top (datediff(D, #d1, #d2)+1) Dates = Dateadd(day, Row_Number() over (order by (Select NULL))-1, #d1) from
master..spt_values s1, master..spt_values s2
)
Select * from cte_dates left join ....
And do left join to your table and get running total
Adding to the date range & CTE solutions, I have created Date Dimension tables in numerous databases where I just left join to them.
There are free scripts online to create date dimension tables for SQL Server. I highly recommend them. Plus, it makes aggregation by other time periods much more efficient (e.g. Quarter, Months, Year, etc....)

Optimizing GROUP BY performance

Is there some tricky way to GROUP BY a variable which has been defined by alias or which is a result of calculation? I think that the following code makes a double dip by calculating MyMonth in Select statement and then again in Group statement. It may be unnecessary waste. It is not possible by simple GROUP BY MyMonth. Is it possible to force only one calculation of month([MyDate])?
Update of code. Aggregate function is added.
SELECT month([MyDate]) AS MyMonth, count([MyDate]) AS HowMany
FROM tableA
WHERE [MyDate] BETWEEN '2014-01-01' AND '2014-12-31'
GROUP BY month([MyDate])
ORDER BY MyMonth
Your real problem likely stems from calling MONTH(...) on every row. This prevents the optimizer from using an index to fulfill the count (it can use it for the WHERE clause, but this will still be many rows).
Instead, you should turn this into a range query, that the optimizer could use for comparisons against an index. First we build a simple range table:
WITH Months as (SELECT MONTH(d) AS month,
d AS monthStart, DATEADD(month, 1, d) AS monthEnd
FROM (VALUES(CAST('20140101' AS DATE))) t(d)
UNION ALL
SELECT MONTH(monthEnd),
monthEnd, DATEADD(month, 1, monthEnd)
FROM Months
WHERE monthEnd < CAST('20150101' AS DATE))
SQL Fiddle Example
(if you have an existing calendar table, you can base your query on that, but sometimes a simple ad-hoc one works best)
Once we have the range-table, you can then use it to constrain and bucket your data, like so:
SELECT Months.month, COUNT(*)
FROM TableA
JOIN Months
ON TableA.MyDate >= Months.monthStart
AND TableA.MyDate < Months.monthEnd
GROUP BY Months.month
Note: The start of the date range was changed to 2014-01-01, as it seems strange that you'd only include one day from January, when aggregating months...
No, you can't use column alias directly in the GROUP BY clause. Instead do a select in the from list, and use the result column in your group by.
select MyMonth, MAX(someothercolumn)
from
(
SELECT month([MyDate]) AS MyMonth,
someothercolumn
FROM tableA
WHERE [MyDate] BETWEEN '2014-01-31' AND '2014-12-31'
)
GROUP BY MyMonth
ORDER BY MyMonth

Select "YYYY" component only from DateTime column

Using SQLCe, I have a column of DateTime type. I would like to filter just by year. Is it possible or should I store year separately, which seems to me redundant?
E.g. get distinct results of 2010,2011,2013.
Thanks
think you have the DATEPART function (but not the YEAR function)
so
select DatePart(yyyy, <yourDateTime>)
or if that's for ordering, of course
order by DatePart(yyyy, <yourDatetime>)
EDIT
select max(InvoiceID)
from yourTable
where DatePart(yyyy, IssuedDate) = 2013
You can use the DATEPART function to return the year for that column:
SELECT DATEPART(yyyy, datetimecolumn) FROM YourTable
You can then filter with a where clause:
WHERE datetimecolumn = 2014
The usual way to do this is to use a range filter:
select *
from table
where datecolumn >= '2012/01/01' and datecolumn < '2013/01/01'
This has the benefit that any index you may have on datecolumn can be used.
Since the answer you accepted shows that you only care about one single year, your objection to this answer doesn't really apply.
select max(InvoiceID)
from table
where IssuedDate >= '2012/01/01' and IssuedDate < '2013/01/01'
will work just fine.

What is the fastest way to group a DateTime column by Date in T-SQL

I have an older sql 2005 box, and I need to do some summaries of a table with ~500m rows.
I have a datetime column in the table and I want to get just the date out of it for output and group by. I know there are a few ways to do this, but what is the absolute fastest?
Thanks
I suspect the fastest would be to:
SELECT
the_day = DATEADD(DAY, the_day, '19000101'),
the_count
FROM
(
SELECT
the_day = DATEDIFF(DAY, '19000101', [the_datetime_column]),
the_count = COUNT(*)
FROM dbo.the_table
GROUP BY DATEDIFF(DAY, '19000101', [the_datetime_column])
WHERE ...
) AS x;
But "fastest" is relative here, and it will depend largely on the indexes on the table, how you're filtering out rows, etc. You will want to test this against other typical date truncation methods, such as CONVERT(CHAR(8), [the_datetime_column], 112).
What you could consider - depending on whether this query is more important than write performance - is adding a persisted computed column with an index, or an indexed view, that would help this aggregation for you at write time instead of query time.
I imagine you can get a slightly better performance this way.
SELECT cast(cast([actiontime]+.5 as int) as datetime) as [yourdate], count(*) as count
FROM <yourtable>
GROUP BY cast([<yourdate>]+.5 as int)
You can improve this once you upgrade to mssql server 2008.
SELECT cast([<yourdate>] as date) as [yourdate], count(*) as count
FROM <yourtable>
GROUP BY cast([<yourdate>] as date)