Is there a way to set all dates? - sql

I was wondering, in SQL/dbt is there a way to set all dates to be >= another date?
Say I have a 'createdat' date field and a 'updatedat' date field. I use it multiple times in my query (multiple CTEs) as well as other dates. I want to make sure all dates used are less then the last day of last month (i.e. <= last_day(current_date()-30, month)).
Is there a way to set that in the beginning of the query?

This can definitely be done. You'll want to compare the greatest() of a number of columns with whatever date cut-off you want.
Effectively, it would be:
select *
from {{ ref('some_table') }}
where greatest(created_at,updated_at) < date_trunc('month', current_date)
You can obviously add as many columns to that query as you'd like.
N.B.: On some warehouses, greatest returns null if any of the columns in it are null. In that situation, you'll need to coalesce each date with some date placeholder, like '1970-01-01'.

Related

Is there a way to store multiple dates into a table with potential to grow?

I have a table like this in SQLITE3:
I need to query this table by ID|DOC_ID|TRANS_DOC_ID and most importantly by DATE because I need to get the data day by day. ex: TODAY|YESTERDAY|ETC
So far the query is easy, as I can just do this to get the rows by day:
SELECT * FROM CLIENTRECORD WHERE DATE = '2020-12-01'
The problem is when I need to display specific records on other dates:
ex: I have a row with DATE 2020-12-01 but I also want it displayed on DATE 2020-01-01 or maybe 2020-01-02, etc. What do I do in this situation? and so I thought about adding another col as DATES which was supposed to be an array of comma-separated dates BUT I researched that this is a BAD solution, I also thought about adding a separate TABLE just for dates but since the dates aren't fixed (they might contain 1 date or maybe even 10 who knows), I am confused as to what I am supposed to do.
The end goal is that a row may or may not contain more than 1 date, would look something like this if I want to query for the row with or without multiple dates:
SELECT * FROM CLIENTRECORD WHERE DATE = '2020-12-01' OR DATES LIKE '2020-12-01'
something similar to it.

YYYY-MM column type in PostgreSQL

I need to a value associated to a month and a user in a table. And I want to perform queries on it. I don't know if there is a column data type for this type of need. If not, should I:
Create a string field and build year-month concatenation (2017-01)
Create a int field and build year-month concatenation (201701)
Create two columns (one year and one month)
Create a date column at the beginning of the month (2017-01-01 00:00:00)
Something else?
The objective is to run queries like (pseudo-SQL):
SELECT val FROM t WHERE year_month = THIS_YEAR_MONTH and user_id='adc1-23...';
I would suggest not thinking too hard about the problem and just using the first date/time of the month. Postgres has plenty of date-specific functions -- from date_trunc() to age() to + interval -- to support dates.
You can readily convert them to the format you want, get the difference between two values, and so on.
If you phrase your query as:
where year_month = date_trunc('month', now()) and user_id = 'adc1-23...'
Then it can readily take advantage of an index on (user_id, year_month) or (year_month, user_id).
If you are interested in display values in YYYY-MM formt you can use to_char(your_datatime_colum,'YYYY-MM')
example:
SELECT to_char(now(),'YYYY-MM') as year_month

sqlalchemy select by date column only x newset days

suppose I have a table MyTable with a column some_date (date type of course) and I want to select the newest 3 months data (or x days).
What is the best way to achieve this?
Please notice that the date should not be measured from today but rather from the date range in the table (which might be older then today)
I need to find the maximum date and compare it to each row - if the difference is less than x days, return it.
All of this should be done with sqlalchemy and without loading the entire table.
What is the best way of doing it? must I have a subquery to find the maximum date? How do I select last X days?
Any help is appreciated.
EDIT:
The following query works in Oracle but seems inefficient (is max calculated for each row?) and I don't think that it'll work for all dialects:
select * from my_table where (select max(some_date) from my_table) - some_date < 10
You can do this in a single query and without resorting to creating datediff.
Here is an example I used for getting everything in the past day:
one_day = timedelta(hours=24)
one_day_ago = datetime.now() - one_day
Message.query.filter(Message.created > one_day_ago).all()
You can adapt the timedelta to whatever time range you are interested in.
UPDATE
Upon re-reading your question it looks like I failed to take into account the fact that you want to compare two dates which are in the database rather than today's day. I'm pretty sure that this sort of behavior is going to be database specific. In Postgres, you can use straightforward arithmetic.
Operations with DATEs
1. The difference between two DATES is always an INTEGER, representing the number of DAYS difference
DATE '1999-12-30' - DATE '1999-12-11' = INTEGER 19
You may add or subtract an INTEGER to a DATE to produce another DATE
DATE '1999-12-11' + INTEGER 19 = DATE '1999-12-30'
You're probably using timestamps if you are storing dates in postgres. Doing math with timestamps produces an interval object. Sqlalachemy works with timedeltas as a representation of intervals. So you could do something like:
one_day = timedelta(hours=24)
Model.query.join(ModelB, Model.created - ModelB.created < interval)
I haven't tested this exactly, but I've done things like this and they have worked.
I ended up doing two selects - one to get the max date and another to get the data
using the datediff recipe from this thread I added a datediff function and using the query q = session.query(MyTable).filter(datediff(max_date, some_date) < 10)
I still don't think this is the best way, but untill someone proves me wrong, it will have to do...

Retrieving how many transactions were made on a date in SQL?

I have a table named Sales and a column within it named Date. I'm simply trying to find how many sales were made on a specific date. My intuition was to use something like this:
SELECT COUNT(Date) FROM Sales WHERE Date='2015-04-04'
this should count all sales that were made on that date, but that returns 0. What am I doing wrong?
While it is difficult to be precise without table definitions or an indication of what RDBMS you are using, it is likely that Date is a time/date stamp, and that the result you want would be obtained either by looking for a range from the beginning of the day to the end of the day in your WHERE clause, or by truncating Date down to a date without the time before comparing it to a date.
Try the below once.
select count(*) from <t.n> where date like '2015-04-04%';
When you want to find the count of rows based on a field (Date) You need to Group By over it like this:
SELECT Date, COUNT(*)
FROM Sales
GROUP BY Date
Now you have all count of rows for each Date.
Type and Value of Date is important in the result of the above query.
For example in SQL Server your best try is to convert a DateTime field to varchar and then check it as the result of CONVERT like this:
SELECT COUNT(*)
FROM Sales
WHERE CONVERT(VARCHAR, Date, 111) = '2015/04/04'

PostgreSQL - GROUP BY timestamp values?

I've got a table with purchase orders stored in it. Each row has a timestamp indicating when the order was placed. I'd like to be able to create a report indicating the number of purchases each day, month, or year. I figured I would do a simple SELECT COUNT(xxx) FROM tbl_orders GROUP BY tbl_orders.purchase_time and get the value, but it turns out I can't GROUP BY a timestamp column.
Is there another way to accomplish this? I'd ideally like a flexible solution so I could use whatever timeframe I needed (hourly, monthly, weekly, etc.) Thanks for any suggestions you can give!
This does the trick without the date_trunc function (easier to read).
// 2014
select created_on::DATE from users group by created_on::DATE
// updated September 2018 (thanks to #wegry)
select created_on::DATE as co from users group by co
What we're doing here is casting the original value into a DATE rendering the time data in this value inconsequential.
Grouping by a timestamp column works fine for me here, keeping in mind that even a 1-microsecond difference will prevent two rows from being grouped together.
To group by larger time periods, group by an expression on the timestamp column that returns an appropriately truncated value. date_trunc can be useful here, as can to_char.