Sort data by month, then year

Sort data by month, then year - sql

Table Folder
Column | Type | Modifiers
-------------------+--------------------------+---------------------------------
ID | integer | not null default
Name | character varying | not null
Size | bigint | not null
Timestamp | timestamp with time zone |
attempting to get a count for all files uploaded in 2014. And monthly count for that same year.
SELECT COUNT(*) FROM "File" WHERE "Timestamp" > '2014-01-01 21:53:23+08'

SELECT TO_CHAR(Timestamp, 'Mon') AS month,
COUNT(*) AS fileCount
FROM File
WHERE EXTRACT(YEAR FROM Timestamp) = 2014
GROUP BY TO_CHAR(Timestamp, 'Mon')
If you wanted a report which shows a monthly breakdown across multiple years, then you can slightly modify the above query:
SELECT TO_CHAR(Timestamp, 'Mon') AS month,
EXTRACT(YEAR FROM Timestamp) AS year,
COUNT(*) AS fileCount
FROM File
WHERE EXTRACT(YEAR FROM Timestamp) IN (2014, 2015, ...)
GROUP BY TO_CHAR(Timestamp, 'Mon'),
EXTRACT(YEAR FROM Timestamp)

Seeing your own query, it seems you want to consider the timestamps of 2014 as seen from timezone GMT + 8 hours (Singapore, Hongkong or whatever) and count accordingly. I think the query should hence be:
SELECT EXTRACT(MONTH FROM "Timestamp" AT TIMEZONE 'GMT-8'), COUNT(*)
FROM "File"
WHERE "Timestamp" >= timestamp with timezone '2014-01-01 00:00:00+08'
AND "Timestamp" < timestamp with timezone '2015-01-01 00:00:00+08'
GROUP BY EXTRACT(MONTH FROM "Timestamp" AT TIMEZONE 'GMT-8')
ORDER BY EXTRACT(MONTH FROM "Timestamp" AT TIMEZONE 'GMT-8');
As I don't use any function on "Timestamp" in the where clause, an index may more likely be used to speed up the query (provided of course there is an index on the "Timestamp" column).
But I must admit timezones always confuse me. Why is it '+08' (plus sign) when I talk about Hongkong in datetime literals and 'GMT-8' (minus sign) when I want to convert to Hongkong time. It should be correct, though.

Related

Window function for average

I have this table timestamp_table and I'm using Presto SQL
timestamp | id
2021-01-01 10:00:00 | 2456
I would like to compute the number of unique IDs in the last 24 and 48 hours and I thought this could be achieved with window functions but I'm struggling. This is my proposed solution, but it needs work
SELECT COUNT(id) OVER (PARTITION BY timestamp ORDER BY timestamp RANGE BETWEEN INTERVAL '24' HOUR PRECEDING AND CURRENT ROW)

You're probably having trouble due to the PARTITION BY clause, since the COUNT will only apply to rows within the same timestamp values.
Try something like this, as a starting point:
The fiddle
SELECT *
, COUNT(id) OVER (ORDER BY timestamp RANGE BETWEEN INTERVAL '24' HOUR PRECEDING AND CURRENT ROW)
, MIN(id) OVER (ORDER BY timestamp RANGE BETWEEN INTERVAL '24' HOUR PRECEDING AND CURRENT ROW)
FROM tbl
;

I think that you can't get data for both time intervals by one table scan. Because row that is in last 24 hours must be in both groups: 24 hours and 48 hours. So you must do 2 request or union them.
select 'h24', count(distinct id)
from timestamp_table
where timestamp < current_timestamp and timestamp >= date_add(day, -1, current_timestamp)
union all
select 'h48', count(distinct id)
from timestamp_table
where timestamp < current_timestamp and timestamp >= date_add(day, -2, current_timestamp)

Can I reduce the number of SQL queries here (Postgresql)?

It's been a while since I've touched SQL.
I'm working on a pretty large database.
In a certain table which has some 30 million rows I'm trying to figure out when the highest number of entries was made for a certain period e.g. a year, down to the detail-level of one hour.
What I do now is something like this:
For the year 2018:
Find month with highest entry number for 2018 (i.e. 12 queries):
select count(*) from sing
where to_char(create_time, 'YYYY-MM-DD') like '2018-01-%'
select count(*) from sing
where to_char(create_time, 'YYYY-MM-DD') like '2018-02-%'
After I find the month with the highest number I must find the day (i.e. up to 31 queries) :
select count(*) from sing
where to_char(create_time, 'YYYY-MM-DD') = '2018-01-01'
select count(*) from sing
where to_char(create_time, 'YYYY-MM-DD') = '2018-01-02'
After I find the day with the highest number I must find the hour (i.e. 24 queries):
select count(*) from sing
where to_char(create_time, 'YYYY-MM-DD HH24:MI:SS') >= '2018-01-02 08:00:00'
and to_char(create_time, 'YYYY-MM-DD HH24:MI:SS') <= '2018-01-02 08:59:59'
As you can see this is a tedious task. So my question is, if and how I can optimize this process?
The database is a PostgreSQL, and I'm using the pgadmin.
Thanks in advance.

Youy can use GROUP BY and the date_part function to simplify things
SELECT date_part('month', create_time), count(*)
FROM sing
WHERE date_part('year', create_time) = 2018
GROUP BY date_part('month', create_time)
and then for the day
SELECT date_part('day', create_time), count(*)
FROM sing
WHERE date_part('year', create_time) = 2018
AND date_part('month', create_time) = <month from previous query>
GROUP BY date_part('day', create_time)
and so on

For the year 2018 would be 1 query:
select count(*) from sing where date_part('year', create_time) = '2018'
So you can use better date_part then to_char I think
https://www.w3resource.com/PostgreSQL/date_part-function.php

Select timestamp range in SQL

I am trying to select a specific time range on a specific days range in SQL postgres. PLease see the code below which gives an error on the '10:00:00'.
The type of data for each columns is :
numeric for "balance",
character varying(255) for "currency",
timestamp without time zone for "created_at" (ex: 2018-03-20 00:00:00).
I tried this link without success.
MySQL select based on daily timestamp range
SELECT SUM(bl.balance) AS balance, bl.currency, bl.created_at
FROM balance_logs bl
WHERE bl.balance_scope = 'system' AND
created_at >= CURRENT_DATE - 2 AND
created_at < CURRENT_DATE AND
created_at BETWEEN '10:00:00' AND '11:00:00'
GROUP BY bl.currency, bl.created_at
ORDER BY created_at DESC

The comparison needs to be as a time:
SELECT SUM(bl.balance) AS balance, bl.currency, bl.created_at
FROM balance_logs bl
WHERE bl.balance_scope = 'system' AND
created_at >= CURRENT_DATE - 2 AND
created_at < CURRENT_DATE AND
created_at::time BETWEEN '10:00:00'::time AND '11:00:00'::time
GROUP BY bl.currency, bl.created_at
ORDER BY created_at DESC;
However, I think it is better to write the WHERE condition as:
extract(hour from created_at) = 10

Group query results by month and year in postgresql

I have the following database table on a Postgres server:
id date Product Sales
1245 01/04/2013 Toys 1000
1245 01/04/2013 Toys 2000
1231 01/02/2013 Bicycle 50000
456461 01/01/2014 Bananas 4546
I would like to create a query that gives the SUM of the Sales column and groups the results by month and year as follows:
Apr 2013 3000 Toys
Feb 2013 50000 Bicycle
Jan 2014 4546 Bananas
Is there a simple way to do that?

I can't believe the accepted answer has so many upvotes -- it's a horrible method.
Here's the correct way to do it, with date_trunc:
SELECT date_trunc('month', txn_date) AS txn_month, sum(amount) as monthly_sum
FROM yourtable
GROUP BY txn_month
It's bad practice but you might be forgiven if you use
GROUP BY 1
in a very simple query.
You can also use
GROUP BY date_trunc('month', txn_date)
if you don't want to select the date.

select to_char(date,'Mon') as mon,
extract(year from date) as yyyy,
sum("Sales") as "Sales"
from yourtable
group by 1,2
At the request of Radu, I will explain that query:
to_char(date,'Mon') as mon, : converts the "date" attribute into the defined format of the short form of month.
extract(year from date) as yyyy : Postgresql's "extract" function is used to extract the YYYY year from the "date" attribute.
sum("Sales") as "Sales" : The SUM() function adds up all the "Sales" values, and supplies a case-sensitive alias, with the case sensitivity maintained by using double-quotes.
group by 1,2 : The GROUP BY function must contain all columns from the SELECT list that are not part of the aggregate (aka, all columns not inside SUM/AVG/MIN/MAX etc functions). This tells the query that the SUM() should be applied for each unique combination of columns, which in this case are the month and year columns. The "1,2" part is a shorthand instead of using the column aliases, though it is probably best to use the full "to_char(...)" and "extract(...)" expressions for readability.

to_char actually lets you pull out the Year and month in one fell swoop!
select to_char(date('2014-05-10'),'Mon-YY') as year_month; --'May-14'
select to_char(date('2014-05-10'),'YYYY-MM') as year_month; --'2014-05'
or in the case of the user's example above:
select to_char(date,'YY-Mon') as year_month
sum("Sales") as "Sales"
from some_table
group by 1;

There is another way to achieve the result using the date_part() function in postgres.
SELECT date_part('month', txn_date) AS txn_month, date_part('year', txn_date) AS txn_year, sum(amount) as monthly_sum
FROM yourtable
GROUP BY date_part('month', txn_date)
Thanks

Why not just use date_part function. https://www.postgresql.org/docs/8.0/functions-datetime.html
SELECT date_part('year', txn_date) AS txn_year,
date_part('month', txn_date) AS txn_month,
sum(amount) as monthly_sum
FROM payment
GROUP BY txn_year, txn_month
order by txn_year;

Take a look at example 6) of this tutorial -> https://www.postgresqltutorial.com/postgresql-group-by/
You need to call the function on your GROUP BY instead of calling the name of the virtual attribute you created on select.
I was doing what all the answers above recommended and I was getting a column 'year_month' does not exist error.
What worked for me was:
SELECT
date_trunc('month', created_at), 'MM/YYYY' AS month
FROM
"orders"
GROUP BY
date_trunc('month', created_at)

Postgres has few types of timestamps:
timestamp without timezone - (Preferable to store UTC timestamps) You find it in multinational database storage. The client in this case will take care of the timezone offset for each country.
timestamp with timezone - The timezone offset is already included in the timestamp.
In some cases, your database does not use the timezone but you still need to group records in respect with local timezone and Daylight Saving Time (e.g. https://www.timeanddate.com/time/zone/romania/bucharest)
To add timezone you can use this example and replace the timezone offset with yours.
"your_date_column" at time zone '+03'
To add the +1 Summer Time offset specific to DST you need to check if your timestamp falls into a Summer DST. As those intervals varies with 1 or 2 days, I will use an aproximation that does not affect the end of month records, so in this case i can ignore each year exact interval.
If more precise query has to be build, then you have to add conditions to create more cases. But roughly, this will work fine in splitting data per month in respect with timezone and SummerTime when you find timestamp without timezone in your database:
SELECT
"id", "Product", "Sale",
date_trunc('month',
CASE WHEN
Extract(month from t."date") > 03 AND
Extract(day from t."date") > 26 AND
Extract(hour from t."date") > 3 AND
Extract(month from t."date") < 10 AND
Extract(day from t."date") < 29 AND
Extract(hour from t."date") < 4
THEN
t."date" at time zone '+03' -- Romania TimeZone offset + DST
ELSE
t."date" at time zone '+02' -- Romania TimeZone offset
END) as "date"
FROM
public."Table" AS t
WHERE 1=1
AND t."date" >= '01/07/2015 00:00:00'::TIMESTAMP WITHOUT TIME ZONE
AND t."date" < '01/07/2017 00:00:00'::TIMESTAMP WITHOUT TIME ZONE
GROUP BY date_trunc('month',
CASE WHEN
Extract(month from t."date") > 03 AND
Extract(day from t."date") > 26 AND
Extract(hour from t."date") > 3 AND
Extract(month from t."date") < 10 AND
Extract(day from t."date") < 29 AND
Extract(hour from t."date") < 4
THEN
t."date" at time zone '+03' -- Romania TimeZone offset + DST
ELSE
t."date" at time zone '+02' -- Romania TimeZone offset
END)

I also need to find results grouped by YEAR and MONTH.
When I grouped them by TIMESTAMP, sum function grouped them with dates and minutes, but that wasn't what I wanted.
Using this query may be helpful for you.
select sum(sum),
concat(year, '-', month, '-', '01')::timestamp
from (select sum(t.final_price) as sum,
extract(year from t.created_at) as year,
extract(month from t.created_at) as month
from transactions t
where status = 'SUCCESS'
group by t.created_at) t
group by year, month;
transactions table
query result
As you can see in the picture, in '2022-07-01' I have two columns in table, and in query result they are grouped together.

How to extract year and month from date in PostgreSQL without using to_char() function?

I want to select sql:
SELECT "year-month" from table group by "year-month" AND order by date, where
year-month - format for date "1978-01","1923-12".
select to_char of couse work, but not "right" order:
to_char(timestamp_column, 'YYYY-MM')

to_char(timestamp, 'YYYY-MM')
You say that the order is not "right", but I cannot see why it is wrong (at least until year 10000 comes around).

date_part(text, timestamp)
e.g.
date_part('month', timestamp '2001-02-16 20:38:40'),
date_part('year', timestamp '2001-02-16 20:38:40')
http://www.postgresql.org/docs/8.0/interactive/functions-datetime.html

Use the date_trunc method to truncate off the day (or whatever else you want, e.g., week, year, day, etc..)
Example of grouping sales from orders by month:
select
SUM(amount) as sales,
date_trunc('month', created_at) as date
from orders
group by date
order by date DESC;

You can truncate all information after the month using date_trunc(text, timestamp):
select date_trunc('month',created_at)::date as date
from orders
order by date DESC;
Example:
Input:
created_at = '2019-12-16 18:28:13'
Output 1:
date_trunc('day',created_at)
// 2019-12-16 00:00:00
Output 2:
date_trunc('day',created_at)::date
// 2019-12-16
Output 3:
date_trunc('month',created_at)::date
// 2019-12-01
Output 4:
date_trunc('year',created_at)::date
// 2019-01-01

1st Option
date_trunc('month', timestamp_column)::date
It will maintain the date format with all months starting at day one.
Example:
2016-08-01
2016-09-01
2016-10-01
2016-11-01
2016-12-01
2017-01-01
2nd Option
to_char(timestamp_column, 'YYYY-MM')
This solution proposed by #yairchu worked fine in my case. I really wanted to discard 'day' info.

You Can use EXTRACT function pgSQL
EX- date = 1981-05-31
EXTRACT(MONTH FROM date)
it will Give 5
For more details
PGSQL Date-Time

It is working for "greater than" functions not for less than.
For example:
select date_part('year',txndt)
from "table_name"
where date_part('year',txndt) > '2000' limit 10;
is working fine.
but for
select date_part('year',txndt)
from "table_name"
where date_part('year',txndt) < '2000' limit 10;
I am getting error.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Sort data by month, then year - sql

Related

Window function for average

Can I reduce the number of SQL queries here (Postgresql)?

Select timestamp range in SQL

Group query results by month and year in postgresql

How to extract year and month from date in PostgreSQL without using to_char() function?

Categories

Resources