Sub-query is Not Working for Date_Part() - sql

I want to pass the subquery as an argument to the EXTRACT() function of Postgres to get the number of the day of the week but it is not working.
Working Code:
SELECT EXTRACT(dow FROM DATE '2018-06-07');
It returns:
+-------------+
| date_part |
|-------------|
| 4.0 |
+-------------+
Not Working Code:
SELECT EXTRACT(DOW FROM DATE
(SELECT start_date from leaves where submitted_by=245 and type_id = 16)
);
It returns
syntax error at or near "SELECT"
LINE 1: SELECT EXTRACT(DAY FROM DATE (SELECT submitted_on FROM leave...
I don't know why EXTRACT() function is not accepting subquery result as the query:
SELECT start_date from leaves where submitted_by=245 and type_id = 16;
returns the following which I think is identical I have passed as a
date string in the working example.
+--------------+
| start_date |
|--------------|
| 2018-06-07 |
+--------------+
Can somebody correct it or let me know some other way to get the number of the day of the week.

Just apply it to the column of the select:
SELECT EXTRACT(DOW from start_date)
from leaves
where submitted_by=245 and type_id = 16
If you really want to use a scalar sub-query, then you must get rid of the DATE keyword, that is only needed to specify date constants.
SELECT EXTRACT(DOW FROM
(SELECT start_date from leaves where submitted_by=245 and type_id = 16)
);

Put the function inside the select:
select (select extract(dow from start_date)
from leaves
where submitted_by = 245 and type_id = 16
)
I don't see the advantage for using a subquery in the select for this (as opposed to -- say -- moving the subquery to the from. But this should do what you want.

Related

How do I create a new column showing difference between maximum date in table and date in row?

I need two columns: 1 showing 'date' and the other showing 'maximum date in table - date in row'.
I kept getting a zero in the 'datediff' column, and thought a nested select would work.
SELECT date, DATEDIFF(max_date, date) AS datediff
(SELECT MAX(date) AS max_date
FROM mytable)
FROM mytable
GROUP BY date
Currently getting this error from the above code : mismatched input '(' expecting {, ';'}(line 2, pos 2)
Correct format in the end would be:
date | datediff
--------------------------
2021-08-28 | 0
2021-07-26 | 28
2021-07-23 | 31
2021-08-11 | 17
If you want the date difference, you can use:
SELECT date, DATEDIFF(MAX(date) OVER (), date) AS datediff
FROM mytable
GROUP BY date
You can do this using the analytic function MAX() Over()
SELECT date, MAX(date) OVER() - date FROM mytable;
Tried this here on sqlfiddle

BigQuery - Nested Query with different WHERE parameters?

I'm trying to trying to fetch the user_counts and new_user_counts by date where new_user_counts is defined by condition WHERE date of timestamp event_timestamp = date of timestamp user_first_touch_timestamp while user_counts would fetch the distinct count of user_pseduo_id field between the same date range. How can I do this in the same query? Here's how my current query is looking.
Eventually, I'd like the result to be as:
|Date | new_user_count | user_counts |
|20200820 | X | Y |
Here is the error I'm getting at line 8 of code:
Syntax error: Function call cannot be applied to this expression. Function calls require a path, e.g. a.b.c() at [8:5]
Thanks.
SELECT
event_date,
COUNT (DISTINCT(user_pseudo_id)) AS new_user_counts FROM
`my-google-analytics-table-name.*`
WHERE DATE(TIMESTAMP_MICROS(event_timestamp)) =
DATE(TIMESTAMP_MICROS(user_first_touch_timestamp))
AND event_date BETWEEN '20200820' AND '20200831'
(SELECT
COUNT (DISTINCT(user_pseudo_id)) AS user_counts
FROM `my-google-analytics-table-name.*`
WHERE event_date BETWEEN '20200820' AND '20200831'
)
GROUP BY event_date
ORDER BY event_date ASC
Try below (solely based on your original query just fixing the syntax/logic)
SELECT
event_date,
COUNT(DISTINCT IF(
DATE(TIMESTAMP_MICROS(event_timestamp)) = DATE(TIMESTAMP_MICROS(user_first_touch_timestamp)),
user_pseudo_id,
NULL
)) AS new_user_counts,
COUNT(DISTINCT(user_pseudo_id)) AS user_counts
FROM `my-google-analytics-table-name.*`
GROUP BY event_date
ORDER BY event_date ASC

Select count for each specific date

I have the following need:
I need to count the number of times each id activated from all dates.
Let's say the table looks like this:
tbl_activates
PersonId int,
ActivatedDate datetime
The result set should look something like this:
counted_activation | ActivatedDate
5 | 2009-04-30
7 | 2009-04-29
5 | 2009-04-28
7 | 2009-04-27
... and so on
Anyone know how to do this the best possible way? The date comes in the following format '2011-09-06 15:47:52.110', I need to relate only to the date without the time. (summary for each date)
you can use count(distinct .. )
and if the ActivatedDate is datetime you can get the date part
select Cast(ActivatedDate AS date), count(distinct id)
from my_table
group by ast(ActivatedDate AS date)
You can use to_char function to remove the time from date
select count(*) counted_activation,
to_char(activatedDate,"yyyy-mm-dd") ActDate
from table1
group by to_char(activatedDate,"yyyy-mm-dd");
Use 'GROUP BY' and 'COUNT'. Use CONVERT method to convert datetime to Date only
SELECT CONVERT(DATE,activatedate), COUNT(userId)
FROM [table]
GROUP BY CONVERT(DATE,InvoiceDate)

PostgreSQL query to count/group by day and display days with no data

I need to create a PostgreSQL query that returns
a day
the number of objects found for that day
It's important that every single day appear in the results, even if no objects were found on that day. (This has been discussed before but I haven't been able to get things working in my specific case.)
First, I found a sql query to generate a range of days, with which I can join:
SELECT to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD')
AS date
FROM generate_series(0, 365, 1)
AS offs
Results in:
date
------------
2013-03-28
2013-03-27
2013-03-26
2013-03-25
...
2012-03-28
(366 rows)
Now I'm trying to join that to a table named 'sharer_emailshare' which has a 'created' column:
Table 'public.sharer_emailshare'
column | type
-------------------
id | integer
created | timestamp with time zone
message | text
to | character varying(75)
Here's the best GROUP BY query I have so far:
SELECT d.date, count(se.id) FROM (
select to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD')
AS date
FROM generate_series(0, 365, 1)
AS offs
) d
JOIN sharer_emailshare se
ON (d.date=to_char(date_trunc('day', se.created), 'YYYY-MM-DD'))
GROUP BY d.date;
The results:
date | count
------------+-------
2013-03-27 | 11
2013-03-24 | 2
2013-02-14 | 2
(3 rows)
Desired results:
date | count
------------+-------
2013-03-28 | 0
2013-03-27 | 11
2013-03-26 | 0
2013-03-25 | 0
2013-03-24 | 2
2013-03-23 | 0
...
2012-03-28 | 0
(366 rows)
If I understand correctly this is because I'm using a plain (implied INNER) JOIN, and this is the expected behavior, as discussed in the postgres docs.
I've looked through dozens of StackOverflow solutions, and all the ones with working queries seem specific to MySQL/Oracle/MSSQL and I'm having a hard time translating them to PostgreSQL.
The guy asking this question found his answer, with Postgres, but put it on a pastebin link that expired some time ago.
I've tried to switch to LEFT OUTER JOIN, RIGHT JOIN, RIGHT OUTER JOIN, CROSS JOIN, use a CASE statement to sub in another value if null, COALESCE to provide a default value, etc, but I haven't been able to use them in a way that gets me what I need.
Any assistance is appreciated! And I promise I'll get around to reading that giant PostgreSQL book soon ;)
You just need a left outer join instead of an inner join:
SELECT d.date, count(se.id)
FROM
(
SELECT to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD') AS date
FROM generate_series(0, 365, 1) AS offs
) d
LEFT OUTER JOIN sharer_emailshare se
ON d.date = to_char(date_trunc('day', se.created), 'YYYY-MM-DD')
GROUP BY d.date;
Extending Gordon Linoff's helpful answer, I would suggest a couple of improvements such as:
Use ::date instead of date_trunc('day', ...)
Join on a date type rather than a character type (it's cleaner).
Use specific date ranges so they're easier to change later. In this case I select a year before the most recent entry in the table - something that couldn't have been done easily with the other query.
Compute the totals for an arbitrary subquery (using a CTE). You just have to cast the column of interest to the date type and call it date_column.
Include a column for cumulative total. (Why not?)
Here's my query:
WITH dates_table AS (
SELECT created::date AS date_column FROM sharer_emailshare WHERE showroom_id=5
)
SELECT series_table.date, COUNT(dates_table.date_column), SUM(COUNT(dates_table.date_column)) OVER (ORDER BY series_table.date) FROM (
SELECT (last_date - b.offs) AS date
FROM (
SELECT GENERATE_SERIES(0, last_date - first_date, 1) AS offs, last_date from (
SELECT MAX(date_column) AS last_date, (MAX(date_column) - '1 year'::interval)::date AS first_date FROM dates_table
) AS a
) AS b
) AS series_table
LEFT OUTER JOIN dates_table
ON (series_table.date = dates_table.date_column)
GROUP BY series_table.date
ORDER BY series_table.date
I tested the query, and it produces the same results, plus the column for cumulative total.
I'll try to provide an answer that includes some explanation. I'll start with the smallest building block and work up.
If you run a query like this:
SELECT series.number FROM generate_series(0, 9) AS series(number)
You get output like this:
number
--------
0
1
2
3
4
5
6
7
8
9
(10 rows)
This can be turned into dates like this:
SELECT CURRENT_DATE + sequential_dates.date AS date
FROM generate_series(0, 9) AS sequential_dates(date)
Which will give output like this:
date
------------
2019-09-29
2019-09-30
2019-10-01
2019-10-02
2019-10-03
2019-10-04
2019-10-05
2019-10-06
2019-10-07
2019-10-08
(10 rows)
Then you can do a query like this (for example), joining the original query as a subquery against whatever table you're ultimately interested in:
SELECT sequential_dates.date,
COUNT(calendar_items.*) AS calendar_item_count
FROM (SELECT CURRENT_DATE + sequential_dates.date AS date
FROM generate_series(0, 9) AS sequential_dates(date)) sequential_dates
LEFT JOIN calendar_items ON calendar_items.starts_at::date = sequential_dates.date
GROUP BY sequential_dates.date
Which will give output like this:
date | calendar_item_count
------------+---------------------
2019-09-29 | 1
2019-09-30 | 8
2019-10-01 | 15
2019-10-02 | 11
2019-10-03 | 1
2019-10-04 | 12
2019-10-05 | 0
2019-10-06 | 0
2019-10-07 | 27
2019-10-08 | 24
Based on Gordon Linoff's answer I realized another problem was that I had a WHERE clause that I didn't mention in the original question.
Instead of a naked WHERE, I made a subquery:
SELECT d.date, count(se.id) FROM (
select to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD')
AS date
FROM generate_series(0, 365, 1)
AS offs
) d
LEFT OUTER JOIN (
SELECT * FROM sharer_emailshare
WHERE showroom_id=5
) se
ON (d.date=to_char(date_trunc('day', se.created), 'YYYY-MM-DD'))
GROUP BY d.date;
I like Jason Swett SQL however ran into issue where the count on some dates should be a zero rather than a one.
Running the statment select count(*) from public.post_call_info where timestamp::date = '2020-11-23' count = zero, but below equals a one.
Also the + give me a forward schedule so changed to a minus provide 9 days data prior to current date.
SELECT sequential_dates.date,
COUNT(*) AS call_count
FROM (SELECT CURRENT_DATE - sequential_dates.date AS date
FROM generate_series(0, 9) AS sequential_dates(date)) sequential_dates
LEFT JOIN public.post_call_info ON public.post_call_info.timestamp::date =
sequential_dates.date
GROUP BY sequential_dates.date
order by date desc

PostgreSQL: How to return rows with respect to a found row (relative results)?

Forgive my example if it does not make sense. I'm going to try with a simplified one to encourage more participation.
Consider a table like the following:
dt | mnth | foo
--------------+------------+--------
2012-12-01 | December |
...
2012-08-01 | August |
2012-07-01 | July |
2012-06-01 | June |
2012-05-01 | May |
2012-04-01 | April |
2012-03-01 | March |
...
1997-01-01 | January |
If you look for the record with dt closest to today w/o going over, what would be the best way to also return the 3 records beforehand and 7 records after?
I decided to try windowing functions:
WITH dates AS (
select row_number() over (order by dt desc)
, dt
, dt - now()::date as dt_diff
from foo
)
, closest_date AS (
select * from dates
where dt_diff = ( select max(dt_diff) from dates where dt_diff <= 0 )
)
SELECT *
FROM dates
WHERE row_number - (select row_number from closest_date) >= -3
AND row_number - (select row_number from closest_date) <= 7 ;
I feel like there must be a better way to return relative records with a window function, but it's been some time since I've looked at them.
create table foo (dt date);
insert into foo values
('2012-12-01'),
('2012-08-01'),
('2012-07-01'),
('2012-06-01'),
('2012-05-01'),
('2012-04-01'),
('2012-03-01'),
('2012-02-01'),
('2012-01-01'),
('1997-01-01'),
('2012-09-01'),
('2012-10-01'),
('2012-11-01'),
('2013-01-01')
;
select dt
from (
(
select dt
from foo
where dt <= current_date
order by dt desc
limit 4
)
union all
(
select dt
from foo
where dt > current_date
order by dt
limit 7
)) s
order by dt
;
dt
------------
2012-03-01
2012-04-01
2012-05-01
2012-06-01
2012-07-01
2012-08-01
2012-09-01
2012-10-01
2012-11-01
2012-12-01
2013-01-01
(11 rows)
You could use the window function lead():
SELECT dt_lead7 AS dt
FROM (
SELECT *, lead(dt, 7) OVER (ORDER BY dt) AS dt_lead7
FROM foo
) d
WHERE dt <= now()::date
ORDER BY dt DESC
LIMIT 11;
Somewhat shorter, but the UNION ALL version will be faster with a suitable index.
That leaves a corner case where "date closest to today" is within the first 7 rows. You can pad the original data with 7 rows of -infinity to take care of this:
SELECT d.dt_lead7 AS dt
FROM (
SELECT *, lead(dt, 7) OVER (ORDER BY dt) AS dt_lead7
FROM (
SELECT '-infinity'::date AS dt FROM generate_series(1,7)
UNION ALL
SELECT dt FROM foo
) x
) d
WHERE d.dt &lt= now()::date -- same as: WHERE dt &lt= now()::date1
ORDER BY d.dt_lead7 DESC -- same as: ORDER BY dt DESC 1
LIMIT 11;
I table-qualified the columns in the second query to clarify what happens. See below.
The result will include NULL values if the "date closest to today" is within the last 7 rows of the base table. You can filter those with an additional sub-select if you need to.
1To address your doubts about output names versus column names in the comments - consider the following quotes from the manual.
Where to use an output column's name:
An output column's name can be used to refer to the column's value in
ORDER BY and GROUP BY clauses, but not in the WHERE or HAVING clauses;
there you must write out the expression instead.
Bold emphasis mine. WHERE dt <= now()::date references the column d.dt, not the the output column of the same name - thereby working as intended.
Resolving conflicts:
If an ORDER BY expression is a simple name that matches both an output
column name and an input column name, ORDER BY will interpret it as
the output column name. This is the opposite of the choice that GROUP BY
will make in the same situation. This inconsistency is made to be
compatible with the SQL standard.
Bold emphasis mine again. ORDER BY dt DESC in the example references the output column's name - as intended. Anyway, either columns would sort the same. The only difference could be with the NULL values of the corner case. But that falls flat, too, because:
the default behavior is NULLS LAST when ASC is specified or implied,
and NULLS FIRST when DESC is specified
As the NULL values come after the biggest values, the order is identical either way.
Or, without LIMIT (as per request in comment):
WITH x AS (
SELECT *
, row_number() OVER (ORDER BY dt) AS rn
, first_value(dt) OVER (ORDER BY (dt > '2011-11-02')
, dt DESC) AS dt_nearest
FROM foo
)
, y AS (
SELECT rn AS rn_nearest
FROM x
WHERE dt = dt_nearest
)
SELECT dt
FROM x, y
WHERE rn BETWEEN rn_nearest - 3 AND rn_nearest + 7
ORDER BY dt;
If performance is important, I would still go with #Clodoaldo's UNION ALL variant. It will be fastest. Database agnostic SQL will only get you so far. Other RDBMS do not have window functions at all, yet (MySQL), or different function names (like first_val instead of first_value). You might just as well replace LIMIT with TOP n (MS SQL) or whatever the local dialect.
You could use something like that:
select * from foo
where dt between now()- interval '7 months' and now()+ interval '3 months'
This and this may help you.