SQL select unique elements and compare them on two periods

SQL select unique elements and compare them on two periods - sql

Lets say we have a table that looks like this:
I want to be able to see how many URL records are there during period1 and period2.
where period 1 is "date > '2016-01-01' and date < '2017-01-01' "
and period 2 is "date > '2014-01-1' and date < '2015-01-01' "
Here is what visualization of my expectations :
I can easily do this with one single period using following query:
SELECT URL, COUNT(URL) as Period1
FROM table WHERE date < '2017-01-01'
AND date > '2016-01-01' GROUP BY URL
But how do I add second column with Period2?
Any thoughts would be much appreciated. Sorry if I explained myself incorrectly.

Just change the dates in below query and it will count the hits per period
SELECT [URL]
,SUM(CASE WHEN (date >= '12/22/2015' And date <='12/23/2015') THEN 1 Else 0 END)
,SUM(CASE WHEN (date > '12/30/2015' And date < '12/31/2015') THEN 1 Else 0 END)
FROM LogFilesv2_Dataset.DE_Visits
Group by URL

for BigQuery Standard SQL
#standardSQL
SELECT
URL AS uri,
COUNTIF(date BETWEEN '2016-01-01' AND '2016-12-31') AS Period1,
COUNTIF(date BETWEEN '2014-01-01' AND '2014-12-31') AS Period2
FROM `LogFilesv2_Dataset.DE_Visits`
GROUP BY URL

Not sure which database you are using. But for Oracle database you can try using scalar sub-queries like this.
select url,(select count(*) from test where url=t.url and date_col <= sysdate and date_col > sysdate-20) period1,(select count(*) from test where url=t.url and date_col <= sysdate and date_col > sysdate-40) period2
from test t group by url;
You can replace date range appropriately in the query above.
It is always helpful to provide SQL scripts to create the table and populate the data.

Related

OR for two different columns

I am trying to select specific rows from an Oracle DB.
The table has the following structure:
Order
Date
Status
1
01.01.2018
10
2
01.01.2018
15
I would like to extract all rows where
Status = < 85 or
the order date is in this week
Unfortunately, column Status is declared as a text column.
How would you build a SQL to extract these specific rows?

Hmmm . . . I don't know what you mean by "this week". Perhaps:
where status <= '85' or
orderdate >= trunc(sysdate, 'IW')
This does a string comparison on the status, so '9' would not be matched, and uses the ISO definition of week for the current week.
If you want a numeric comparison for status, then:
where to_number(status) <= 85 or
orderdate >= trunc(sysdate, 'IW')

Maybe try the CAST function like
select* from yourtable where cast(Status as INT) <=85 AND
to_char(to_date(date,'MM/DD/YYYY'),'WW')=to_char(to_date(sysdate,'MM/DD/YYYY'),'WW')

SQL count distinct # of calls 6 months prior to create date

Am trying to figure out the SQL to:
count # of distinct calls
made on an account 6 months prior to the account being created
I also need to CAST the date field.
I'm thinking something like:
case when (call_date as date format 'MM/DD/YYYY')
between (create_date as date format 'MM/DD/YYYY') and
(ADD_MONTHS, (create_date as date format 'MM/DD/YYYY), -6)
then COUNT (DISTINCT call_nbr) as calls
Here's a snippet of the data i am working with. The answer I require 3 Calls.
Note: both dates are flagged in the db table as DATE format.
Call_Nbr.....Call Date......Create Date
12345........03/14/2020....07/23/2020.....include in result set
12345........03/14/2020....07/23/2020.....exclude in result set
45678........02/14/2020....07/23/2020.....include in result set
91011........01/20/2020....07/23/2020.....include in result set
91211........01/24/2020....07/23/2020.....exclude in result set
12345........11/14/2019....07/23/2020.....exclude in result set

I think you want:
select count(distinct call_nbr) no_calls
from mytable
where call_date >= add_months(create_date, -6)
If you have a column that represnets the account_id, then you can use a group by clause to get the count of calls per account:
select account_id, count(distinct call_nbr) no_calls
from mytable
where call_date >= add_months(create_date, -6)
group by account_id
Edit: it seems like you want conditional aggregation instead:
select
account_id,
count(distinct case when call_date >= add_months(create_date, -6) then call_nbr end) no_calls
from mytable
group by account_id

Athena greater than condition in date column

I have the following query that I am trying to run on Athena.
SELECT observation_date, COUNT(*) AS count
FROM db.table_name
WHERE observation_date > '2017-12-31'
GROUP BY observation_date
However it is producing this error:
SYNTAX_ERROR: line 3:24: '>' cannot be applied to date, varchar(10)
This seems odd to me. Is there an error in my query or is Athena not able to handle greater than operators on date columns?
Thanks!

You need to use a cast to format the date correctly before making this comparison. Try the following:
SELECT observation_date, COUNT(*) AS count
FROM db.table_name
WHERE observation_date > CAST('2017-12-31' AS DATE)
GROUP BY observation_date
Check it out in Fiddler: SQL Fidle
UPDATE 17/07/2019
In order to reflect comments
SELECT observation_date, COUNT(*) AS count
FROM db.table_name
WHERE observation_date > DATE('2017-12-31')
GROUP BY observation_date

You can also use the date function which is a convenient alias for CAST(x AS date):
SELECT *
FROM date_data
WHERE trading_date >= DATE('2018-07-06');

select * from my_schema.my_table_name where date_column = cast('2017-03-29' as DATE) limit 5

I just want to add my little words here, if you have date column with ISO-8601 format, for example: 2022-08-02T01:46:46.963120Z then you can use parse_datetime function.
In my case, the query looks like this:
SELECT * FROM internal_alb_logs
WHERE elb_status_code >= 500 AND parse_datetime(time,'yyyy-MM-dd''T''HH:mm:ss.SSSSSS''Z') > parse_datetime('2022-08-01-23:00:00','yyyy-MM-dd-HH:mm:ss')
ORDER BY time DESC
See more other examples here: https://docs.aws.amazon.com/athena/latest/ug/application-load-balancer-logs.html#query-alb-logs-examples

How to Average column on bottom of query

Ignore '?DATE1::?' it is just a prefix for users to input date range.
Select
STARTDATEKEY
round(avg(Minutes),2) as Time /*average for 1 day */
from Table
where To_Date(to_char(StartDate, 'DD-MON-YYYY')) >= To_Date('?DATE1::?','MM/DD/YYYY')
and To_Date(to_char(RESTOREDDATETIME, 'DD-MON-YYYY')) <= To_Date('?DATE2::?','MM/DD/YYYY')
and FLAG = 0
group by STARTDATEKEY
Out will be
I need help showing average for column Time on bottom of 20130110 52.67
note to editor/reviewer : I don't know if I should tag Oracle or SQL.

You can use the ROLLUP grouping function.
Should be something like this:
Select
STARTDATEKEY
round(avg(Minutes),2) as Time /*average for 1 day */
from Table
where To_Date(to_char(StartDate, 'DD-MON-YYYY')) >= To_Date('?DATE1::?','MM/DD/YYYY')
and To_Date(to_char(RESTOREDDATETIME, 'DD-MON-YYYY')) <= To_Date('?DATE2::?','MM/DD/YYYY')
and FLAG = 0
group by ROLLUP(STARTDATEKEY)
Here is a simplified sqlfiddle demo

Efficient way to query separate days of data?

I want to query statistics using SQL from 3 different days (in a row). The display would be something like:
15 users created today, 10 yesterday, 12 two days ago
The SQL would be something like (for today):
SELECT Count(*) FROM Users WHERE created_date >= '2012-05-11'
And then I would do 2 more queries for yesterday and the day before.
So in total I'm doing 3 queries against the entire database. The format for created_date is 2012-05-11 05:24:11 (date & time).
Is there a more efficient SQL way to do this, say in one query?
For specifics, I'm using PHP and SQLite (so the PDO extension).
The result should be 3 different numbers (one for each day).
Any chance someone could show some performance numbers in comparison?

You can use GROUP BY:
SELECT Count(*), created_date FROM Users GROUP BY created_date
That will give you a list of dates with the number of records found on that date. You can add criteria for created_date using a normal WHERE clause.
Edit: based on your edit:
SELECT Count(*), created_date FROM Users WHERE created_date>='2012-05-09' GROUP BY date(created_date)

The best solution is to use GROUP BY DAY(created_date). Here is your query:
SELECT DATE(created_date), count(*)
FROM users
WHERE created_date > CURRENT_DATE - INTERVAL 3 DAY
GROUP BY DAY(created_date)

This would work I believe though I have no way to test it:
SELECT
(SELECT Count(*) FROM Users WHERE created_date >= '2012-05-11') as today,
(SELECT Count(*) FROM Users WHERE created_date >= '2012-05-10') as yesterday,
(SELECT Count(*) FROM Users WHERE created_date >= '2012-05-11') as day_before
;

Use GROUP BY like jeroen suggested, but if you're planning for other periods you can also set ranges like this:
SELECT SUM(IF(created_date BETWEEN '2012-05-01' AND NOW(), 1, 0)) AS `this_month`,
SUM(IF(created_date = '2012-05-09', 1, 0)) AS `2_days_ago`
FROM ...
As noted below, SQLite doesn't have IF function but there is CASE instead. So this way it should work:
SELECT SUM(CASE WHEN created_date BETWEEN '2012-05-01' AND NOW() THEN 1 ELSE 0 END) AS `this_month`,
SUM(CASE created_date WHEN '2012-05-09' THEN 1 ELSE 0 END) AS `2_days_ago`
FROM ...

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL select unique elements and compare them on two periods - sql

Just change the dates in below query and it will count the hits per period SELECT [URL] ,SUM(CASE WHEN (date >= '12/22/2015' And date <='12/23/2015') THEN 1 Else 0 END) ,SUM(CASE WHEN (date > '12/30/2015' And date < '12/31/2015') THEN 1 Else 0 END) FROM LogFilesv2_Dataset.DE_Visits Group by URL

for BigQuery Standard SQL #standardSQL SELECT URL AS uri, COUNTIF(date BETWEEN '2016-01-01' AND '2016-12-31') AS Period1, COUNTIF(date BETWEEN '2014-01-01' AND '2014-12-31') AS Period2 FROM `LogFilesv2_Dataset.DE_Visits` GROUP BY URL

Related

OR for two different columns

SQL count distinct # of calls 6 months prior to create date

Athena greater than condition in date column

How to Average column on bottom of query

Efficient way to query separate days of data?

Categories

Resources