Impala list all dates between 2 dates - impala

Can HUE Impala create a column which shows all dates between a specified start and end dates?
I want to list a column with date values.

You can use this sql.
select a.Date_Range
from (
select date1 - INTERVAL (a.a + (10 * b.a) + (100 * c.a) + (1000 * d.a) ) DAY as Date_Range
from (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as c
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as d
) a
where a.Date_Range <= date2
Explanation -
You first create a range of numbers. And then add it to the date1 to get a range. Then you can pick your date range less than date2.

Related

Is there any alternative to use MYSQL's ADDDATE() in ORACLE?

I have this query that needs to be executed for oracle sql instead of mysql which is where it originally came from, but I have the ADDDATE() function which I don't see any other alternative than DateAdd since it needs more parameters than I really need..
Apart from that, if I try to execute it, it also indicates an error in the
SELECT 0 i UNION.................
part, saying the following ORA-00923: FROM keyword not found where expected
Maybe in oracle it is not allowed to do a select 0 union select 1 union...
Any suggestions or help I appreciate it, thanks
SELECT
ADDDATE('1970-01-01', t4.i * 10000 + t3.i * 1000 + t2.i * 100 + t1.i * 10 + t0.i) selected_date
FROM
(
SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) t0,
(
SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) t1,
(
SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) t2,
(
SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) t3,
(
SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) t4
In Oracle you must select from the one-row table dual in order to select one row. You cannot select without a from clause.
If you want to generate dates, you'll write a standard SQL recursive CTE. (And this is the typical approach now in MySQL, too, since version 8.0.)
Here is an example selecting all days for 1970:
with dates (dt) as
(
select date '1970-01-01' from dual
union all
select dt + interval '1' day from dates where dt < date '1970-12-31'
)
select dt from dates;
Here is another way to SELECT a list of dates for the year 1970. Adjust the starting and ending dates if you want different years or the INTERVAL if you want different periods like seconds, minutes, hours…
ALTER SESSION SET NLS_DATE_FORMAT = 'DD-MON-YYYY HH24:MI:SS';
with dt (dt, interv) as (
select date '1970-01-01', numtodsinterval(1,'DAY') from dual
union all
select dt.dt + interv, interv from dt
where dt.dt + interv <= date '1970-12-31')
select dt from dt;
/

Converting monthly to daily data

I have monthly data that I would like to transform to daily data. The data looks like this. The extraction_dt is in date format.
isin
extraction_date
yield
001
2013-01-31
100
001
2013-02-28
110
001
2013-03-31
105
...
...
...
002
2013-01-31
200
...
...
...
And I would like to have something like this
isin
extraction_dt
yield
001
2013-01-01
100
001
2013-01-02
100
001
2013-01-03
100
..
.....
...
001
2013-02-01
110
...
...
...
I tried the following code but it does not work. I get the error message AnalysisException: Could not resolve table reference: 'cte'. How would you convert monthly to daily data?
with cte as
(select isin, extraction_dt, yield
from datashop
union all
select isin, extraction_dt, dateadd(d, 1, extraction_dt) AS date_dt, yield
from cte
where datediff(m,date_dt,dateadd(d, 1, date_dt))=0
)
select isin, date_dt,
1.0*isin / count(*) over (partition by isin, date_dt) AS daily_yield
from cte
order by 1,2
I can suggest easy solution.
generate a date series
match it with your data so it gets repeated.
So, here is the SQL you can use for Impala.
select isin, extraction_dt, a.dt AS date_dt, yield
from
datashop d,
(
select now() - INTERVAL (a.a + (10 * b.a) + (100 * c.a) + (1000 * d.a) ) DAY as dt
from (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as c
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as d
) a
WHERE
from_timestamp(a.dt,'yyyy/MM') =from_timestamp(d.extraction_dt,'yyyy/MM')
order by 1,2,3
the alias a is going to generate a series of dates.
WHERE - this clause will restrict to the month of extraction_dt. and you will get all possible values for a month.
ORDER BY - will show a nice output.
Your WITH clause has a recursive (self-referencing) query. In most SQL dialects, this requires using WITH RECURSIVE, not plain WITH. According to the Impala SQL reference, Impala does not support recursive common table expressions:
The Impala WITH clause does not support recursive queries in the
WITH, which is supported in some other database systems.
In other words, you cannot do this in Impala.

Explode and Count all items from 2 dates column

I would like to get all possible date (in this case : event_day) and number of event that happen between start_date and end_date. please look table below
---------------------------------
start_date | end_date | event
---------------------------------
2019-01-01 | 2019-01-04 | A
2019-01-02 | 2019-01-03 | B
2019-01-01 | 2019-01-06 | C
and I want to query to get number of event_count in all date. please see the following result
----------------------------
event_day | event_count
----------------------------
2019-01-01 | 2
2019-01-02 | 3
2019-01-03 | 3
2019-01-04 | 2
2019-01-05 | 1
2019-01-06 | 1
I read others source but can only find how to explode date from 2 dates. Any helps here? Thanks
You can use a calendar table to solve this:
SELECT date_value AS event_day, COUNT(*) AS event_count
FROM (
SELECT ADDDATE('1970-01-01', t4 * 10000 + t3 * 1000 + t2 * 100 + t1 * 10 + t0) AS date_value
FROM
(SELECT 0 t0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) t0,
(SELECT 0 t1 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) t1,
(SELECT 0 t2 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) t2,
(SELECT 0 t3 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) t3,
(SELECT 0 t4 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) t4
) calendar INNER JOIN events ON calendar.date_value BETWEEN events.start_date AND events.end_date
WHERE calendar.date_value BETWEEN '2019-01-01' AND '2019-01-04' -- to filter for a specific date range.
GROUP BY date_value
demo on dbfiddle.uk
If you are using postgres you can generate a calendar table using generate_series, basically you need a calendar table to be able to explode the dates.
WITH a AS(
Select '2019-01-01'::date as start_date ,'2019-01-04'::date as end_date union all
Select '2019-01-02'::date , '2019-01-03'::date union all
Select '2019-01-01'::date, '2019-01-06'::date
)
Select t.date_generated,count(*) as event
from a
JOIN(Select date_generated
from generate_series(date '2019-01-01',
date '2019-12-31',
interval '1 day') as t(date_generated)
) t
ON t.date_generated between a.start_date and a.end_date
group by t.date_generated
order by t.date_generated
select Calendar.Calndr_date , count(Calendar.Calndr_date) count_events
from event_table
join Calendar on
Calendar.Calndr_date between event_table.start_date and event_table.end_date
group by Calendar.Calndr_date
please discuss if any problem.
Please create calendar table and insert data of calendar.

Count in each row the number of second column

Here is answer to request
The question is how to count by each selected_date e.x:
2012-02-10: 1
2012-02-15: 0
2012-02-14: 3
2012-02-11: 0
How to make this request
Here is the request to get above answer
select selected_date, date1 from
(select selected_date from
(select adddate('1970-01-01',t4.i*10000 + t3.i*1000 + t2.i*100 + t1.i*10 + t0.i) selected_date from
(select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t0,
(select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t1,
(select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t2,
(select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t3,
(select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t4) v
where selected_date between '2012-02-10' and '2012-02-15' ) vv left join clicker on clicker.date1=vv.selected_date
This might work:
SELECT selected_date, SUM(CASE WHEN date1 IS NULL THEN 0 ELSE 1 END) FROM table
GROUP BY selected_date
So , basically this?
SELECT t.selected_date, COUNT(t.date1)
FROM ( Your Query Here )
GROUP BY t.selected_date
COUNT() ignores NULL values by default, so it will count only matches .

Using Distinct with Top in Select Clause of Query

In SQL Server I am able to create a query that uses both Top and Distinct in the Select clause, such as this one:
Select Distinct Top 10 program_name
From sampleTable
Will the database return the distinct values from the top 10 results, or will it return the top 10 results of the distinct values? Is this behavior consistent in SQL or is it database dependent?
TOP is executed last, so your DISTINCT runs first then the TOP
http://blog.sqlauthority.com/2009/04/06/sql-server-logical-query-processing-phases-order-of-statement-execution/
Use
Select Top 10 program_name
From sampleTable group by program_name;
It will return you the top 10 distinct program_name.
Your query will also return the distinct 10 program_name.
Try this:
select distinct top 10 c from
(
select 1 c union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 2
) as T
order by c
Compare that result to these queries:
select distinct c from (
select top 10 c from
(
select 1 c union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 2
) as T
order by c
) as T2
select top 10 c from (
select distinct c from
(
select 1 c union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 1 union all
select 2
) as T
) as T2
order by c