Hive TimeStamp for Weekly and Quarterly data - hive

The below hive command,
select * from my_new_table
where month(time_stamp)= '03'
and year(time_stamp) = '2016'
and age = '1'
and gender = '0'
and income = '4'
and ethnicity = '3'
and marital_status = '1';
generates the following results for all the data (31 days) for the month of March(03) 2016:
time_stamp age gender income ethnicity marital_status
2016-03-14#17:42:47.000 1 0 4 3 1
2016-03-14#16:10:51.000 1 0 4 3 1
2016-03-20#15:16:44.000 1 0 4 3 1
2016-03-14#17:13:51.000 1 0 4 3 1
2016-03-14#17:12:51.000 1 0 4 3 1
2016-03-14#18:24:51.000 1 0 4 3 1
2016-03-03#13:02:06.000 1 0 4 3 1
Similarly, I want to get the data for 2nd quarter of 2016 (or nth quarter, data should come for 3 months, starting March 1st till May 31st for this quarter ) and 12th week (or nth week, data should come for 7 days for this particular week) of 2016. What is the correct Hive command for this?
I'm getting error if I replace month with quarter or week in the Hive command.
select * from my_new_table
where quarter(time_stamp)='03'
and year(time_stamp) = '2016';
returns
FAILED: SemanticException [Error 10011]: Line 1:71 Invalid function 'quarter'
and
select * from my_new_table
where week(time_stamp)='12'
and year(time_stamp) = '2016';
returns
FAILED: SemanticException [Error 10011]: Line 1:71 Invalid function 'week'
Looks like there needs to be a calculation included in getting the nth quarter or week but not sure. Please help. Thanks.

The documentation says that month and week functions expect a "string date", not an actual date or timestamp type as input. You may need to use the date_format function to convert your timestamp to a string, then use the functions, like
month(date_format(timestamp, 'yyyy-MM-dd'))
This doesn't explain why your first example of these functions as part of the where clause seem to work, however.

Related

SQlite strftime Query Isuue

SELECT *
FROM BillData
WHERE date BETWEEN strftime('%d-%m-%Y','2020-05-04') AND strftime('%d-%m-%Y','2020-07-08')
Query Result-it picks date Where the year is 2019
SN Date
1 07-07-2020
2 07-06-2020
3 07-01-2020
4 08-07-2020
5 08-07-2019 <------------(return this result 2019)need a solution
for this issue
Use a format that matches the date,
strftime('%Y-%m-%d','2020-05-04')

HIVE Obtain First Weekday of Current Month

Morning,
Title says it all. I cannot for the life of me figure out how to obtain the first weekday of the current month (or previous month etc.) in HQL.
So if today's date were to be evaluated, it should return 2/3/2020 as the date, since the 3rd was the first weekday of this month.
I've tried case statements to evaluate the first day of the month, and if it is a saturday, add 2 days, a sunday, add 1 but it is not working and I receive the following error: ERROR: Prepare error: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: ParseException line
1:452 extraneous input ',' expecting KW_THEN near ''line 1:626 extraneous input ',' expecting KW_THEN near ''
case when date_format(date_add(current_date, 1 - day(current_date)),'u')=6, then to_date(mydate) = date_add(date_add(current_date, 1 - day(current_date)+2))
when date_format(date_add(current_date, 1 - day(current_date)),'u')=7, then to_date(mydate) = date_add(date_add(current_date, 1 - day(current_date)+1))
else to_date(mydate) = date_add(date_add(current_date, 1 - day(current_date))) end
Please help!
#Vamsi Prabhala
I tried this however I receive a result of 1. I need a date returned, specifically 2/3/2020 for this month.
case when date_format(date_add(current_date, 1 - day(current_date)),'u')> 5 then
to_date(mydate)= to_date(date_add(date_add(current_date, 1 - day(current_date)),8-cast(date_format(date_add(current_date, 1 - day(current_date)),'u') as int)%8))
else to_date(mydate) = date_add(current_date, 1 - day(current_date)) end
One option is to get the first day in month and then add days if the first day in month is a weekend. By default weekday 1 = Monday, 2 = Tuesday ... 6 = Saturday, 7 = Sunday.
select dt
,case when date_format(first_day_in_month,'u') > 5
then date_add(first_day_in_month,8-cast(date_format(first_day_in_month,'u') as int))
else first_day_in_month end as first_week_day
from (select dt
,date_sub(dt,cast(date_format(dt,'d') as int)-1) as first_day_in_month
from tbl
) t
;
Note that the date_format function works in Hive versions 1.2.0 and above.

Get monthly report in sql

I got in my data base colums logs, data, type
How to get logs from year from now with month distinction
f.ex:
row
logs = 'error...'
data = 2012-11-05 11:24:08
type = 1
....
And I want get them in that view
month logs-count type
January 100 1
January 100 2
February 160 1
February 120 2
....
For mysql:
select monthname(data) as month, count(*) as logs-count, type
from table
group by month, type
You need to group by both date and type to get multiple rows per month.
try this one:
select datename(month, data) as month count(logs)as logscount, type from table
try this,If these columns are in same table
select monthname(data) as month,count(logs)as logs-count,type from table
group by month

Determine how to calculate split weeks in SQL Server

Im not quite sure what the term is, I have been calling it "Split Weeks" but here is what I need to find out.
Given:
User will input #StartDate and #EndDate
col_week_end_date will always end on a Saturday, and is a DateTime column.
I want to cycle through either multiple or a single month(s) and sum col_payment_amt
Using the month of September 2010, a col_payment_amt with a col_week_end_date falls on 09/04/2010, which covers the week of Aug 29 - Sep 04.
The payment month is September, but only 3 workdays fall w/i this week (Wed, Thurs, Fri). So only 3/5ths of the payment is made for that week.
The same thing happens with the end of a month. In this case, the col_week_end_date falls on 10/02/2010. Only 4/5ths of the payment will be made for this week.
I have a particular way to sum the col_payment_amt when this happens at the beginning of a month, and also at the end.
What I can't figure out is how to tell when I am at the start of a month, and when i am at the end of a month so I can apply the appropriate function, when running the report for multiple months (Aug - Oct).
Currently if I just force them to run the report for a single month, no problems, and I have been told it is only a monthly report, but I know eventually I will be asked if it can be run for multiple months.
I know it is basically something like:
SELECT sum(CASE WHEN at-top-of-month-with-split-week THEN....
WHEN at-bottom-of-month-with-split-week THEN...
ELSE col_payment_amt END) as PayTotal
FROM...
WHERE....
GROUP BY...
I'm trying to figure out the at-top-of-month and at-bottom-of-month parts. The other parts I have.
This code works in IBM Informix Dynamic Server (tested on 11.50.FC6 for MacOS X 1.06.4, but would work on any supported version of IDS, and any platform). You will need to translate into MS SQL Server notation.
Version 2 - reference month identified by month number only
CREATE FUNCTION NumWeekDaysInRefMonth(eow DATE DEFAULT TODAY, ref_month INTEGER)
RETURNING INT AS numdays;
DEFINE monday DATE;
DEFINE friday DATE;
DEFINE mon_month INTEGER;
DEFINE fri_month INTEGER;
IF eow IS NULL OR ref_month IS NULL THEN RETURN NULL; END IF;
LET monday = eow - WEEKDAY(eow) + 1;
LET friday = monday + 4;
LET mon_month = MONTH(monday);
LET fri_month = MONTH(friday);
IF mon_month = ref_month AND fri_month = ref_month THEN
-- All in same month: 5 days count.
RETURN 5;
END IF;
IF mon_month != ref_month AND fri_month != ref_month THEN
-- None in the same month: 0 days count.
RETURN 0;
END IF;
-- Some of the days are in the same month, some are not.
IF mon_month = ref_month THEN
-- End of month
RETURN 5 - DAY(friday);
ELSE
-- Start of month
RETURN DAY(friday);
END IF;
END FUNCTION;
Test cases
SELECT NumWeekDaysInRefMonth('2010-09-04', 9) answer, 3 AS expected FROM dual;
SELECT NumWeekDaysInRefMonth('2010-09-04', 8) answer, 2 AS expected FROM dual;
SELECT NumWeekDaysInRefMonth('2010-10-02', 9) answer, 4 AS expected FROM dual;
SELECT NumWeekDaysInRefMonth('2010-10-02', 10) answer, 1 AS expected FROM dual;
SELECT NumWeekDaysInRefMonth('2010-10-02', 8) answer, 0 AS expected FROM dual;
SELECT NumWeekDaysInRefMonth('2010-10-09', 10) answer, 5 AS expected FROM dual;
This has much the same logic as the previous version (below); it takes just a month number to identify which billing month you are interested in, rather than a full date.
Version 1 - full reference date
CREATE FUNCTION NumWeekDaysInRefMonth(eow DATE DEFAULT TODAY, ref DATE DEFAULT TODAY)
RETURNING INT AS numdays;
DEFINE mon DATE;
DEFINE fri DATE;
DEFINE v_mon INTEGER;
DEFINE v_fri INTEGER;
DEFINE v_ref INTEGER;
IF eow IS NULL OR ref IS NULL THEN RETURN NULL; END IF;
LET mon = eow - WEEKDAY(eow) + 1;
LET fri = mon + 4;
LET v_mon = YEAR(mon) * 100 + MONTH(mon);
LET v_fri = YEAR(fri) * 100 + MONTH(fri);
LET v_ref = YEAR(ref) * 100 + MONTH(ref);
IF v_mon = v_ref AND v_fri = v_ref THEN
-- All in same month: 5 days count.
RETURN 5;
END IF;
IF v_mon != v_ref AND v_fri != v_ref THEN
-- None in the same month: 0 days count.
RETURN 0;
END IF;
-- Some of the days are in the same month, some are not.
IF v_mon = v_ref THEN
-- End of month
RETURN 5 - DAY(fri);
ELSE
-- Start of month
RETURN DAY(fri);
END IF;
-- Month-end wrapping
-- 26 27 28 29 30 31 1 2 3 4 5 6 Jan, Mar, May, Jul, Aug, Oct, Dec
-- 25 26 27 28 29 30 1 2 3 4 5 6 Apr, Jun, Sep, Nov
-- 24 25 26 27 28 29 1 2 3 4 5 6 Feb - leap year
-- 23 24 25 26 27 28 1 2 3 4 5 6 Feb
-- Mo Tu We Th Fr Sa Su Mo Tu We Th Fr
-- Su Mo Tu We Th Fr Sa Su Mo Tu We Tu
-- Sa Su Mo Tu We Th Fr Sa Su Mo Tu We
-- Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu
-- Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo
-- We Th Fr Sa Su Mo Tu We Th Fr Sa Su
-- Tu We Th Fr Sa Su Mo Tu We Th Fr Sa
END FUNCTION;
Test cases
These tests are equivalent to the previous set.
SELECT NumWeekDaysInRefMonth('2010-09-04', '2010-09-01') answer, 3 AS expected FROM dual;
SELECT NumWeekDaysInRefMonth('2010-09-04', '2010-08-01') answer, 2 AS expected FROM dual;
SELECT NumWeekDaysInRefMonth('2010-10-02', '2010-09-01') answer, 4 AS expected FROM dual;
SELECT NumWeekDaysInRefMonth('2010-10-02', '2010-10-01') answer, 1 AS expected FROM dual;
SELECT NumWeekDaysInRefMonth('2010-10-02', '2010-08-01') answer, 0 AS expected FROM dual;
SELECT NumWeekDaysInRefMonth('2010-10-09', '2010-10-01') answer, 5 AS expected FROM dual;
Explanation
The Informix DATE type counts in days, so adding 1 to a DATE gives the day after. The Informix WEEKDAY function returns 0 for Sunday, 1 for Monday, ..., 5 for Friday, 6 for Saturday. The DAY, MONTH and YEAR functions return the corresponding component of a DATE value.
The code allows any day of the week as the reference date for the payment (it does not have to be Saturday). Similarly, although the examples for version 1 use the first of the month as the reference day for the month, you can supply any date within the requisite month as the reference date; in version 2, that is simplified to passing in the requisite month number.
If anyone passes a NULL into the function, the answer is NULL.
Then we calculate the Monday of the week containing the 'end of week' date; if the week day is Sunday, it subtracts 0 and adds 1 to get Monday; if the week day is Sturday, it subtracts 6 and adds 1 to get Monday; etc. Friday is 4 days later.
In version 1, then we calculate a representation for the year and month.
If both Monday and Friday fall in the reference month, then the answer is 5 days; if neither falls in the reference month, the answer is 0 days. If the Friday is within the reference month, then the DAY() value of the Friday date is the number of days in the month. Otherwise, the Monday is within the reference month, and the number of days in the month is 5 - DAY(Friday).
Note that this calculation deals with completely unrelated months - as shown by the last but one test case; there are zero days of a payment made October that should be counted in August.
Ok so here is what I ended up doing in a nutshell.
I set up a WHILE loop. I loop through the months for the date range, and insert the data into a TempTable along with a numeric Reference Month column (well, really an INT), and the month name (September, October etc...). From the TempTable I then select the data and have a CASE statement which determines which calculation to use. So it reads something like this...
-- Top of the month, Split Week
select sum(case when datepart(d,col_week_end_date) <= 7 AND RefMonth = month(col_ween_end_date)
then ...Top of Month calc...
-- Middle of the month, entire week is in Reference Month
when datepart(d,col_week_end_date) > 7 AND RefMonth = month(col_week_end_date)
then ...Just SUM the column as usual...
-- Bottom of month, Split Week
when RefMonth < month(col_week_end_date)
then ...Bottom of Month calc...
) end as MonthlySum,
col_RefMonth,
col_RefMonthName
from dbo.TempTable
group by col_RefMonth, col_RefMonthName
order by col_RefMonth
So far with the limited amount of data I have, it appears to be working with the split weeks in the months I have available. Seems the "Reference Month" was the key, and I needed to loop through and put the data in a TempTable. I was hoping there would be a way to get it in a single pass, or a least a quick dump into a #TempTable and select from there.
I'm not sure how efficient it will be, but it is only for monthly reporting and not really as an online interactive report.

Oracle week calculation issue

I am using Oracle's to_char() function to convert a date to a week number (1-53):
select pat_id,
pat_enc_csn_id,
contact_date,
to_char(contact_date,'ww') week,
...
the 'ww' switch gives me these values for dates in January of this year:
Date Week
1-Jan-10 1
2-Jan-10 1
3-Jan-10 1
4-Jan-10 1
5-Jan-10 1
6-Jan-10 1
7-Jan-10 1
8-Jan-10 2
9-Jan-10 2
10-Jan-10 2
11-Jan-10 2
12-Jan-10 2
a quick look at the calendar indicates that these values should be:
Date Week
1-Jan-10 1
2-Jan-10 1
3-Jan-10 2
4-Jan-10 2
5-Jan-10 2
6-Jan-10 2
7-Jan-10 2
8-Jan-10 2
9-Jan-10 2
10-Jan-10 3
11-Jan-10 3
12-Jan-10 3
if I use the 'iw' switch instead of 'ww', the outcome is less desirable:
Date Week
1-Jan-10 53
2-Jan-10 53
3-Jan-10 53
4-Jan-10 1
5-Jan-10 1
6-Jan-10 1
7-Jan-10 1
8-Jan-10 1
9-Jan-10 1
10-Jan-10 1
11-Jan-10 2
12-Jan-10 2
Is there another Oracle function that will calculate weeks as I would expect or do I need to write my own?
EDIT
I'm trying to match the logic used by Crystal Reports. Each full week starts on a Sunday; the first week of the year starts on whichever day is represented by January 1st (e.g. in 2010, January 1st is a Friday).
When using IW, Oracle follows the ISO 8601 standard regarding week numbers (see http://en.wikipedia.org/wiki/ISO_8601). That is the same standard than the one we generally use in Europe here.
Your problem is also mentioned on the Oracle forum: http://forums.oracle.com/forums/thread.jspa?threadID=947291 and http://forums.oracle.com/forums/message.jspa?messageID=3318715#3318715. Maybe you can find a solution there.
I know this is old, but still a common question.
This should give you the correct results in the smallest amount of effort:
select pat_id,
pat_enc_csn_id,
contact_date,
to_char(contact_date + 1,'IW') week,
...
Since it looks like you are using your own special definition of the week number you'll need to write your own function.
It might be helpful that NLS_TERRITORY affects the day with which a week starts as used by the D Format Model
see also:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/sql_elements004.htm#SQLRF00210
and
http://www.adp-gmbh.ch/ora/sql/to_char.html
Based on this question, How do I calculate the week number given a date?, I wrote the following Oracle logic:
CASE
--if [date field]'s day-of-week (e.g. Monday) is earlier than 1/1/YYYY's day-of-week
WHEN to_char(to_date('01/01/' || to_char([date field],'YYYY'),'mm/dd/yyyy'), 'D') - to_char([date field], 'D') > 1 THEN
--adjust the week
trunc(to_char([date field], 'DDD') / 7) + 1 + 1 --'+ 1 + 1' used for clarity
ELSE trunc(to_char([date field], 'DDD') / 7) + 1
END calendar_week