Group by clause in case then end in HIveQL - hive

I am having trouble in successfully running a query
select session from (select F_SESSION as session
FROM T_TEMP GROUP BY F_SESSION ) a ;
The above runs successfully. However, the below one fails
select session, count(total) from (select F_SESSION as session,
case when F_RECORDED_VALUE != 0 then F_RECORDED_VALUE end as total FROM T_TEMP GROUP BY F_SESSION ) a ;
The error is
FAILED: SemanticException [Error 10025]: Line 4:30 Expression not in GROUP BY key '0'
Can someone point me where I am going wrong?

select session, count(total) from (select F_SESSION as session,
case when F_RECORDED_VALUE != 0 then F_RECORDED_VALUE end as total FROM T_TEMP ) a
group by session;

Related

SQL - Flag rows till 0 value of each group

I am calculating a running balance and want to flag all rows till 0 value to have 'MATCHED' flag else 'NOT-MATCHED' flag with respect to account ID.
Here is what I have tried but didn't got proper result:
SEL a.*,
CASE WHEN RUNNING_BALANCE OVER (PARTITION BY ACCT_SROGT_ID ROWS UNBOUNDED PRECEDING ) = 0 THEN 'M' ELSE 'NM' END R
FROM NON_MATCHING_RUNNING_BALANCE a
We can use a sub-query to find the last acct_rank which is 0 and then use case to test each row.
Select
a.*,
Case when a.acct_rank > z.last_zero
Then 'unmatched' else 'matched'
End as is_matched
From accounts a
Join ( select
account_id as id,
MAX(acct_rank) last_zero
From accounts
Where running_balance = 0
Group by account_id) z
On a.account_id = z.id;
Check this one
WITH SUB AS (
select NON_MATCHING_RUNNING_BALANCE.*,row_number()over(partition by [Account ID] order by [Account ID]) RN from NON_MATCHING_RUNNING_BALANCE
)
SELECT SUB.*,CASE WHEN SUB.RN<=SUB2.RN AND SUB.[Account ID]=SUB2.[Account ID] THEN 'MATCHED' ELSE 'Not-Matched' END AS 'MATCH' FROM SUB LEFT JOIN (
SELECT * FROM SUB WHERE value=0 ) SUB2 ON SUB.[Account ID]=SUB2.[Account ID]

How do I fix a select statement in the join in Athena?

This query was working in Redshift but isn't in Amazon Athena:
SELECT DISTINCT t1.place_id, t1.date, flg
FROM rec t1
LEFT JOIN (
SELECT date, place_id,
CASE WHEN COUNT(*) > 0 THEN 1 ELSE 0 END AS flg
FROM rec
GROUP BY date, place_id
) t2 ON t1.one_year_ago_dow_day = t2.date AND t1.place_id = t2.place_id
The error message is:
------------------------------------------
Your query has the following error(s):
SYNTAX_ERROR: line 84:39: Column 't2.date' cannot be resolved
This query ran against the "~~~" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: 29311d73-2cdb-4279-b88c-xxxxxxx
--------------------------------------
The error says "t2.date" is the problem, is it because I am selecting the table in left join?
How do I get this SQL to work in Athena?
Any insight would be appreciated.
If the date field is causing problems (as suggested by #Deepstop), you can try:
SELECT DISTINCT t1.place_id, t1.date, flg
FROM rec t1
LEFT JOIN (
SELECT
date as date1, -- Changed here, and in the ON below
place_id,
CASE WHEN COUNT(*) > 0 THEN 1 ELSE 0 END AS flg
FROM rec
GROUP BY date, place_id
) t2 ON (t1.one_year_ago_dow_day = t2.date1) AND (t1.place_id = t2.place_id)

What is wrong with my select for update skip locked oracle query

I need to update multiple records in a table by matching a particular column (dest_file_path in this case). I'm joining the table with itself. Inner select to get the matching column joined with the same table to get all the records with the one matching column
update job_queue
set status = 'RUNNING', last_updated_time = systimestamp
where rowid in
(
select jq.rowid from job_queue jq,
(select DEST_FILE_PATH from JOB_QUEUE
where status not in ('RUNNING','FINISHED','SKIPPED') and operation_type in ('copy')
order by case when status='FAILED' then 0 else 1 end desc, dest_file_path fetch first 1 rows only) dest
where jq.dest_file_path = dest.dest_file_path and jq.operation_type='copy'
)
for update skip locked;
Unfortunately I'm getting this error:
Error at Command Line : 25 Column : 1
Error report -
SQL Error: ORA-00933: SQL command not properly ended
00933. 00000 - "SQL command not properly ended"
*Cause:
*Action:
Although if I do a simple select instead of update it works just fine. Here is the select query.
select rowid,dest_file_path from job_queue
where rowid in
(
select jq.rowid from job_queue jq,
(select DEST_FILE_PATH from JOB_QUEUE
where dest_file_path='/file/path'
and status not in ('RUNNING','FINISHED','SKIPPED') and operation_type in ('copy')
order by case when status='FAILED' then 0 else 1 end desc, dest_file_path fetch first 1 rows only) dest
where jq.dest_file_path = dest.dest_file_path and jq.operation_type='copy'
)
for update skip locked;
Any suggestions on how to tackle the update.
The for update clause only exists for select statements, not for update statements (which have to lock the rows they update). It sounds like you want to run the select for update first, load the key(s) into a local variable/ collection, then do a subsequent update using the values you saved off.
you can run your query using pl/sql.
BEGIN
FOR R
IN (SELECT JQ.ROWID
FROM JOB_QUEUE JQ,
(SELECT *
FROM ( SELECT DEST_FILE_PATH
FROM JOB_QUEUE
WHERE STATUS NOT IN ('RUNNING',
'FINISHED',
'SKIPPED')
AND OPERATION_TYPE IN ('copy')
ORDER BY (CASE STATUS
WHEN 'FAILED' THEN 0
ELSE 1
END) DESC)
WHERE ROWNUM = 1) DEST
WHERE JQ.DEST_FILE_PATH = DEST.DEST_FILE_PATH
AND JQ.OPERATION_TYPE = 'copy'
FOR UPDATE
SKIP LOCKED)
LOOP
UPDATE JOB_QUEUE
SET STATUS = 'RUNNING', LAST_UPDATED_TIME = SYSTIMESTAMP
WHERE ROWID = R.ROWID;
END LOOP;
END;
/

syntax error case statement

I am trying to write an sql statement but i am getting syntax error. I know it is do with my select and case statement but cant figure out.As the error is not descriptive. I am using redshift
select school_district_teacher_ind,customer_status,initial_pay_type,(select(
CASE
WHEN total_line_price = 0
THEN 'free'
ELSE 'paid'
END
)
from storiacloud.schl_storia_revenue_fact_a)as a,count(distinct convert(varchar(100),[Otc_Order_Number])+'_'+ convert(varchar(100),[Otc_Order_Line_Number]))
from storiacloud.schl_storia_revenue_fact_a as fact
inner join
storiacloud.schl_storia_school_status as status
on fact.school_ucn = status.ucn
where date = '11/2/2015'
group by school_district_teacher_ind,customer_status,initial_pay_type,a
Below is the error
ERROR: Invalid Query:
Detail:
-----------------------------------------------
error: Invalid Query:
code: 8001
context: single-row subquery returns more than one row
query: 5132289
location: 25.cpp:69
process: padbmaster [pid=29183]
-----------------------------------------------
Execution time: 0.16s
1 statement failed.
The results that i expect are
Note first column customer type is school_district_teacher_ind in the above select statment
Your case was a subquery that is selecting from the same table as your main query. Try this;
SELECT school_district_teacher_ind ,
customer_status ,
initial_pay_type ,
CASE WHEN total_line_price = 0 THEN 'free'
ELSE 'paid'
END AS a ,
COUNT(DISTINCT CONVERT(VARCHAR(100), [Otc_Order_Number]) + '_'
+ CONVERT(VARCHAR(100), [Otc_Order_Line_Number]))
FROM storiacloud.schl_storia_revenue_fact_a AS fact
INNER JOIN storiacloud.schl_storia_school_status AS status ON fact.school_ucn = status.ucn
WHERE date = '11/2/2015'
GROUP BY school_district_teacher_ind ,
customer_status ,
initial_pay_type ,
CASE WHEN total_line_price = 0 THEN 'free'
ELSE 'paid'
END
I think you just want conditional aggregation. The query is something like this:
select school_district_teacher_ind, customer_status, initial_pay_type,
sum(case when total_line_price = 0 then 1 else 0 end) as free,
sum(case when total_line_price = 0 then 0 else 1 end) as paid
from storiacloud.schl_storia_revenue_fact_a fact inner join
storiacloud.schl_storia_school_status status
on fact.school_ucn = status.ucn
where date = '2015-11-02'
group by school_district_teacher_ind,customer_status, initial_pay_type;

How do I group by generic field?

If I have a table with a status field and I want to know how many records of each status I can do a simple group by. And how about if I want to know the count for 2 records and the count for all the others.
In other words I want this:
Status Count
-------- -----
Success X
Running Y
Failure Z
but Failure is not Failure on the table, it contains the actual error message, so I want everything that's different that Success and Running
select case when Status <> 'Success'
and Status <> 'Running'
then 'Failure'
else Status
end Status,
count (*) [Count]
from atable
group by case when Status <> 'Success'
and Status <> 'Running'
then 'Failure'
else Status
end
Click here to view the demo in SQL Fiddle.
Script:
CREATE TABLE errormsgs
(
id INT NOT NULL IDENTITY
, statusmsg VARCHAR(30) NOT NULL
);
INSERT INTO errormsgs (statusmsg) VALUES
('Success'),
('This is error message 1.'),
('Running'),
('This is error message 2.'),
('This is error message 3.'),
('Success'),
('Success'),
('This is error message 4.'),
('Running'),
('failure, may be'),
('failure, absolutely.');
;WITH statuses AS
(
SELECT CASE
WHEN statusmsg NOT IN ('Success', 'Running') THEN 'Failure'
ELSE statusmsg
END status
FROM errormsgs
)
SELECT status
, COUNT(status) AS status_count
FROM statuses
GROUP BY status;
Output:
STATUS STATUS_COUNT
-------- ------------
Failure 6
Running 2
Success 3
SELECT DISTINCT CASE
WHEN [status]='s' OR [STATUS]='r' THEN [status]
ELSE 'OTHER'
END AS STATUS
,COUNT(1) OVER(
PARTITION BY CASE
WHEN [status]='s'
OR [STATUS]='r' THEN [status] ELSE 'aaa' END
) AS 'count'
FROM tbl2