SQL Query not working - sql

I seem to be getting this error while trying to run the below query:
SELECT
to_char(EFFECTIVE_DT,'YYYY-MM') as YYYYMM,
--EFFECTIVE_DT,
AH01_PAYMENT_STATUS_CTD,
TSYS_ACCT_ID
FROM OIS_TSYS.AH_CYCLE_HIST
WHERE 1=1
AND EFFECTIVE_DT BETWEEN '01-MAY-2017' AND '31-MAY-2017'
GROUP BY 2
ORDER BY 1
error: ORA-00979: not a GROUP BY expression
I am trying to group by date as at the moment i get the results daily for each individual account.
Result set:
65589 N 03-MAY-17
65590 S 03-MAY-17
65591 M 03-MAY-17
65592 F 03-MAY-17
65617 G 03-MAY-17
Any help be amazing.
Best,
Saad

When you "group by 2", all other columns must have an aggregate function like (sum, avg, min, max,..)
The "1=1" is pretty useless

To get the desired result use the below query:
When you apply group by clause in any query you cannot just put one column in the group by clause if there are more than one colum in the select clause apart from the aggregate functions like sum, count, min, max etc. So in your case you have to put all the three columns in group by that you selected in the select clause.
SELECT
TSYS_ACCT_ID,
AH01_PAYMENT_STATUS_CTD,
to_char(EFFECTIVE_DT,'YYYY-MM') as YYYYMM
FROM OIS_TSYS.AH_CYCLE_HIST
WHERE EFFECTIVE_DT BETWEEN '01-MAY-2017' AND '31-MAY-2017'
GROUP BY
TSYS_ACCT_ID,
AH01_PAYMENT_STATUS_CTD,
to_char(EFFECTIVE_DT,'YYYY-MM')
ORDER BY 1

Related

MAX in Select statement not returning the highest value?

I have a question regarding the max-statement in a select -
Without the MAX-statemen i have this select:
SELECT stockID, DATE, close, symbol
FROM ta_stockprice JOIN ta_stock ON ta_stock.id = ta_stockprice.stockID
WHERE stockid = 8648
ORDER BY close
At the end i only want to have the max row for the close-column so i tried:
Why i didnĀ“t get date = "2021-07-02" as output?
(i saw that i allways get "2021-07-01" as output - no matter if i use MAX / MIN / AVG...)
The MAX() turns the query into an aggregation query. With no GROUP BY, it returns one row. But the query is syntactically incorrect, because it mixes aggregated and unaggregated columns.
Once upon a time, MySQL allowed such syntax in violation of the SQL Standard but returned values from arbitrary rows for the unaggreged columns.
Use ORDER BY to do what you want:
SELECT stockID, DATE, close, symbol
FROM ta_stockprice JOIN ta_stock ON ta_stock.id = ta_stockprice.stockID
WHERE stockid = 8648
ORDER BY close DESC
LIMIT 1;

Distinct count and group by in HIVE

I am very new to HIVE and have an issue with distinct count and GROUP BY.
I want to calculate maximum temperature from temperature_data table corresponding to those years which have at least 2 entries in the table-
I tried with below query but it is not working
select
SUBSTRING(full_date,7,4) as year,
MAX(temperature) as temperature
from temperature_data
where count(distinct(SUBSTRING(full_date,7,4))) >= 2
GROUP BY SUBSTRING(full_date,7,4);
I am getting an error-
FAILED: SemanticException [Error 10128]: Line 2:0 Not yet supported place for UDAF 'count'
Below is input-
year,zip,temperature
10-01-1990,123112,10
14-02-1991,283901,11
10-03-1990,381920,15
10-01-1991,302918,22
12-02-1990,384902,9
10-01-1991,123112,11
14-02-1990,283901,12
10-03-1991,381920,16
10-01-1990,302918,23
12-02-1991,384902,10
10-01-1993,123112,11
You should use HAVING keyword instead to set a condition on variable you're using for grouping.
Also, you can benefit of using subqueries. See below.
SELECT
year,
MAX(t1.temperature) as temperature
FROM
(select SUBSTRING(full_date,7,4) year, temperature from temperature_data) t1
GROUP BY
year
HAVING
count(t1.year) > 2;
#R.Gold, We can try to simplify the above query without using sub-query as below:
SELECT substring(full_date,7) as year, max(temperature)
FROM your-hive-table
GROUP BY substring(full_date,7)
HAVING COUNT(substring(full_date,7)) >= 2
And, fyi - we can't use aggregate functions with WHERE clause.

Cannot Group by Year

Beginner SQL Question:
I'm trying to do a group by, by year and I'm getting funny results. I am using SQL Server 2008.
First, I tried
select count(applicationkey) , approveddate from ida.applications group by approveddate
To get a count of applications by date. However, I am interested in applications by year in stead of day so I tried
select count(applicationkey) , approveddate from ida.applications group by year(approveddate)
When I do this, I get an error message -Column 'ida.applications.ApprovedDate' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.-
However, if I do this I get results
select count(applicationkey) from ida.applications group by year(approveddate)
I get results. Its just I want to be able to see what year matches to which count, which I cannot do for some reason. Does anyone know why I am having this problem?
select count(applicationkey),
year(approveddate)
from ida.applications
group by year(approveddate)
group by must match fields in select if not using an aggregate.
You have all the correct parts there, just include the year(approveddate) in your select like so
select count(applicationkey), year(approveddate)
from ida.applications
group by year(approveddate)
In a group query, columns selected have to be aggregate functions or appear in the group-by list, because otherwise SQL wouldn't know which of the multiple values for the column in the group to use.
You can fix your query easily by using
select count(applicationkey) , year(approveddate)
from ida.applications group by year(approveddate)
-- the year displayed is from the group by list

Odd behavior doing join and

create table umd2
as select a.permno, a.date, a.realdate, exp(sum(log(1+b.ret))) - 1 as cum_return
from msex2 (obs=50 keep=permno date realdate) as a, msex2 (obs=50 keep=permno date ret) as b
where a.permno=b.permno and 0<=intck('month', b.date, a.date)<3
group by a.permno, a.date
having count(b.ret)=3;
This query is to calculate the momentum (cumulative return in the past 3 month of a stock). However, this gives me duplicate rows. I thought group by would not return duplicate rows?
When I added the realdate column to my group by statement,
create table umd2
as select a.permno, a.date, a.realdate, exp(sum(log(1+b.ret))) - 1 as cum_return
from msex2 (obs=50 keep=permno date realdate) as a, msex2 (obs=50 keep=permno date ret) as b
where a.permno=b.permno and 0<=intck('month', b.date, a.date)<3
group by a.permno, a.date, a.realdate
having count(b.ret)=3;
those duplicate rows disapear. Why is this?
This is the way that SAS behaves. SAS recognizes the following query:
select a.permno, a.date, a.realdate, count(*)
from <whatever>
group by a.permno, a.date, a.realdate;
as being an aggregation query. That means that the rows are aggregated and reduced, with one result row per combination of the three columns. In particular, the non-aggregated columns in the select match (or are a subset) of the columns in the group by.
When you do this:
select a.permno, a.date, a.realdate, count(*)
from <whatever>
group by a.permno, a.date;
You are now using non-standard SQL. Most databases would generate an error. MySQL would accept this syntax, and assign an arbitrary value to a.read_date from the matching values. SAS does something different. SAS says: "Well, you clearly don't intend for this to be an aggregation query." So, it doesn't aggregate the rows, but it appends the aggregated values onto each row. In other databases, you would do this using window functions.
Technically, SAS calls this remerging summary data, which is documented here.

Using a timestamp function in a GROUP BY

I'm working with a large transaction data set and would like to group a count of individual customer transactions by month. I am unable to use the timestamp function in the GROUP BY and return the following error:
BAD_QUERY (expression STRFTIME_UTC_USEC([DATESTART], '%b') in GROUP BY is invalid)
Is there a simple workaround to achieve this or should I build a calendar table (which may be the simplest option)?
You have to use an alias:
SELECT STRFTIME_UTC_USEC(DATESTART, '%b') as month, COUNT(TRANSACTION)
FROM datasetId.tableId
GROUP BY month
#Charles is correct but as an aside you can also group by column number.
SELECT STRFTIME_UTC_USEC(DATESTART, '%b') as month, COUNT(TRANSACTION) as count
FROM [datasetId.tableId]
GROUP BY 1
ORDER BY 2 DESC