Issue with subquery in Oracle corrected the query? [duplicate] - sql

This question already has answers here:
ORA-00979 not a group by expression
(10 answers)
Closed 8 years ago.
SELECT COUNT(DISTINCT SEC.ERROR_GROUP_ID),
COUNT(DISTINCT SEC_DET.ERROR_GROUP_ID),
COUNT(DISTINCT MB.ERROR_GROUP_ID),
COUNT(DISTINCT OD.ERROR_GROUP_ID),
(SELECT COUNT (DISTINCT SEC_SCH.ERROR_GROUP_ID)
FROM SCHEMA.SECURITY SEC
LEFT OUTER JOIN SCHEMA.SECURITY_SCHEDULE SEC_SCH
ON SEC.MSD_SECURITY_ID =SEC_SCH.MSD_SECURITY_ID
WHERE SEC.MSD_SECURITY_ID IN
( SELECT DISTINCT main.MSD_SECURITY_ID
FROM SCHEMA2.Positions main
WHERE main.QUANTITY != 0
AND systimestamp >= main.eff_from_dt
AND main.eff_to_dt > systimestamp
AND systimestamp >= main.asrt_from_dt
AND main.asrt_to_dt > systimestamp
))
FROM SCHEMA.SECURITY SEC
JOIN SCHEMA.SECURITY_DETAIL SEC_DET
ON SEC.MSD_SECURITY_ID = SEC_DET.MSD_SECURITY_ID
LEFT OUTER JOIN SCHEMA.MUNI_BOND MB
ON SEC.MSD_SECURITY_ID=MB.MSD_SECURITY_ID
LEFT OUTER JOIN SCHEMA.OPTION_DETAIL OD
ON SEC.MSD_SECURITY_ID =OD.MSD_SECURITY_ID
WHERE SEC.MSD_SECURITY_ID IN
( SELECT DISTINCT main.MSD_SECURITY_ID
FROM SCHEMA2.Positions main
WHERE main.QUANTITY != 0
AND systimestamp >= main.eff_from_dt
AND main.eff_to_dt > systimestamp
AND systimestamp >= main.asrt_from_dt
AND main.asrt_to_dt > systimestamp
) ;
Error ORA-00936: missing expression
00936. 00000 - "missing expression"
*Cause:
*Action:
Error at Line: 365 Column: 3
The Nested query syntax needs to b corrected for this to work thats where i am stuck at ?.

This is a partial answer - I do not understand fully some of your code. I think it has issues.,
Note: ** means bold -- I messed up formatting ** is not part of this SQL.
You have to group by something. In this case:
(SELECT DISTINCT (SEC_SCH.ERROR_GROUP_ID)
FROM SCHEMA.SECURITY SEC
LEFT OUTER JOIN SCHEMA.SECURITY_SCHEDULE SEC_SCH
ON SEC.MSD_SECURITY_ID =SEC_SCH.MSD_SECURITY_ID
WHERE SEC.MSD_SECURITY_ID IN
( SELECT DISTINCT main.MSD_SECURITY_ID
FROM SCHEMA2.Positions main
WHERE main.QUANTITY != 0
AND systimestamp >= main.eff_from_dt
AND main.eff_to_dt > systimestamp
AND systimestamp >= main.asrt_from_dt
AND main.asrt_to_dt > systimestamp
)) **foo**
FROM SCHEMA.SECURITY SEC
JOIN SCHEMA.SECURITY_DETAIL SEC_DET
ON SEC.MSD_SECURITY_ID = SEC_DET.MSD_SECURITY_ID
LEFT OUTER JOIN SCHEMA.MUNI_BOND MB
ON SEC.MSD_SECURITY_ID=MB.MSD_SECURITY_ID
LEFT OUTER JOIN SCHEMA.OPTION_DETAIL OD
ON SEC.MSD_SECURITY_ID =OD.MSD_SECURITY_ID
WHERE SEC.MSD_SECURITY_ID IN
( SELECT DISTINCT main.MSD_SECURITY_ID
FROM SCHEMA2.Positions main
WHERE main.QUANTITY != 0
AND systimestamp >= main.eff_from_dt
AND main.eff_to_dt > systimestamp
AND systimestamp >= main.asrt_from_dt
AND main.asrt_to_dt > systimestamp
)
**group by foo** ;

Your overall query is an aggregation query without a group by, so it would be expected to return one row. You have a subquery in the select that can return multiple rows -- I suspect the problem is related to this structure.
I would suggest that you change the distinct to an aggregation function. But what? COUNT(DISTINCT SEC_SCH.ERROR_GROUP_ID)) ? MAX(SEC_SCH.ERROR_GROUP_ID))? LISTAGG(SEC_SCH.ERROR_GROUP_ID, ',') WITHIN GROUP (ORDER BY SEC_SCH.ERROR_GROUP_ID))? I don't know. It is not clear what you want for the third column.
Your entire query looks suspect. So many count(disintct) expressions often mean that you are joining along independent dimensions -- creating a cartesian product. Hard to say if that is a problem, because without sample data and desired results, your question doesn't actually say what you want to accomplish.

Related

Subquery returned more than 1 value. This is not permitted when the subquery follows =

I am trying to find 12 months of AR Amounts from a table. Grouping by a calculated date, which is on seperate table. this is the Query:
SELECT ROUND(SUM(bdr_hfl),2) AS AmountDC , datum
FROM gbkmut with (NOLOCK)
WHERE reknr in (1300,1320)
AND kstdrcode BETWEEN '00' AND '10'
AND kstplcode = '00'
AND transtype IN ('N', 'C', 'P')
AND ISNULL(transsubtype, '') <> 'X'
AND datum <= (SELECT eddatum FROM perdat
WHERE bkjrcode = 2019
GROUP BY EDDATUM)
GROUP BY DATUM, BDR_HFL
I am getting the following error:
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.1, Line 1.
Presumably, you want some sort of aggregation. Instead of:
datum <= (SELECT eddatum FROM perdat WHERE bkjrcode = 2019 GROUP BY EDDATUM)
Perhaps you want:
datum <= (SELECT MIN(p.eddatum)
FROM perdat p
WHERE bkjrcode = 2019
)
I think you want to use datepart, but Sample Data and desired output would help a lot.
AND datepart(year,datum) = 2019
instead of
AND datum <= (SELECT eddatum FROM perdat
WHERE bkjrcode = 2019
GROUP BY EDDATUM)
In your sub query you are getting all the dates from 2nd table which are less than 2019, this is cause for the error.
As mentioned if you are trying to get all the amounts for past 12 months from table 1 using dates from table2 you can try the following.
- Write a join (left outer join between tab1 and tab2 on date col and in the where clause provide the date condition using between operator or using greater than or less than operator.
Ex: select round(a.col1,2),a.col2 from tab1 a inner join/ left outer join tab2 bon a.datecol=b.datecol
Where a.cond1=xxxx and a.cond2=yyyy.....etc
a.datecol>= ‘01-01-2018’ and a.datecol <=‘31-12-2018’
If you do not prefer joins you can try with min/max operator.
Select col1,col2 From tab1 where datecol>=(select min(datecol) from tab2 where yearpart(datecol)<‘2019’) and datecol>= select max(datecol) from tab2 where yearpart(datecol)<‘2019’)

HIVE: Error in GROUP BY Key

hive -e "select a.EMP_ID,
count(distinct c.SERIAL_NBR) as NUM_CURRENT_EMP,
count(distinct c.SERIAL_NBR)/count(distinct a.SERIAL_NBR) as DISTINCT_EMP
from ORDERS_COMBINED_EMPLOYEES as a
inner join ORDERS_EMPLOYEE_STATS as b
on a.CPP_ID = b.CPP_ID
left join ( select SERIAL_NBR, MIN(TRAN_DT) as TRAN_DT
from EMP_TXNS
group by SERIAL_NBR
) c
on c.SERIAL_NBR = a.SERIAL_NBR
where c.TRAN_DT > a.LAST_TXN_DT
group by a.EMP_ID
having (
(NUM_CURRENT_EMP >= 25 and DISTINCT_EMP > 0.01)
) ; " > EMPLOYEE_ORDERS.txt
Getting error message,
"FAILED: SemanticException [Error 10025]: Line 15:31 Expression not in GROUP BY key '0.01'".
When I ran the same query with just one condition in HAVING clause as NUM_CURRENT_EMP >= 25, the query ran fine without any issues. NUM_CURRENT_EMP is a int type and DISTINCT_EMP is float in the table where I am trying to insert the results. Breaking my head.
Any help is appreciated.
What happens if you replace the aliases in the having with the expressions that define them?
having count(distinct c.SERIAL_NBR) >= 25 and
count(distinct c.SERIAL_NBR)/count(distinct a.SERIAL_NBR) > 0.01

SQL subquery in the AND statement

A couple problems.
Solved valid_from_tsp <> max(valid_from_tsp) - how can I get my query to filter based on not being the max date? This idea doesn't work The error being returned is: "Improper use of an aggregate function in a WHERE clause"
My second issue is when I run it without the date, I am returned a syntax error: Syntax error, expected something like 'IN' keyword or 'CONTAINS' keyword between ')' and ')'
What do you see that I don't? Thanks in advance
Edited Query
select
a.*,
b.coverage_typ_cde as stg_ctc
from P_FAR_BI_VW.V_CLAIM_SERVICE_TYP_DIM a
inner join (select distinct etl_partition_id, coverage_typ_cde from
P_FAR_STG_VW.V_CLAIM_60_POLICY_STG where row_Create_tsp > '2013-11-30 23:23:59')b
on (a.etl_partition_id = b.etl_partition_id)
where a.valid_from_tsp > '2013-11-30 23:23:59'
and a.coverage_typ_cde = ' '
and (select * from P_FAR_SBXD.T_CLAIM_SERVICE_TYP_DIM where service_type_id = 136548255
and CAST(valid_from_tsp AS DATE) <> '2014-03-14')
Trouble part: and (select * from P_FAR_SBXD.T_CLAIM_SERVICE_TYP_DIM where service_type_id = 136548255
and CAST(valid_from_tsp AS DATE) <> '2014-03-14')
I am trying to filter by the date on the service_type_id, and I am getting the error in question 2
As for sample data: This is kinda tricky, This query returns many thousands of rows of data. Currently when I do the inner join, I get a secondary unique index violation error. So I am trying to filter out everything but the more recent which could be under that violation (service_type_id is the secondary index)
If I bring back three rows with the service_type_id with three different valid_from_tsp timestamps, I only want to keep the newest one, and in the query, not return the other two.
I don't know about your second question, but your first error is due to using an aggregate function max in a where clause. I'm not really sure what you want to do here, but a quick fix is to replace max(valid_from_tsp) with a subquery that only returns the maximum value.
This is your query:
select a.*, b.coverage_typ_cde as stg_ctc
from P_FAR_BI_VW.V_CLAIM_SERVICE_TYP_DIM a inner join
(select distinct etl_partition_id, coverage_typ_cde
from P_FAR_STG_VW.V_CLAIM_60_POLICY_STG
where row_Create_tsp > '2013-11-30 23:23:59'
) b
on (a.etl_partition_id = b.etl_partition_id)
where a.valid_from_tsp > '2013-11-30 23:23:59' and
a.coverage_typ_cde = ' ' and
(select *
from P_FAR_SBXD.T_CLAIM_SERVICE_TYP_DIM
where service_type_id = 136548255 and
CAST(valid_from_tsp AS DATE) <> '2014-03-14'
);
In general, you cannot have a subquery just there in the where clause with no condition. Some databases might allow a scalar subquery in this context (one that returns one row and one column), but this isn't a scalar subquery. You can fix the syntax by using exists:
where a.valid_from_tsp > '2013-11-30 23:23:59' and
a.coverage_typ_cde = ' ' and
exists (select 1
from P_FAR_SBXD.T_CLAIM_SERVICE_TYP_DIM
where service_type_id = 136548255 and
CAST(valid_from_tsp AS DATE) <> '2014-03-14'
);

Join Oracle tables on an exact match, and a closest match

I am trying to join two tables of performance metrics, system stats and memory usage. Entries in these tables come in on differing time schedules. I need to join the tables by finding the exact match for the System_Name in both tables, and the closest for WRITETIME. Write time uses the systems own idea of time and is NOT a standard Oracle timestamp.
I can select the closest timestamp from one table with something like:
select "Unix_Memory"."WRITETIME", ABS ('1140408134015004' - "Unix_Memory"."WRITETIME")
as Diff from "Unix_Memory"
where "Unix_Memory"."WRITETIME" > '1140408104015004' order by Diff;
The constants there will be parameterised in my script.
However when I try to expand this into my larger query:
select "System"."System_Name", "System"."WRITETIME" as SysStamp,
from "System"
join "Unix_Memory" on "System"."System_Name" = "Unix_Memory"."System_Name"
and "Unix_Memory"."WRITETIME" = (
select Stamp from (
select "Unix_Memory"."WRITETIME" as Stamp,
ABS ( "System"."WRITETIME" - "Unix_Memory"."WRITETIME") as Diff
from "Unix_Memory" where "Unix_Memory"."WRITETIME" > '1140408104015004' and rownum = 1 order by Diff
)
)
WHERE "System"."System_Name" in ('this','that', 'more')
and "System"."WRITETIME" > '1140408124015004';
I get:
Error at Command Line:38 Column:72
Error report:
SQL Error: ORA-00904: "System"."WRITETIME": invalid identifier
00904. 00000 - "%s: invalid identifier"
I have tried a few variations, but I am not getting any closer.
You must state the System table in the inner Select as well.
select "System"."System_Name", "System"."WRITETIME" as SysStamp,
from "System"
join "Unix_Memory" on "System"."System_Name" = "Unix_Memory"."System_Name"
and "Unix_Memory"."WRITETIME" = (
select Stamp from (
select "Unix_Memory"."WRITETIME" as Stamp,
ABS ( "System"."WRITETIME" - "Unix_Memory"."WRITETIME") as Diff
from "Unix_Memory"
-- THE NEXT LINE IS MISSING IN YOUR CODE
INNER JOIN "System" ON "System.System_Name" = "Unix_Memory"."System_Name"
and "System"."WRITETIME" > '1140408124015004'
-- end of missing
where "Unix_Memory"."WRITETIME" > '1140408104015004' and rownum = 1 order by Diff
)
)
WHERE "System"."System_Name" in ('this','that', 'more')
and "System"."WRITETIME" > '1140408124015004';
Unfortunately the column names are only known in the next nesting level. So System.writetime would be known in select Stamp from ..., but no more in select "Unix_Memory"."WRITETIME" as Stamp ...
Anyhow, you would select a rather random stamp anyhow, the first Unix_Memory"."WRITETIME" > '1140408104015004' found to be precise, because rownum = 1 gets executed before order by. You will have to re-write your statement completely.
EDIT: Here is one possibility to re-write the statement using MIN/MAX KEEP:
select
s.system_name,
s.writetime as sysstamp,
min(um.id) keep (dense_rank first order by abs(s.writetime - um.writetime)) as closest_um_id
from system sys
join unix_memory um on s.system_name = um.system_name
where s.system_name in ('this','that', 'more')
and s.writetime > '1140408124015004'
and um.writetime > '1140408104015004'
group by s.system_name, s.writetime
order by s.system_name, s.writetime;
If you need more than just the ID of unix_memory then surround this with another select:
select
sy.system_name,
sy.sysstamp,
mem.*
from
(
select
s.system_name,
s.writetime as sysstamp,
min(um.id) keep (dense_rank first order by abs(s.writetime - um.writetime)) as closest_um_id
from system sys
join unix_memory um on s.system_name = um.system_name
where s.system_name in ('this','that', 'more')
and s.writetime > '1140408124015004'
and um.writetime > '1140408104015004'
group by s.system_name, s.writetime
) sy
join unix_memory mem on mem.id = sy.closest_um_id
order by sy.system_name, sy.sysstamp;

SQL: Using COUNT(*) Instead of EXISTS

Is it possible to use COUNT in place of EXISTS?
I have following query:
SELECT *
FROM Goals G
WHERE EXISTS (SELECT NULL FROM tfv_home_last6(G.Date, G.Home) WHERE GameNumber <= 6 AND
HomeGoals >= 3)
Instead of returning the row if at least one row exists in the subquery, I'd like to specify a number of rows that need to be returned in the subquery, something like
SELECT *
FROM Goals G
WHERE ROWCOUNT(*) >= 2 (SELECT NULL FROM tfv_home_last6(G.Date, G.Home) WHERE GameNumber <= 6 AND
HomeGoals >= 3)
I'm not sure how to go about it?
I'm using SQL Server 2012.
You can do the subquery pretty much just like you describe:
SELECT *
FROM Goals G
WHERE (SELECT count(*)
FROM tfv_home_last6(G.Date, G.Home)
WHERE GameNumber <= 6 AND HomeGoals >= 3
) > 0;
However, this requires calculating the entire count. The exists form is more efficient, because it stops at the first matching record.
In SQL Server 2012, you could also use `cross apply:
SELECT *
FROM Goals G cross apply
(select count(*) as cnt
FROM tfv_home_last6(G.Date, G.Home)
WHERE GameNumber <= 6 AND HomeGoals >= 3
) a
WHERE a.cnt > 0;
I do not know which would have better performance, the correlated subquery in the where clause or the
cross apply version.