Group By & Having vs. SubQuery (Where Count is Greater Than 1)

Group By & Having vs. SubQuery (Where Count is Greater Than 1) - sql

I'm struggling here trying to write a script that finds where an order was returned multiple times by the same associate (count greater than 1). I'm guessing my syntax with the subquery is incorrect. When I run the script, I get a message back that the "SELECT failed.. [3669] More than one value was returned by the subquery."
I'm not tied to the subquery, and have tried using just the group by and having statements, but I get an error regarding a non-aggregate value. What's the best way to proceed here and how do I fix this?
Thank you in advance - code below:
SEL s.saletran
, s.saletran_dt SALE_DATE
, r.saletran_id RET_TRAN
, r.saletran_dt RET_DATE
, ra.user_id RET_ASSOC
FROM salestrans s
JOIN salestrans_refund r
ON r.orig_saletran_id = s.saletran_id
AND r.orig_saletran_dt = s.saletran_dt
AND r.orig_loc_id = s.loc_id
AND r.saletran_dt between s.saletran_dt and s.saletran_dt + 30
JOIN saletran rt
ON rt.saletran_id = r.saletran_id
AND rt.saletran_dt = r.saletran_dt
AND rt.loc_id = r.loc_id
JOIN assoc ra --Return Associate
ON ra.assoc_prty_id = rt.sls_assoc_prty_id
WHERE
(SELECT count(*)
FROM saletran_refund
GROUP BY ORIG_SLTRN_ID
) > 1
AND s.saletran_dt between '2015-01-01' and current_date - 1

Based on what you've got so far, I think you want to use this instead:
where r.ORIG_SLTRN_ID in
(select
ORIG_SLTRN_ID
from
saletran_refund
group by ORIG_SLTRN_ID
having count (*) > 1)
That will give you the ORIG_SLTRN_IDs that have more than one row.

you don't give enough for a full answer but this is a start
group by s.saletran
, s.saletran_dt SALE_DATE
, r.saletran_id RET_TRAN
, r.saletran_dt RET_DATE
, ra.user_id RET_ASSOC
having count(distinct(ORIG_SLTRN_ID)) > 0
this does return more the an one row
run it
SELECT count(*)
FROM saletran_refund
GROUP BY ORIG_SLTRN_ID

Related

Trying to combine 2 columns in sql query

SELECT SubscriberKey, COUNT(*) AS TotalSentLast180Days
FROM (
SELECT s.SubscriberKey
FROM ENT._Sent s
INNER JOIN ENT.AllSubscribershistroyland ROCS
ON ROCS.SubscriberKey = s.SubscriberKey
WHERE ROCS.SLSegment__c = 'S4 - real-love'
AND 'S6 - real-love'
AND s.OYBAccountID = '85208879'
AND s.EventDate >= DATEADD(DAY, -180, GETDATE())
) t
GROUP BY SubscriberKey
So in the " AllSubscribershistroyland " their are 2 columns that are called 'S4 - Dutch real-love' and 'S6 - real-love'. Im trying to run the query to see how many subscribers are in those 2 columns. I cant seem to combine them but when i run the query for example with one column i do get a result back. I tried the ' AND ' to combine the 2 columns but i get an error code of "
Error saving the Query field. An expression of non-boolean type specified in a context where a condition is expected, near 'AND'.** "
if anyone can help me i would be very grateful

I am not entirely sure what you're trying to do, but if you are just trying to include both of your titles in the where clause for the same column then you could use the IN Clause:
SELECT SubscriberKey, COUNT(*) AS TotalSentLast180Days
FROM (
SELECT s.SubscriberKey
FROM ENT._Sent s
INNER JOIN ENT.AllSubscribershistroyland ROCS
ON ROCS.SubscriberKey = s.SubscriberKey
WHERE ROCS.SLSegment__c in ('S4 - real-love', 'S6 - real-love')
AND s.OYBAccountID = '85208879'
AND s.EventDate >= DATEADD(DAY, -180, GETDATE())
) t
GROUP BY SubscriberKey

SQL CASE WHEN ELSE not working in AWS Athena

I have the script below setup in AWS Athena, the goal is to replace some budget numbers (total) with 0 if they are within a certain category (costitemid). I'm getting the following error in AWS Athena and could use some advice as to why it isn't working. Is the problem that I need to repeat everything in the FROM and GROUP BY in the WHEN and ELSE? Code below the error. Thank you!
SYNTAX_ERROR: line 6:9: 'projectbudgets.projectid' must be an aggregate expression or appear in GROUP BY clause
This query ran against the "acorn-prod-reports" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: 077f007b-61a0-4f6b-aa1f-dd38bb401218
SELECT
CASE
WHEN projectbudgetlineitems.costitemid IN (462561,462562,462563,462564,462565,462566,478030) THEN (
SELECT
projectbudgets.projectid
, projectbudgetyears.year fiscalYear
, projectbudgetyears.status
, "sum"(((0 * projectbudgetlineitems.unitcost) * (projectbudgetlineitems.costshare * 1E-2))) total
)
ELSE (
SELECT
projectbudgets.projectid
, projectbudgetyears.year fiscalYear
, projectbudgetyears.status
, "sum"(((projectbudgetlineitems.quantity * projectbudgetlineitems.unitcost) * (projectbudgetlineitems.costshare * 1E-2))) total
)
END
FROM
(("acorn-prod-etl".target_acorn_prod_acorn_projectbudgets projectbudgets
INNER JOIN "acorn-prod-etl".target_acorn_prod_acorn_projectbudgetyears projectbudgetyears ON (projectbudgets.id = projectbudgetyears.projectbudgetid))
INNER JOIN "acorn-prod-etl".target_acorn_prod_acorn_projectbudgetlineitems projectbudgetlineitems ON (projectbudgetyears.id = projectbudgetlineitems.projectbudgetyearid))
--WHERE (((projectbudgetlineitems.costitemid <> 478030) AND (projectbudgetlineitems.costitemid < 462561)) OR (projectbudgetlineitems.costitemid > 462566))
GROUP BY projectbudgets.projectid, projectbudgetyears.year, projectbudgetyears.status

Your syntax is wrong (at least according to most SQL dialects.) You can't generally say "SELECT CASE WHEN (condition) THEN (this select clause) ELSE (that select clause) END FROM (tables)"
You can only use CASE to calculate a single value.
But it looks as if the only change between your two inner SELECT clauses is whether you use 0 or the quantity in the final multiplication. And that is perfect for a CASE!
I do not guarantee this will work right off the bat, because I don't have your setup or an idea of your table layout. However, it's a step in the right direction:
SELECT
projectbudgets.projectid
, projectbudgetyears.year fiscalYear
, projectbudgetyears.status
, "sum"(
((
CASE
WHEN projectbudgetlineitems.costitemid IN (462561,462562,462563,462564,462565,462566,478030)
THEN 0
ELSE projectbudgetlineitems.quantity
END * projectbudgetlineitems.unitcost
) * (
projectbudgetlineitems.costshare * 1E-2
))) total
FROM
(("acorn-prod-etl".target_acorn_prod_acorn_projectbudgets projectbudgets
INNER JOIN
"acorn-prod-etl".target_acorn_prod_acorn_projectbudgetyears projectbudgetyears
ON (projectbudgets.id = projectbudgetyears.projectbudgetid))
INNER JOIN "acorn-prod-etl".target_acorn_prod_acorn_projectbudgetlineitems projectbudgetlineitems
ON (projectbudgetyears.id = projectbudgetlineitems.projectbudgetyearid))
GROUP BY
projectbudgets.projectid, projectbudgetyears.year, projectbudgetyears.status

This could solve your problem if you want to sum the items for each project and year and status except for certain line items. Here, it is correct to use a "where" condition and not "case when" :
SELECT
projectbudgets.projectid,
projectbudgetyears.year,
projectbudgetyears.status,
"sum"(((projectbudgetlineitems.quantity * projectbudgetlineitems.unitcost) *
(projectbudgetlineitems.costshare * 1E-2))) total
FROM
(("acorn-prod-etl".target_acorn_prod_acorn_projectbudgets projectbudgets
INNER JOIN "acorn-prod-etl".target_acorn_prod_acorn_projectbudgetyears
projectbudgetyears ON (projectbudgets.id = projectbudgetyears.projectbudgetid))
INNER JOIN "acorn-prod-etl".target_acorn_prod_acorn_projectbudgetlineitems
projectbudgetlineitems ON (projectbudgetyears.id =
projectbudgetlineitems.projectbudgetyearid))
WHERE projectbudgetlineitems.costitemid NOT IN
(462561,462562,462563,462564,462565,462566,478030)
GROUP BY projectbudgets.projectid, projectbudgetyears.year,
projectbudgetyears.status
;

Compare the same table and fetch the satisfied results

I am trying to achieve the below requirement and need some help.
I created the below query,
SELECT * from
(
select b.extl_acct_nmbr, b.TRAN_DATE, b.tran_time,
case when (a.amount > b.amount) then b.amount
end as amount
,b.ivst_grup, b.grup_prod, b.pensionpymt
from ##pps a
join #pps b
on a.extl_acct_nmbr = b.extl_acct_nmbr
where a.pensionpymt <=2 and b.pensionpymt <=2) rslt
where rstl.amount is not null
Output I am getting,
Requirement is to get
The lowest amount row having same account number. (Completed and getting in the output)
In case both the amounts are same for same account (get the pensionpymt =1) (not sure how to get)
In case only one pensionpymt there add that too in the result set. (not sure how to get)
could you please help, expected output should be like this,

you can use window function:
select * from (
select * , row_number() over (partition by extl_acct_nmbr order by amount asc,pensionpymt) rn
from ##pps a
join #pps b
on a.extl_acct_nmbr = b.extl_acct_nmbr
) t
where rn = 1

Use of MAX function in SQL query to filter data

The code below joins two tables and I need to extract only the latest date per account, though it holds multiple accounts and history records. I wanted to use the MAX function, but not sure how to incorporate it for this case. I am using My SQL server.
Appreciate any help !
select
PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label
from
Property.dbo.PROP
inner join
Property.dbo.PROP_DATA on Property.dbo.PROP.FileID = Actuarial.dbo.PROP_DATA.FileID
where
(PROP_DATA.Label in ('Occupancy' , 'OccupancyTIV'))
and (PROP.EffDate >= '42278' and PROP.EffDate <= '42643')
and (PROP.Status = 'Bound')
and (Prop.FileTime = Max(Prop.FileTime))
order by
PROP.EffDate DESC

Assuming your DBMS supports windowing functions and the with clause, a max windowing function would work:
with all_data as (
select
PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label,
max (PROP.EffDate) over (partition by PROP.PolNo) as max_date
from Actuarial.dbo.PROP
inner join Actuarial.dbo.PROP_DATA
on Actuarial.dbo.PROP.FileID = Actuarial.dbo.PROP_DATA.FileID
where (PROP_DATA.Label in ('Occupancy' , 'OccupancyTIV'))
and (PROP.EffDate >= '42278' and PROP.EffDate <= '42643')
and (PROP.Status = 'Bound')
and (Prop.FileTime = Max(Prop.FileTime))
)
select
FileName, InsName, Status, FileTime, SubmissionNo,
PolNo, EffDate, ExpDate, Region, UnderWriter, Data, Label
from all_data
where EffDate = max_date
ORDER BY EffDate DESC
This also presupposes than any given account would not have two records on the same EffDate. If that's the case, and there is no other objective means to determine the latest account, you could also use row_numer to pick a somewhat arbitrary record in the case of a tie.

Using straight SQL, you can use a self-join in a subquery in your where clause to eliminate values smaller than the max, or smaller than the top n largest, and so on. Just set the number in <= 1 to the number of top values you want per group.
Something like the following might do the trick, for example:
select
p.FileName
, p.InsName
, p.Status
, p.FileTime
, p.SubmissionNo
, p.PolNo
, p.EffDate
, p.ExpDate
, p.Region
, p.Underwriter
, pd.Data
, pd.Label
from Actuarial.dbo.PROP p
inner join Actuarial.dbo.PROP_DATA pd
on p.FileID = pd.FileID
where (
select count(*)
from Actuarial.dbo.PROP p2
where p2.FileID = p.FileID
and p2.EffDate <= p.EffDate
) <= 1
and (
pd.Label in ('Occupancy' , 'OccupancyTIV')
and p.Status = 'Bound'
)
ORDER BY p.EffDate DESC
Have a look at this stackoverflow question for a full working example.

Not tested
with temp1 as
(
select foo
from bar
whre xy = MAX(xy)
)
select PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label
from Actuarial.dbo.PROP
inner join temp1 t
on Actuarial.dbo.PROP.FileID = t.dbo.PROP_DATA.FileID
ORDER BY PROP.EffDate DESC

SQL subquery in the AND statement

A couple problems.
Solved valid_from_tsp <> max(valid_from_tsp) - how can I get my query to filter based on not being the max date? This idea doesn't work The error being returned is: "Improper use of an aggregate function in a WHERE clause"
My second issue is when I run it without the date, I am returned a syntax error: Syntax error, expected something like 'IN' keyword or 'CONTAINS' keyword between ')' and ')'
What do you see that I don't? Thanks in advance
Edited Query
select
a.*,
b.coverage_typ_cde as stg_ctc
from P_FAR_BI_VW.V_CLAIM_SERVICE_TYP_DIM a
inner join (select distinct etl_partition_id, coverage_typ_cde from
P_FAR_STG_VW.V_CLAIM_60_POLICY_STG where row_Create_tsp > '2013-11-30 23:23:59')b
on (a.etl_partition_id = b.etl_partition_id)
where a.valid_from_tsp > '2013-11-30 23:23:59'
and a.coverage_typ_cde = ' '
and (select * from P_FAR_SBXD.T_CLAIM_SERVICE_TYP_DIM where service_type_id = 136548255
and CAST(valid_from_tsp AS DATE) <> '2014-03-14')
Trouble part: and (select * from P_FAR_SBXD.T_CLAIM_SERVICE_TYP_DIM where service_type_id = 136548255
and CAST(valid_from_tsp AS DATE) <> '2014-03-14')
I am trying to filter by the date on the service_type_id, and I am getting the error in question 2
As for sample data: This is kinda tricky, This query returns many thousands of rows of data. Currently when I do the inner join, I get a secondary unique index violation error. So I am trying to filter out everything but the more recent which could be under that violation (service_type_id is the secondary index)
If I bring back three rows with the service_type_id with three different valid_from_tsp timestamps, I only want to keep the newest one, and in the query, not return the other two.

I don't know about your second question, but your first error is due to using an aggregate function max in a where clause. I'm not really sure what you want to do here, but a quick fix is to replace max(valid_from_tsp) with a subquery that only returns the maximum value.

This is your query:
select a.*, b.coverage_typ_cde as stg_ctc
from P_FAR_BI_VW.V_CLAIM_SERVICE_TYP_DIM a inner join
(select distinct etl_partition_id, coverage_typ_cde
from P_FAR_STG_VW.V_CLAIM_60_POLICY_STG
where row_Create_tsp > '2013-11-30 23:23:59'
) b
on (a.etl_partition_id = b.etl_partition_id)
where a.valid_from_tsp > '2013-11-30 23:23:59' and
a.coverage_typ_cde = ' ' and
(select *
from P_FAR_SBXD.T_CLAIM_SERVICE_TYP_DIM
where service_type_id = 136548255 and
CAST(valid_from_tsp AS DATE) <> '2014-03-14'
);
In general, you cannot have a subquery just there in the where clause with no condition. Some databases might allow a scalar subquery in this context (one that returns one row and one column), but this isn't a scalar subquery. You can fix the syntax by using exists:
where a.valid_from_tsp > '2013-11-30 23:23:59' and
a.coverage_typ_cde = ' ' and
exists (select 1
from P_FAR_SBXD.T_CLAIM_SERVICE_TYP_DIM
where service_type_id = 136548255 and
CAST(valid_from_tsp AS DATE) <> '2014-03-14'
);

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Group By & Having vs. SubQuery (Where Count is Greater Than 1) - sql

Based on what you've got so far, I think you want to use this instead: where r.ORIG_SLTRN_ID in (select ORIG_SLTRN_ID from saletran_refund group by ORIG_SLTRN_ID having count (*) > 1) That will give you the ORIG_SLTRN_IDs that have more than one row.

Related

Trying to combine 2 columns in sql query

SQL CASE WHEN ELSE not working in AWS Athena

Compare the same table and fetch the satisfied results

Use of MAX function in SQL query to filter data

SQL subquery in the AND statement

Categories

Resources