SQL subquery in the AND statement - sql

A couple problems.
Solved valid_from_tsp <> max(valid_from_tsp) - how can I get my query to filter based on not being the max date? This idea doesn't work The error being returned is: "Improper use of an aggregate function in a WHERE clause"
My second issue is when I run it without the date, I am returned a syntax error: Syntax error, expected something like 'IN' keyword or 'CONTAINS' keyword between ')' and ')'
What do you see that I don't? Thanks in advance
Edited Query
select
a.*,
b.coverage_typ_cde as stg_ctc
from P_FAR_BI_VW.V_CLAIM_SERVICE_TYP_DIM a
inner join (select distinct etl_partition_id, coverage_typ_cde from
P_FAR_STG_VW.V_CLAIM_60_POLICY_STG where row_Create_tsp > '2013-11-30 23:23:59')b
on (a.etl_partition_id = b.etl_partition_id)
where a.valid_from_tsp > '2013-11-30 23:23:59'
and a.coverage_typ_cde = ' '
and (select * from P_FAR_SBXD.T_CLAIM_SERVICE_TYP_DIM where service_type_id = 136548255
and CAST(valid_from_tsp AS DATE) <> '2014-03-14')
Trouble part: and (select * from P_FAR_SBXD.T_CLAIM_SERVICE_TYP_DIM where service_type_id = 136548255
and CAST(valid_from_tsp AS DATE) <> '2014-03-14')
I am trying to filter by the date on the service_type_id, and I am getting the error in question 2
As for sample data: This is kinda tricky, This query returns many thousands of rows of data. Currently when I do the inner join, I get a secondary unique index violation error. So I am trying to filter out everything but the more recent which could be under that violation (service_type_id is the secondary index)
If I bring back three rows with the service_type_id with three different valid_from_tsp timestamps, I only want to keep the newest one, and in the query, not return the other two.

I don't know about your second question, but your first error is due to using an aggregate function max in a where clause. I'm not really sure what you want to do here, but a quick fix is to replace max(valid_from_tsp) with a subquery that only returns the maximum value.

This is your query:
select a.*, b.coverage_typ_cde as stg_ctc
from P_FAR_BI_VW.V_CLAIM_SERVICE_TYP_DIM a inner join
(select distinct etl_partition_id, coverage_typ_cde
from P_FAR_STG_VW.V_CLAIM_60_POLICY_STG
where row_Create_tsp > '2013-11-30 23:23:59'
) b
on (a.etl_partition_id = b.etl_partition_id)
where a.valid_from_tsp > '2013-11-30 23:23:59' and
a.coverage_typ_cde = ' ' and
(select *
from P_FAR_SBXD.T_CLAIM_SERVICE_TYP_DIM
where service_type_id = 136548255 and
CAST(valid_from_tsp AS DATE) <> '2014-03-14'
);
In general, you cannot have a subquery just there in the where clause with no condition. Some databases might allow a scalar subquery in this context (one that returns one row and one column), but this isn't a scalar subquery. You can fix the syntax by using exists:
where a.valid_from_tsp > '2013-11-30 23:23:59' and
a.coverage_typ_cde = ' ' and
exists (select 1
from P_FAR_SBXD.T_CLAIM_SERVICE_TYP_DIM
where service_type_id = 136548255 and
CAST(valid_from_tsp AS DATE) <> '2014-03-14'
);

Related

Trying to combine 2 columns in sql query

SELECT SubscriberKey, COUNT(*) AS TotalSentLast180Days
FROM (
SELECT s.SubscriberKey
FROM ENT._Sent s
INNER JOIN ENT.AllSubscribershistroyland ROCS
ON ROCS.SubscriberKey = s.SubscriberKey
WHERE ROCS.SLSegment__c = 'S4 - real-love'
AND 'S6 - real-love'
AND s.OYBAccountID = '85208879'
AND s.EventDate >= DATEADD(DAY, -180, GETDATE())
) t
GROUP BY SubscriberKey
So in the " AllSubscribershistroyland " their are 2 columns that are called 'S4 - Dutch real-love' and 'S6 - real-love'. Im trying to run the query to see how many subscribers are in those 2 columns. I cant seem to combine them but when i run the query for example with one column i do get a result back. I tried the ' AND ' to combine the 2 columns but i get an error code of "
Error saving the Query field. An expression of non-boolean type specified in a context where a condition is expected, near 'AND'.** "
if anyone can help me i would be very grateful
I am not entirely sure what you're trying to do, but if you are just trying to include both of your titles in the where clause for the same column then you could use the IN Clause:
SELECT SubscriberKey, COUNT(*) AS TotalSentLast180Days
FROM (
SELECT s.SubscriberKey
FROM ENT._Sent s
INNER JOIN ENT.AllSubscribershistroyland ROCS
ON ROCS.SubscriberKey = s.SubscriberKey
WHERE ROCS.SLSegment__c in ('S4 - real-love', 'S6 - real-love')
AND s.OYBAccountID = '85208879'
AND s.EventDate >= DATEADD(DAY, -180, GETDATE())
) t
GROUP BY SubscriberKey

SQL CASE WHEN ELSE not working in AWS Athena

I have the script below setup in AWS Athena, the goal is to replace some budget numbers (total) with 0 if they are within a certain category (costitemid). I'm getting the following error in AWS Athena and could use some advice as to why it isn't working. Is the problem that I need to repeat everything in the FROM and GROUP BY in the WHEN and ELSE? Code below the error. Thank you!
SYNTAX_ERROR: line 6:9: 'projectbudgets.projectid' must be an aggregate expression or appear in GROUP BY clause
This query ran against the "acorn-prod-reports" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: 077f007b-61a0-4f6b-aa1f-dd38bb401218
SELECT
CASE
WHEN projectbudgetlineitems.costitemid IN (462561,462562,462563,462564,462565,462566,478030) THEN (
SELECT
projectbudgets.projectid
, projectbudgetyears.year fiscalYear
, projectbudgetyears.status
, "sum"(((0 * projectbudgetlineitems.unitcost) * (projectbudgetlineitems.costshare * 1E-2))) total
)
ELSE (
SELECT
projectbudgets.projectid
, projectbudgetyears.year fiscalYear
, projectbudgetyears.status
, "sum"(((projectbudgetlineitems.quantity * projectbudgetlineitems.unitcost) * (projectbudgetlineitems.costshare * 1E-2))) total
)
END
FROM
(("acorn-prod-etl".target_acorn_prod_acorn_projectbudgets projectbudgets
INNER JOIN "acorn-prod-etl".target_acorn_prod_acorn_projectbudgetyears projectbudgetyears ON (projectbudgets.id = projectbudgetyears.projectbudgetid))
INNER JOIN "acorn-prod-etl".target_acorn_prod_acorn_projectbudgetlineitems projectbudgetlineitems ON (projectbudgetyears.id = projectbudgetlineitems.projectbudgetyearid))
--WHERE (((projectbudgetlineitems.costitemid <> 478030) AND (projectbudgetlineitems.costitemid < 462561)) OR (projectbudgetlineitems.costitemid > 462566))
GROUP BY projectbudgets.projectid, projectbudgetyears.year, projectbudgetyears.status
Your syntax is wrong (at least according to most SQL dialects.) You can't generally say "SELECT CASE WHEN (condition) THEN (this select clause) ELSE (that select clause) END FROM (tables)"
You can only use CASE to calculate a single value.
But it looks as if the only change between your two inner SELECT clauses is whether you use 0 or the quantity in the final multiplication. And that is perfect for a CASE!
I do not guarantee this will work right off the bat, because I don't have your setup or an idea of your table layout. However, it's a step in the right direction:
SELECT
projectbudgets.projectid
, projectbudgetyears.year fiscalYear
, projectbudgetyears.status
, "sum"(
((
CASE
WHEN projectbudgetlineitems.costitemid IN (462561,462562,462563,462564,462565,462566,478030)
THEN 0
ELSE projectbudgetlineitems.quantity
END * projectbudgetlineitems.unitcost
) * (
projectbudgetlineitems.costshare * 1E-2
))) total
FROM
(("acorn-prod-etl".target_acorn_prod_acorn_projectbudgets projectbudgets
INNER JOIN
"acorn-prod-etl".target_acorn_prod_acorn_projectbudgetyears projectbudgetyears
ON (projectbudgets.id = projectbudgetyears.projectbudgetid))
INNER JOIN "acorn-prod-etl".target_acorn_prod_acorn_projectbudgetlineitems projectbudgetlineitems
ON (projectbudgetyears.id = projectbudgetlineitems.projectbudgetyearid))
GROUP BY
projectbudgets.projectid, projectbudgetyears.year, projectbudgetyears.status
This could solve your problem if you want to sum the items for each project and year and status except for certain line items. Here, it is correct to use a "where" condition and not "case when" :
SELECT
projectbudgets.projectid,
projectbudgetyears.year,
projectbudgetyears.status,
"sum"(((projectbudgetlineitems.quantity * projectbudgetlineitems.unitcost) *
(projectbudgetlineitems.costshare * 1E-2))) total
FROM
(("acorn-prod-etl".target_acorn_prod_acorn_projectbudgets projectbudgets
INNER JOIN "acorn-prod-etl".target_acorn_prod_acorn_projectbudgetyears
projectbudgetyears ON (projectbudgets.id = projectbudgetyears.projectbudgetid))
INNER JOIN "acorn-prod-etl".target_acorn_prod_acorn_projectbudgetlineitems
projectbudgetlineitems ON (projectbudgetyears.id =
projectbudgetlineitems.projectbudgetyearid))
WHERE projectbudgetlineitems.costitemid NOT IN
(462561,462562,462563,462564,462565,462566,478030)
GROUP BY projectbudgets.projectid, projectbudgetyears.year,
projectbudgetyears.status
;

ERROR: Subquery evaluated to more than one row. in SAS

I wrote the following code in SAS in order to select the record with egrefid not equal to 3 grouped by subjid and cpevent, but was told "ERROR: Subquery evaluated to more than one row."
case when (select count(egrefid) from INFMM.EDAT_EG004
group by subjid, cpevent
having count(egrefid) ne 3)
and cpevent in ('DAY1', 'DAY29', 'DAY85') then 'triplicate'
else ' ' end as flag
I think the problem is in the count() function, but don't know how to fix it.
Does anybody know how to solve this problem?
The case when is evaluated for each line. Your subquery will return all unique subjid and cpevent pairs in your INFMM.EDAT_EG004 table.
I think a join would be your best bet in this instance
create table egrefid_counts as
select subjid, cpevent, count(egrefid) as egrefid_count
from INFMM.EDAT_EG004
where cpeven in ('DAY1', 'DAY29', 'DAY85')
group by subjid, cpevent
;
Then you join that to your table on subjid and cpevent
select a.*, case when b.egrefid_count = 3 then 'triplicate'
else ' ' end as flag
from <whatever your table is> as a
left join
egrefid_count as b
on a.subjid=b.subjid and a.cpevent = b.cpevent

SQL: Filter records based on record creation date and other criteria

I am struggling to find a better solution to pick unique records from my user call data table.
My table structure is as follows:
SELECT [MarketName],
[WebsiteName] ,
[ID] ,
[UserID],
[CreationDate],
[CallDuration],
[FromPhone] ,
[ToPhone],
[IsAnswered],
[Source]
FROM [dbo].[UserCallData]
There are multiple entries in this table with different and same ID's. I wanted to check if [FromPhone] and [ToPhone] exists multiple times within last 3 months, if yes, I wanted to pick the first record with all columns based on [CreationDate], count the number of occurrences as TotalCallCount and sum the totalCallDuration as a single record. If [FromPhone] and [ToPhone] does not occur multiple times, I wanted to pick all columns as such. I have been able to put up partial query like below. But this doesn't return all columns without including in group by clause and also it doesn't satisfy my entire criteria. Any help on this would be highly appreciated.
select [FromPhone],
MIN([CreationDate]),
[ToPhone],
marketname,
count(*) as TotalCallCount ,
sum(CallDuration) as TotalCallDuration
from [dbo].[UserCallData]
where [CreationDate] >= DATEADD(MONTH, -3, GETDATE())
group by [FromPhone],[ToPhone], marketname
having count([FromPhone]) > 1 and count([ToPhone]) >1
Try to use ROW_NUMBER()
;with cte as
(
select *, ROW_NUMBER() OVER(PARTITION BY FromPhone, ToPhone ORDER BY CreationDate) as RN
from UserCallData
where CreationDate >= DATEADD(MONTH, -3, GETDATE())
),
cte_totals as
(
select C1.FromPhone, C1.ToPhone, COUNT(*) as TotalCallCount, SUM(CallDuration) as TotalCallDuration
from cte C1
where exists(select * from cte C2 where C1.FromPhone = C2.FromPhone and C1.ToPhone = C2.ToPhone and C2.RN > 1)
group by C1.FromPhone, C1.ToPhone
)
select C1.*, TotalCallCount, TotalCallDuration
from cte C1
inner join cte_totals C2 on C1.FromPhone = C2.FromPhone and C1.ToPhone = C2.ToPhone
where C1.RN = 1
I wrote query right in here so it can have some mistakes or mistypes, but the main idea might be clear.
I'm not entirely sure I've understood the question, but if I have the following may be what you want (or be a useful starting point):
SELECT
ucd.FromPhone,
min(ucd.CreationDate) as MinCreationDate,
ucd.ToPhone,
ucd.MarketName,
count(*) as TotalCallCount,
sum(ucd.CallDuration) as TotalCallDuration,
case
when min(ucd.WebsiteName) = max(ucd.WebsiteName) then min(ucd.WebsiteName)
else '* Various'
end as WebsiteName,
case
when min(ucd.ID) = max(ucd.ID) then min(ucd.ID)
else '* Various'
end as ID,
case
when min(ucd.UserID) = max(ucd.UserID) then min(ucd.UserID)
else '* Various'
end as UserID,
case
when min(ucd.IsAnswered) = max(ucd.IsAnswered) then min(ucd.IsAnswered)
else '* Some'
end as IsAnswered,
case
when min(ucd.Source) = max(ucd.Source) then min(ucd.Source)
else '* Various'
end as Source
FROM
dbo.UserCallData ucd
WHERE
ucd.CreationDate >= DATEADD(MONTH, -3, GETDATE())
GROUP BY
ucd.FromPhone,
ucd.ToPhone,
ucd.MarketName
Where we are collapsing rows together, if all the rows agree on a given column (so min(Field) = max(Field)), I return the min(Field) value (which is the same all the others, but avoid problems with needing additional "group by" clauses which would interfere with the other cases). Where they don't all agree, I've returned "* something".
The code assumes that all the columns are text type columns (you haven't said), you may get conversion errors. It also assumes that none of these fields are null. You / we can adapt the code if those assumptions aren't correct. If you aren't able to do that for yourself, let me know about issues, I'll be happy to do what I can.

Group By & Having vs. SubQuery (Where Count is Greater Than 1)

I'm struggling here trying to write a script that finds where an order was returned multiple times by the same associate (count greater than 1). I'm guessing my syntax with the subquery is incorrect. When I run the script, I get a message back that the "SELECT failed.. [3669] More than one value was returned by the subquery."
I'm not tied to the subquery, and have tried using just the group by and having statements, but I get an error regarding a non-aggregate value. What's the best way to proceed here and how do I fix this?
Thank you in advance - code below:
SEL s.saletran
, s.saletran_dt SALE_DATE
, r.saletran_id RET_TRAN
, r.saletran_dt RET_DATE
, ra.user_id RET_ASSOC
FROM salestrans s
JOIN salestrans_refund r
ON r.orig_saletran_id = s.saletran_id
AND r.orig_saletran_dt = s.saletran_dt
AND r.orig_loc_id = s.loc_id
AND r.saletran_dt between s.saletran_dt and s.saletran_dt + 30
JOIN saletran rt
ON rt.saletran_id = r.saletran_id
AND rt.saletran_dt = r.saletran_dt
AND rt.loc_id = r.loc_id
JOIN assoc ra --Return Associate
ON ra.assoc_prty_id = rt.sls_assoc_prty_id
WHERE
(SELECT count(*)
FROM saletran_refund
GROUP BY ORIG_SLTRN_ID
) > 1
AND s.saletran_dt between '2015-01-01' and current_date - 1
Based on what you've got so far, I think you want to use this instead:
where r.ORIG_SLTRN_ID in
(select
ORIG_SLTRN_ID
from
saletran_refund
group by ORIG_SLTRN_ID
having count (*) > 1)
That will give you the ORIG_SLTRN_IDs that have more than one row.
you don't give enough for a full answer but this is a start
group by s.saletran
, s.saletran_dt SALE_DATE
, r.saletran_id RET_TRAN
, r.saletran_dt RET_DATE
, ra.user_id RET_ASSOC
having count(distinct(ORIG_SLTRN_ID)) > 0
this does return more the an one row
run it
SELECT count(*)
FROM saletran_refund
GROUP BY ORIG_SLTRN_ID