Redshift SQL Case Statement and WHERE Clause not working - sql

I'm running the following SQL script to create a table with information on which marketing efforts are driving people to my site for form fills.
The case statement for marketing source stamp does not work. My results still have values for search-brand-whatever for the marketing_source column when those values should now just read "google-adwords".
I'm also getting results with a createddate of 9/15/2015 and after when my filter clearly states createddate >= '1/1/2016'.
Any ideas?
create table temp.roi_inqs as (
Select
a.ID,
cast(a.createddate as date) as date,
CASE
WHEN a.marketing_source_stamp__c = 'search%' then 'google-adwords'
ELSE a.marketing_source_stamp__c
END AS marketing_source,
CASE
when a.contactid is null then b.email
when a.contactid is not null then c.email
end as prospect_email
from rjm_current.sf_campaignmember a
left join rjm_current.sf_lead b on b.id = a.leadid
left join rjm_current.sf_contact c on c.id = a.contactid
where cast(a.createddate as date) >= '1/1/2016'
AND (
campaign_marketing_type__c in ('A','C'))
OR
(campaign_marketing_type__c = 'B'
AND a.status in ('Registered','Attended','No Show')));

Your entire predicate is of the form
A AND B OR C
If you don't put any parentheses, this means you're querying
(A AND B) OR C
When you really meant to write
A AND (B OR C)
As a general rule: Always put parentheses if you're mixing AND and OR

#Lukas Eder Answered my question for the date issue.
For the case statement I had to change the marketing_source_stamp__c = 'search%' to marketing_source_stamp__c LIKE 'search%' and that issue is solved.
CASE
WHEN a.marketing_source_stamp__c LIKE 'search%' then 'google-adwords'
ELSE a.marketing_source_stamp__c
END AS marketing_source,

Related

select only one value in 1:N relation

i want to have only one value in the result of the query which is the first value, or the last value, i tried many things but i coudnt resolve it, the query is too long but i picked for you only the part where i am stucked.
select eccev.extra_data , c.id,
case when (eccev.extra_data::json->'tns')::VARCHAR = 'false'
then 'NON'
else case when coalesce((eccev.extra_data::json->'tns')::VARCHAR, '') = '' then 'EMPTY VALUE' else 'OUI'
end end as tns
from endorsement_contract_covered_element_version eccev, endorsement_contract_covered_element ecce, endorsement_contract ec, contract c, endorsement e, party_party pp
WHERE ec.endorsement = e.id
and e.applicant = pp.id
and c.subscriber = pp.id
AND eccev.covered_element_endorsement = ecce.id
and ecce.contract_endorsement = ec.id
and c.contract_number = 'CT20200909112'
with this query i have the result
{"qualite":"non_etu","tns":false} 199479 NON
{"qualite":"non_etu","tns":false} 199479 NON
{"qualite":"non_etu","tns":false} 199479 NON
i want to have only the first or the last row so i dont have repetition on the other rows, i saw that we can use first_value(X over (XX)) but i couldnt make it.
if u guys can help me, i would be gratefull
Thanks
you can try this
select distinct on (eccev.extra_data , c.id) eccev.extra_data , c.id, case when ...
but your query seems not optimized as you cross join 6 tables all together ...

Substitute join to leave only one 'Table Scan'

I have financials data. And want to calculate Shareholder's Equity. This is basically how it looks like:
I have the following query which works:
SELECT a.Ticker, a.Value - l.Value as 'ShareholdersEquity'
FROM FinData a
JOIN FinData l
ON a.Ticker = l.Ticker AND a.Date = l.Date
WHERE a.Type = 'assets'
AND l.Type = 'liabilities'
But for a table with many records this will work slowly because when I check the query with Explain (I use Azure Data Studio) and it makes 2 table scans, which means more time. How can I rewrite it to be faster?
You could try conditional aggregation rather than a self-join:
select ticker, date,
sum(case when type = 'asset' then value else - value end) as ShareholdersEquity
from findata
where type in ('asset', 'liabilities')
group by ticker, date

Generating missing report submissions via reference table

I have a SQL query i'm currently working on which i would greatly appreciate some help with.
Here is a simplified version of the view I've been given to work on:
SELECT a.Organisation_Name
,a.Org_Id
,b.Activity_month
,SUM(b.Activity_Plan) 'Plan_Activity'
,SUM(b.Activity_Actual) 'Actual_Activity'
,SUM(b.Price_Actual) 'Actual_Price'
,SUM(b.Price_Plan) 'Plan_Price'
,COUNT(b.Instances) AS 'Record_Count'
,CASE WHEN COUNT(b.Instances) > 0 THEN 'Yes' ELSE 'No' END AS Submitted
FROM [ExampleDatabase].[dbo].[Organisation_Reference] a
LEFT JOIN [ExampleDatabase].[dbo].[Report_Submissions] b
ON a.Org_Id = b.Org_Id
AND ([Exmaple_Code] LIKE ('X') or [Example_Code] = 'X')
WHERE a.Category_Flag = 1
AND a.Example_Code in ('X','X','X','X','X')
GROUP BY
a.Organisation_Name
,a.Org_Id
,b.Activity_month
--
The Activity Month field is an Integer rather than a date, currently ranging from 1-8.
The problem i am facing is that within the [Report_Submissions] table, it only contains organisations which have actually submitted the reports, whereas the
[Organisation_Reference] table lists all the organisations which should be submitting.
Where the organisations have submitted the reports, the data is perfect and gives me a run down of all the details i need for each individual month.
Obviously if an organisation hasn't submitted then this detail wouldn't be available, but i do need to have a complete list of all organisations listed from the reference table for each individual month and whether they have submitted the reports or not.
At the moment where the 'Submitted' field = 'No' it's only bringing back one record for each organisation that has never submitted (With Activity_month coming through as null) and if an organisation has only submitted once or twice then it will include those submissions but still be missing the rest of the months from the result set.
I've tried various different joins etc. but I seem to be drawing a blank for a solution. Is there a way of generating this information within the script? Any advice would be great!
Kind Regards,
Mark
Since you just need numbers 1-8, using a subquery in your join to cross apply(values ()) to your Organisation_Reference table works well and does not make the query much more compliCated to read.
select
a.Organisation_Name
, a.Org_Id
, a.Activity_Month
, sum(b.Activity_Plan) 'Plan_Activity'
, sum(b.Activity_Actual) 'Actual_Activity'
, sum(b.Price_Actual) 'Actual_Price'
, sum(b.Price_Plan) 'Plan_Price'
, count(b.Instances) as 'record_count'
, case when count(b.Instances) > 0 then 'yes' else 'no' end as Submitted
from (
select o.*, t.Activity_Month
from [ExampleDatabase].[dbo].[Organisation_Reference] as o
cross apply (values (1),(2),(3),(4),(5),(6),(7),(8)) t(Activity_Month)
) as a
left join [ExampleDatabase].[dbo].[Report_Submissions] b
on a.Org_Id = b.Org_Id
and a.Activity_Month = b.Activity_Month
and ([exmaple_Code] like ('X') or [Example_Code] = 'X')
where a.Category_Flag = 1
and a.Example_Code in ('X','X','X','X','X')
group by
a.Organisation_Name
, a.Org_Id
, b.Activity_Month
You could also cross join with a numbers/tally table, or use a common table expression to generate the range of numbers you need. I would recommend either of those options as well, especially if your logic was more compliCated.
If Report_Submissions contains all of the months you want in your query, you could cross join the distinct Activity_Months from that table to your Organisation_Reference table.
select
a.Organisation_Name
, a.Org_Id
, a.Activity_Month
, sum(b.Activity_Plan) 'Plan_Activity'
, sum(b.Activity_Actual) 'Actual_Activity'
, sum(b.Price_Actual) 'Actual_Price'
, sum(b.Price_Plan) 'Plan_Price'
, count(b.Instances) as 'record_count'
, case when count(b.Instances) > 0 then 'yes' else 'no' end as Submitted
from (
select o.*, t.Activity_Month
from [ExampleDatabase].[dbo].[Organisation_Reference] as o
cross join (select distinct Activity_Month from Report_Submissions) t
) as a
left join [ExampleDatabase].[dbo].[Report_Submissions] b
on a.Org_Id = b.Org_Id
and a.Activity_Month = b.Activity_Month
and ([exmaple_Code] like ('X') or [Example_Code] = 'X')
where a.Category_Flag = 1
and a.Example_Code in ('X','X','X','X','X')
group by
a.Organisation_Name
, a.Org_Id
, b.Activity_Month

Like SQL Statement SQL Server 2008R2

I have a simple query but I have mutiple records I need to filter out. I'm using the like statment with wild cards. Is there a better way do do this then writing out each one? Can I create a udf, table that it refrences? How? If I can. Thanks :)
SELECT a.SalesOrderNo ,
a.ShipExpireDate ,
a.CustomerNo ,
b.ItemCode ,
b.LineKey ,
b.QuantityOrdered ,
b.QuantityShipped ,
b.ItemCodeDesc ,
b.ExplodedKitItem
FROM dbo.SO_SalesOrderHeader a
LEFT JOIN dbo.SO_SalesOrderDetail b
ON a.SalesOrderNo = b.SalesOrderNo
WHERE b.ItemType = '1'
AND b.ItemCodeDesc NOT LIKE '%Cert%'
AND b.ItemCodeDesc NOT LIKE '%Fee%'
AND b.ItemCodeDesc NOT LIKE '%Tag%'
AND b.ItemCode NOT LIKE 'GF%'
AND b.ItemCode NOT LIKE 'PXDIALPREP'
AND b.ItemCode NOT LIKE '/C%'
AND a.ShipExpireDate = CONVERT(DATE, GETDATE(), 101)
Here's a different design that lets you put ItemCodeDesc in a seperate table (this could also be a TVF). I can't comment on performance though.
On a different note, be aware that because you are outer joining to sales order detail, this table can have NULL records. In turn your b.ItemType = '1' will always be FALSE when ItemType is NULL. So you may as well make it an inner join (and you might find your query plan is doing that anyway)
SELECT a.SalesOrderNo ,
a.ShipExpireDate ,
a.CustomerNo ,
b.ItemCode ,
b.LineKey ,
b.QuantityOrdered ,
b.QuantityShipped ,
b.ItemCodeDesc ,
b.ExplodedKitItem
FROM dbo.SO_SalesOrderHeader a
LEFT JOIN dbo.SO_SalesOrderDetail b
ON a.SalesOrderNo = b.SalesOrderNo
WHERE b.ItemType = '1'
AND b.ItemCode NOT LIKE 'GF%'
AND b.ItemCode NOT LIKE 'PXDIALPREP'
AND b.ItemCode NOT LIKE '/C%'
AND a.ShipExpireDate = CONVERT(DATE, GETDATE(), 101)
AND NOT EXISTS (
SELECT 1 FROM dbo.MappingTable MT
WHERE b.ItemCodeDesc LIKE MT.ItemCodeDesc
)
Note: I am guessing that your criteria is meant to filter out item types that can't be shipped (like Fees), adjust as per your requirements.
The problem you are encountering is a result of discrete values being stored in an ID. Looks like you should have a column IsShippable, or better yet a code table for ItemCodeType with rows of Cert, Fee, Tag, etc. and the IsShippable column there. if you had a code table then you'd be able to do
inner join ItemCodeTypes ict on ict.ItemCodeTypeId = b.ItemCodeTypeId and ict.IsShippable = 1
Cert, Fee, Tag, rows in the ItemCodeTypes table would have IsShippable = 0:
Id | Name | IsShippable
1 Cert 0
2 Fee 0
3 Tag 0
4 Product 1
5 Book 1
Edit: To more directly answer your question, you could make a view like this, and then when you query from it easily filter on Where IsShippable = 1:
Select CASE
When b.ItemCodeDesc LIKE '%Cert%' Then 0
When b.ItemCodeDesc LIKE '%Fee%' Then 0
--etc.
Else 1
END as IsShippable
,*
From dbo.SO_SalesOrderDetail

"Invalid Identifier" when joining on a column I created in the SELECT Statement

Note: My research turned up this question, which provides a possible solution to my issue, but my question is more general: is that the kind of solution I should go for?
I would like to query an academic history database to give me a record for each pair of calculus classes a particular student has taken where one is the prerequisite of the other. If the database was set up nicely or the course numbering was reasonable, I could do:
SELECT ...
FROM Academic_History PrerequisiteCourse
JOIN Academic_History NextCourse
ON (NextCourse.CalculusLevel = PrerequisiteCourse.CalculusLevel + 1)
WHERE ...
Of course the CalculusLevel field doesn't exist, so this is nonsense. Also, there are several course numbers that qualify as Calculus I, and several that qualify as Calculus II, and so on, and these change fairly often. That makes hardcoding all the prerequisite pairings into the JOIN statement like this a really bad idea:
SELECT ...
FROM Academic_History PrerequisiteCourse
JOIN Academic_History NextCourse
ON (NextCourse.CourseNumber = '231' AND PrerequisiteCourse.CourseNumber = '220'
OR NextCourse.CourseNumber = '231' AND PrerequisiteCourse.CourseNumber = '221'
OR NextCourse.CourseNumber = '241' AND PrerequisiteCourse.CourseNumber = '231'
OR NextCourse.CourseNumber = '24-' AND PrerequisiteCourse.CourseNumber = '231'
...)
WHERE ...
What I feel like I should do is create my "CalculusLevel" field on the fly, which would be much easier to maintain:
SELECT CASE PrerequisiteCourse.CRS_NBR
WHEN '115' THEN '0'
WHEN '220' THEN '1'
WHEN '221' THEN '1'
...
END PrerequisiteCourseLevel,
CASE NextCourse.CRS_NBR
WHEN '115' THEN '0'
WHEN '220' THEN '1'
WHEN '221' THEN '1'
...
END NextCourseLevel,
FROM Academic_History PrerequisiteCourse
JOIN Academic_History NextCourse
ON (PrerequisiteCourseLevel + 1 = NextCourseLevel)
WHERE ...
But of course the join doesn't work, since those columns are not in those tables. Even if I move the condition out of the JOIN ON and into the WHERE clause, though, I get an "Invalid Identifier" error, presumably because these fields don't exist yet when the WHERE clause is being executed.
What's the right way to do this? I've come up with a couple solutions like the one I mentioned in the second code block, but they all feel like unprofessional hacks.
Thanks!
You could add reusable columns using a CTE:
;with hist as
(
select case ... end as NextCourseLevel
, case ... end as PrerequisiteCourseLevel
, *
from Academic_History
)
select *
from hist t1
join hist t2
on t1.PrerequisiteCourseLevel + 1 = t2.NextCourseLevel
EDIT: Per your comment, you can refactor the with statement by expanding it everywhere it's used:
select *
from (
select case ... end as PrerequisiteCourseLevel
, *
from Academic_History
) as t1
join (
select case ... end as NextCourseLevel
, *
from Academic_History
) as t2
on t1.PrerequisiteCourseLevel + 1 = t2.NextCourseLevel