Is there a way to make this run without a case statement? [duplicate] - sql

This question already has answers here:
TSQL Pivot without aggregate function
(9 answers)
Closed 1 year ago.
I'm relatively new to coding and SQL so please bear with me.
I'm currently working on a query and I have no idea how to get the infinite loop to stop without using a case statement. When I use the case statement I get each value on its own row rather than the values all together in the combination they're supposed to be in.
Case statement SQL
select
CASE
When Attribute_id = '5024923' Then attribute_value
END Page_Name,
CASE
When Attribute_id = '5024925' Then attribute_value
END Site_Name,
CASE
When Attribute_id = '5024924' Then attribute_value
END Last_Touch_Channel,
count(distinct MASTER_CONTACT_ID) known_contact_count,
count (distinct visitor_id) total_contact_Count,
ACTION_DATE
from Adobe_Analytics_Staging
where ATTRIBUTE_ID in ('5024925','5024924','5024923')
group by ATTRIBUTE_ID, ACTION_DATE, ATTRIBUTE_VALUE
Example:
Error with Case statement:
Column A
Column B
Column C
value1
NULL
NULL
NULL
value2
NULL
NULL
NULL
value3
When in the data it is value1, value2, value3 on the same row.
So I'm trying a new avenue. I suspect the loop is because I'm linking back to the table so many times but I have limited the amount of results to the best of my ability to reduce the amount of records being sent through. Each query works and works fast individually. It's collectively that it slows down a ton.
The reason for joining to the table so many times is because I have to distinguish different types of values within one column.
Note: Not sure if it's relevant but the different values in the table correlate to a specific id number within that that table. Attribute value and attribute ID are different columns
For example in Table A the column looks like this
Column
A
B
C
I have to make it look like this:
Column 1
Column 2
Column 3
A
B
C
select
a.ATTRIBUTE_VALUE,
b.ATTRIBUTE_VALUE,
c.ATTRIBUTE_VALUE,
count(distinct aas.MASTER_CONTACT_ID) known_contact_count,
count (distinct d.visitor_id) total_contact_Count,
aas.ACTION_DATE
from Adobe_Analytics_Staging aas
left join (select ATTRIBUTE_VALUE, VISITOR_ID from Adobe_Analytics_Staging
where Attribute_id = '5024923') a on a.VISITOR_ID = aas.VISITOR_ID
left join (select ATTRIBUTE_VALUE, VISITOR_ID from Adobe_Analytics_Staging
where Attribute_id = '5024925') b on b.VISITOR_ID = aas.VISITOR_ID
left join (select ATTRIBUTE_VALUE, VISITOR_ID from Adobe_Analytics_Staging
where Attribute_id = '5024924') c on c.VISITOR_ID = aas.VISITOR_ID
inner join (select visitor_id from Adobe_Analytics_Staging
where ATTRIBUTE_ID in ('5024923','5024925','5024924')) d
on d.VISITOR_ID = aas.VISITOR_ID
--where aas.VISITOR_ID = '3438634761938550664_6795123974460253552'
group by a.ATTRIBUTE_VALUE, b.ATTRIBUTE_VALUE, c.ATTRIBUTE_VALUE, aas.ACTION_DATE

SELECT
VISITOR_ID,
MAX(CASE WHEN Attribute_id = '5024923' Then attribute_value END) Page_Name,
MAX(CASE WHEN Attribute_id = '5024925' Then attribute_value END) Site_Name,
MAX(CASE WHEN Attribute_id = '5024924' Then attribute_value END) Last_Touch_Channel,
COUNT(distinct MASTER_CONTACT_ID) known_contact_count,
COUNT(distinct visitor_id) total_contact_Count,
ACTION_DATE
FROM ContactTargeting.dbo.Adobe_Analytics_Staging
GROUP BY VISITOR_ID, ACTION_DATE
See this fiddle with some demo data

Related

Counting Booleans for Distinct and Non Distinct ID numbers

I have a simple table that looks like the following PNG file from the following join:
SELECT *
FROM tableA A
JOIN tableB B ON B.Main_SPACE_ID = A.Main_SPACE_ID
Table A contains Guest_ON and User_Controls (last 2 columns) and Table B contains Trigger_ON and DOCX_ON.
Issue:
What I am trying to do is count all the True's for each tableB.Subspace_ID and the DISTINCT trues for tableA.Main_SPACE_ID.
The problem is that subspace_ID from table B lives within the main_space_id from table A and therefore creates a situation where I am double counting.
I only want to count the trues for a distinct Main_space ID
Current Data Model
Desired Output:
From the above screenshot, I am trying to get a count of true values without double counting in the case for tableA_MAIN_SPACE_ID.
As you can see, each row is counted for true values as it relates to the subspace_ID (table B) for totals of 12 and 8 (1 if True, 0 if False) and for tableA, I am only counting distinct values so we only count Trues for a single MainspaceID and avoid recounting them.
If someone can advise on how to get this output from my current data model that would be very helpful!
My attempt as follows double counts trues for the Main space ID column..
SELECT
count(CASE WHEN B.TRIGGER_ON THEN 1 END) as TRIGGER_ON,
count(CASE WHEN B.DOCX_ON THEN 1 END) as DOCX_ON,
count(CASE WHEN A.GUEST_ON THEN 1 END) as SPRINTS,
count(CASE WHEN A.USER_CONTROLS THEN 1 END) as SPRINTS
FROM DataModel
What I am trying to do is count all the True's for each tableB.Subspace_ID and the DISTINCT trues for tableA.Main_SPACE_ID.
You can use conditional aggregation. In Snowflake, you can use the convenient COUNT_IF() for the first two columns. However, for the second two, you need COUNT(DISTINCT) with conditional logic:
SELECT COUNT_IF( B.Trigger_on ) as Trigger_On,
COUNT_IF( B. DOCX_ON ) as DOCX_ON,
COUNT(DISTINCT CASE WHEN A.GUEST_ON THEN A.Main_SPACE_ID END) as GUEST_ON,
COUNT(DISTINCT CASE WHEN A. USER_CONTROLS THEN A.Main_SPACE_ID END) as USER_CONTROLS
FROM tableA A JOIN
tableB B
ON B.Main_SPACE_ID = A.Main_SPACE_ID;
Mabye:
SELECT
COUNT(CASE WHEN B.TRIGGER_ON THEN 1 END) AS TRIGGER_ON,
COUNT(CASE WHEN B.DOCX_ON THEN 1 END) AS DOCX_ON,
(SELECT COUNT(*) FROM (SELECT DISTINCT A.MAIN_SPACE_ID, A.GUEST_ON FROM DataModel WHERE A.GUEST_ON = TRUE) A) AS GUEST_ON
(SELECT COUNT(*) FROM (SELECT DISTINCT A.USER_CONTROLS, A.GUEST_ON FROM DataModel WHERE A.USER_CONTROLS = TRUE) A) AS USER_CONTROLS
FROM DataModel

BigQuery(standard SQL) grouping values based on first CASE WHEN statement

Here is my query with the output below the syntax.
SELECT DISTINCT CASE WHEN id = 'RUS0261431' THEN value END AS sr_type,
COUNT(CASE WHEN id in ('RUS0290788') AND value in ('1','2','3','4') THEN respondentid END) AS sub_ces,
COUNT(CASE WHEN id IN ('RUS0290788') AND value in ('5','6','7') THEN respondentid END) AS pos_ces,
COUNT(*) as total_ces
FROM `some_table`
WHERE id in ( 'RUS0261431') AND id <> '' AND value IS NOT NULL
GROUP BY 1
As you can see with the attached table I'm unable to group the values based on Id RUS0290788 with the distinct values that map to RUS0261431. Is there anyway to pivot with altering my case when statements so I can group sub_ces and pos_ces by sr_type. Thanks in advanceenter image description here
You can simplify your WHERE condition to WHERE id = ('RUS0261431'). Only records with this value will be selected so you do not have to repeat this in the CASE statements.

Selecting independent rows and displaying them into a single row (ORACLE SQL)

I have a table called requesttool.request_detail which is used to store attributes for entities identified by the value in column REQUEST_ID. The table requesttool.request_detail has a column called ATTRIBUTE_ID which indicates what is stored in the respective row of another column called ATTRIBUTE_VALUE. For instance, if ATTRIBUTE_ID='259' for a given row then name will be stored in that respective row of ATTRIBUTE_VALUE.
Here is what requesttool.request_detail looks like in practice:
What I want to do is to extract the value stored in ATTRIBUTE_VALUE for 3 different ATTRIBUTE_ID's and for a given REQUEST_ID, say 4500161635, and display them in a single row, like this:
I have tried the following code:
select
request_id,
case when attribute_id = '289' then attribute_value end as name,
case when attribute_id = '259' then attribute_value end as country,
case when attribute_id = '351' then attribute_value end as priority
from (
select a.request_id, a.attribute_id, a.attribute_value
from requesttool.request_detail a
where a.request_id='4500161635');
but from this I obtain a table with null values, not a single line:
You are on the right track. Only you'd have to aggregate your rows so as to get one result row per request_id:
select
request_id,
max(case when attribute_id = '289' then attribute_value end) as name,
max(case when attribute_id = '259' then attribute_value end) as country,
max(case when attribute_id = '351' then attribute_value end) as priority
from requesttool.request_detail
where request_id = '4500161635'
group by request_id;
Given an index on request_id + attribute_id, you might be able to speed this up by adding a condition to your where clause:
and attribute_id in ('289', '259', '351')
BTW: Are request_id and attribute_id really strings or why are you using quotes on the numbers?
Try this
select
request_id,
MIN(case when attribute_id = '289' then attribute_value end) as name,
MIN(case when attribute_id = '259' then attribute_value end) as country,
MIN(case when attribute_id = '351' then attribute_value end) as priority
from (
select a.request_id, a.attribute_id, a.attribute_value
from requesttool.request_detail a
where a.request_id='4500161635')
GROUP BY request_id

pivot table returns more than 1 row for the same ID

I have a sql code which I am using to do pivot. Code is as follows:
SELECT DISTINCT PersonID
,MAX(pivotColumn1)
,MAX(pivotColumn2) --originally these were in 2 separate rows)
FROM(SELECT srcID, PersonID, detailCode, detailValue) FROM src) AS SrcTbl
PIVOT(MAX(detailValue) FOR detailCode IN ([pivotColumn1],[pivotColumn2])) pvt
GROUP BY PersonID
In the source data the ID has 2 separate rows due to having its own ID which separates the values. I have now pivoted it and its still giving me 2 separate rows for the ID even though i grouped it and used aggregation on the pivot columns. Ay idea whats wrong with the code?
So I have all my possible detailCode listed in the IN clause. So I have null returned when the value is none but I want it all summarised in 1 row. See image below.
If those are all the options of detailCode , you can use conditional aggregation with CASE EXPRESSION instead of Pivot:
SELECT t.personID,
MAX(CASE WHEN t.detailCode = 'cas' then t.detailValue END) as cas,
MAX(CASE WHEN t.detailCode = 'buy' then t.detailValue END) as buy,
MAX(CASE WHEN t.detailCode = 'sel' then t.detailValue END) as sel,
MAX(CASE WHEN t.detailCode = 'pla' then t.detailValue END) as pla
FROM YourTable t
GROUP BY t.personID

SQL using CASE in SELECT with GROUP BY. Need CASE-value but get row-value

so basicially there is 1 question and 1 problem:
1. question - when I have like 100 columns in a table(and no key or uindex is set) and I want to join or subselect that table with itself, do I really have to write out every column name?
2. problem - the example below shows the 1. question and my actual SQL-statement problem
Example:
A.FIELD1,
(SELECT CASE WHEN B.FIELD2 = 1 THEN B.FIELD3 ELSE null FROM TABLE B WHERE A.* = B.*) AS CASEFIELD1
(SELECT CASE WHEN B.FIELD2 = 2 THEN B.FIELD4 ELSE null FROM TABLE B WHERE A.* = B.*) AS CASEFIELD2
FROM TABLE A
GROUP BY A.FIELD1
The story is: if I don't put the CASE into its own select statement then I have to put the actual rowname into the GROUP BY and the GROUP BY doesn't group the NULL-value from the CASE but the actual value from the row. And because of that I would have to either join or subselect with all columns, since there is no key and no uindex, or somehow find another solution.
DBServer is DB2.
So now to describing it just with words and no SQL:
I have "order items" which can be divided into "ZD" and "EK" (1 = ZD, 2 = EK) and can be grouped by "distributor". Even though "order items" can have one of two different "departements"(ZD, EK), the fields/rows for "ZD" and "EK" are always both filled. I need the grouping to consider the "departement" and only if the designated "departement" (ZD or EK) is changing, then I want a new group to be created.
SELECT
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END) AS ZD,
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END) AS EK,
TABLE.DISTRIBUTOR,
sum(TABLE.SOMETHING) AS SOMETHING,
FROM TABLE
GROUP BY
ZD
EK
TABLE.DISTRIBUTOR
TABLE.DEPARTEMENT
This here worked in the SELECT and ZD, EK in the GROUP BY. Only problem was, even if EK was not the designated DEPARTEMENT, it still opened a new group if it changed, because he was using the real EK value and not the NULL from the CASE, as I was already explaining up top.
And here ladies and gentleman is the solution to the problem:
SELECT
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END) AS ZD,
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END) AS EK,
TABLE.DISTRIBUTOR,
sum(TABLE.SOMETHING) AS SOMETHING,
FROM TABLE
GROUP BY
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END),
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END),
TABLE.DISTRIBUTOR,
TABLE.DEPARTEMENT
#t-clausen.dk: Thank you!
#others: ...
Actually there is a wildcard equality test.
I am not sure why you would group by field1, that would seem impossible in your example. I tried to fit it into your question:
SELECT FIELD1,
CASE WHEN FIELD2 = 1 THEN FIELD3 END AS CASEFIELD1,
CASE WHEN FIELD2 = 2 THEN FIELD4 END AS CASEFIELD2
FROM
(
SELECT * FROM A
INTERSECT
SELECT * FROM B
) C
UNION -- results in a distinct
SELECT
A.FIELD1,
null,
null
FROM
(
SELECT * FROM A
EXCEPT
SELECT * FROM B
) C
This will fail for datatypes that are not comparable
No, there's no wildcard equality test. You'd have to list every field you want tested individually. If you don't want to test each individual field, you could use a hack such as concatenating all the fields, e.g.
WHERE (a.foo + a.bar + a.baz) = (b.foo + b.bar + b.az)
but either way, you're listing all of the fields.
I might tend to solve it something like this
WITH q as
(SELECT
Department
, (CASE WHEN DEPARTEMENT = 1 THEN ZD
WHEN DEPARTEMENT = 2 THEN EK
ELSE null
END) AS GRP
, DISTRIBUTOR
, SOMETHING
FROM mytable
)
SELECT
Department
, Grp
, Distributor
, sum(SOMETHING) AS SumTHING
FROM q
GROUP BY
DEPARTEMENT
, GRP
, DISTRIBUTOR
If you need to find all rows in TableA that match in TableB, how about INTERSECT or INTERSECT DISTINCT?
select * from A
INTERSECT DISTINCT
select * from B
However, if you only want rows from A where the entire row matches the values in a row from B, then why does your sample code take some values from A and others from B? If the row matches on all columns, then that would seem pointless. (Perhaps your question could be explained a bit more fully?)