I have a single table in the following format:
STATE SURVEY_ANSWER
NC high
NC moderate
WA high
FL low
NC high
I am looking for a single query that will get me the following result:
STATE HIGH MODERATE LOW
NC 2 1 0
WA 1 0 0
FL 0 0 1
Unfortunately, these are the results I am getting:
STATE HIGH MODERATE LOW
NC 3 1 1
WA 3 1 1
FL 3 1 1
Here is the code I am using:
Select mytable.STATE,
(SELECT COUNT(*) FROM mytable WHERE mytable.survey_answer = 'low' and state = mytable.state) AS low,
(SELECT COUNT(*) FROM mytable WHERE mytable.survey_answer = 'moderate' and state = mytable.state) AS moderate,
(SELECT COUNT(*) FROM mytable WHERE mytable.survey_answer = 'high' and state = mytable.state) AS high,
FROM mytable
GROUP BY mytable.state;
While this and other forums have been very helpful I am unable to figure out what I am doing wrong. PLEASE NOTE: I am using Access so CASE WHEN solutions do not work. Thank you for any advice.
It looks like this may be an issue caused by not using table aliases. Because you are doing sub-queries on the same table that the outer SELECT is using and not giving the outer table an alias, both of the conditions in the WHERE of the sub-query are only using data in the sub-query.
In other words, when you write:
SELECT COUNT(*) FROM mytable WHERE mytable.survey_answer = 'low' and state = mytable.state
It doesn't know anything about the outer query.
Try this:
SELECT t1.STATE,
(SELECT COUNT(*) FROM mytable t2 WHERE t2.state = t1.state AND t2.survey_answer = 'low') low,
(SELECT COUNT(*) FROM mytable t3 WHERE t3.state = t1.state AND t3.survey_answer = 'moderate') moderate,
(SELECT COUNT(*) FROM mytable t4 WHERE t4.state = t1.state AND t4.survey_answer = 'high') high,
FROM mytable t1
GROUP BY t1.state
Aiias answer explains why your current query is not working, but I thought I'd point out that your assumption that you can't use CASE WHEN solutions is only partly right, yes you can't use CASE WHEN but that doesn't mean you need correlated subqueries. You could simply use:
SELECT mytable.STATE,
SUM(IIF(mytable.survey_answer = 'low', 1, 0) AS low,
SUM(IIF(mytable.survey_answer = 'moderate', 1, 0) AS moderate,
SUM(IIF(mytable.survey_answer = 'high', 1, 0) AS high
FROM mytable
GROUP BY mytable.state;
Related
I have a table like below and I want 'Y' in front of Ref 345 and 789 in the result-set on basis of count(Ref) = 1 where the amount is less than 0. I am using this query to get the desired output. My question is, is there any other (and more efficient) way to do it in Teradata?
SELECT T.Ref,T.AMOUNT, R.Refund_IND as Refund_IND
FROM Table1 t
LEFT JOIN (select 'Y' as Refund_IND, Ref from Table1 where Ref in
(select Ref from Table1 where amount < 0)
group by Ref having count(Ref) = 1) R on t.Ref = R.Ref
You can use window functions to test these conditions:
SELECT
Ref,
Amount,
CASE WHEN COUNT(*) OVER (PARTITION BY REF) = 1 AND Amount < 0 THEN 'Y' ELSE '' END AS Refund_Ind
FROM Table1
I have a table of data and I would want to group 2 columns based on a logic formed from a few case statements in a new column. This is my data:
And this is my current sql:
select a.Action,st.State,ym.Year,sum(RatingCount) as LevelCount
from ActionTable a
left join StateTable st on a.ID = st.ActionID
left join YearMetrics ym a.Name = ym.NameCategory and st.Name = ym.CategoryName
group by a.name,st.name,ym.Year,ym.Level
These are the case statements (not all of them) base on which the logic should apply:
case when level = 'high' and levelcount >= 1 then 'High'
case when level = 'medium' and levelcount >3 then 'High'
else Low
end as Level
So, for example in case of Oregon (lines 20,21,22) I would want, based on the case statements to group the data on Action, State, Year. A new column named Level should be formed based from the logic on the case statements. So in the case of line 20, because there is no case statements to match the data in the table the result should be:
Non-Travel Oregon 2020 Low
The lines 21,22 should be:
Non-Travel Oregon 2021 High
because, according to the case statements, there is one levelcount >=1 and Level is High. In the case of line 19 the result should be :
Non-Travel Nevada null null
What I have tried includes:
Partitions
CLR object to include the logic in a c# assembly
Stuff function
Group by case statements
I have not managed to obtain the desired result using any of the techniques.
This is the expected result:
Any help would be appreciated.
This appears to be the logic that you are describing:
select a.Action, st.State, ym.Year,
sum(RatingCount) as LevelCount,
(case when level = 'high' and sum(RatingCount) >= 1 then 'High'
when level = 'medium' and sum(RatingCount) > 3 then 'High'
when level = 'medium' then 'Low'
end) as Level
from ActionTable a left join
StateTable st
on a.ID = st.ActionID left join
YearMetrics ym
on a.Name = ym.NameCategory and st.Name = ym.CategoryName
group by a.name, st.name, ym.Year, ym.Level;
As far as I can tell, the stated expected results are not compatible with what you've give us in terms of rules of how to derive them. It also doesn't help that your data rather than being the raw data is the output of your existing query. As a result, it feels like we're guessing a bit here ...
The query I've given below doesn't return what you say you want, but it's close and I think agrees with your explanation.
WITH subquery AS
(
select a.Action,st.State,ym.Year,ym.Level,sum(RatingCount) as LevelCount
from ActionTable a
left join StateTable st on a.ID = st.ActionID
left join YearMetrics ym a.Name = ym.NameCategory and st.Name = ym.CategoryName
group by a.name,st.name,ym.Year,ym.Level
) --This is just your original code with ym.Level added to the SELECT clause.
SELECT
s.Action,
s.State,
s.Year,
CASE WHEN s.Level = 'high' AND s.LevelCount >=1 THEN 'High'
WHEN s.Level = 'medium' AND s.LevelCount >0 THEN 'High'
WHEN s.Level IS NULL THEN NULL --If you don't do this, NULLs become 'Low'
ELSE 'Low'
END AS NewLevel
FROM
subquery s
GROUP BY
s.Action,
s.State,
s.Year,
CASE WHEN s.Level = 'high' AND s.LevelCount >=1 THEN 'High'
WHEN s.Level = 'medium' AND s.LevelCount >0 THEN 'High'
WHEN s.Level IS NULL THEN NULL
ELSE 'Low'
END
I have multiple queries that look like this:
select count(*) from (
SELECT * FROM TABLE1 t
JOIN TABLE2 e
USING (EVENT_ID)
) s1
WHERE
s1.SOURCE_ID = 1;
where the only difference is the t1.SOURCE_ID = (some other number). I would like to turn these into a single query that just selects from the subquery using a different SOURCE_ID for each column in the result, like this:
+----------------+----------------+----------------+
| source_1_count | source_2_count | source_3_count | ... so on
+----------------+----------------+----------------+
I am trying to avoid using the multiple queries as the join is on a very large table and takes some time, so I would rather do it once and query the result multiple times.
This is on a Snowflake data warehouse which I think uses something similar to PostgreSQL (also I'm fairly new to SQL so feel free to suggest a completely different solution as well).
Use conditional aggregation
SELECT sum(case when sourceid=1 then 1 else 0 end) source_1_count, sum(case when sourceid=2 then 1 else 0 end) source_2_count...
FROM TABLE1 t
JOIN TABLE2 e
USING (EVENT_ID)
You would put the results in separate rows, using group by:
SELECT SOURCE_ID, COUNT(*)
FROM TABLE1 t JOIN
TABLE2 e
USING (EVENT_ID)
GROUP BY SOURCE_ID;
Putting the separate sources in columns is troublesome, unless you know the exact list of sources that you want in the result set.
EDIT:
If you know the exact list of sources, you can use conditional aggregation or pivot:
SELECT SUM(CASE WHEN SOURCE_ID = 1 THEN 1 ELSE 0 END) as source_id_1,
SUM(CASE WHEN SOURCE_ID = 2 THEN 1 ELSE 0 END) as source_id_2,
SUM(CASE WHEN SOURCE_ID = 3 THEN 1 ELSE 0 END) as source_id_3
FROM TABLE1 t JOIN
TABLE2 e
USING (EVENT_ID);
All the comments so far ignore the fact that you won't have the possible benefits of pruning the data during the scan, as there are no WHERE predicates. Join can also be slower than it needs to be because of that.
This is a possible improvement:
SELECT SUM(CASE WHEN SOURCE_ID = 1 THEN 1 ELSE 0 END) as source_id_1,
SUM(CASE WHEN SOURCE_ID = 2 THEN 1 ELSE 0 END) as source_id_2,
SUM(CASE WHEN SOURCE_ID = 3 THEN 1 ELSE 0 END) as source_id_3
FROM TABLE1 t JOIN
TABLE2 e
USING (EVENT_ID);
WHERE SOURCE_ID IN (1, 2, 3)
I have the following information in a table
Load Number Origin Destination
1 AR TX
2 AR AL
3 TX MS
4 WA AR
I need help with a SQL statement that will produce the follow results
State Origin Destination
AR 2 1
TX 1 1
WA 1
MS 1
I have tried countless types of SELECT statements with various types of COUNTS in them and GROUP BY's at the end but I can't get the results I'm looking for. Any help is greatly appreciated.
How about:
select state,
sum(origin) origin,
sum(destination) destination
from (select origin as state,
0 as destination,
1 as origin
from my_table
union all
select destination,
1,
0
from my_table)
group by state
You can first build a inner query to select all the distinct states from Origin and Destination columns. Then you can join it back to the main table and do a conditional aggregation.
Demo
select x.state,
sum(case when x.state = t.Origin then 1 end),
sum(case when x.state = t.Destination then 1 end)
from tablename t
join (
select Origin as State
from tablename
union
select Destination
from tablename ) x
on t.Origin = x.state or t.Destination = x.state
group by x.state
Here is a sample table I have
Logs
user_id, session_id, search_query, action
1, 100, dog, A
1, 100, dog, B
2, 101, cat, A
3, 102, ball, A
3, 102, ball, B
3, 102, kite, A
4, 103, ball, A
5, 104, cat, A
where
miss = for the same user_id and same session id , if action A is not followed by action B its termed a miss.
Note: action B can happen only after action A has happened.
I am able to find the count of misses for each unique search_query across all users and sessions.
SELECT l1.search_query, count(l1.*) as misses
FROM logs l1
WHERE NOT EXISTS
(SELECT NULL FROM logs l2
WHERE l1.user_id = l2.user_id
AND l1.session_id = l2.session_id
AND l1.session_id != ''
AND l2.action = 'B'
AND l1.action = 'A')
AND l1.action='A'
AND l1.search_query != ''
GROUP BY v1.search_query
order by misses desc;
I am trying to find the value of miss_percentage=(number of misses/total number of rows)*100 for each unique search_query. I couldn't figure out how to find the count with a condition and count without that condition in the same query. Any help would be great.
expected output:
cat 100
kite 100
ball 50
One way to do it is to move the EXISTS into the count
SELECT l1.search_query, count(case when NOT EXISTS
(SELECT 1 FROM logs l2
WHERE l1.user_id = l2.user_id
AND l1.session_id = l2.session_id
AND l1.search_query = l2.search_query
AND l2.action = 'B'
AND l1.action = 'A') then 1 else null end
)*100.0/count(*) as misses
FROM logs l1
WHERE l1.action='A'
AND l1.search_query != ''
GROUP BY l1.search_query
order by misses desc;
This produces the desired results, but also zeros if no misses were found. This can be removed with a HAVING clause, or postprocessing.
Note I also added the clause l1.search_query = l2.search_query that was missing, since otherwise it was counting kite as succeeded, since there is a row with B in the same session.
I think you just need to use case statements here. If I have understood your problem correctly .. then the solution would be something like this -
WITH summary
AS (
SELECT user_id
,session_id
,search_query
,count(1) AS total_views
,sum(CASE
WHEN action = 'A'
THEN 1
ELSE 0
END) AS action_a
,sum(CASE
WHEN action = 'B'
THEN 1
ELSE 0
END) AS action_b
FROM logs l
GROUP BY user_id
,session_id
,search_query
)
SELECT search_query
,(sum(action_a - action_b) / sum(action_a)) * 100 AS miss_percentage
FROM summary
GROUP BY search_query;
You can allways create two queries, and combine them into one with a join. Then you can do the calculations in the bridging (or joining) SQL statement.
In MS-SQL compatible SQL this would be:
SELECT ActiontypeA,countedA,isNull(countedB,0) as countedB,
(countedA-isNull(countedB,0))*100/CountedA as missed
FROM (SELECT search_query as actionTypeA, count(*) as countedA
FROM logs WHERE Action='A' GROUP BY actionType
) as TpA
LEFT JOIN
(SELECT search_query as actionTypeB, count(*) as countedB
FROM logs WHERE Action='B' GROUP BY actionType
) as TpB
ON TpA.ActionTypeA = TpB.ActiontypeB
The LEFT JOIN is required to select all activities (search_query) from the 'A' results, and join them to only those from the 'B' results where a B is available.
Since this is very basic SQL (and well optimized by SQL engines) I'd suggest to prevent WHERE EXISTS as much as possible. The IsNull() function is an MS-SQL function to force a NULL value into the int(0) value which can be used in a calculation.
Finally you could filter on
WHERE missed>0
to get the final result.