T-SQL Crosstab count query - sql

If have the following dataset:
... and I want to do a crosstab of sorts, counting the data against specific criteria e.g.:
Colour criteria: String contains "Blue", "Red", "Yellow" or "Green" (not case sensitive)
Type criteria: String contains "Car", "Lorry", or "Bus (not case sensitive)
... and I would like the result to look like the following:
Is there an SQL query that I can run on the original data to produce the result I'm looking for?

You can use CROSS APPLY with conditional aggregation; CROSS APPLY simplifies the generation of the list of colours:
select c.colour,
sum(case when v.VehicleData like '%Car%' then 1 else 0 end) Car,
sum(case when v.VehicleData like '%Lorry%' then 1 else 0 end) Lorry,
sum(case when v.VehicleData like '%Bus%' then 1 else 0 end) Bus
from vehicles v
cross apply (values ('Blue'), ('Red'), ('Yellow'), ('Green')
) AS c(colour)
where v.VehicleData like '%' + c.colour + '%'
group by c.colour
Output:
colour Car Lorry Bus
Blue 3 1 0
Red 1 2 0
Yellow 0 1 1
Green 0 0 2
Demo on dbfiddle

With conditional aggregation:
select c.colour,
count(case when t.VehicleData like '%Car%' then 1 end) Car,
count(case when t.VehicleData like '%Lorry%' then 1 end) Lorry,
count(case when t.VehicleData like '%Bus%' then 1 end) Bus
from (
select 'Blue' colour union all
select 'Red' union all
select 'Yellow' union all
select 'Green'
) c left join tbl1 t
on t.VehicleData like '%' + c.colour + '%'
group by c.colour
See the demo.
Results:
> colour | Car | Lorry | Bus
> :----- | --: | ----: | --:
> Blue | 3 | 1 | 0
> Red | 1 | 2 | 0
> Yellow | 0 | 1 | 1
> Green | 0 | 0 | 2

Related

How to do multiple actions in case when then in sql?

I want to do something like this:
select sum(case ttt.ind = 1 then 1 else 0 end) from ttt
I want to add a column to this query, called myresult which indicates if the value of ttt.istry is equal to 1.
Maybe like:
select
sum(case ttt.ind = 1 then 1, ttt.istry as myresult else 0 end)
from ttt
of course I got an error...
How would I do that?
My data is:
ttt.ind | ttt.istry
--------+----------
1 | 0
0 | 1
1 | 1
and so on...
Expected result:
ttt.ind | ttt.istry | myresult | sum
--------+-----------+----------+------
1 | 0 | 0 | 2
0 | 1 | null | 2
1 | 1 | 1 | 2
You don't say which database so I'll assume it's a modern one. You can use a window function and a CASE clause to do this.
For example:
select
ind,
istry,
case when ind = 1 then istry end as myresult,
sum(ind) over() as sum
from ttt
See live example at SQL Fiddle.
Your logic is a bit hard to follow, but your result set suggests:
select ind, istry,
(case when istry = 1 then 1
when sum(istry) over (partition by ind) = 1 then 0
end),
sum(ttt.ind) over () as sum_ind
from ttt;

Is there a way in SQL Server to solve for this conditional problem for prioritization?

I have the following schema of a table
Name Number
----- -------
A 200
A 322
B 200
B 322
C 322
C 200
D 322
D 234
I need some conditional statement to add another label column.
The conditions being that if a name has number 200, it should be prioritized over all other numbers and be labeled as 'Apple'
The next condition is that if a name does not have a number 200, the second priority is for number 322. So then those should be labeled as 'Mango'
I want my final result to look something like this which is grouped by name.
Name Number Label
----- ------- ------
A 200 Apple
B 200 Apple
C 200 Apple
D 322 Mango
With conditional aggregation:
select
name,
case min(case number when 200 then 0 when 322 then 1 end)
when 0 then 'Apple'
when 1 then 'Mango'
end Label
from tablename
group by name
See the demo.
Results:
> name | Label
> :--- | :----
> A | Apple
> B | Apple
> C | Apple
> D | Mango
If you want the column Number also do the aggregation inside a CTE:
with cte as (
select name, min(case number when 200 then 0 when 322 then 1 end) id
from tablename
group by name
)
select
name,
case id when 0 then 200 when 1 then 322 end Number,
case id when 0 then 'Apple' when 1 then 'Mango' end Label
from cte
See the demo.
Results:
> name | Number | Label
> :--- | -----: | :----
> A | 200 | Apple
> B | 200 | Apple
> C | 200 | Apple
> D | 322 | Mango
You can do something like that:
SELECT (CASE WHEN [Number]=200 THEN 'APPLE' WHEN [Number] =322 THEN 'MANGO' ELSE 'WHATEVER' END) [Label], [Number]
FROM [Yourtablename]
ORDER BY (CASE WHEN [Number]=200 THEN 2 WHEN [Number] =322 THEN 1 ELSE 0 END) DESC

Counting sum of items of type

What i what to do is from this :
|type|quantity|
+----+--------+
|shoe| 10 |
|hat | 2 |
|shoe| 7 |
|shoe| 1 |
|hat | 5 |
to get this :
|shoes|hats|
+-----+----+
| 18 | 7 |
How can i do that? So far I hadn't come up with a working query, I think it should look something like that:
SELECT
SUM(CASE type WHEN 'shoe' then quantity ELSE 0 END) AS "shoes",
SUM(CASE type WHEN 'hat' then quantity ELSE 0 END) AS "hats"
FROM items
GROUP BY type
Just drop the group by. You want only one row:
SELECT
SUM(CASE type WHEN 'shoe' then quantity ELSE 0 END) AS "shoes",
SUM(CASE type WHEN 'hat' then quantity ELSE 0 END) AS "hats"
FROM items ;

Division with Aggregate Functions in SQL Not Behaving as Expected

I'm trying to do some crosstabs in SQL Server 2008 R2. That part is alright, however, if I try to get percentages for each cell, I run into a problem.
Here is a distilled use case: A survey where people give their favorite color and their favorite fruit. I'd like to know how many like a given fruit AND a given color:
with survey as (
select 'banana' fav_fruit, 'yellow' fav_color
union select 'banana', 'red'
union select 'apple', 'yellow'
union select 'grape', 'red'
union select 'apple', 'blue'
union select 'orange', 'purple'
union select 'pomegranate', 'green'
)
select
s.fav_color,
sum(case
when s.fav_fruit = 'banana' then 1
else 0
end) as banana,
sum(case
when s.fav_fruit = 'banana' then 1
else 0
end) / sum(1) -- why does division always yield 0? "+", "-", and "*" all behave as expected.
* 100 as banana_pct,
sum(1) as total
from
survey s
group by
s.fav_color;
Results:
fav_color banana banana_pct total
------------------------------------
blue 0 0 1
green 0 0 1
purple 0 0 1
red 1 0 2
yellow 1 0 2
What I was expecting:
fav_color banana banana_pct total
------------------------------------
blue 0 0 1
green 0 0 1
purple 0 0 1
red 1 50 2
yellow 1 50 2
Please help me to get what I was expecting?
You are using SQL Server. Here is a much simpler example that replicates the issue:
select 1/2
SQL Server does integer division.
Replace the denominator with something like sum(1.0) or sum(cast 1 as float) or sum(1e0) instead of sum(1).
Contrary to my expectation at least, SQL Server stores numbers with decimal points as numeric/decimal type (see here) rather than float. The fixed number of decimal spaces might affect subsequent operations.
Query:
SQLFIddleexample
SELECT s.fav_color,
sum( CASE WHEN s.fav_fruit = 'banana' THEN 1 ELSE 0 END ) AS banana,
sum( CASE WHEN s.fav_fruit = 'banana' THEN 1 ELSE 0 END) / sum(1.00) -- why does division always yield 0? "+", "-", and "*" all behave as expected.
* 100 AS banana_pct,
sum(1) AS total
FROM survey s
GROUP BY s.fav_color
Result:
| FAV_COLOR | BANANA | BANANA_PCT | TOTAL |
-------------------------------------------
| blue | 0 | 0 | 1 |
| green | 0 | 0 | 1 |
| purple | 0 | 0 | 1 |
| red | 1 | 50 | 2 |
| yellow | 1 | 50 | 2 |
I've recently discovered the IIF function. It makes things much cleaner. Taking Justin's example from above:
SELECT s.fav_color,
sum( IIF(s.fav_fruit = 'banana', 1,0) AS banana,
sum( IIF(s.fav_fruit = 'banana', 1,0) / sum(1.00)
* 100 AS banana_pct,
sum(1) AS total
FROM survey s
GROUP BY s.fav_color

How can I turn a bunch of rows into aggregated columns WITHOUT using pivot in SQL Server 2005?

Here is the scenario:
I have a table that records the user_id, the module_id, and the date/time the module was viewed.
eg.
Table: Log
------------------------------
User_ID Module_ID Date
------------------------------
1 red 2001-01-01
1 green 2001-01-02
1 blue 2001-01-03
2 green 2001-01-04
2 blue 2001-01-05
1 red 2001-01-06
1 blue 2001-01-07
3 blue 2001-01-08
3 green 2001-01-09
3 red 2001-01-10
3 green 2001-01-11
4 white 2001-01-12
I need to get a result set that has the user_id as the 1st column, and then a column for each module. The row data is then the user_id and the count of the number of times that user viewed each module.
eg.
---------------------------------
User_ID red green blue white
---------------------------------
1 2 1 2 0
2 0 1 1 0
3 1 2 1 0
4 0 0 0 1
I was initially thinking that I could do this with PIVOT, but no dice; the database is a converted SQL Server 2000 DB that is running in SQL Server 2005. I'm not able to change the compatibility level, so pivot is out.
The other catch is that the modules will vary, and it isn't feasible to re-write the query every time a module is added or removed. This means that I can't hard-code in the modules because I don't know in advance which will and will not be installed.
How can I accomplish this?
PIVOT can be simulated with CASE and GROUP BY
select
[user_id],
sum(case when [Module_ID] = 'red' then 1 else 0 end) as red,
sum(case when [Module_ID] = 'green' then 1 else 0 end) as green,
sum(case when [Module_ID] = 'blue' then 1 else 0 end) as blue,
sum(case when [Module_ID] = 'white' then 1 else 0 end) as white
from [log]
group by
[user_id]
Of course this doesn't work if the modules vary (as stated in the question) but then, PIVOT has the same problem.
Dynamically generating some sql overcomes this problem but this solution smells a bit!
declare #sql nvarchar(max)
set #sql = '
select
[user_id],'
select #sql = #sql + '
sum(case when [Module_ID] = ''' + replace([Module_ID], '''','''''') + ''' then 1 else 0 end) as [' + replace([Module_ID], '''','') + '],'
from (select distinct [Module_ID] from [log]) as moduleids
set #sql = substring(#sql,1,len(#sql)-1) + '
from [log]
group by
[user_id]
'
print #sql
exec sp_executesql #sql
Note that this may be vulnerable to sql-injection if the module id data can't be trusted.
SELECT User_ID, MAX(red) AS red, MAX(green) AS green, MAX(blue) AS blue,
MAX(white) AS white FROM
((SELECT User_ID, COUNT(Module_ID) AS red, 0 AS green, 0 AS blue,
0 AS white
FROM log
WHERE Module_ID = 'red'
GROUP BY User_ID)
UNION
(SELECT User_ID, 0 AS red, COUNT(Module_ID) AS green, 0 AS blue,
0 AS white
FROM log
WHERE Module_ID = 'green'
GROUP BY User_ID)
UNION
(SELECT User_ID, 0 AS red, 0 AS green, COUNT(Module_ID) AS blue,
0 AS white
FROM log
WHERE Module_ID = 'blue'
GROUP BY User_ID)
UNION
(SELECT User_ID, 0 AS red, 0 AS green, 0 AS blue,
COUNT(Module_ID) AS white
FROM log
WHERE Module_ID = 'white'
GROUP BY User_ID))
GROUP BY User_ID
ORDER BY User_ID
Using MySQL I did this:
Copied your data into Log_Table.sql
create table Log (User_ID mediumint, Module_ID CHAR(5), dte CHAR(10));
load data infile 'Log_Table.sql' INTO TABLE Log FIELDS TERMINATED BY ',';
Pivot:
select User_ID AS 'USER', sum(case
Module_ID WHEN 'red' then 1 else 0
END) AS 'red',
sum(case Module_ID WHEN 'green' then 1
else 0 END) AS 'green',
sum(case Module_ID WHEN 'blue' then 1
else 0 END) AS 'blue',
sum(case Module_ID WHEN 'white' then 1
else 0 END) AS 'white'
from Log
Group By User_ID;
> +------+------+-------+------+-------+
> | USER | red | green | blue | white |
> +------+------+-------+------+-------+
> | 1 | 2 | 1 | 2 | 0 |
> | 2 | 0 | 1 | 1 | 0 |
> | 3 | 1 | 2 | 1 | 0 |
> | 4 | 0 | 0 | 0 | 1 |
> +------+------+-------+------+-------+
> 4 rows in set (0.00 sec)
Hope this helps.
I believe characteristic functions are what you want.