Count Based on Columns in SQL Server - sql

I have 3 tables:
SELECT id, letter
FROM As
+--------+--------+
| id | letter |
+--------+--------+
| 1 | A |
| 2 | B |
+--------+--------+
SELECT id, letter
FROM Xs
+--------+------------+
| id | letter |
+--------+------------+
| 1 | X |
| 2 | Y |
| 3 | Z |
+--------+------------+
SELECT id, As_id, Xs_id
FROM A_X
+--------+-------+-------+
| id | As_id | Xs_id |
+--------+-------+-------+
| 9 | 1 | 1 |
| 10 | 1 | 2 |
| 11 | 2 | 3 |
| 12 | 1 | 2 |
| 13 | 2 | 3 |
| 14 | 1 | 1 |
+--------+-------+-------+
I can count all As and Bs with group by. But I want to count As and Bs based on X,Y and Z. What I want to get is below:
+-------+
| X,Y,Z |
+-------+
| 2,2,0 |
| 0,0,2 |
+-------+
X,Y,Z
A 2,2,0
B 0,0,2
What is the best way to do this at MSSQL? Is it an efficent way to use foreach for example?
edit: It is not a duplicate because I just wanted to know the efficent way not any way.

For what you're trying to do without knowing what is inefficient with your current code (because none was provided), a Pivot is best. There are a million resources online and here in the stack overflow Q/A forums to find what you need. This is probably the simplest explanation of a Pivot which I frequently need to remind myself of the complicated syntax of a pivot.
To specifically answer your question, this is the code that shows how the link above applies to your question
First Tables needed to be created
DECLARE #AS AS TABLE (ID INT, LETTER VARCHAR(1))
DECLARE #XS AS TABLE (ID INT, LETTER VARCHAR(1))
DECLARE #XA AS TABLE (ID INT, AsID INT, XsID INT)
Values were added to the tables
INSERT INTO #AS (ID, Letter)
SELECT 1,'A'
UNION
SELECT 2,'B'
INSERT INTO #XS (ID, Letter)
SELECT 1,'X'
UNION
SELECT 2,'Y'
UNION
SELECT 3,'Z'
INSERT INTO #XA (ID, ASID, XSID)
SELECT 9,1,1
UNION
SELECT 10,1,2
UNION
SELECT 11,2,3
UNION
SELECT 12,1,2
UNION
SELECT 13,2,3
UNION
SELECT 14,1,1
Then the query which does the pivot is constructed:
SELECT LetterA, [X],[Y],[Z]
FROM (SELECT A.LETTER AS LetterA
,B.LETTER AS LetterX
,C.ID
FROM #XA C
JOIN #AS A
ON A.ID = C.ASID
JOIN #XS B
ON B.ID = C.XSID
) Src
PIVOT (COUNT(ID)
FOR LetterX IN ([X],[Y],[Z])
) AS PVT
When executed, your results are as follows:
Letter X Y Z
A 2 2 0
B 0 0 2

As i said in comment ... just join and do simple pivot
if object_id('tempdb..#AAs') is not null drop table #AAs
create table #AAs(id int, letter nvarchar(5))
if object_id('tempdb..#XXs') is not null drop table #XXs
create table #XXs(id int, letter nvarchar(5))
if object_id('tempdb..#A_X') is not null drop table #A_X
create table #A_X(id int, AAs int, XXs int)
insert into #AAs (id, letter) values (1, 'A'), (2, 'B')
insert into #XXs (id, letter) values (1, 'X'), (2, 'Y'), (3, 'Z')
insert into #A_X (id, AAs, XXs)
values (9, 1, 1),
(10, 1, 2),
(11, 2, 3),
(12, 1, 2),
(13, 2, 3),
(14, 1, 1)
select LetterA,
ISNULL([X], 0) [X],
ISNULL([Y], 0) [Y],
ISNULL([Z], 0) [Z]
from (
select distinct a.letter [LetterA], x.letter [LetterX],
count(*) over (partition by a.letter, x.letter order by a.letter) [Counted]
from #A_X ax
join #AAs A on ax.AAs = A.ID
join #XXs X on ax.XXs = X.ID
)src
PIVOT
(
MAX ([Counted]) for LetterX in ([X], [Y], [Z])
) piv
You get result as you asked for
LetterA X Y Z
A 2 2 0
B 0 0 2

Related

SQL Server query for multiple conditions on the same column

Here's the schema and data that i am working with
CREATE TABLE tbl (
name varchar(20) not null,
groups int NOT NULL
);
insert into tbl values('a', 35);
insert into tbl values('a', 36);
insert into tbl values('b', 35);
insert into tbl values('c', 36);
insert into tbl values('d', 37);
| name | groups|
|------|-------|
| a | 35 |
| a | 36 |
| b | 35 |
| c | 36 |
| d | 37 |
now i need names of only those that are having group greater than or equal to 35
but also an additional is that i can only include a row for which group=35 when a corresponding groups=36 is also present
| name | groups|
|------|-------|
| a | 35 |
| a | 36 |
second condition is that it CAN include those names that are having groups greater than or equal to 36 without having a groups=35
| name | groups|
|------|-------|
| c | 36 |
| d | 37 |
the only case it should leave out is where a record has only groups=35 present without a corresponding groups=36
| name | groups|
|------|-------|
| b | 35 |
i have tried the following
select name from tbl
where groups>=35
group by name
having count(distinct(groups))>=2
or groups>=36;
this is the error i am facing Column 'tbl.groups' is invalid in the HAVING clause because it is not contained in either an aggregate function or the GROUP BY clause.
Try this:
DECLARE #tbl table ( [name] varchar(20) not null, groups int NOT NULL );
INSERT INTO #tbl VALUES
('a', 35), ('a', 36), ('b', 35), ('c', 36), ('d', 37);
DECLARE #group int = 35;
; WITH cte AS (
SELECT
[name]
, COUNT ( DISTINCT groups ) AS distinct_group_count
FROM #tbl
WHERE
groups >= #group
GROUP BY
[name]
)
SELECT t.* FROM #tbl AS t
INNER JOIN cte
ON t.[name] = cte.[name]
WHERE
cte.distinct_group_count > 1
OR t.groups > #group;
RETURNS
+------+--------+
| name | groups |
+------+--------+
| a | 35 |
| a | 36 |
| c | 36 |
| d | 37 |
+------+--------+
Basically, this restricts the name results to groups with a value >= 35 with more than one distinct group associated, or any name with a group value greater than 35. Several assumptions were made in regard to your data, but I believe the logic still applies.
So, as far as i can tell you just want to limit where groups 35 is by itself. I thought, lets try and isolate those names where they only have groups=35 and then not exists from there. Is this the correct output youre after?
Also, using complicated OR's in the where clause will often lead to your query not being SARGable. Better to UNION or some how building the query so that each part can use indexes (if they can).
if object_id('tempdb..#tbl') is not null drop table #tbl;
CREATE TABLE #tbl (
name varchar(20) not null,
groups int NOT NULL
);
insert into #tbl values('a', 35), ('a', 36), ('b', 35), ('c', 36), ('d', 37);
select *
from #tbl tbl
WHERE NOT EXISTS
(
SELECT COUNT(groups), name
FROM #tbl t
WHERE EXISTS
(
SELECT name
FROM #tbl tb
WHERE groups = 35
and tb.name=t.name
)
AND t.name = tbl.name
GROUP BY name
HAVING COUNT(groups)=1
)
;
It looks like you need an exists() condition. Try:
select *
from tbl t
where t.groups >= 35
and (
t.groups > 35
or exists(select * from tbl t2 where t2.name = t.name and t2.groups = 36)
)
There are other ways to arrange the where clause to achieve the same effect. Having the t.groups >= 35 condition up front should give the query optimizer the ability to leverage an index on groups.
You can use a windowed count for this
This avoids joining the table multiple times
SELECT
name,
groups
FROM (
SELECT *,
Count36 = COUNT(CASE WHEN groups = 36 THEN 1 END) OVER (PARTITION BY name)
FROM tbl
WHERE groups >= 35
) tbl
WHERE groups >= 36 OR Count36 > 0;
db<>fiddle

Roll up multiple rows into one when joining in SQL Server

I have a table, Foo
ID | Name
-----------
1 | ONE
2 | TWO
3 | THREE
And another, Bar:
ID | FooID | Value
------------------
1 | 1 | Alpha
2 | 1 | Alpha
3 | 1 | Alpha
4 | 2 | Beta
5 | 2 | Gamma
6 | 2 | Beta
7 | 3 | Delta
8 | 3 | Delta
9 | 3 | Delta
I would like a query that joins these tables, returning one row for each row in Foo, rolling up the 'value' column from Bar. I can get back the first Bar.Value for each FooID:
SELECT * FROM Foo f OUTER APPLY
(
SELECT TOP 1 Value FROM Bar WHERE FooId = f.ID
) AS b
Giving:
ID | Name | Value
---------------------
1 | ONE | Alpha
2 | TWO | Beta
3 | THREE | Delta
But that's not what I want, and I haven't been able to find a variant that will bring back a rolled up value, that is the single Bar.Value if it is the same for each corresponding Foo, or a static string something like '(multiple)' if not:
ID | Name | Value
---------------------
1 | ONE | Alpha
2 | TWO | (multiple)
3 | THREE | Delta
I have found some solutions that would bring back concatenated values (albeit not very elegant) 'Alpha' Alpha, Alpha', 'Beta, Gamma, Beta' &c, but that's not what I want either.
One method, using a a CASE expression and assuming that [Value] cannot have a value of NULL:
WITH Foo AS
(SELECT *
FROM (VALUES (1, 'ONE'),
(2, 'TWO'),
(3, 'THREE')) V (ID, [Name])),
Bar AS
(SELECT *
FROM (VALUES (1, 1, 'Alpha'),
(2, 1, 'Alpha'),
(3, 1, 'Alpha'),
(4, 2, 'Beta'),
(5, 2, 'Gamma'),
(6, 2, 'Beta'),
(7, 3, 'Delta'),
(8, 3, 'Delta'),
(9, 3, 'Delta')) V (ID, FooID, [Value]))
SELECT F.ID,
F.[Name],
CASE COUNT(DISTINCT B.[Value]) WHEN 1 THEN MAX(B.Value) ELSE '(Multiple)' END AS [Value]
FROM Foo F
JOIN Bar B ON F.ID = B.FooID
GROUP BY F.ID,
F.[Name];
You can also try below:
SELECT F.ID, F.Name, (case when B.Value like '%,%' then '(Multiple)' else B.Value end) as Value
FROM Foo F
outer apply
(
select SUBSTRING((
SELECT distinct ', '+ isnull(Value,',') FROM Bar WHERE FooId = F.ID
FOR XML PATH('')
), 2 , 9999) as Value
) as B

SELECT check the colum of the max row

Here my row with my first select:
SELECT
user.id, analytic_youtube_demographic.age,
analytic_youtube_demographic.percent
FROM
`user`
INNER JOIN
analytic ON analytic.user_id = user.id
INNER JOIN
analytic_youtube_demographic ON analytic_youtube_demographic.analytic_id = analytic.id
Result:
---------------------------
| id | Age | Percent |
|--------------------------
| 1 |13-17| 19,6 |
| 1 |18-24| 38.4 |
| 1 |25-34| 22.5 |
| 1 |35-44| 11.5 |
| 1 |45-54| 5.3 |
| 1 |55-64| 1.6 |
| 1 |65+ | 1.2 |
| 2 |13-17| 10 |
| 2 |18-24| 10 |
| 2 |25-34| 25 |
| 2 |35-44| 5 |
| 2 |45-54| 25 |
| 2 |55-64| 5 |
| 1 |65+ | 20 |
---------------------------
The max value by user_id:
---------------------------
| id | Age | Percent |
|--------------------------
| 1 |18-24| 38.4 |
| 2 |45-54| 25 |
| 2 |25-34| 25 |
---------------------------
And I need to filter Age in ['25-34', '65+']
I must have at the end :
-----------
| id |
|----------
| 2 |
-----------
Thanks a lot for your help.
Have tried to use MAX(analytic_youtube_demographic.percent). But I don't know how to filter with the age too.
Thanks a lot for your help.
You can use the rank() function to identify the largest percentage values within each user's data set, and then a simple WHERE clause to get those entries that are both of the highest rank and belong to one of the specific demographics you're interested in. Since you can't use windowed functions like rank() in a WHERE clause, this is a two-step process with a subquery or a CTE. Something like this ought to do it:
-- Sample data from the question:
create table [user] (id bigint);
insert [user] values
(1), (2);
create table analytic (id bigint, [user_id] bigint);
insert analytic values
(1, 1), (2, 2);
create table analytic_youtube_demographic (analytic_id bigint, age varchar(32), [percent] decimal(5, 2));
insert analytic_youtube_demographic values
(1, '13-17', 19.6),
(1, '18-24', 38.4),
(1, '25-34', 22.5),
(1, '35-44', 11.5),
(1, '45-54', 5.3),
(1, '55-64', 1.6),
(1, '65+', 1.2),
(2, '13-17', 10),
(2, '18-24', 10),
(2, '25-34', 25),
(2, '35-44', 5),
(2, '45-54', 25),
(2, '55-64', 5),
(2, '65+', 20);
-- First, within the set of records for each user.id, use the rank() function to
-- identify the demographics with the highest percentage.
with RankedDataCTE as
(
select
[user].id,
youtube.age,
youtube.[percent],
[rank] = rank() over (partition by [user].id order by youtube.[percent] desc)
from
[user]
inner join analytic on analytic.[user_id] = [user].id
inner join analytic_youtube_demographic youtube on youtube.analytic_id = analytic.id
)
-- Now select only those records that are (a) of the highest rank within their
-- user.id and (b) either the '25-34' or the '65+' age group.
select
id,
age,
[percent]
from
RankedDataCTE
where
[rank] = 1 and
age in ('25-34', '65+');

SQL: Pick highest and lowest value (int) from one row

I am looking for a way to pick the highest and lowest value (integer) from a single row in table. There are 4 columns that i need to compare together and get highest and lowest number there is.
The table looks something like this...
id | name | col_to_compare1 | col_to_compare2 | col_to_compare3 | col_to_compare4
1 | John | 5 | 5 | 2 | 1
2 | Peter | 3 | 2 | 4 | 1
3 | Josh | 3 | 5 | 1 | 3
Can you help me, please? Thanks!
You can do this using CROSS APPLY and the VALUES clause. Use VALUES to group all your compared columns and then select the max.
SELECT
MAX(d.data1) as MaxOfColumns
,MIN(d.data1) as MinOfColumns
,a.id
,a.name
FROM YOURTABLE as a
CROSS APPLY (
VALUES(a.col_to_compare1)
,(a.col_to_compare2)
,(a. col_to_compare3)
,(a.col_to_compare4)
,(a. col_to_compare5)
) as d(data1) --Name the Column
GROUP BY a.id
,a.name
Assuming you are looking for min/max per row
Declare #YourTable table (id int,name varchar(50),col_to_compare1 int,col_to_compare2 int,col_to_compare3 int,col_to_compare4 int)
Insert Into #YourTable values
(1,'John',5,5,2,1),
(2,'Peter',3,2,4,1),
(3,'Josh',3,5,1,3)
Select A.ID
,A.Name
,MinVal = min(B.N)
,MaxVal = max(B.N)
From #YourTable A
Cross Apply (Select N From (values(a.col_to_compare1),(a.col_to_compare2),(a.col_to_compare3),(a.col_to_compare4)) N(N) ) B
Group By A.ID,A.Name
Returns
ID Name MinVal MaxVal
1 John 1 5
3 Josh 1 5
2 Peter 1 4
These solutions keep the current rows and add additional columns of min/max.
select *
from t cross apply
(select min(col) as min_col
,max(col) as max_col
from (
values
(t.col_to_compare1)
,(t.col_to_compare2)
,(t.col_to_compare3)
,(t.col_to_compare4)
) c(col)
) c
OR
select *
,cast ('' as xml).value ('min ((sql:column("t.col_to_compare1"),sql:column("t.col_to_compare2"),sql:column("t.col_to_compare3"),sql:column("t.col_to_compare4")))','int') as min_col
,cast ('' as xml).value ('max ((sql:column("t.col_to_compare1"),sql:column("t.col_to_compare2"),sql:column("t.col_to_compare3"),sql:column("t.col_to_compare4")))','int') as max_col
from t
+----+-------+-----------------+-----------------+-----------------+-----------------+---------+---------+
| id | name | col_to_compare1 | col_to_compare2 | col_to_compare3 | col_to_compare4 | min_col | max_col |
+----+-------+-----------------+-----------------+-----------------+-----------------+---------+---------+
| 1 | John | 5 | 5 | 2 | 1 | 1 | 5 |
+----+-------+-----------------+-----------------+-----------------+-----------------+---------+---------+
| 2 | Peter | 3 | 2 | 4 | 1 | 1 | 4 |
+----+-------+-----------------+-----------------+-----------------+-----------------+---------+---------+
| 3 | Josh | 3 | 5 | 1 | 3 | 1 | 5 |
+----+-------+-----------------+-----------------+-----------------+-----------------+---------+---------+
A way to do this is to "break" apart the data
declare #table table (id int, name varchar(10), col1 int, col2 int, col3 int, col4 int)
insert into #table values (1 , 'John' , 5 , 5 , 2 , 1)
insert into #table values (2 , 'Peter' , 3 , 2 , 4 , 1)
insert into #table values (3 , 'Josh' , 3 , 5 , 1 , 3)
;with stretch as
(
select id, col1 as col from #table
union all
select id, col2 as col from #table
union all
select id, col3 as col from #table
union all
select id, col4 as col from #table
)
select
t.id,
t.name,
agg.MinCol,
agg.MaxCol
from #table t
inner join
(
select
id, min(col) as MinCol, max(col) as MaxCol
from stretch
group by id
) agg
on t.id = agg.id
Seems simple enough
SELECT min(col1), max(col1), min(col2), max(col2), min(col3), max(col3), min(col4), max(col4) FROM table
Gives you the Min and Max for each column.
Following OP's comment, I believe he may be looking for a min/max grouped by the person being queried against.
So that would be:
SELECT name, min(col1), max(col1), min(col2), max(col2), min(col3), max(col3), min(col4), max(col4) FROM table GROUP BY name

How to convert all columns value to rows and then check for a particular column?

i have a table like this:
student | group
1 | A
2 | B
1 | B
3 | C
i want to produce the output like the following:
Student | Group_A | Group_B | Group_C
1 | Yes | Yes |
2 | | Yes |
3 | | | Yes
Do anybody have any idea how can I produce this type of report? I tried several ways using Pivot and Unpivot but it's not working here
I believe you are looking for pivot.
http://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx
declare #students table(id int, type_ nvarchar)
insert into #students
values(1, 'A'),
(2, 'B'),
(3, 'C'),
(1, 'B');
select result.id Student,
iif(result.A = 1, 'Yes', 'No') Group_A,
iif(result.B = 1, 'Yes', 'No') Group_B,
iif(result.C = 1, 'Yes', 'No') Group_C
from (
select *
from #students a
pivot(count(type_) for type_ in (
[A],
[B],
[C]
)) as pivotExample
) result;
If a dynamic pivot is required, I would refer you to this:
SQL Server dynamic PIVOT query?