Count Unique Identifiers by a result in another column - sql

I am trying to return some values in SQL. Let's call Column A as UniqueIdentifier I then have a row that lists two different actions that can be done to this unique identifier, let's call the column action and the actions a and b
The actions can be duplicated more than one time per unique identifier, thus listing tons of duplicated UniqueIdentifier
How do I get the count of UniqueIdentifier where action a has been performed and the count where action a has not?
I did create a really long and cumbersome way(listed below) using temp tables but I feel like there must be a more roundabout way
<!-- language: lang-sql -->
select
UniqueIdentifier
,case when action = 'a' then 1 else 0 end [actionflag]
into #actionflags
from mydatabase
select
distinct UniqueIdentifier
,sum(actionflag)
into #actionflagscount
from #actionflags
group by UniqueIdentifier
select case when actionflag > 0 then 1 else 0 end [actionflag]
,count(uniqueidentifier)
from #actionflagscount
group by case when actionflag > 0 then 1 else 0 end

Do you want DISTINCT ?
select case when actionflag > 0 then 1 else 0 end [actionflag],
count(distinct uniqueidentifier)
from #actionflagscount
group by case when actionflag > 0 then 1 else 0 end

This is what I came up with
DROP TABLE IF EXISTS #temp
CREATE TABLE #temp ([UniqueIdentifier] INT, Action NVARCHAR(10))
INSERT INTO #temp ([UniqueIdentifier], Action) VALUES (1, 'a')
INSERT INTO #temp ([UniqueIdentifier], Action) VALUES (2, 'a')
INSERT INTO #temp ([UniqueIdentifier], Action) VALUES (3, 'a')
INSERT INTO #temp ([UniqueIdentifier], Action) VALUES (4, 'b')
INSERT INTO #temp ([UniqueIdentifier], Action) VALUES (5, 'b')
INSERT INTO #temp ([UniqueIdentifier], Action) VALUES (6, 'c')
INSERT INTO #temp ([UniqueIdentifier], Action) VALUES (7, 'c')
SELECT CASE WHEN Action = 'a' THEN 1 ELSE 0 END AS ActionFlag, COUNT(*)
FROM #temp
GROUP BY CASE WHEN Action = 'a' THEN 1 ELSE 0 END
-- Results:
-- ActionFlag (No column name)
-- 0 4
-- 1 3

Considering you may have:
The same action performed several times on the same UniqueIdentifier
Different actions performed on the same UniqueIdentifier
I would do:
SELECT DISTINCT UniqueIdentifier, CASE WHEN EXISTS (SELECT 1 FROM MyDatabase WHERE action = 'a' and UniqueIdentifier = MDB.UniqueIdentifier) THEN 1 ELSE 0 END AS ActionPerformed
FROM MyDatabase MDB
If this looks fine, then, a simple COUNT will do the trick surrounding the previous query.
SELECT Count(*), ActionPerformed
FROM (
SELECT DISTINCT UniqueIdentifier, CASE WHEN EXISTS (SELECT 1 FROM MyDatabase WHERE action = 'a' and UniqueIdentifier = MDB.UniqueIdentifier) THEN 1 ELSE 0 END AS ActionPerformed
FROM MyDatabase MDB
) T
GROUP BY ActionPerformed
ActionPerformed = 1 means action a was performed (possibly action b was performed too on the same UniqueIdentifier). ActionPerformed = 0 means action a was not performed (but it does not say anything about action b)

I would use two levels of aggregation:
select sum(case when num_as > 0 then 1 else 0 end) as num_with_as,
sum(case when num_as = 0 then 1 else 0 end) as num_without_as
from (select uniqueidentifier,
sum(case when action = 'a' then 1 else 0 end) as num_as
from mydatabase
group by uniqueidentifier
) d;
The inner query counts the number of "a"s per uniqueidentifier. The outer query uses this to get the two values you want.

Related

How to INSERT into 1 table and UPDATE another table in one query

I have a question on how to write a single query to insert and update. Below is the scenario. I am trying to use 1 query for the part that is enclosed in (-----)
CREATE TABLE #TEMP
(
Ref VARCHAR(10),
Num INT,
[Status] VARCHAR(3)
)
INSERT INTO #TEMP
VALUES ('A123', 1, 'A3'), ('A123', 2, 'A3'), ('A123', 3, 'A3'),
('B123', 1, 'A1'), ('B123', 2, 'A3'),
('C123', 1, 'A1'), ('C123', 2, 'A2'), ('C123', 3, 'A3');
SELECT
Ref,
CASE WHEN A.TotalCount = A.DenialCount THEN 1 ELSE 0 END IsDenial
--CASE WHEN A.TotalCount <> A.DenialCount Then 1 else 0 end IsApproval
INTO
#TEMP1
FROM
(SELECT
Ref, COUNT(Num) TotalCount,
SUM(CASE WHEN [Status] = 'A1' THEN 1 ELSE 0 END) ApprovedCount,
SUM(CASE WHEN [Status] = 'A2' THEN 1 ELSE 0 END) PartialApprovalCount,
SUM(CASE WHEN [Status] = 'A3' THEN 1 ELSE 0 END) DenialCount
FROM
#temp
GROUP BY
Ref) A
UPDATE A
SET A.[Status] = CASE WHEN IsDenial = 1 THEN 'A3' ELSE 'A1' END
FROM #TEMP A
JOIN #TEMP1 B ON A.Ref = B.Ref
SELECT * FROM #TEMP
SELECT * FROM #TEMP1
DROP TABLE #TEMP
DROP TABLE #TEMP1
Any help would be appreciated.
"INSERT into 1 table and UPDATE another table in one query"
Nope. Some DBMSes support the idea of 'upsert' but that's insert/update in a single table.
Your looking for the MERGE statment. However I see several issues with the SQL in your post. In short it is generally more efficient to use set theory instead of thinking of optimisations per statement.
Rather than update, why not join in the data thats inserted into temp into the second query and produce the result you require?
hint ' SELECT 'ABC' as a, '123' as b, 456 as c UNION '

How can I simplify this Query? I need to compare a temp variable value with a column value of multiple rows

I need to compare a temp variable value with a column value of multiple rows and perform Operations based on that.
| intSeqID | Value |
----------------------------
1 | 779.40
2 | 357.38
3 | NULL
4 | NULL
5 | NULL
6 | NULL
7 | NULL
8 | NULL
9 | NULL
10 | NULL
DECLARE #tmpRange NUMERIC(5,2)
SELECT #tmpRange = 636
Here I need to compare the value #tmpRange with Value from TABLE and perform operations based on it.
IF((#tmpRange < (select ISNULL(Value,0) from #tableA intSeqID=1)) AND
(#tmpRange< (select ISNULL(Value,0) from #tableA where intSeqID=2))) AND
(#tmpRange< (select ISNULL(Value,0) from #tableA where intSeqID=3))) AND
(#tmpRange< (select ISNULL(Value,0) from #tableA where intSeqID=9))) AND
(#tmpRange< (select ISNULL(Value,0) from #tableA where intSeqID=10)))
BEGIN
SELECT 'All'
END
ELSE IF ((#tmpRange < (select ISNULL(Value,0) from #tableA intSeqID=1)) AND
(#tmpRange< (select ISNULL(Value,0) from #tableA where intSeqID=2))) AND
(#tmpRange< (select ISNULL(Value,0) from #tableA where intSeqID=3))) AND
(#tmpRange< (select ISNULL(Value,0) from #tableA where intSeqID=9))))
BEGIN
SELECT '10'
END
END
How can i simplify this query to compare values. Or is there any other way to pick the values of multiple rows and compare the same with temp variable.
Here is one fairly simple way to do it:
Create and populate sample table (Please save us this step in your future questions)
DECLARE #tableA as table
(
intSeqID int identity(1,1),
Value numeric(5,2)
)
INSERT INTO #tableA VALUES
(779.40),
(357.38),
(256.32),
(NULL)
Declare and populate the variable:
DECLARE #tmpRange numeric(5, 2) = 636
The query:
;WITH CTE AS
(
SELECT TOP 1 intSeqId
FROM #TableA
WHERE #tmpRange < ISNUll(Value, 0)
ORDER BY Value
)
SELECT CASE WHEN intSeqId =
(
SELECT TOP 1 intSeqId
FROM #TableA
ORDER BY ISNUll(Value, 0)
) THEN 'All'
ELSE CAST(intSeqId as varchar(3))
END
FROM CTE
Result: 1.
See a live demo on rextester.
We can try to refactor your query using aggregations. We almost get away with no subquery except for just one, which is needed to distinguish the two conditions.
SELECT
CASE WHEN SUM(CASE WHEN #tmpRange < Value THEN 1 ELSE 0 END) = 4 AND
#tmpRange < (SELECT Value FROM #tableA WHEREA intSeqID = 10)
THEN 'All'
WHEN SUM(CASE WHEN #tmpRange < Value THEN 1 ELSE 0 END) = 4
THEN '10'
ELSE 'NONE' END AS label
FROM #tableA
WHERE intSeqID IN (1, 2, 3, 9)
You want to find the biggest record in Value, who is also smaller than your variable, correct?
--DECLARE #tableA TABLE (intSeqID tinyint, [Value] decimal(5,2))
--INSERT INTO #tableA SELECT 1, 400 UNION SELECT 2, 300 UNION SELECT 3, 200
--DECLARE #tmpRange decimal(5,2) = 250
SELECT TOP 1 *
FROM (
SELECT TOP 1 CONCAT('', intSeqID) AS intSeqID -- Can't UNION int to varchar.
FROM #tableA
WHERE ISNULL([Value], 0) < #tmpRange
ORDER BY intSeqID ASC
UNION
SELECT 'All' AS [?]
) AS T
ORDER BY intSeqID ASC

SQL Many to Many relationship

I'm having difficulties writing a SQL query. This is the structure of 3 tables, table Race_ClassificationType is many-to-many table.
Table Race
----------------------------
RaceID
Name
Table Race_ClassificationType
----------------------------
Race_ClassificationTypeID
RaceID
RaceClassificationID
Table RaceClassificationType
----------------------------
RaceClassificationTypeID
Name
What I'm trying to do is get the races with certain classifications. The results are returned by a store procedure that has a table-value parameter which holds the desired classifications:
CREATE TYPE [dbo].[RaceClassificationTypeTable]
AS TABLE
(
RaceClassificationTypeID INT NULL
);
GO
CREATE PROCEDURE USP_GetRaceList
(#RaceClassificationTypeTable AS [RaceClassificationTypeTable] READONLY,
#RaceTypeID INT = NULL,
#IsCompleted BIT = NULL,
#MinDateTime DATETIME = NULL,
#MaxDateTime DATETIME = NULL,
#MaxRaces INT = NULL)
WITH RECOMPILE
AS
BEGIN
SET NOCOUNT ON;
SELECT DISTINCT
R.[RaceID]
,R.[RaceTypeID]
,R.[Name]
,R.[Abbreviation]
,R.[DateTime]
,R.[IsCompleted]
FROM [Race] R,[Race_ClassificationType] R_CT, [RaceClassificationType] RCT
WHERE (R.[RaceTypeID] = #RaceTypeID OR #RaceTypeID IS NULL)
AND (R.[IsCompleted] = #IsCompleted OR #IsCompleted IS NULL)
AND (R.[DateTime] >= #MinDateTime OR #MinDateTime IS NULL)
AND (R.[DateTime] <= #MaxDateTime OR #MaxDateTime IS NULL)
AND (R.RaceID = R_CT.RaceID)
AND (R_CT.RaceClassificationTypeID = RCT.RaceClassificationTypeID)
AND (RCT.RaceClassificationTypeID IN (SELECT DISTINCT T.RaceClassificationTypeID FROM #RaceClassificationTypeTable T))
ORDER BY [DateTime] DESC
OFFSET 0 ROWS FETCH NEXT #MaxRaces ROWS ONLY
END
GO
As it is this stored procedure doesnt work correctly because it returns all races that have at least one classification type ID in the table-value parameter of classification type IDs (because of the IN clause). I want that the store procedure returns only races that have all the classifications supplied in the table-valued parameter.
Example:
RaceClassificationTypeID RaceID
3 92728
3 92729
8 92729
29 92729
12 92729
2 92729
3 92730
8 92730
8 92731
1 92731
RaceClassificationTypeIDs in RaceClassificationTypeTable parameter: 3 and 8
OUTPUT: all the races with RaceClassificationID 3 and 8 and optionally any other (2, 29, 12)
That means only races 92729 and 92730 should be returned, as it is all the races in the example are returned.
I've set up two tables, one stores your result set and the other represents the values in the table valued parameter of your stored procedure. See below.
CREATE TABLE ABC
(
RCTID INT,
RID INT
)
INSERT INTO ABC VALUES (3,92728)
INSERT INTO ABC VALUES (3,92729)
INSERT INTO ABC VALUES (8,92729)
INSERT INTO ABC VALUES (29,92729)
INSERT INTO ABC VALUES (12,92729)
INSERT INTO ABC VALUES (2,92729)
INSERT INTO ABC VALUES (3,92730)
INSERT INTO ABC VALUES (8,92730)
INSERT INTO ABC VALUES (8,92731)
INSERT INTO ABC VALUES (1,92731)
GO
CREATE TABLE TABLEVALUEPARAMETER
(
VID INT
)
INSERT INTO TABLEVALUEPARAMETER VALUES (3)
INSERT INTO TABLEVALUEPARAMETER VALUES (8)
GO
SELECT RID FROM ABC WHERE RCTID IN (SELECT VID FROM TABLEVALUEPARAMETER) GROUP BY
RID HAVING COUNT(RID) = (SELECT COUNT(VID) FROM TABLEVALUEPARAMETER)
GO
If you run this on your machine you'll notice it produces the two IDs that you're after.
Because you have a stored procedure with a lot of columns selected it would be necessary to use a CTE (Common Table Expression). This is because if you were to try to group all the columns in the current select statement you would have to group by all the columns and you would then get duplication.
If the first CTE delivers the result set and then you uses a version of the select above you should be able to produce only the IDs you want.
If you don't know CTE's let me know!
This is an example of a "set-within-sets" subquery. One way to solve this is with aggregation and a having clause. Here is how you get the RaceIds:
select RaceID
from RaceClassification rc
group by RaceID
having sum(case when RaceClassificationTypeId = 3 then 1 else 0 end) > 0 and
sum(case when RaceClassificationTypeId = 8 then 1 else 0 end) > 0;
Each condition in the having clause is counts how many rows have each type. Only races with each (because of the > 0) are kept.
You can get all the race information by using this as a subquery:
select r.*
from Races r join
(select RaceID
from RaceClassification rc
group by RaceID
having sum(case when RaceClassificationTypeId = 3 then 1 else 0 end) > 0 and
sum(case when RaceClassificationTypeId = 8 then 1 else 0 end) > 0
) rc
on r.RaceID = rc.RaceId;
Your stored procedure seems to have other conditions. These can also be added in.

simple insert procedure, check for duplicate

I am creating a program that is going to insert data into a table which is pretty simple
But my issue is I want my insert statement to make sure that it isnt inserting duplicate data
I want to somehow check the table the data is going into to make sure that there isnt a row with the same indivualid and categoryid and value
So if I am inserting
indivualid = 1
categorid = 1
value = 1
and in my table there is a row with
indivualid = 1
categorid = 1
value = 2
my data would still be inserted
but if there was a row with
indivualid = 1
categorid = 1
value = 1
then it wouldnt
I tried this
IF #value = 'Y'
OR #value = 'A'
OR #value = 'P'
AND NOT EXISTS
(SELECT categoryid,
individualid
FROM ualhistory
WHERE categoryid = #cat
AND individualid = #id)
INSERT INTO individuory(categoryid, individualid, value, ts)
VALUES (#cat,
#id,
#yesorno,
getdate())
but it still inserts duplicates.
You can do that in the following manner:
insert into
individuory(categoryid, individualid, value, ts)
VALUES (#cat, #id, #yesorno, getdate())
where not exists
(select 1 from individuory where categoryid=#cat and individualid=#id)
Now, the exact problem with your approach is that you are not associating the ORs and therefore, the condition becomes true and always inserts the data. You can change your statement to this:
if ((#value = 'Y' or #value = 'A' or #value = 'P')
and not EXISTS
(SELECT categoryid, individualid FROM ualhistory WHERE categoryid = #cat
and individualid = #id) )
INSERT INTO individuory(categoryid, individualid, value, ts)
VALUES (#cat, #id, #yesorno, getdate())
And I think it will work also.
ALTER TABLE individuory
ADD CONSTRAINT myConstarint
UNIQUE (categoryid, individualid, value)
Add a UNIQUE constraint on (individualid, categoryid, value) and the server won't let you insert a duplicate row.
http://msdn.microsoft.com/en-us/library/ms189862.aspx

Query: find rows that do not belong to a list of values

Lets consider I have a table 'Tab' which has a column 'Col'
The table 'Tab' has this data -
Col
1
2
3
4
5
If I have a set of values (2,3,6,7). I can query the values that are present in the table and the list by suing the query
Select Col from Tab where col IN (2,3,6,7)
But, if I want to return the values in the list that are not present in the table i.e. only (6,7) in this case. What query should I use?
The problem I believe is that your trying to find values from you in statement. What you need to do is turn your in statement into a table and then you can determine which values are different.
create table #temp
(
value int
)
insert into #temp values 1
insert into #temp values 2
insert into #temp values 3
insert into #temp values 4
select
id
from
#temp
where
not exists (select 1 from Tab where Col = id)
A better alternative would be to create a table-valued function to turn your comma-delimited string into a table. I don't have any code handy, but it should be easy to find on Google. In that case you would only need to use the syntax below.
select
id
from
dbo.SplitStringToTable('2,3,6,7')
where
not exists (select 1 from Tab where Col = id)
Hope this helps
A SQL Server 2008 method
SELECT N FROM (VALUES(2),(3),(6),(7)) AS D (N)
EXCEPT
Select Col from Tab
Or SQL Server 2005
DECLARE #Values XML
SET #Values =
'<r>
<v>2</v>
<v>3</v>
<v>6</v>
<v>7</v>
</r>'
SELECT
vals.item.value('.[1]', 'INT') AS Val
FROM #Values.nodes('/r/v') vals(item)
EXCEPT
Select Col from Tab
one way would be to use a temp table:
DECLARE #t1 TABLE (i INT)
INSERT #t1 VALUES(2)
INSERT #t1 VALUES(3)
INSERT #t1 VALUES(6)
INSERT #t1 VALUES(7)
SELECT i FROM #t1 WHERE i NOT IN (Select Col from Tab)
One method is
declare #table table(col int)
insert into #table
select 1 union all
select 2 union all
select 3 union all
select 4 union all
select 5
declare #t table(col int)
insert into #t
select 2 union all
select 3 union all
select 6 union all
select 7
select t1.col from #t as t1 left join #table as t2 on t1.col=t2.col
where t2.col is null
Do you have a [numbers] table in your database? (See Why should I consider using an auxiliary numbers table?)
SELECT
[Tab].*
FROM
[numbers]
LEFT JOIN [Tab]
ON [numbers].[num] = [Tab].[Col]
WHERE
[numbers].[num] IN (2, 3, 6, 7)
AND [Tab].[Col] IS NULL
I think there are many ways to achive this, here is one.
SELECT a.col
FROM
(SELECT 2 AS col UNION ALL SELECT 3 UNION ALL SELECT 6 UNION ALL SELECT 7) AS a
WHERE a.col NOT IN (SELECT col FROM Tab)
Late to the party...
SELECT
'2s' = SUM(CASE WHEN Tab.Col = 2 THEN 1 ELSE 0 END),
'3s' = SUM(CASE WHEN Tab.Col = 3 THEN 1 ELSE 0 END),
'6s' = SUM(CASE WHEN Tab.Col = 6 THEN 1 ELSE 0 END),
'7s' = SUM(CASE WHEN Tab.Col = 7 THEN 1 ELSE 0 END)
FROM
(SELECT 1 AS Col, 'Nums' = 1 UNION SELECT 2 AS Col,'Nums' = 1 UNION SELECT 3 AS Col, 'Nums' = 1 UNION SELECT 4 AS Col, 'Nums' = 1 UNION SELECT 5 AS Col, 'Nums' = 1 ) AS Tab
GROUP BY Tab.Nums
BTW, mine also gives counts of each, useful if you need it. Like if you were checking a product list against what you have in inventory. Though you can write a pivot for that better, just don't know how off the top of my head.