SQL - Pulling data from the same table with two different "where" statements - sql

I have a table with the following columns...
TestName - StepNumber - Data_1
I'm trying to write a query that can look for Data_1 results and average them for one day. The TestNames are unique tests we're running, and StepNumbers are the individual steps inside of the test. Normally, I would use something like
select Data_1 from table
where TestName in(1,2,3,4)
and StepNumber in(1)
to return all of the Data_1 results I need. However, sometimes the data I need is located in different steps across the table. Test 1 might have the required data in Step 2, Test 2 in step 10, etc...and in the end, I need an average of the Data_1 results for all of the similar StepNumber results. I'm not sure how I can capture this data in a single query. There's a separate part of the query where I'm breaking it down by geography, and doing it individually would take a long time.
I'd be looking for something like...
select avg(Data_1) from table
where TestName = 1 and StepNumber = 2
and TestName = 2 and StepNumber = 10
and TestName = 3 and StepNumber = 5
If I can clarify, please let me know. Thank you!

select avg(Data_1)
from table
where (TestName = 1 and StepNumber = 2)
or (TestName = 2 and StepNumber = 10)
or (TestName = 3 and StepNumber = 5)

If I have understood correctly, you have three (or more) sets of data in your table:
TestName = 1 and StepNumber = 2 - cardinality N1
Testname = 2 and StepNumber = 10 - cardinality N2
TestName = 3 and StepNumber = 5 - cardinality N3
If you want the average for all three separately, you needs must select three columns. AVG in this case cannot help you as it would run the average on a cardinality of N4, this being the intersection of the three groups. So you have to do this by hand. I do not know how this would perform vs. three separate queries:
SELECT
AVG(Data_1) AS OverallAverage,
SUM(IF((TestName = 1 AND StepNumber = 2), Data_1, 0))
/SUM(IF((TestName = 1 AND StepNumber = 2), 1, 0)) AS AvgGroup1,
SUM(IF((TestName = 2 AND StepNumber = 10), Data_1, 0))
/SUM(IF((TestName = 2 AND StepNumber = 10), 1, 0)) AS AvgGroup2,
SUM(IF((TestName = 3 AND StepNumber = 5), Data_1, 0))
/SUM(IF((TestName = 3 AND StepNumber = 5), 1, 0)) AS AvgGroup3
FROM table
WHERE (
( TestName = 1 AND StepNumber = 2)
OR
( TestName = 2 AND StepNumber = 10)
OR
( TestName = 3 AND StepNumber = 5)
);
This kind of query can be assembled from components, i.e. programmatically depending on the groups.
This is a SQLFiddle to show the results.

Related

Query based on multiple rows and multiple columns

Database : Azure SQL server 2019, .net core 3.0
I'm using stored procedure for querying data.
--Table Structure
create table yourtable
(
id int,
class int,
islab bit,
isschool bit
);
insert into yourtable
values (1, 1, 1, 1),
(1, 2, 1, 1),
(1, 3, 1, 1),
(2, 1, 1, 0),
(2, 2, 1, 1),
(2, 3, 1, 1)
Now if I want a query to return all unique Id's where class = 1 and 2 and islab = 1 and isschool = 1, it should return only Id =1 because
a) Id=1 has both classes i,e (1,2) and in both classes islab = 1 and isschool = 1
b) Id=2 is not true for this condition because in classes 1 value for isschool = 0
Can you help me write this query? Currently I'm getting all row for input classes than in c# using list check all conditions. It's working but I want do all in SQL
Also I think using cursor in stored procedure I can have same result as in C# but in C# it's easy as we have Collection and various methods like intersect between lists and so on.
Assuming that islab and isschool are actually a bit (as bool doesn't exist in SQL Server), one method would be to use a HAVING with conditional aggregation. So, for the first one, you would do the following:
SELECT id
FROM dbo.yourtable
WHERE islab = 1
AND isschool = 1
GROUP BY id
HAVING COUNT(CASE class WHEN 1 THEN 1 END) = 1
AND COUNT(CASE class WHEN 2 THEN 1 END) = 1;
The question is not clear, but based on your expected output, I think you want a query that, for a given set of classes, finds the ids where there are no records with isschool = 0 or islab = 0. You can do this with a NOT EXISTS condition:
WITH mytab AS
(
SELECT *
FROM yourtable
WHERE class IN (1,2,3) -- Change this line to get your 3 different outputs
)
SELECT DISTINCT id
FROM mytab t1
WHERE NOT EXISTS
(
SELECT *
FROM mytab t2
WHERE t1.id = t2.id
AND (t2.islab = 0 OR t2.isschool = 0)
)
For class IN (1,2) this returns id 1.
For class IN (2,3) this returns ids 1 and 2.
For class IN (1,2,3) this returns id 1.
The CTE limits to the classes we want to consider. The subquery in the NOT EXISTS finds ids that should be eliminated because either isschool = 0 or islab = 0.
An alternative way of doing this, using a LEFT JOIN instead of the NOT EXISTS condition is:
WITH mytab AS
(
SELECT *
FROM yourtable
WHERE class IN (1,2,3) -- Change this line to get your 3 different version
)
SELECT DISTINCT t1.id
FROM mytab t1 LEFT OUTER JOIN mytab t2
on t1.id = t2.id AND (t2.islab = 0 OR t2.isschool = 0)
WHERE t2.id is null

Performing sum per row based on conditions SQL

I wanted to know how I can perform a sum per row based on conditions for the columns using SQL (I'm new to SQL).
For example, I have this table:
ID Col_1 Col_2 Col_3 ...
1 L L L ...
2 L Q Q ...
3 L L Q ...
4 Q Q L ...
The result that I'm looking for is:
ID count_L count_Q
1 3 0
2 1 2
3 2 1
4 1 2
I'm not sure on how I should approach this. Doing this using Count function if my table was transposed would be easier (I think) but performing the query in the way my data is organized is tricky for me. I think I need nested SQL statements and join them using UNION but not sure how to do it.
I wasn't able to find similar questions/solutions elsewhere. Would appreciate some help!
You can use iif() and +:
select id,
(iif(col_1 = "L" , 1, 0) + iif(col_2 = "L" , 1, 0) + iif(col_3 = "L" , 1, 0) ) as count_l,
(iif(col_1 = "Q" , 1, 0) + iif(col_2 = "Q" , 1, 0) + iif(col_3 = "Q" , 1, 0) ) as count_q
from t;

How to join tables where a variable number of conditions match

I have two tables, tblPerson and tblOffer. The person table has details about which products they currently have. The offers have multiple criteria for whether or not we should suggest that a person get the new product.
I'm trying to join these based on multiple criteria in tblOffer, but getting stuck on making sure that all of the selected criteria are judged, not just any one of them. For example Here are some fields in my tables.
tblPerson
pkPersonId
HasCreditCard
HasEmail
HasLoan
tblOffer
pkOfferId
NeedsCreditCard
NeedsEmail
NeedsLoan
Sample Data:
tblPerson
1, 0, 0, 1
2, 0, 1, 0
3, 0, 0, 0
tblOffer
100, 1, 0, 1
200, 0, 0, 1
300, 1, 1, 0
I'm trying to return a result for person 1 which contains Offer 300, Person 2 gets offer 100 and 200, and Person 3 gets offers 100, 200, and 300.
I have tried Cross APPLY between the two tables, and then using my WHERE clause to say:
SELECT * FROM tblPerson prs
CROSS JOIN tblMrmOffer ofr
WHERE prs.pkPersonId = #PersonId AND (
(prs.HasEmail = 0 AND ofr.NeedsEmail = 1) OR
(prs.HasCreditCard = 0 AND ofr.NeedsCreditCard = 1) OR
(prs.HasLoan = 0 AND ofr.NeedsLoan = 1))
This will give me a row if any of the selected criteria are true but not limited to rows where all selected criteria are set. For example, Offer 300 would match if the person needs a Credit Card or Email, but not necessarily if they need both. I'm trying to work this out as a Cross Tab Pivot, but not clear how to JOIN this. Any help would be appreciated.
If you are trying to make it so that only offers show up for which each of the items in the offer are needed by a person, but offers are disqualified if the person already has at least one of the things, then here is some SQL to try. What each of the AND conditions says is we want to disqualify cases where the person already has an item in the offer.
SELECT * FROM tblPerson prs
CROSS JOIN tblMrmOffer ofr
WHERE prs.pkPersonId = #PersonId
AND
NOT (prs.HasEmail = 1 AND ofr.NeedsEmail = 1)
AND
NOT (prs.HasCreditCard = 1 AND ofr.NeedsCreditCard = 1)
AND
NOT (prs.HasLoan = 1 AND ofr.NeedsLoan = 1)
If instead, you are looking for cases where the offer has ALL of the person's needs, then this SQL might be more up your alley. The theory with the case statements is if the criteria IS important (which is the case when the HasX = 0), then the filter needs to be 1 for the NeedsX. Otherwise, it just matches the NeedsX value (and so the filter wouldn't do anything).
SELECT * FROM tblPerson prs
CROSS JOIN tblMrmOffer ofr
WHERE prs.pkPersonId = #PersonId
AND
(
ofr.NeedsEmail
=
CASE
WHEN prs.HasEmail = 0
THEN 1
ELSE ofr.NeedsEmail
END
)
AND
(
ofr.NeedsCreditCard
=
CASE
WHEN prs.HasCreditCard = 0
THEN 1
ELSE ofr.NeedsCreditCard
END
)
AND
(
ofr.NeedsLoan
=
CASE
WHEN prs.HasLoan = 0
THEN 1
ELSE ofr.NeedsLoan
END
)
If you want to get really fancy, you can, in your select statement, put a ranking that ranks cases that perfectly match the person's needs above those that just match some of the user's unmet needs. For this, it would look something like this, which prioritizes perfect matches first, and good matches second (and within good matches, those with more items in the offer), and then within those categories, latest offers first:
SELECT
...
DENSE_RANK() OVER (
PARTITION BY prs.pkPersonID
ORDER BY
CASE
WHEN <logic from 2nd SQL above>
THEN 1
WHEN <logic from 1st SQL above>
THEN ofr.NeedsEmail + ofr.NeedsCreditCard + ofr.NeedsLoan
ELSE NULL
END,
ofr.pkOfferId DESC
) AS Order
...

Get all missing numbers in the sequence

The numbers are originally alpha numeric so I have a query to parse out the numbers:
My query here gives me a list of numbers:
select distinct cast(SUBSTRING(docket,7,999) as INT) from
[DHI_IL_Stage].[dbo].[Violation] where InsertDataSourceID='40' and
ViolationCounty='Carroll' and SUBSTRING(docket,5,2)='TR' and
LEFT(docket,4)='2011' order by 1
Returns the list of numbers parsed out.
For example, the number will be 2012TR557. After using the query it will be 557.
I need to write a query that will give back the missing numbers in a sequence.
Here is one approach
The following should return one row for each sequence of missing numbers. So, if you series is 3, 5, 6, 9, then it should return:
4 4
7 8
The query is:
with nums as (
select distinct cast(SUBSTRING(docket, 7, 999) as INT) as n,
row_number() over (order by cast(SUBSTRING(docket, 7, 999) as INT)) as seqnum
from [DHI_IL_Stage].[dbo].[Violation]
where InsertDataSourceID = '40' and
ViolationCounty = 'Carroll' and
SUBSTRING(docket,5,2) = 'TR' and
LEFT(docket, 4) = '2011'
)
select (nums_prev.n + 1) as first_missing, nums.n - 1 as last_missing
from nums left outer join
nums nums_prev
on nums.seqnum = nums_prev.seqnum + 1
where nums.n <> nums_prev.n + 1 ;

SQL sub query with complex criteria

I have a table like this:
TransId. LayerNo. AccountId.
100. 1. 2.
100. 2. 3.
120. 1. 5.
120. 2. 6.
120. 3. 12.
70. 1. 2.
I want to find transId(s) where:
(LayerNo = 1 and (accountId = 2 or 5))
and
(LayerNo = 2 and (accountId = 3 or 6))
And result set would be row no 1,2,3,4.
How could I write query to get the result?
My database is SQL server 2008 r2
Thanks in advance
Nima
SELECT TransId
FROM your_table
WHERE ( layerno = 1
AND accountid IN ( 2, 5 ) )
INTERSECT
SELECT TransId
FROM your_table
WHERE ( layerno = 2
AND accountid IN ( 3, 6 ) )
One approach is to ensure that each transID must have two records that satisfy the conditions you outlined.
SELECT * FROM
TABLE
WHERE TransID IN(
SELECT TransId
FROM table
WHERE ( layerno = 1
AND accountid IN ( 2, 5 ) )
OR ( layerno = 2
AND accountid IN( 3, 6 ) )
GROUP BY
TransId
HAVING Count(*) = 2
)
However this could be a problem if you can have multple records where layerno = 1. So you can use self joins instead to ensure the criteria.
SELECT DISTINCT a.transid
FROM table a
INNER JOIN table b
ON a.transid = b.transid
INNER JOIN table c
ON a.transid = c.transid
WHERE b.layerno = 1
AND accountid IN ( 2, 5 )
AND c.layerno = 2
AND accountid IN ( 3, 6 )
That said Martin's INTERSECT approach is probably the best
Do you mean:
SELECT
TransId,
LayerNo,
AccountId
FROM Table
WHERE (LayerNo = 1 AND AccountId IN (2, 5)) OR
(LayerNo = 2 AND AccountId IN (3, 7))
create table #temp
( rowId Int Identity(1,1), transId int)
INSERT INTO #temp(transId)
select TransId
from TableName
where (layerNo = 1 and accountID IN (2, 5))
OR (layerNo = 2 and accountId IN (3, 6))
select * from #temp
SELECT
base.TransId,
base.LayerNo,
base.AccountId
FROM TableX AS base
JOIN TableX AS a
ON a.TransId = base.TransId
AND a.LayerNo = 1 AND a.AccountId IN (2, 5)
JOIN TableX AS b
ON b.TransId = base.TransId
AND b.LayerNo = 2 AND b.AccountId IN (3, 7)
WHERE (base.LayerNo = 1 AND base.AccountId IN (2, 5))
OR (base.LayerNo = 2 AND base.AccountId IN (3, 7))
This intersection is empty. If you take the values where LayerNo = 1 and LayerNo = 2 and intersect them their intersection is empty because these events are mutually exclusive. I believe this error comes from how the question was originally stated. I might be wrong but the predicate should have been
(LayerNo = 1 and (accountId = 2 or 5)) OR (LayerNo = 2 and (accountId = 3 or 6))
Replace the AND with an OR. If the predicate was stated correctly then the intersect is correct but will always be empty.
SELECT *
FROM table
WHERE (LayerNo = 1 AND (AccountID = 2 OR AccountID = 5))
OR (LayerNo = 2 AND (AccountID = 3 OR AccountID = 6))