Selecting only if on column value is distinct by other column - sql

I have a linking table with rule_id to sub_rule_id as such:
rule_id | sub_rule_id
---------------------
1 | 1
2 | 1
2 | 2
2 | 3
3 | 3
3 | 4
I want to be able to get all the sub_rule_ids which are linked to only one rule_is by rule_id. So if my rule_id = 1 then I expected no rows. And if rule_id = 2 then I should get just one. Tried to play with distinct and having and would not trouble you with a bad query.. I am sure there is an easy elegant way to do it.
Thanks in advance

You can group by sub_rule_id amd set the condition in the having clause:
select sub_rule_id
from tablename
group by sub_rule_id
having count(distinct rule_id) = 1
Or with NOT EXISTS if you want full rows:
select t.* from tablename t
where not exists (
select 1 from tablename
where sub_rule_id = t.sub_rule_id
and rule_id <> t.rule_id
)

Related

SQL Server : update boolean column based on conditions from 2 other columns

I need to set all rows to 1 (true) in column PrimaryInvoiceFile when my InvoiceFileId only has one InvoiceId.
However, if InvoiceId has multiple has multiple InvoiceFileId, I need to set all of the PrimaryInvoiceFile rows to 0 (false) except for the most recent InvoiceFileId added based on the date added.
For example it should look like this:
|CreatedDate|InvoiceId|InvoiceFileId|PrimaryInvoiceFile|
+-----------+---------+-------------+------------------+
|2019-01-16 | 1 | 1 | 1 |
|2019-01-17 | 2 | 2 | 1 |
|2019-01-18 | 3 | 3 | 0 |
|2019-01-19 | 3 | 4 | 0 |
|2019-01-20 | 3 | 5 | 1 |
|2019-01-21 | 4 | 6 | 1 |
I just added the PrimaryInvoiceFile column migration and set the default value to 0.
Any help with this would be greatly appreciated! I have been racking my head with this trying to get my update statements to perform this update.
You can make use of rownumber while doing your update to achieve your desired results. Also, order by descending so that you get the most recent date.
Lets create a table keeping PrimaryInvoiceFile as null and then updating later.
select '2019-01-16' as CreatedDate, 1 as invoiceID, 1 as Invoicefield, null
as PrimaryInvoiceFile
into #temp
union all
select '2019-01-17' as CreatedDate, 2 as invoiceID, 2 as Invoicefield, null
as Primaryinvoicefile union all
select '2019-01-18' as CreatedDate, 3 as invoiceID, 3 as Invoicefield, null
as Primaryinvoicefile union all
select '2019-01-19' as CreatedDate, 3 as invoiceID, 4 as Invoicefield, null
as Primaryinvoicefile union all
select '2019-01-20' as CreatedDate, 3 as invoiceID, 5 as Invoicefield, null
as Primaryinvoicefile union all
select '2019-01-21' as CreatedDate, 4 as invoiceID, 6 as Invoicefield, null
as Primaryinvoicefile
update t
set Primaryinvoicefile = tst.Rownum
from #temp t
join
(Select invoiceID, Invoicefield,CreatedDate,
case when ROW_NUMBER() over (partition by invoiceID order by createddate desc) = 1
then 1 else 0 end as Rownum from #temp) tst
on tst.CreatedDate = t.CreatedDate
and tst.invoiceID = t.invoiceID
and tst.Invoicefield = t.Invoicefield
Case statement would make sure that you are value as 1 for only the rows where you have 1 row for invoice ID or for the most recent data.
select * from #temp
Output:
CreatedDate invoiceID Invoicefield PrimaryInvoiceFile
2019-01-16 1 1 1
2019-01-17 2 2 1
2019-01-18 3 3 0
2019-01-19 3 4 0
2019-01-20 3 5 1
2019-01-21 4 6 1
Please try this:
;WITH Data AS (
SELECT t.CreatedDate,t.InvoiceId,t.InvoiceFieldId,t.PrimaryInvoiceFile
,COUNT(*)OVER(PARTITION BY t.InvoiceId) AS [cnt]
FROM [YourTableName] t
)
UPDATE d SET d.PrimaryInvoiceFile = CASE WHEN d.cnt = 1 THEN 1 ELSE 0 END
FROM Data d
;
Query to play around:
DROP TABLE IF EXISTS #YourTableName;
CREATE TABLE #YourTableName(CreatedDate DATETIME2,InvoiceId INT, InvoiceFieldId INT,PrimaryInvoiceFile BIT);
INSERT INTO #YourTableName(CreatedDate,InvoiceId,InvoiceFieldId)VALUES
('2019-01-16',1,1),('2019-01-17',2,2),('2019-01-18',3,3),('2019-01-19',3,4),('2019-01-20',3,5),('2019-01-21',4,6)
;WITH Data AS (
SELECT t.CreatedDate,t.InvoiceId,t.InvoiceFieldId,t.PrimaryInvoiceFile,COUNT(*)OVER(PARTITION BY t.InvoiceId) AS [cnt]
FROM #YourTableName t
)
UPDATE d SET d.PrimaryInvoiceFile = CASE WHEN d.cnt = 1 THEN 1 ELSE 0 END
FROM Data d
;
SELECT t.CreatedDate,t.InvoiceId,t.InvoiceFieldId,t.PrimaryInvoiceFile
FROM #YourTableName t
;
DROP TABLE IF EXISTS #YourTableName;
If the default PrimaryInvoiceFile is set to 0, you need to update the PrimaryInvoiceFile of the maximum CreatedDate grouped by InvoiceId.
UPDATE inv1
SET inv1.PrimaryInvoiceFile = 1
FROM invoices inv1 JOIN
(SELECT max(CreatedDate) as maxDate, InvoiceId
FROM invoices
GROUP BY InvoiceId ) as inv2
WHERE inv1.CreatedDate=inv2.maxDate and inv1.InvoiceId= inv2.InvoiceId

Select IDs from multiple rows where column values satisfy one condition but not another

Hello I have the following problem.
I have a table like the one in this sql fiddle
This table defines a relationship and it contains IDs from two other tables
example values
| FirstID | SecondID |
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 2 | 4 |
| 2 | 5 |
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
I want to select all the FirstIDs that satisfy the following criteria.
Their corresponding SecondIDs are in the range 1-3 AND NOT in the range 4-5
For example in this case we would want FirstIDs 1 and 3.
I have tried the following queries
SELECT FirstID from table
WHERE SecondID IN (1,2,3) AND SecondID NOT IN (4,5)
SELECT FirstID,SecondID
FROM(
SELECT FirstID, SecondID
FROM table
WHERE SecondID in (1,2,3,4,5) )
WHERE SecondID NOT IN (4,5)
but I don't get the correct results I am aiming for.
What is the correct query to get the data I want?
SELECT FirstID
FROM table
WHERE SecondId in (1,2,3) --Included values
AND FirstID NOT IN (SELECT FirstID FROM test
WHERE SecondId IN (4,5)) --Excluded values
How about min() and max():
select firstid
from t
group by firstid
having min(secondId) between 1 and 3 and
max(secondid) between 1 and 3;
Assuming 1 is the minimum, then this can be simplified to:
having max(secondid) <= 3;
For arbitrary ranges, you can use sum(case):
having sum(case when secondId between 1 and 3 then 1 else 0 end) > 0 and
sum(case when secondId between 4 and 5 then 1 else 0 end) = 0;
I think Gonzalo Lorieto proably has the best answer to this question already, but depending on the size of your data, SELECT statements in a WHERE clause can get really slow, and the below might be significantly faster (although it's not clear it's worth it for the reduced readability...)
SELECT inrange.FirstId FROM
t inrange
LEFT OUTER JOIN
(SELECT FirstID FROM t
WHERE SEcondId IN (4,5)) outrange
ON inrange.firstID = outrange.firstId
WHERE SecondID IN (1,2,3)
AND outrange.firstId IS NULL
GROUP BY inrange.FirstId
You will want to use the EXISTS clause to exclude the FirstIDs that have an invalid SecondID. here is an example:
SELECT FirstID from test Has123
WHERE SecondID IN (1,2,3)
AND NOT EXISTS (
SELECT 1 FROM test Not45
WHERE Has123.FirstID = Not45.FirstID
AND Not45.SecondID IN (4,5)
)
GROUP BY FirstID
SqlFiddle

SQL group by can't find correct phrase

I have a simple design
id | grpid | main
-----------------
1 | 1 | 1
2 | 1 | 0
3 | 1 | 0
4 | 2 | 0
5 | 2 | 1
6 | 2 | 0
The question to answer is
What is the "id" of the main in each group?
The result should be
id
---
1
5
Seriously at the moment, I'm not able to answer it on my own. Pls assist me.
Maybe i'm oversimplifying it here but couldn't you just do this:
select id,
grpid
from table
where main = 1;
The simplest way you can do this with:
select id from <table_name> where main=1
but as you have mentioned you want id with group by grpid below query will work.
select id from <table_name> group by grpid, main having main = 1
You have to apply group by on your group id and based on that check the value of main as 1. You will get the desired result.
If you want to add a column for its corresponding "MainId" then you can do this perhaps?
SELECT f.id, f.grpid, f.main, t.MainId
FROM foo f
CROSS APPLY (
SELECT grpid, id AS MainId
FROM foo f1
WHERE main = 1
AND f.grpid = f1.grpid) t

Query to skip first row after id changes in SQL Server

I have a long table like the following. The table adds two similar rows after the id changes. E.g in the following table when ID changes from 1 to 2 a duplicate record is added. All I need is a SELECT query to skip this and all other duplicate records only if the ID changes.
# | name| id
--+-----+---
1 | abc | 1
2 | abc | 1
3 | abc | 1
4 | abc | 1
5 | abc | 1
5 | abc | 2
6 | abc | 2
7 | abc | 2
8 | abc | 2
9 | abc | 2
and so on
You could use NOT EXISTS to eliminate the duplicates:
SELECT *
FROM yourtable AS T
WHERE NOT EXISTS
( SELECT 1
FROM yourtable AS T2
WHERE T.[#] = T2.[#]
AND T2.ID > T.ID
);
This will return:
# name ID
------------------
. ... .
4 abc 1
5 abc 2
6 abc 2
. ... .
... (Some irrelevant rows have been removed from the start and the end)
If you wanted the first record to be retained, rather than the last, then just change the condition T2.ID > T.ID to T2.ID < T.ID.
You can use the following CTEs to simulate LAG window function not available in SQL Server 2008:
;WITH CTE_RN AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY [#], id) AS rn
FROM #mytable
), CTE_LAG AS (
SELECT t1.[#], t1.name,
t1.id AS curId, t2.id AS prevId,
t1.[#] AS cur#, t2.[#] AS lag#
FROM CTE_RN t1
LEFT JOIN CTE_RN t2 ON t1.rn = t2.rn + 1 )
You can now filter out the 'duplicate' records using the above CTE_LAG and the following predicate in your WHERE clause:
;WITH (
... cte definitions here
) SELECT *
FROM CTE_LAG
WHERE (NOT ((prevId <> curId) AND (cur# = lag#))) OR (prevId IS NULL)
If prevId <> curId and cur# = lag#, then there is a change in the value of the id column and the following record has the same [#] value as the previous one, i.e. it is a duplicate.
Hence, using NOT on (prevId <> curId) AND (cur# = lag#), filters out all 'duplicate' records. This means record (5, abc, 2) will be eliminated.
SQL Fiddle Demo here
P.S. You can also add column name in the logical expression of the WHERE clause, depending on what defines a 'duplicate'.
So I achieved it by using the following query in SQL server.
select #, name, id
from table
group by #, name, id
having count(*) > 0

SQL Server: Select only one row of rows that has the same ID on some coulmn

I have a table that has 3 columns:
- ID
- FROM
- TO
And i have data like that
-----------------------
ID | FROM | TO
1 | 2 | 1
2 | 5 | 1
3 | 7 | 1
4 | 2 | 1
5 | 2 | 1
6 | 9 | 1
7 | 3 | 1
8 | 4 | 1
9 | 5 | 1
I would like to create a query that selects all rows where TO = 1 and i don't want to display rows that was previously retrieved, for example i have multiple rows where FROM = 2 and TO = 1, i just need to retrieve that row only once.
My table doesn't really look like this but i am giving a small example because my aim is to collect all FROM numbers but without any redundancy.
use distinct keyword
select distinct m.from,m.to from mytable as m;
Use DISTINCT
SELECT DISTINCT from,to FROM yourTable WHERE to = 1
You just have to group by the columns you want to display:
select [from] from mytable group by [from]
If you want to see how many froms you have all you have to do is:
select [from], count(*) from mytable group by [from]
You could use distinct but it would slower than group by but require more memory.
Please read here if you want an explanation on the difference between group by and distinct:
Huge performance difference when using group by vs distinct
Not sure what exactly you meant select distinct [FROM] from TableName where [TO] = 1
OR
may be you need single row for every distinct [FROM] value for given [TO] ?
;with cte as (
select ID, [FROM], [TO],
rn = row_number() over (partition by [FROM] order by ID)
from TableName
where [TO] = 1
)
select ID, [FROM], [TO]
from cte
where rn=1