SQL query distinct - group by

SQL query distinct - group by - sql

I am having issues with a sql query.
I have following table (I simplified it a bit):
ChainId ComponentId TransmitEndpointId
1 156 NULL
1 156 7
1 157 7
2 510 6
2 510 6
2 511 6
2 511 8
What I need to do is get the number of TransmitEndpointId's foreach 'unique' ComponentId in a 'unique' ChainId.
So the result of the above data would be: 5 (2 unique componentId's in Chain 1 & 2 unique componentId's in Chain 2 with 2 different TransmitId's ==> NULL values don't count)
This is quite complex and have no idea on how to start with this query.
Can anybody help?
Thanks in advance.

add the where TransmitEndpointId is not null or you will get a 0 because of the NULL
SELECT ChainId,
ComponentId,
count(distinct TransmitEndpointId)
FROM chain
where TransmitEndpointId is not null
GROUP BY ChainId, ComponentId
select sum(endPointNr) from
(
SELECT count(distinct TransmitEndpointId) as endPointNr
FROM chain
where TransmitEndpointId is not null
GROUP BY ChainId, ComponentId
) X
EDIT (why add TransmitEndpointId is not null)

This will give you the unique count for each ChainID/ComponentId combination.
SELECT ChainId,
ComponentId,
count(distinct TransmitEndpointId)
FROM your_table
GROUP BY ChainId, ComponentId
Now you can use that inside a derived table to get the total count:
SELECT sum(distinct_count)
FROM (
SELECT ChainId,
ComponentId,
count(distinct TransmitEndpointId) as distinct_count
FROM your_table
GROUP BY ChainId, ComponentId
) t

I think that is what you want
Please Try this code
SELECT COUNT(DISTINCT ChainId,ComponentId,TransmitEndpointId)
FROM your_table
WHERE TransmitEndpointId IS NOT NULL

Related

SQL Server (terminal result) hierarchy map

In SQL Server 2016, I have a table with the following chaining structure:
dbo.Item
OriginalItem
ItemID
NULL
7
1
2
NULL
1
5
6
3
4
NULL
8
NULL
5
9
11
2
3
EDIT NOTE: Bold numbers were added as a response to #lemon comments below
Importantly, this example is a trivialized version of the real data, and the neatly ascending entries is not something that is present in the actual data, I'm just doing that to simplify the understanding.
I've constructed a query to get what I'm calling the TerminalItemID, which in this example case is ItemID 4, 6, and 7, and populated that into a temporary table #TerminalItems, the resultset of which would look like:
#TerminalItems
TerminalItemID
4
6
7
8
11
What I need, is a final mapping table that would look something like this (using the above example -- note that it also contains for 4, 6, and 7 mapping to themselves, this is needed by the business logic):
#Mapping
ItemID
TerminalItemID
1
4
2
4
3
4
4
4
5
6
6
6
7
7
8
8
9
11
11
11
What I need help with is how to build this last #Mapping table. Any assistance in this direction is greatly appreciated!

This should do:
with MyTbl as (
select *
from (values
(NULL, 1 )
,(1, 2 )
,(2, 3 )
,(3, 4 )
,(NULL, 5 )
,(5, 6 )
,(NULL, 7 )
) T(OriginalItem, ItemID)
)
, TerminalItems as (
/* Find all leaf level items: those not appearing under OriginalItem column */
select LeafItem=ItemId, ImmediateOriginalItem=M.OriginalItem
from MyTbl M
where M.ItemId not in
(select distinct OriginalItem
from MyTbl AllParn
where OriginalItem is not null
)
), AllLevels as (
/* Use a recursive CTE to find and report all parents */
select ThisItem=LeafItem, ParentItem=ImmediateOriginalItem
from TerminalItems
union all
select ThisItem=AL.ThisItem, M.OriginalItem
from AllLevels AL
inner join
MyTbl M
on M.ItemId=AL.ParentItem
)
select ItemId=coalesce(ParentItem,ThisItem), TerminalItemId=ThisItem
from AllLevels
order by 1,2
Beware of the MAXRECURSION setting; by default SQLServer iterates through recursion 100 times; this would mean that the depth of your tree can be 100, max (the maximum number of nodes between a terminal item and its ultimate original item). This can be increased by OPTION(MAXRECURSION nnn) where nnn can be adjusted as needed. It can also be removed entirely by using 0 but this is not recommended because your data can cause infinite loops.

This is a typical gaps-and-islands problem and can also be carried out without recursion in three steps:
assign 1 at the beginning of each partition
compute a running sum over your flag value (generated at step 1)
extract the max "ItemID" on your partition (generated at step 2)
WITH cte1 AS (
SELECT *, CASE WHEN OriginalItem IS NULL THEN 1 ELSE 0 END AS changepartition
FROM Item
), cte2 AS (
SELECT *, SUM(changepartition) OVER(ORDER BY ItemID) AS parts
FROM cte1
)
SELECT ItemID, MAX(ItemID) OVER(PARTITION BY parts) AS TerminalItemID
FROM cte2
Check the demo here.
Assumption: Your terminal id items correspond to the "ItemID" value preceding a NULL "OriginalItem" value.
EDIT: "Fixing orphaned records."
The query works correctly when records are not orphaned. The only way to deal them, is to get missing records back, so that the query can work correctly on the full data.
This is carried out by an extra subquery (done at the beginning), that will apply a UNION ALL between:
the available records of the original table
the missing records
WITH fix_orphaned_records AS(
SELECT * FROM Item
UNION ALL
SELECT NULL AS OriginalItem,
i1.OriginalItem AS ItemID
FROM Item i1
LEFT JOIN Item i2 ON i1.OriginalItem = i2.ItemID
WHERE i1.OriginalItem IS NOT NULL AND i2.ItemID IS NULL
), cte AS (
...
Missing records correspond to "OriginalItem" values that are never found within the "ItemID" field. A self left join will uncover these missing records.
Check the demo here.

You can use a recursive CTE to compute the last item in the sequence. For example:
with
n (orig_id, curr_id, lvl) as (
select itemid, itemid, 1 from item
union all
select n.orig_id, i.itemid, n.lvl + 1
from n
join item i on i.originalitem = n.curr_id
)
select *
from (
select *, row_number() over(partition by orig_id order by lvl desc) as rn from n
) x
where rn = 1
Result:
orig_id curr_id lvl rn
-------- -------- ---- --
1 4 4 1
2 4 3 1
3 4 2 1
4 4 1 1
5 6 2 1
6 6 1 1
7 7 1 1
See running example at db<>fiddle.

SQL query: Get rows where column A not exists with same value and column value B

Sorry, for the confusing title, but did not found a better one. Here is the situation:
CREATE TABLE orders (
order_id int NOT NULL,
company_id int NOT NULL,
last_update date NULL
)
Table Data:
ORDER_ID COMPANY_ID LAST_UPDATE
1 1 2020/06/08
2 1 2020/06/08
3 1 2020/06/08
4 2 2020/06/08
5 2 2020/01/27
6 3 2020/06/08
7 3 2020/06/08
8 3 2020/06/08
9 3 NULL
10 4 2020/06/08
11 4 2020/06/08
12 4 2020/06/08
13 4 2020/06/08
14 4 2020/06/08
I want to have all rows, with a company, where there is no row with the same company and a LAST_UPDATE older than 3 months (or null).
What does not work:
I cannot use a simple WHERE clause with the date, because this filters me out just the rows 5 and 9. I only want the rows 1-3 & 10-14.
What works, but is to slow:
I can use a subquery (AND company_ID NOT IN (SELECT DISTINCT company_id [...])), but this completely kills my performance. In prod environment I have nearly 50M rows, the ResultSet of the subquery is too huge.
My current workaround:
I just ordered my results by company_id, last_update and use a "continue" in my Java Code, if there is a too old last_update. But that's also not optimal.
Question:
Is there are performant SQL only way, to achive this. Maybe over a "group by ... having" - clause.
Thanks in advance!

You could use window functions:
select o.*
from (select o.*, min(last_update) over (partition by company_id) as min_last_update
from orders o
) o
where min_last_update >= add_months(sysdate, -3);
But a simple not exists should also be fine:
select o.*
from orders o
where not exists (select 1
from orders o2
where o2.company_id = o.company_id and
o2.last_update < add_months(sysdate, -3)
);
Either of these can take advantage of an index on orders(company_id, last_update).

You can use the WINDOW functions as follows:
SELECT T.* FROM (
SELECT T.*,
MIN(CASE WHEN LAST_UPDATE IS NULL THEN LAST_UPDATE - 100
ELSE LAST_UPDATE END) OVER(
PARTITION BY T.COMPANY_ID
) AS MIN_LAST_UPDATE
FROM ORDERS T ) T
WHERE T.MIN_LAST_UPDATE >= ADD_MONTHS(SYSDATE, - 3);

SQL, Check if Rows are in another Table

I have two tables, Stock and Warehouse.
I need to get the Items which are available in all Warehouses.
Here an example:
Stock Table:
ItemID WarehouseID ItemAmount
-------------------------------------------
1043 1 20
1043 2 2
1043 3 16
1043 4 17
1044 1 32
1044 2 12
1044 4 7
1055 2 6
 
Warehouse Table:
WarehouseID WarehouseName
-------------------------------
1 Name1
2 Name2
3 Name3
4 Name4
For the Example the result should be Item 1043 because its available in all Warehouses, unlike the other ones.
I didn't get to a solution, can anyone help me?

You could also use this "double negative" query using NOT EXISTS:
SELECT DISTINCT s.ItemID
FROM StockTable s
WHERE NOT EXISTS
(
SELECT 1 FROM Warehouse w
WHERE NOT EXISTS(SELECT 1 FROM StockTable s2
WHERE w.WarehouseID = s2.WarehouseID
AND s.ItemID = s2.ItemID)
)
Demo fiddle
This approach looks more verbose but it has some benefits:
you can change it easily if the rules are getting more complex
you can remove the DISTINCT to see all rows
you can add all columns since GROUP BY was not used
it has no issues with null values

select itemid
from stock
group by itemid
having count(distinct warehouseid) = (select count(*) from warehouse);
SQLFiddle: http://sqlfiddle.com/#!15/e4273/1
If the stock table may also contain items with an amount = 0 you need to add a where clause:
select itemid
from stock
where itemamount > 0
group by itemid
having count(distinct warehouseid) = (select count(*) from warehouse);

NOT EXISTS combined with EXCEPT:
select distinct ItemID
from stock s1
where not exists (select warehouseid from warehouse
except
select warehouseid from stock s2 where s2.ItemID = s1.ItemID);
You can even replace select distinct ItemID with select * to get all those items.

I use this query:
SELECT
ItemID
FROM
stock
GROUP BY
ItemID
HAVING
SUM(DISTINCT warehouseid) = (SELECT SUM(WarehouseID) from warehouse)
That is more reliable than using COUNT, because in a rare situation of don't making a foreign key it should returns some invalid results.

Is there a way to update groups of rows with separate incrementing values in one query

Lets say you have the following table:
Id Index
1 3
1 1
2 1
3 3
1 5
what I would like to have is the following:
Id Index
1 0
1 1
2 0
3 0
1 2
As you might notice, the goal is for every row where Id is the same, to incrementally update the Index column, starting from zero.
Now, I know this is fairly simple with using cursors, but out of curiosity is there a way to do this with single UPDATE query, somehow combining with temp tables, common table expressions or something similar?

Yes, assuming that the you don't really care about the order of the values for the new index values. SQL Server offers updatable CTEs and window functions that do exactly what you want:
with toupdate as (
select t.*, row_number() over (partition by id order by (select NULL)) as newindex
from table t
)
update toupdate
set index = newindex;
If you want them in a specific order, then you need another column to specify the ordering. The existing index column doesn't work.

With Row_number() -1 and CTE you can write as:
CREATE TABLE #temp1(
Id int,
[Index] int)
INSERT INTO #temp1 VALUES (1,3),(1,1),(2,1),(3,3),(1,5);
--select * from #temp1;
With CTE as
(
select t.*, row_number() over (partition by id order by (select null))-1 as newindex
from #temp1 t
)
Update CTE
set [Index] = newindex;
select * from #temp1;
Demo

I'm not sure why you would want to do this really, but I had fun figuring it out!
This solution relies on your table having a primary key for the self join... but you could always create an auto inc index if none exists and this is a one off job... This will also have the added benefit of getting you to think about the precise ordering of this you want... as currently there is no way of saying which order [ID] will get [Index] in.
UPDATE dbo.Example
SET [Index] = b.newIndex
FROM dbo.Example a
INNER JOIN (
select
z.ID,
z.[Index],
(row_number() over (partition by ID order by (select NULL))) as newIndex
from Example z
) b ON a.ID = b.ID AND a.[Index]=b.[Index] --Is this a unique self join for your table?.. no PK provided. You might need to make an index first.

Probably, this is what you want
SELECT *,RANK() OVER(PARTITION BY Id ORDER BY [Index])-1 AS NewIndex FROM
(
SELECT 1 AS Id,3 [Index]
UNION
SELECT 1,1
UNION
SELECT 2,1
UNION
SELECT 3,3
UNION
SELECT 1,5
) AS T
& the result will come as
Now if you want to update the table then execute this script
UPDATE tblname SET Index=RANK() OVER(PARTITION BY t.Id ORDER BY t.[Index])-1
FROM tblname AS t
In case I am missing something or any further assistance is required please let me know.

CREATE TABLE #temp1(
Id int,
Value int)
INSERT INTO #temp1 VALUES (1,2),(1,3),(2,3),(4,5)
SELECT
Id
,Value
,ROW_NUMBER() OVER (PARTITION BY Id ORDER BY Id) Id
FROM #temp1
Start with this :)
Gave me results like
Id Value Count
1 2 1
1 3 2
1 2 3
1 3 4
1 2 5
1 3 6
1 2 7
1 3 8
2 3 1
2 4 2
2 5 3
2 3 4
2 4 5
2 5 6
2 4 7
2 5 8
2 3 9
2 3 10
3 4 1
4 5 1
4 5 2
4 5 3
4 5 4

How to declare a row as a Alternate Row

id Name claim priority
1 yatin 70 5
6 yatin 1 10
2 hiren 30 3
3 pankaj 40 2
4 kavin 50 1
5 jigo 10 4
7 jigo 1 10
this is my table and i want to arrange this table as shown below
id Name claim priority AlternateFlag
1 yatin 70 5 0
6 yatin 1 10 0
2 hiren 30 3 1
3 pankaj 40 2 0
4 kavin 50 1 1
5 jigo 10 4 0
7 jigo 1 10 0
It is sorted as alternate group of same row.
I am Using sql server 2005. Alternate flag starts with '0'. In my example First record with name "yatin" so set AlternateFlag as '0'.
Now second record has a same name as "yatin" so alternate flag would be '0'
Now Third record with name "hiren" is single record, so assign '1' to it
In short i want identify alternate group with same name...
Hope you understand my problem
Thanks in advance

Try
SELECT t.*, f.AlternateFlag
FROM tbl t
JOIN (
SELECT [name],
AlternateFlag = ~CAST(ROW_NUMBER() OVER(ORDER BY MIN(ID)) % 2 AS BIT)
FROM tbl
GROUP BY name
) f ON f.name = t.name
demo

You could use probably an aggregate function COUNT() and then HAVING() and then UNION both Table, like:
SELECT id, A.Name, Claim, Priority, 0 as AlternateFlag
FROM YourTable
INNER JOIN (
SELECT Name, COUNT(*) as NameCount
FROM YourTable
GROUP BY Name
HAVING COUNT(*) > 1 ) A
ON YourTable.Name = A.Name
UNION ALL
SELECT id, B.Name, Claim, Priority, 1 as AlternateFlag
FROM YourTable
INNER JOIN (
SELECT Name, COUNT(*) as NameCount
FROM YourTable
GROUP BY Name
HAVING COUNT(*) = 1 ) B
ON YourTable.Name = B.Name
Now, this assumes that the Names are unique meaning the names like Yatin for example although has two counts is only associated to one person.
See my SqlFiddle Demo

You can use Row_Number() function with OVER that will give you enumeration, than use the reminder of integer division it by 2 - so you'll get 1s and 0s in your SELECT or in the view.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL query distinct - group by - sql

I think that is what you want Please Try this code SELECT COUNT(DISTINCT ChainId,ComponentId,TransmitEndpointId) FROM your_table WHERE TransmitEndpointId IS NOT NULL

Related

SQL Server (terminal result) hierarchy map

SQL query: Get rows where column A not exists with same value and column value B

SQL, Check if Rows are in another Table

Is there a way to update groups of rows with separate incrementing values in one query

How to declare a row as a Alternate Row

Categories

Resources