Group by unknown values using SQL - sql

I have a table called "shipment" and a table called "order". The order and the shipment are related using the table "order_movement". So, in this last table, there will be the shipment_id and the order_gid.
In the shipment table I have the name of the carrier (servprov_gid). What I want to do is to group all the order basing on the name of the carrier. Simple until this point. Here is my query:
select count(distinct order_release_gid) X, servprov_gid Y
from
(select distinct ord.order_release_gid, ship.servprov_gid
from order_release ord,
shipment ship,
order_movement om,
where ship.shipment_gid = om.shipment_gid
and om.order_release_gid = ord.order_release_gid
and ship.servprov_gid in ('CNHILA.CAVL_CCWB','CNHILA.PRLG_CCPL','CNHILA.TCXS_CCWB','CNHILA.RDWY_CCWB', 'CNHILA.WAWL_CCWB'))
group by servprov_gid
please, forget about the query form, it's not the focus of the question. So now I have all the order for a certain carrier, choosen in that list. But now I'd like to know, in the same query, all the orders by other carriers! What I'd expect is a table containing
0. X | Y
1. 1 | CNHILA.CAVL_CCWB
2. ...
3. 6 | OTHER
it's possible? Thank you
EDIT
my expected output is a "6-row" table containing the number of the orders for the 5 carrier specified in the "IN" clause and the number of all the other orders (the ones which have a different carrier)!
0. X | Y
1. 1 | CNHILA.CAVL_CCWB
2. 2 | CNHILA.PRLG_CCPL
3. 0 | CNHILA.TCXS_CCWB
4. 2 | CNHILA.RDWY_CCWB
5. 12 | CNHILA.WAWL_CCWB
6. 6 | OTHER

Skip doing the in list in the where clause, you are going to read everything anyway. Instead use a case statement to transform everyone that is not in the in list to OTHER:
select count(order_release_gid) X, servprov_gid Y
from
(select distinct ord.order_release_gid,
case
when ship.servprov_gid in ('CNHILA.CAVL_CCWB','CNHILA.PRLG_CCPL','CNHILA.TCXS_CCWB','CNHILA.RDWY_CCWB', 'CNHILA.WAWL_CCWB')
then ship.servprov_gid
else 'OTHER'
end servprov_gid
from order_release ord,
shipment ship,
order_movement om,
where ship.shipment_gid = om.shipment_gid
and om.order_release_gid = ord.order_release_gid
)
group by servprov_gid
order by case servprov_gid when 'OTHER' then 2 else 1 end
, servprov_gid
The case in the order by is only to insure that the OTHER row always is the last row.

You need to manually provide the same value to all of the OTHER providers so that you can group them. One way would be to use the DECODEfunction:
select
count(distinct order_release_gid) X,
ShipmentGroupID Y
from
(select distinct
ord.order_release_gid,
decode(ship.servprov_gid,
'CNHILA.CAVL_CCWB', 'CNHILA.CAVL_CCWB',
'CNHILA.PRLG_CCPL', 'CNHILA.PRLG_CCPL',
'CNHILA.TCXS_CCWB', 'CNHILA.TCXS_CCWB',
'CNHILA.RDWY_CCWB', 'CNHILA.RDWY_CCWB',
'CNHILA.WAWL_CCWB', 'CNHILA.WAWL_CCWB',
'OTHER') ShipmentGroupID
from
order_release ord,
shipment ship,
order_movement om
where
ship.shipment_gid = om.shipment_gid
and om.order_release_gid = ord.order_release_gid
)
group by
ShipmentGroupID
The decode function works like a CASE statement. The first parameter to the function is the value to be compared, then you follow with with pairs of values, the first of each pair is compared to the first parameter and if it matches then the second of the pair is returned. The extra parameter at the end is the default if no matches are found.
So if the provider is 'CNHILA.PRLG_CCPL' it will return 'CNHILA.PRLG_CCPL', but if the provider is 'CNHILA.IJustMadeThisUp' it will return 'OTHER' because none of the pairs given in the decode function matched it.
Your query though won't return a shipment method that is never used and your sample results contain a shipment provider with a count of 0.
This query can be rewritten to get those results, and you don't even need the order table:
select
count(distinct order_release_gid) X,
ShipmentGroupID Y
from
(select distinct
om.order_release_gid,
decode(ship.servprov_gid,
'CNHILA.CAVL_CCWB', 'CNHILA.CAVL_CCWB',
'CNHILA.PRLG_CCPL', 'CNHILA.PRLG_CCPL',
'CNHILA.TCXS_CCWB', 'CNHILA.TCXS_CCWB',
'CNHILA.RDWY_CCWB', 'CNHILA.RDWY_CCWB',
'CNHILA.WAWL_CCWB', 'CNHILA.WAWL_CCWB',
'OTHER') ShipmentGroupID
from
shipment ship
LEFT JOIN order_movement om ON ship.shipment_gid = om.shipment_gid
)
group by
ShipmentGroupID

Related

Clean up 'duplicate' data while preserving most recent entry

I want to display each crew member, basic info, and the most recent start date from their contracts. With my basic query, it returns a row for each contract, duplicating the basic info with a distinct start and end date.
I only need one row per person, with the latest start date (or null if they have never yet had a start date).
I have limited understanding of group by and partition functions. Queries I have reverse engineered for similar date use partition and create temp tables where they select from. Ultimately I could reuse that but it seems more convoluted than what we need.
select
Case when P01.EMPLOYMENTENDDATE < getdate() then 'Y'
else ''
end as "Deactivate",
concat(p01.FIRSTNAME,' ',p01.MIDDLENAME) as "First and Middle",
p01.LASTNAME,
p01.PIN,
(select top 1 TELENO FROM PW001P0T WHERE PIN = P01.PIN and TELETYPE = 6 ORDER BY TELEPRIORITY) as "EmailAddress",
org.NAME AS Vessel,
case
WHEN c02.CODECATEGORY= '20' then 'MARINE'
WHEN c02.CODECATEGORY= '10' then 'MARINE'
ELSE 'HOTEL' end as "Department",
c02.name as RankName,
c02.Alternative RankCode,
convert(varchar, ACT.DATEFROM,101) EmbarkDate,
convert(varchar,(case when ACT.DATEFROM is null then p03.TODATEESTIMATED else ACT.DATEFROM end),101) DebarkDate
FROM PW001P01 p01
JOIN PW001P03 p03
ON p03.PIN = p01.PIN
LEFT JOIN PW001C02 c02
ON c02.CODE = p03.RANK
/*LEFT JOIN PW001C02 CCIRankTbl
ON CCIRankTbl.CODE = p01.RANK*/
LEFT JOIN PWORG org
ON org.NUMORGID = dbo.ad_scanorgtree(p03.NUMORGID, 3)
LEFT JOIN PWORGVESACT ACT
ON ACT.numorgid=dbo.ad_scanorgtree(p03.numorgid,3)
where P01.EMPLOYMENTENDDATE > getdate()-10 or P01.EMPLOYMENTENDDATE is null
I only need to show one row per column. The first 5 columns will be the same always. The last columns depend on contract, and we just need data from the most recent one.
<table><tbody><tr><th>Deactivate</th><th>First and Middle</th><th>Lastname</th><th>PIN</th><th>Email</th><th>Vessel</th><th>Department</th><th>Rank</th><th>RankCode</th><th>Embark</th><th>Debark</th></tr><tr><td> </td><td>Martin</td><td>Smith</td><td>123</td><td>msmith#fake.com</td><td>Ship1</td><td>Marine</td><td>ViceCaptain</td><td>VICE</td><td>9/1/2008</td><td>9/20/2008</td></tr><tr><td> </td><td>Matin</td><td>Smith</td><td>123</td><td>msmith#fake.com</td><td>Ship2</td><td>Marine</td><td>Captain</td><td>CAP</td><td>12/1/2008</td><td>12/20/2008</td></tr><tr><td> </td><td>Steve Mark</td><td>Dude</td><td>98765</td><td>sdude#fake.com</td><td>Ship1</td><td>Hotel</td><td>Chef</td><td>CHEF</td><td>5/1/2009</td><td>8/1/2009</td></tr><tr><td> </td><td>Steve Mark</td><td>Dude</td><td>98765</td><td>sdude#fake.com</td><td>Ship3</td><td>Hotel</td><td>Chef</td><td>CHEF</td><td>10/1/2010</td><td>12/20/2010</td></tr></tbody></table>
Change your query to a SELECT DISTINCT on the main query and use a sub-select for DebarkDate column:
(SELECT TOP 1 A.DATEFROM FROM PWORGVESACT A WHERE A.numorgid = ACT.numorgid ORDER BY A.DATEFROM DESC) AS DebarkDate
You can do whatever conversions on the date you need to from the result of that sub-query.

SQL: Return single value with multiple criteria

Maybe a simple question but I don't get the right results so I hope you can help. In this case I have two different tables, one table filled with Order data (OrderID, Supplier and Order Value). The other table is filled with Invoice data (Invoice ID, Supplier, invoice value, Invoice value -10%, Invoice value +10%).
What I need is an overview based on the order table whereby there is a match between Order supplier and invoice supplier + the order value which is in the range of -10% and +10% of the invoice value. It doesn't matter which order belongs to which invoice, I only need to know whether there is a match 'yes' or 'no'.
Example: In the order table you can see row 1 (order 100). It belongs to supplier A and has a value of 10. In the invoice table you can see that row 4 meets the requirements (Supplier = A and order value: 10 -> range between 9 and 11). This should result in a 'Yes'.
Hope you can help!
Thanks in advance,
Greets!
Order table:
Invoice table:
try:
select *
from Order as o
join Invoice as i
on (o.Supplier = i.Supplier and o.Value between i.ValueMinus10Percent and i.ValuePlus10Percent);
If you just need to now if there is a match, use case with exists. However, I'm going to propose returning a matching invoice id. The value will be NULL if there are no matches:
select o.*,
(select i.id
from invoice i
where i.supplier = o.supplier and
o.value between i.value * 0.9 and i.value * 1.1
fetch first 1 row only
) as a_matching_invoice_id
from orders o
You can do this with a subselect:
select *, (select 1 from Supplier s where o.value - 10 > s.value and o.value + 10 < s.value and so on) as YesOrNo from table Order o;
select a.* into temp from order a,invoice b
where a.supplier=b.supplier and a.value=b.value
select *,case when value between min and max then state='yes'
else state='no'
end
from temp

finding specific customers in contract table

I am working with MS SQL and need help creating a query.
I have a table called CustomerContracts, in it is multiple lines per item# for specific customers.
For example real data
item cust_num
x 1156
x 3924
x 7565
x 84339
x 104365
x 106066
x 107377
x 118691
y 1156
y 3924
y 7565
y 84339
y 104365
y 106066
y 107377
So what I need to do is search the table by item number and a specific customer number and return the item if that customer number does not exist as a record for that item.
So, in this case I am checking all item records for the cust_num of 106066 and 118691 if the item does not contain both customers then I want it to be included in my results so in this cause item X would not show up, but item Y would.
I think I need to do some type of count. I have tried using NOT IN(002,003) no luck.
Suggestions?
to satisfy my attempt at this. I have tried at least 8 different ways, this is the latest attempt.
select 'Cust Does not exist' as Status,
i.item as item,
i.description as description,
t.numcusts
From
item i inner join (select count(cust_num) as numcusts,item
from itemcust
where cust_num NOT IN ('106066','118691')
group by item) t on t.item = i.item
where i.stat = 'A' and t.numcusts > 0
order by i.item,i.description
did not work. So, I am still trying to resolve it. I was able to develop a sort of solution using imbedded queries in Access, but can't get the sql it created to port over.
I am guessing that you have multiple records for a customer. You want customers where the table has no record for a given item. One approach is to use group by and having:
select cc.cust_num
from CustomerContracts cc
where c.cust_num in ('001', '002')
group by cc.cust_num
having sum(case when cc.cust_item_num in ('aaa') then 1 else 0 end) = 0;
Another approach is to use not exists:
select cc.*
from CustomerContract cc
where c.cust_num in ('001', '002') and
not exists (select 1 from CustomerContracts where cust_item_num in ('aaa'));
The first gives a list of customer number. The second gives all the records for those customers.
EDIT (based on edit to question):
The question is about items not customers. A modification to the first approach will work. If you only want items that have at least one of the customers, then use a where clause:
select cc.cust_item_num
from CustomerContracts cc
where c.cust_num in (106066, 118691)
group by cc.cust_item_num
having count(distinct c.cust_num) < 2;
If you want all items (even where neither customer appears), then count each customer match separately in the having clause:
select cc.cust_item_num
from CustomerContracts cc
group by cc.cust_item_num
having sum(case when c.cust_num = 106066 then 1 else 0 end) = 0 or
sum(case when c.cust_num = 118691 then 1 else 0 en) = 0;
Thanks for the help, but after 5 cups of coffee and 2 playlists I figured it out
select 'Cust Does not exist' as Status,
i.item as item,
i.description as description,expr1
From
item i RIGHT join (select itemcust.item,
sum(case when LTRIM(RTRIM(cust_num)) = '106066' or LTRIM(RTRIM(cust_num)) ='118691'then 1 else 0 end) as expr1
from itemcust
group by itemcust.item) t on t.item = i.item
where i.stat = 'A' AND expr1 <> 2
order by i.item,i.description
This gives me all the items that do not have both records or actually have them more than once.

Comparing a list of values

For example, I have a head-table with one column id and a position-table with id, head-id (reference to head-table => 1 to N), and a value. Now I select one row in the head-table, say id 1. I look into the position-table and find 2 rows which referencing to the head-table and have the values 1337 and 1338. Now I wanna select all heads which have also 2 positions with these values 1337 and 1338. The position-ids are not the same, only the values, because it is not a M to N relation. Can anyone tell me a SQL-Statement? I have no idea to get it done :/
Assuming that the value is not repeated for a given headid in the position table, and that it is never NULL, then you can do this using the following logic. Do a full outer join on the position table to the specific head positions you care about. Then check whether there is a full match.
The following query does this:
select *
from (select p.headid,
sum(case when p.value is not null then 1 else 0 end) as pmatches,
sum(case when ref.value is not null then 1 else 0 end) as refmatches
from (select p.value
from position p
where p.headid = <whatever>
) ref full outer join
position p
on p.value = ref.value and
p.headid <> ref.headid
) t
where t.pmatches = t.refmatches
If you do have NULLs in the values, you can accommodate these using coalesce. If you have duplicates, you need to specify more clearly what to do in this case.
Assuming you have:
Create table head
(
id int
)
Create table pos
(
id int,
head_id int,
value int
)
and you need to find duplicates by value, then I'd use:
Select distinct p.head_id, p1.head_id
from pos p
join pos p1 on p.value = p1.value and p.head_id<>p1.head_id
where p.head_id = 1
for specific head_id, or without last where for every head_id

Variant use of the GROUP BY clause in TSQL

Imagine the following schema and sample data (SQL Server 2008):
OriginatingObject
----------------------------------------------
ID
1
2
3
ValueSet
----------------------------------------------
ID OriginatingObjectID DateStamp
1 1 2009-05-21 10:41:43
2 1 2009-05-22 12:11:51
3 1 2009-05-22 12:13:25
4 2 2009-05-21 10:42:40
5 2 2009-05-20 02:21:34
6 1 2009-05-21 23:41:43
7 3 2009-05-26 14:56:01
Value
----------------------------------------------
ID ValueSetID Value
1 1 28
etc (a set of rows for each related ValueSet)
I need to obtain the ID of the most recent ValueSet record for each OriginatingObject. Do not assume that the higher the ID of a record, the more recent it is.
I am not sure how to use GROUP BY properly in order to make sure the set of results grouped together to form each aggregate row includes the ID of the row with the highest DateStamp value for that grouping. Do I need to use a subquery or is there a better way?
You can do it with a correlated subquery or using IN with multiple columns and a GROUP-BY.
Please note, simple GROUP-BY can only bring you to the list of OriginatingIDs and Timestamps. In order to pull the relevant ValueSet IDs, the cleanest solution is use a subquery.
Multiple-column IN with GROUP-BY (probably faster):
SELECT O.ID, V.ID
FROM Originating AS O, ValueSet AS V
WHERE O.ID = V.OriginatingID
AND
(V.OriginatingID, V.DateStamp) IN
(
SELECT OriginatingID, Max(DateStamp)
FROM ValueSet
GROUP BY OriginatingID
)
Correlated Subquery:
SELECT O.ID, V.ID
FROM Originating AS O, ValueSet AS V
WHERE O.ID = V.OriginatingID
AND
V.DateStamp =
(
SELECT Max(DateStamp)
FROM ValueSet V2
WHERE V2.OriginatingID = O.ID
)
SELECT OriginatingObjectID, id
FROM (
SELECT id, OriginatingObjectID, RANK() OVER(PARTITION BY OriginatingObjectID
ORDER BY DateStamp DESC) as ranking
FROM ValueSet)
WHERE ranking = 1;
This can be done with a correlated sub-query. No GROUP-BY necessary.
SELECT
vs.ID,
vs.OriginatingObjectID,
vs.DateStamp,
v.Value
FROM
ValueSet vs
INNER JOIN Value v ON v.ValueSetID = vs.ID
WHERE
NOT EXISTS (
SELECT 1
FROM ValueSet
WHERE OriginatingObjectID = vs.OriginatingObjectID
AND DateStamp > vs.DateStamp
)
This works only if there can not be two equal DateStamps for a OriginatingObjectID in the ValueSet table.