SUM only distinct values given certain criteria SQL - sql

I have a table which Data is like
LayerID Company id Company name Layer Name Price
1 1 x x1 20
2 1 x x2 10
3 2 y y1 50
4 2 y y2 50
5 2 y y3 50
6 3 z z1 15
What I want is to have the following table after SQL query is applied
Company id Company name Price
1 x 30
2 y 50
3 z 15
i.e. the following rules apply:
if the price for the different layers for the company are different then sum them up
example: for company x it would be 20+10 = 30
if the price for the different layers for the company are the same then take that number
example: for company y it would be 50, for z it would be 15
I'm not sure how i would so this in SQL (for Access/VBA), and have been trying to figure this out to no avail.
Thanks for your help in advance
Claudy

The SQL query that would produce the result you are looking for:
SELECT m.Company_id, m.Company_name, SUM(m.Price)
FROM
(
SELECT DISTINCT Company_id, Company_name, Price
FROM MyTable
) AS m
GROUP BY m.Company_id, m.Company_name

You can do this as:
SELECT m.Company_id, m.Company_name, SUM(distinct m.Price)
FROM table m
GROUP BY m.Company_id, m.Company_name;
As a warning: I never use sum(distinct). It generally indicates an error in the underlying data structure or subquery generating the data.
EDIT:
Why is it bad to do this? Generally, what you really want is:
SUM(m.Price) where <some id> is distinct
But you can't phrase that in SQL without a subquery. If the above is what you want, then you have a problem when two "id"s have the same price. The sum() produces the wrong value.

Related

Select rows where a value on table x is 1 greater than the same value on table y

I need to create a report of all the rows where a value in table (x) is 1 greater than another value in table (y).
For example, I want to select all rows from TABLE X where the 'Total' is 1 greater than the 'Sum' in TABLE Y. So here I want to select ONLY Dai's record:
TABLE X:
Name
Total
Dai
1001
Cam
1001
TABLE Y:
Name
Sum
Dai
1000
Cam
1001
I am running this SQL in an older version of SQL*Plus so any newer methods probably won't work.
Thanks in advance!
I think the solution could be like this:
select *
from X join Y on X.Total = Y.Sum + 1 and X.Name = Y.Name;

JOIN on aggregate function

I have a table showing production steps (PosID) for a production order (OrderID) and which machine (MachID) they will be run on; I’m trying to reduce the table to show one record for each order – the lowest position (field “PosID”) that is still open (field “Open” = Y); i.e. the next production step for the order.
Example data I have:
OrderID
PosID
MachID
Open
1
1
A
N
1
2
B
Y
1
3
C
Y
2
4
C
Y
2
5
D
Y
2
6
E
Y
Example result I want:
OrderID
PosID
MachID
1
2
B
2
4
C
I’ve tried two approaches, but I can’t seem to get either to work:
I don’t want to put “MachID” in the GROUP BY because that gives me all the records that are open, but I also don’t think there is an appropriate aggregate function for the “MachID” field to make this work.
SELECT “OrderID”, MIN(“PosID”), “MachID”
FROM Table T0
WHERE “Open” = ‘Y’
GROUP BY “OrderID”
With this approach, I keep getting error messages that T1.”PosID” (in the JOIN clause) is an invalid column. I’ve also tried T1.MIN(“PosID”) and MIN(T1.”PosID”).
SELECT T0.“OrderID”, T0.“PosID”, T0.“MachID”
FROM Table T0
JOIN
(SELECT “OrderID”, MIN(“PosID”)
FROM Table
WHERE “Open” = ‘Y’
GROUP BY “OrderID”) T1
ON T0.”OrderID” = T1.”OrderID”
AND T0.”PosID” = T1.”PosID”
Try this:
SELECT “OrderID”,“PosID”,“MachID” FROM (
SELECT
T0.“OrderID”,
T0.“PosID”,
T0.“MachID”,
ROW_NUMBER() OVER (PARTITION BY “OrderID” ORDER BY “PosID”) RNK
FROM Table T0
WHERE “Open” = ‘Y’
) AS A
WHERE RNK = 1
I've included the brackets when selecting columns as you've written it in the question above but in general it's not needed.
What it does is it first filters open OrderIDs and then numbers the OrderIDs from 1 to X which are ordered by PosID
OrderID
PosID
MachID
Open
RNK
1
2
B
Y
1
1
3
C
Y
2
2
4
C
Y
1
2
5
D
Y
2
2
6
E
Y
3
After it filters on the "rnk" column indicating the lowest PosID per OrderID. ROW_NUMBER() in the select clause is called a window function and there are many more which are quite useful.
P.S. Above solution should work for MSSQL

Count different groups in the same query

Imagine I have a table like this:
# | A | B | MoreFieldsHere
1 1 1
2 1 3
3 1 5
4 2 6
5 2 7
6 3 9
B is associated to A in an 1:n relationship. The table could've been created with a join for example.
I want to get both the total count and the count of different A.
I know I can use a query like this:
SELECT v1.cnt AS total, v2.cnt AS num_of_A
FROM
(
SELECT COUNT(*) AS cnt
FROM SomeComplicatedQuery
WHERE 1=1
-- AND SomeComplicatedCondition
) v1,
(
SELECT COUNT(A) AS cnt
FROM SomeComplicatedQuery
WHERE 1=1
-- AND SomeComplicatedCondition
GROUP BY A
) v2
However SomeComplicatedQuery would be a complicated and slow query and SomeComplicatedCondition would be the same in both cases. And I want to avoid calling it unnessesarily. Aside from that if the query changes, you need to make sure to change it in the other place too, making it prone to error and creating (probably unnessesary) work.
Is there a way to do this more efficiently?
Are you looking for this?
SELECT COUNT(*) AS total, COUNT(DISTINCT A) AS num_of_A
FROM (. . . ) q

Eliminating duplicate rows with null values using with clause

How do we eliminate duplicates by only selecting those with values in a certain field using with clause statement?
Query is something like this:
with x as (--queries with multiple join tables, etc.)
select distinct * from x
Output below:
Com_no Company Loc Rewards
1 Mccin India 50
1 Mccin India
2 Rowle China 18
3 Draxel China 11
3 Draxel China
4 Robo UK
As you can see, I get duplicate records. I want to get rid of the null values that are NOT unique. That is to say, Robo is unique since it only has 1 record with a null value in Rewards, so I want to keep that.
I tried this:
with x as (--queries with multiple join tables, etc.)
select distinct * from x where Rewards is not null
And of course that wasn't right, since it also got rid of 4 Robo UK
Expected output should be:
1 Mccin India 50
2 Rowle China 18
3 Draxel China 11
4 Robo UK
The problem is you're calling those rows duplicates, but they're not duplicates. They're different. So what you want to do is exclude rows where Rewards is null UNLESS there aren't any rows with a not null value, and then select the distinct rows. So something like:
select distinct *
from x a
where Rewards is not null
or (Rewards is null and not exists (select 1 from x b where a.Com_no = b.Com_no
and b.Rewards is not null)
Now your Robo row will still be included as there isn't a row in x for Robo where Rewards is not null, but the rows for the other Companies with null Rewards will be excluded as there are not null rows for them.
This is a prioritization query. One method is to use row_number(). If you want only one value per Com_no/Company/Loc, then:
select x.*
from (select x.*,
row_number() over (partition by Com_no, Company, Loc order by Rewards nulls last) as seqnum
from x
) x
where seqnum = 1;
Or even:
select Com_no, Company, Loc, max(Rewards)
from x
group by Com_no, Company, Loc;

Finding duplicates on one column using select where in SQL Server 2008

I am trying to select rows from a table that have duplicates in one column but also restrict the rows based on another column. It does not seem to be working correctly.
select Id,Terms from QueryData
where Track = 'Y' and Active = 'Y'
group by Id,Terms
having count(Terms) > 1
If I remove the where it works fine but I need to restrict it to these rows only.
ID Terms Track Active
100 paper Y Y
200 paper Y Y
100 juice Y Y
400 orange N N
1000 apple Y N
Ideally the query should return the first 2 rows.
SELECT Id, Terms, Track, Active
FROM QueryData
WHERE Terms IN (
SELECT Terms
FROM QueryData
WHERE Track = 'Y' and Active = 'Y'
GROUP BY Terms
HAVING COUNT(*) > 1
)
Demo on SQLFiddle
Data:
ID Terms Track Active
100 paper Y Y
200 paper Y Y
100 juice Y Y
400 orange N N
1000 apple Y N
Results:
Id Terms Track Active
100 paper Y Y
200 paper Y Y
Don't exactly get what you're doing. You use count(Terms) in having however Terms is in your select clause. It means that for each records count(Terms) will be 1. Probably you have to exclude Terms from select list.
Honestly i reproduced your table and query and it doesn't work.
Probably this is what you're looking for(?):
select Id, count(Terms) from QueryData
where Track = 'Y' and Active = 'Y'
group by Id
having count(Terms) > 1
This will return all duplicated terms meeting the criteria:
select Terms
from QueryData
where Track = 'Y' and Active = 'Y'
group by Terms
having count(*) > 1
http://sqlfiddle.com/#!3/18a57/2
If you want all the details for these terms, you can join to this result.
;with dups as (
select Terms
from QueryData
where Track = 'Y' and Active = 'Y'
group by Terms
having count(*) > 1
)
select
qd.ID, qd.Terms, qd.Track, qd.Active
from
QueryData qd join
dups d on qd.terms = d.terms
http://sqlfiddle.com/#!3/18a57/5