Partition By - Sum all values Excluding Maximum Value - sql

I have data as follows
+----+------+--------+
| ID | Code | Weight |
+----+------+--------+
| 1 | M | 200 |
| 1 | 2A | 50 |
| 1 | 2B | 50 |
| 2 | | 350 |
| 2 | M | 350 |
| 2 | 3A | 120 |
| 2 | 3B | 120 |
| 3 | 5A | 100 |
| 4 | | 200 |
| 4 | | 100 |
+----+------+--------+
For ID 1 the max weight is 200, I want to subtract sum of all weights from ID 1 except the max value that is 200.
There might be a case when there are 2 rows containing max values for same id. Example for ID 2 we have 2 rows containing max value i.e. 350 . In such scenario I want to sum all values except the max value. But I would mark weight 0 for 1 of the 2 rows containing max value. That row would be the one where Code is NULL/Blank.
Case where there is only 1 row for an ID the row would be kept as is.
Another scenario could be one where there is only row containing max weight but Code is NULL/Blank in such case we would simply do what we did for ID 1. Sum all values except max value and subtract from row containing max value.
Desired Output
+----+------+--------+---------------+
| ID | Code | Weight | Actual Weight |
+----+------+--------+---------------+
| 1 | M | 200 | 100 |
| 1 | 2A | 50 | 50 |
| 1 | 2B | 50 | 50 |
| 2 | | 350 | 0 |
| 2 | M | 350 | 110 |
| 2 | 3A | 120 | 120 |
| 2 | 3B | 120 | 120 |
| 3 | 5A | 100 | 100 |
| 4 | | 200 | 100 |
| 4 | | 100 | 100 |
+----+------+--------+---------------+
I want to create column Actual Weight as shown above. I can't find a way to apply partition by excluding max value and create column Actual Weight.

dense_rank() to identify the row with max weight, dr = 1 is rows with max weight
row_number() to differentiate the max weight row for Code = blank from others
with cte as
(
select *,
dr = dense_rank() over (partition by ID order by [Weight] desc),
rn = row_number() over (partition by ID order by [Weight] desc, Code desc)
from tbl
)
select *,
ActWeight = case when dr = 1 and rn <> 1
then 0
when dr = 1 and rn = 1
then [Weight]
- sum(case when dr <> 1 then [Weight] else 0 end) over (partition by ID)
else [Weight]
end
from cte
dbfiddle demo

Hmmm . . . I think you just want window functions and conditional logic:
select t.*,
(case when 1 = row_number() over (partition by id order by weight desc, (case when code <> '' then 2 else 1 end))
then weight - sum(case when weight <> max_weight then weight else 0 end) over (partition by id)
else weight
end) as actual_weight
from (select t.*,
max(weight) over (partition by id, code) as max_weight
from t
) t

Related

SQL select with preference on column values

I am new to SQL and I would like to ask about how to select entries based on preferences and grouping.
+----------+----------+------+
| ENTRY_ID | ROUTE_ID | TYPE |
+----------+----------+------+
| 1 | 15 | 0 |
| 1 | 26 | 1 |
| 1 | 39 | 1 |
| 2 | 22 | 1 |
| 2 | 15 | 1 |
| 3 | 30 | 1 |
| 3 | 35 | 0 |
| 3 | 40 | 1 |
+----------+----------+------+
With the table above, I would like to select 1 entry for each ENTRY_ID with the following preference for the returned ROUTE_ID:
IF TYPE = 0 is available
for any one of the entries with the same ENTRY_ID, return the minimum ROUTE_ID for all entries with TYPE = 0
IF for the same ENTRY_ID only TYPE = 1 is available, return the minimum ROUTE_ID
The expected outcome for the query will be the following:
+----------+----------+------+
| ENTRY_ID | ROUTE_ID | TYPE |
+----------+----------+------+
| 1 | 15 | 0 |
| 2 | 15 | 1 |
| 3 | 35 | 0 |
+----------+----------+------+
Thank you for your help!
You can group by both TYPE and ENTRY_ID, and then use the HAVING clause to filter out those where TYPE is not the minimal value for that record.
SELECT ENTRY_ID, MIN(ROUTE_ID), TYPE
FROM MyTable
GROUP BY ENTRY_ID, TYPE
HAVING TYPE = (SELECT MIN(s.TYPE) FROM MyTable s WHERE s.ENTRY_ID = MyTable.ENTRY_ID)
This relies on type only being able to be 0 or 1. If there are more possible values, it will only return the lowest type.
If you want complete rows, use a correlated subquery:
select t.*
from t
where t.route_id = (select top 1 t2.route_id
from t as t2
where t2.entry_id = t.entry_id
order by iif(t2.type = 0, 1, 2), -- put type 0 first
t2.route_id asc -- then the first route_id
);
This has the advantage that it can return more than just the three columns you show in the question.

Distribute sequential SQL results evenly based on count

I have SQL results that I need to break into item ranges and the count distributed evenly across a number of tasks. What is a good way to do this?
My data looks like this.
+------+-------+----------+
| Item | Count | ItmGroup |
+------+-------+----------+
| 1A | 100 | 1 |
| 1B | 25 | 1 |
| 1C | 2 | 1 |
| 1D | 6 | 1 |
| 2A | 88 | 2 |
| 2B | 10 | 2 |
| 2C | 122 | 2 |
| 2D | 12 | 2 |
| 3A | 4 | 3 |
| 3B | 103 | 3 |
| 3C | 1 | 3 |
| 3D | 22 | 3 |
| 4A | 55 | 4 |
| 4B | 42 | 4 |
| 4C | 100 | 4 |
| 4D | 1 | 4 |
+------+-------+----------+
Item = the item code.
Count = this context it is determining the popularity of the item. This can be used to RANK items if need be.
ItmGroup - this is a parent value for the Itm column. Item is contained in a Group.
What differentiates this from other similar questions I'veviewed is that the ranges I need to determine cannot be taken out of the order they show in this table. We can do Item Range from A1 to B3, in other words, they can cross over ItmGroups, but they must remain in alphanumeric order by Item.
The expected result would be item ranges that evenly distribute the total count.
+------+-------+----------+
| FrItem | ToItem | TotCount|
+------+-------+----------+
| 1A | 2D | 134 |
| 3A | 3D | 130 |
(etc)
Provided you've happy with a rough estimate, this will split the data in to two groups.
The first group will always have as many records as possible, but no more than half of the total count (and group 2 will have the rest).
WITH
cumulative AS
(
SELECT
*,
SUM([Count]) OVER (ORDER BY Item) AS cumulativeCount,
SUM([Count]) OVER () AS totalCount
FROM
yourData
)
SELECT
MIN(item) AS frItem,
MAX(item) AS toItem,
SUM([Count]) AS TotCount
FROM
cumulative
GROUP BY
CASE WHEN cumulativeCount <= totalCount / 2 THEN 0 ELSE 1 END
ORDER BY
CASE WHEN cumulativeCount <= totalCount / 2 THEN 0 ELSE 1 END
To split the data in to 5 portions, it's similar...
GROUP BY
CASE WHEN cumulativeCount <= totalCount * 1/5 THEN 0
WHEN cumulativeCount <= totalCount * 2/5 THEN 1
WHEN cumulativeCount <= totalCount * 3/5 THEN 2
WHEN cumulativeCount <= totalCount * 4/5 THEN 3
ELSE 4 END
Depending on your data this isn't necessarily ideal
Item | Count GroupAsDefinedAbove IdealGroup
------+-------
1A | 4 1 1
2A | 5 2 1
3A | 8 2 2
If you want something that can get the two groups as close in size as possible, that's a lot more complex.
Same as the accepted answer, except declaring a batch number and an addition to the select statement in the WITH cumulativeCte to prevent a remainder.
DECLARE #BatchCount NUMERIC(4,2) = 5.00;
WITH
cumulativeCte AS
(
SELECT
*,
SUM(r.[Count]) OVER (ORDER BY Item) AS cumulativeCount,
SUM(r.[Count]) OVER () AS totalCount
,CEILING(SUM(r.[Count]) OVER (ORDER BY IM.MMITNO ASC) / (SUM(r.[Count]) OVER () / #BatchCount)) AS BatchNo
FROM
records r
)
SELECT
MIN(c.Item) AS frItem,
MAX(c.Item) AS toItem,
SUM(c.[Count]) AS TotCount,
c.BatchNo
FROM
cumulativeCte c
GROUP BY
c.BatchNo
ORDER BY
c.BatchNo

Horizontal Count SQL

I apologize if this is a duplicate question but I could not find my answer.
I am trying to take data that is horizontal, and get a count of how many times a specific number appears.
Example table
+-------+-------+-------+-------+
| Empid | KPI_A | KPI_B | KPI_C |
+-------+-------+-------+-------+
| 232 | 1 | 3 | 3 |
| 112 | 2 | 3 | 2 |
| 143 | 3 | 1 | 1 |
+-------+-------+-------+-------+
I need to see the following:
+-------+--------------+--------------+--------------+
| EmpID | (1's Scored) | (2's Scored) | (3's Scored) |
+-------+--------------+--------------+--------------+
| 232 | 1 | 0 | 2 |
| 112 | 0 | 2 | 1 |
| 143 | 2 | 0 | 1 |
+-------+--------------+--------------+--------------+
I hope that makes sense. Any help would be appreciated.
Since you are counting data across multiple columns, it might be easier to unpivot your KPI columns first, then count the scores.
You could use either the UNPIVOT function or CROSS APPLY to convert your KPI columns into multiple rows. The syntax would be similar to:
select EmpId, KPI, Val
from yourtable
cross apply
(
select 'A', KPI_A union all
select 'B', KPI_B union all
select 'C', KPI_C
) c (KPI, Val)
See SQL Fiddle with Demo. This gets your multiple columns into multiple rows, which is then easier to work with:
| EMPID | KPI | VAL |
|-------|-----|-----|
| 232 | A | 1 |
| 232 | B | 3 |
| 232 | C | 3 |
| 112 | A | 2 |
Now you can easily count the number of 1's, 2's, and 3's that you have using an aggregate function with a CASE expression:
select EmpId,
sum(case when val = 1 then 1 else 0 end) Score_1,
sum(case when val = 2 then 1 else 0 end) Score_2,
sum(case when val = 3 then 1 else 0 end) Score_3
from
(
select EmpId, KPI, Val
from yourtable
cross apply
(
select 'A', KPI_A union all
select 'B', KPI_B union all
select 'C', KPI_C
) c (KPI, Val)
) d
group by EmpId;
See SQL Fiddle with Demo. This gives a final result of:
| EMPID | SCORE_1 | SCORE_2 | SCORE_3 |
|-------|---------|---------|---------|
| 112 | 0 | 2 | 1 |
| 143 | 2 | 0 | 1 |
| 232 | 1 | 0 | 2 |

Select dynamic couples of lines in SQL (PostgreSQL)

My objective is to make dynamic group of lines (of product by TYPE & COLOR in fact)
I don't know if it's possible just with one select query.
But : I want to create group of lines (A PRODUCT is a TYPE and a COLOR) as per the number_per_group column and I want to do this grouping depending on the date order (Order By DATE)
A single product with a NB_PER_GROUP number 2 is exclude from the final result.
Table :
-----------------------------------------------
NUM | TYPE | COLOR | NB_PER_GROUP | DATE
-----------------------------------------------
0 | 1 | 1 | 2 | ...
1 | 1 | 1 | 2 |
2 | 1 | 2 | 2 |
3 | 1 | 2 | 2 |
4 | 1 | 1 | 2 |
5 | 1 | 1 | 2 |
6 | 4 | 1 | 3 |
7 | 1 | 1 | 2 |
8 | 4 | 1 | 3 |
9 | 4 | 1 | 3 |
10 | 5 | 1 | 2 |
Results :
------------------------
GROUP_NUMBER | NUM |
------------------------
0 | 0 |
0 | 1 |
~~~~~~~~~~~~~~~~~~~~~~~~
1 | 2 |
1 | 3 |
~~~~~~~~~~~~~~~~~~~~~~~~
2 | 4 |
2 | 5 |
~~~~~~~~~~~~~~~~~~~~~~~~
3 | 6 |
3 | 8 |
3 | 9 |
If you have another way to solve this problem, I will accept it.
What about something like this?
select max(gn.group_number) group_number, ip.num
from products ip
join (
select date, type, color, row_number() over (order by date) - 1 group_number
from (
select op.num, op.type, op.color, op.nb_per_group, op.date, (row_number() over (partition by op.type, op.color order by op.date) - 1) % nb_per_group group_order
from products op
) sq
where sq.group_order = 0
) gn
on ip.type = gn.type
and ip.color = gn.color
and ip.date >= gn.date
group by ip.num
order by group_number, ip.num
This may only work if your nb_per_group values are the same for each combination of type and color. It may also require unique dates, but that could probably be worked around if required.
The innermost subquery partitions the rows by type and color, orders them by date, then calculates the row numbers modulo nb_per_group; this forms a 0-based count for the group that resets to 0 each time nb_per_group is exceeded.
The next-level subquery finds all of the 0 values we mapped in the lower subquery and assigns group numbers to them.
Finally, the outermost query ties each row in the products table to a group number, calculated as the highest group number that split off before this product's date.

Create a sub query for sum data as a new column in SQL Server

Suppose that I have a table name as tblTemp which has data as below:
| ID | AMOUNT |
----------------
| 1 | 10 |
| 1-1 | 20 |
| 1-2 | 30 |
| 1-3 | 40 |
| 2 | 50 |
| 3 | 60 |
| 4 | 70 |
| 4-1 | 80 |
| 5 | 90 |
| 6 | 100 |
ID will be format as X (without dash) if it's only one ID or (X-Y) format if new ID (Y) is child of (X).
I want to add a new column (Total Amount) to output as below:
| ID | AMOUNT | Total Amount |
---------------------------------
| 1 | 10 | 100 |
| 1-1 | 20 | 100 |
| 1-2 | 30 | 100 |
| 1-3 | 40 | 100 |
| 2 | 50 | 50 |
| 3 | 60 | 60 |
| 4 | 70 | 150 |
| 4-1 | 80 | 150 |
| 5 | 90 | 90 |
| 6 | 100 | 100 |
The "Total Amount" column is the calculate column which sum value in Amount column that the (X) in ID column is the same.
In order to get parent ID (X), I use the following SQL:
SELECT
ID, SUBSTRING (ID, 1,
IIF (CHARINDEX('-', ID) = 0,
len(ID),
CHARINDEX('-', ID) - 1)
), Amount
FROM
tblTemp
How Can I query like this in SQL Server 2012?
You can use sqlfiddle here to test it.
Thank You
Pengan
You have already done most of the work. To get the final result you can use your existing query and make it a subquery or use a CTE, then use sum() over() to get the result:
;with cte as
(
SELECT
ID,
SUBSTRING (ID, 1,
IIF (CHARINDEX('-', ID) = 0,
len(ID),
CHARINDEX('-', ID) - 1)
) id_val, Amount
FROM tblTemp
)
select id, amount, sum(amount) over(partition by id_val) total
from cte
See SQL Fiddle with Demo
You can do this using the sum() window function:
select id, amount,
SUM(amount) over (partition by (case when id like '%-%'
then left (id, charindex('-', id) - 1)
else id
end)
) as TotalAmount
from tblTemp t