SQL query to group based on sum - sql

I have a simple table with values that I want to chunk/partition into distinct groups based on the sum of those values (up to a certain limit group sum total).
e.g.,. imagine a table like the following:
Key Value
-----------
A 1
B 4
C 2
D 2
E 5
F 1
And I would like to group into sets such that no one grouping's sum will exceed some given value (say, 5).
The result would be something like:
Group Key Value
-------------------
1 A 1
B 4
--------
Total: 5
2 C 2
D 2
--------
Total: 4
3 E 5
--------
Total: 5
4 F 1
--------
Total: 1
Is such a query possible?

While I am inclined to agree with the comments that this is best done outside of SQL, here is some SQL which would seem to do roughly what you're asking:
with mytable AS (
select 'A' AS [Key], 1 AS [Value] UNION ALL
select 'B', 4 UNION ALL
select 'C', 2 UNION ALL
select 'D', 2 UNION ALL
select 'E', 5 UNION ALL
select 'F', 1
)
, Sums AS (
select T1.[Key] AS T1K
, T2.[Key] AS T2K
, (SELECT SUM([Value])
FROM mytable T3
WHERE T3.[Key] <= T2.[Key]
AND T3.[Key] >= T1.[Key]) AS TheSum
from mytable T1
inner join mytable T2
on T2.[Key] >= T1.[Key]
)
select S1.T1K AS StartKey
, S1.T2K AS EndKey
, S1.TheSum
from Sums S1
left join Sums S2
on (S1.T1K >= S2.T1K and S1.T2K <= S2.T2K)
and S2.TheSum > S1.TheSum
and S2.TheSum <= 5
where S1.TheSum <= 5
AND S2.T1K IS NULL
When I ran this code on SQL Server 2008 I got the following results:
StartKey EndKey Sum
A B 5
C D 4
E E 5
F F 1
It should be straightforward to construct the required groups from these results.

If you want to have only two members or less in each set, you can use the following query:
Select
A.[Key] as K1 ,
B.[Key] as K2 ,
isnull(A.value,0) as V1 ,
isnull(B.value,0) as V2 ,
(A.value+B.value)as Total
from Table_1 as A left join Table_1 as B
on A.value+B.value<=5 and A.[Key]<>B.[Key]
For finding sets having more members, you can continue to use joins.

Related

Reorder the rows of a table according to the numbers of similar cells in a specific column using SQL

I have a table like this:
D
S
2
1
2
3
4
2
4
3
4
5
6
1
in which the code of symptoms(S) of three diseases(D) are shown. I want to rearrange this table (D-S) such that the diseases with more symptoms come up i.e. order it by decreasing the numbers of symptoms as below:
D
S
4
2
4
3
4
5
2
1
2
3
6
1
Can anyone help me to write a SQL code for it in SQL server?
I had tried to do this as the following but this doesn't work:
SELECT *
FROM (
select D, Count(S) cnt
from [D-S]
group by D
) Q
order by Q.cnt desc
select
D,
S
from
D-S
order by
count(*) over(partition by D) desc,
D,
S;
Two easy ways to approach this:
--==== Sample Data
DECLARE #t TABLE (D INT, S INT);
INSERT #t VALUES(2,1),(2,3),(4,2),(4,3),(4,5),(6,1);
--==== Using Window Function
SELECT t.D, t.S
FROM (SELECT t.*, Rnk = COUNT(*) OVER (PARTITION BY t.D) FROM #t AS t) AS t
ORDER BY t.Rnk DESC;
--==== Using standard GROUP BY
SELECT t.*
FROM #t AS t
JOIN
(
SELECT t2.D, Cnt = COUNT(*)
FROM #t AS t2
GROUP BY t2.D
) AS t2 ON t.D = t2.D
ORDER BY t2.Cnt DESC;
Results:
D S
----------- -----------
4 2
4 3
4 5
2 1
2 3
6 1

SQL query to identify rows not contained contained in another table across multiple subsets

I have two tables as seen below.
Table 1:
Day Group
---------
1 A
1 B
1 C
2 B
2 C
2 D
3 C
3 D
3 E
Table 2:
Group
-------
A
B
C
D
E
I would like to create a SQL query that identifies each Group that exists in Table 2 but does not exist in Table 1 partitioned by Day.
The desired result would look like this:
Day Group
---------
1 D
1 E
2 A
2 E
3 A
3 B
Use a cross join to generate all combinations and then weed out what you have:
select d.day, t1.group
from (select distinct day from table1) d cross join
table2 g left join
table1 t1
on t1.day = d.day and t1.group = g.group
where t1.day is null;
SELECT
*
FROM
(
SELECT DISTINCT
Day
FROM
Table1
) AS Days
,
Table2
WHERE
NOT EXISTS (
SELECT
*
FROM
Table1
WHERE
Table1.day=Days.Day AND
Table1.Group=Table2.Group
)

SQL get the closest two rows within duplicate rows

I have following table
ID Name Stage
1 A 1
1 B 2
1 C 3
1 A 4
1 N 5
1 B 6
1 J 7
1 C 8
1 D 9
1 E 10
I need output as below with parameters A and N need to select closest rows where difference between stage is smallest
ID Name Stage
1 A 4
1 N 5
I need to select rows where difference between stage is smallest
This query can make use of an index on (name, stage) efficiently:
WITH cte AS (
SELECT TOP 1
a.id AS a_id, a.name AS a_name, a.stage AS a_stage
, n.id AS n_id, n.name AS n_name, n.stage AS n_stage
FROM tbl a
CROSS APPLY (
SELECT TOP 1 *, stage - a.stage AS diff
FROM tbl
WHERE name = 'N'
AND stage >= a.stage
ORDER BY stage
UNION ALL
SELECT TOP 1 *, a.stage - stage AS diff
FROM tbl
WHERE name = 'N'
AND stage < a.stage
ORDER BY stage DESC
) n
WHERE a.name = 'A'
ORDER BY diff
)
SELECT a_id AS id, a_name AS name, a_stage AS stage FROM cte
UNION ALL
SELECT n_id, n_name, n_stage FROM cte;
SQL Server uses CROSS APPLY in place of standard-SQL LATERAL.
In case of ties (equal difference) the winner is arbitrary, unless you add more ORDER BY expressions as tiebreaker.
dbfiddle here
This solution works, if u know the minimum difference is always 1
SELECT *
FROM myTable as a
CROSS JOIN myTable as b
where a.stage-b.stage=1;
a.ID a.Name a.Stage b.ID b.Name b.Stage
1 A 4 1 N 5
Or simpler if u don't know the minimum
SELECT *
FROM myTable as a
CROSS JOIN myTable as b
where a.stage-b.stage in (SELECT min (a.stage-b.stage)
FROM myTable as a
CROSS JOIN myTable as b)

SELECT all rows where sum of count for this id is not 0

I'm querying an access db from excel. I have a table similar to this one:
id Product Count
1 A 0
1 B 5
3 C 0
2 A 0
2 B 0
2 C 5
3 A 6
3 B 5
3 C 7
From which I'd like to return all the rows (including the ones where count for that product is 0) where the sum of the count for this ID is not 0 and the product is either A or B. So from the above table, I would get:
id Product Count
1 A 0
1 B 5
3 A 6
3 B 5
The following query gives the right output, but is quite slow (takes almost a minute when querying from a somewhat small 7k row db), so I was wondering if there is a more efficient way of doing it.
SELECT *
FROM [BD$] BD
WHERE (BD.Product='A' or BD.Product='B')
AND BD.ID IN (
SELECT BD.ID
FROM [BD$] BD
WHERE (Product='A' or Product='B')
GROUP BY BD.ID
HAVING SUM(BD.Count)<>0)
Use your GROUP BY approach in a subquery and INNER JOIN that back to the [BD$] table.
SELECT BD2.*
FROM
(
SELECT BD1.ID
FROM [BD$] AS BD1
WHERE BD1.Product IN ('A','B')
GROUP BY BD1.ID
HAVING SUM(BD1.Count) > 0
) AS sub
INNER JOIN [BD$] AS BD2
ON sub.ID = BD2.ID;
IN() statement can perform badly a lot of times, you can try EXISTS() :
SELECT * FROM [BD$] BD
WHERE BD.Product in('A','B')
AND EXISTS(SELECT 1 FROM [BD$] BD2
WHERE BD.id = BD2.id
AND BD2.Product in('A','B')
AND BD2.Count > 0)
If you are looking for the records where the sum of the count for the id is non-zero, then at least one non-unique id must have a count that is non-zero.
SELECT *
FROM [BD$] BD
WHERE BD.Product IN ('A', 'B')
AND BD.ID IN (
SELECT DISTINCT b.ID
FROM [BD$] b
WHERE b.Product IN ('A', 'B')
AND b.Count<>0
)

access query to filter and combine count

i have two access tables
tableA
num count
1 7
2 8
3 9
4 9
5 13
6 6
tableB
num count
0 1
1 14
2 12
3 5
4 5
5 11
6 5
how can i create an access query that will ignore the numbers which have count less than 6 in any of the two tables. i.e. 0,3,4 & 6 and create a table with the rest of the numbers sorted by combined count
tableC
num count
5 24
1 21
2 20
any help appreciated
Maybe....
SELECT a.num, a.count + b.count
FROM tableA a
JOIN tableB b on b.num = a.num
WHERE a.count >= 6
AND b.count >= 6
this will include numbers which are in both A and B. To include numbers with count >= 6 that are in one table and not the other you'll have to add a Join and a "isnull" for the a.count and b.count values. ie; isnull(a.count,0) + isnull(b.count,0)
You can try something like this
SELECT DISTINCT tableA.num, [tableA].[val]+[tableB].[val] AS Expr1
FROM tableA INNER JOIN tableB ON tableA.num = tableB.num
WHERE (((tableA.val)>=6) AND ((tableB.val)>=6));
How about
SELECT x.Num, x.Count FROM (
SELECT Num, Count(*)
FROM tableA
GROUP BY Num
HAVING Count(*)>6
UNION ALL
SELECT Num, Count(*)
FROM tableB
GROUP BY Num
HAVING Count(*)>6) x
Or if count is a field, rather than a calculation:
SELECT x.Num, x.Count FROM (
SELECT Num, Count
FROM tableA
WHERE Count>6
UNION ALL
SELECT Num, Count
FROM tableB
WHERE Count>6) x