SQL sum rows and select one unique id - sql

I am seeking a method of summing multiple rows and selecting the Unique id of one of the rows to be the Unique id for the sum row.. if that makes sence?
For example if I have a table like this
ID | Value1 Value2 Text1 Text2
---------|-------------------------------------------
1 | 100 150 Bananas Hawaii
2 | 200 100 Bananas Hawaii
3 | 300 200 Bananas Hawaii
---------|--------------------------------------------
1,2 or 3 | 600 450 Bananas Hawaii
To get the result row I would do something like this
SELECT
sum(Value1) as Value1
sum(Value2) as Value2
FROM
db..table
GROUP BY
Text1
,Text2
However I need to retrieve just one of the ID's to put on my results row, I don't care wether it would be 1, 2 or 3.
The reason for this is, I have a massive database and a big program to retrieve data, but due to some new programming I can suddenly now have more of the same row, but with different Unique ids, hence I am interrested in summing the rows and just keeping one of the Unique ids.
Assigning a new Unique ID to the result row will not help me because of the way everything is designed right now.
I am using Microsoft SQL Server 2005.

You could just use another aggregate like MIN or MAX
SELECT min(ID) as ID
,sum(Value1) as Value1
,sum(Value2) as Value2
FROM
db..table
GROUP BY
Text1
,Text2

Please try:
SELECT
min(ID) as ID,
sum(Value1) as Value1,
sum(Value2) as Value2
FROM
db..table
GROUP BY
Text1
,Text2

Related

SQL - Need to find duplicates where one column can have multiple values

I am pretty sure this SQL requires using GROUP BY and HAVING, but not sure how to write it.
I have a table similar to this:
ID
Cust#
Order#
ItemCode
DataPoint1
DataPoint2
1
001
123
I
xxxyyyxxx
123456
2
001
123
Insert
xxxyyyxxx
123456
3
001
123
Delete
asdf
9999
4
001
123
D
asdf
9999
In this table Rows 1 & 2 are effectively duplicates, as are rows 3 & 4.
This is determined by the ItemCode having the value of 'I' or 'Insert' in rows 1 & 2. And 'D' or 'Delete' in rows 3 & 4.
How could I write a SQL select statement to return rows 2 and 4, as I am interested in pulling out the duplicated rows with the higher ID value.
Thanks for any help.
Replace the "offending" column with a consistent value. Then, you can use row_number() or a similar mechanism:
select t.*
from (select t.*,
row_number() over (partition by Cust#, Order#, left(ItemCode, 1), DataPoint1, DataPoint2
order by id asc
) as seqnum
from t
) t
where seqnum > 1;
Note: Not all databases support left(), but all support the functionality somehow. This does assume that the first character of the ItemCode is sufficient to identify identical rows, regardless of the value.

SQL compares the value of 2 columns and select the column with max value row-by-row

I have table something like:
GROUP
NAME
Value_1
Value_2
1
ABC
0
0
1
DEF
4
4
50
XYZ
6
6
50
QWE
6
7
100
XYZ
26
2
100
QWE
26
2
What I would like to do is to groupby group and select the name with highest value_1. If their value_1 are the same, compare and select the max with value_2. If they're still the same, select the first one.
The output will be something like:
GROUP
NAME
Value_1
Value_2
1
DEF
4
4
50
QWE
6
7
100
XYZ
26
2
The challenge for me here is I don't know how many categories in NAME so a simple case when is not working. Thanks for help
You can use window functions to solve the bulk of your problem:
select t.*
from (select t.*,
row_number() over (partition by group order by value1 desc, value2 desc) as seqnum
from t
) t
where seqnum = 1;
The one caveat is the condition:
If they're still the same, select the first one.
SQL tables represent unordered (multi-) sets. There is no "first" one unless a column specifies the ordering. The best you can do is choose an arbitrary value when all the other values are the same.
That said, you might have another column that has an ordering. If so, add that as a third key to the order by.

Creating an ID to go along with my data when querying SQL Server

I have a little problem when querying data. I have a table that looks like this:
Value1 Value2
--------------------
ABC 123
BCD 123
DCE 123
EFG 123
What I'm hoping to do is, basically only select VALUE1 from this table, however, I'd like to assign an ID to go along with each value...
Desired end result
ID Value1
--------------
1 ABC
2 BCD
3 DCE
4 EFG
Is something like that possible? I'd hope to have the end result to be in alphabetical order and ID's assigned based in ASCENDING order. Also, would hope that this query can be in place even if more VALUE1 values are added in the table (dynamic)
Any ideas?
You can use window function row_number while selecting:
select row_number() over (
order by value1
) as id,
value1
from your_table;

Cumulative count of duplicates

For a table looking like
ID | Value
-------------
1 | 2
2 | 10
3 | 3
4 | 2
5 | 0
6 | 3
7 | 3
I would like to calculate the number of IDs with a higher Value, for each Value that appears in the table, i.e.
Value | Position
----------------
10 | 0
3 | 1
2 | 4
0 | 6
This equates to the offset of the Value in a ORDER BY Value ordering.
I have considered doing this by calculating the number of duplicates with something like
SELECT Value, count(*) AS ct FROM table GROUP BY Value";
And then cumulating the result, but I guess that is not the optimal way to do it (nor have I managed to combine the commands accordingly)
How would one go about calculating this efficiently (for several dozens of thousands of rows)?
This seems like a perfect opportunity for the window function rank() (not the related dense_rank()):
SELECT DISTINCT ON (value)
value, rank() OVER (ORDER BY value DESC) - 1 AS position
FROM tbl
ORDER BY value DESC;
rank() starts with 1, while your count starts with 0, so subtract 1.
Adding a DISTINCT step (DISTINCT ON is slightly cheaper here) to remove duplicate rows (after computing counting ranks). DISTINCT is applied after window functions. Details in this related answer:
Best way to get result count before LIMIT was applied
Result exactly as requested.
An index on value will help performance.
SQL Fiddle.
You might also try this if you're not comfortable with window functions:
SELECT t1.value, COUNT(DISTINCT t2.id) AS position
FROM tbl t1 LEFT OUTER JOIN tbl t2
ON t1.value < t2.value
GROUP BY t1.value
Note the self-join.

Efficient ways to count the number of times two items are ordered together

I am currently stuck on a problem where I have to write a SQL query to count the number of times a pair of items is ordered together.
The table that I have at my disposal is something like:
ORDER_ID | PRODUCT_ID | QUANTITY
1 1 10
1 2 20
1 3 10
2 1 10
2 2 20
3 3 50
4 2 10
I am looking to write a SQL query that can, for every unique pair of items, count the number of times they were ordered together and tell me the quantities when they were in the same order.
The resulting table should look like:
PRODUCT_ID_1 | PRODUCT_ID_2 | NUM_JOINT_ORDERS | SUM_QUANTITY_1 | SUM_QUANTITY__2
1 2 2 20 40
1 3 1 10 10
2 3 1 20 10
Some things to exploit are that:
Some orders only contain 1 item and so are not relevant in counting the pairwise relationship (not sure how to exclude these but maybe it makes sense to filter them first)
We only need to list the pairwise relationship once in the final table (so maybe a WHERE PRODUCT_ID_1 < PRODUCT_ID_2)
There is a similar post here, though I have reposted the question because
I really want to know the fastest way to do this since my original table is huge and my computational resources are limited, and
in this case I only have a single table and no table that lists the number.
You may use the following approach, which gives you the result shown above.
select
PRODUCT1, PRODUCT2, count(*), sum(QUANTITY1), sum(QUANTITY2)
from (
select
T1.PRODUCT_ID AS PRODUCT1,
T2.PRODUCT_ID AS PRODUCT2,
T1.QUANTITY AS QUANTITY1,
T2.QUANTITY AS QUANTITY2
from TABLE as T1, TABLE as T2
where T1.ORDER_ID=T2.ORDER_ID
and T1.PRODUCT_ID<T2.PRODUCT_ID
)
group by PRODUCT1, PRODUCT2