Assign new value to every unique number in SQL Server - sql

I am new to SQL Server and trying to do some operations
Sample data:
Amount | BillID
-------+-------
500 | 10009
500 | 1492
350 | 15892
222 | 15596
899 | 20566
350 | 9566
How can I create a new column that holds a serial number according to the Amount column so the output looks like:
Amount | BillID | unique
-------+--------+-------
500 | 10009 | 1
500 | 1492 | 1
350 | 15892 | 2
222 | 15596 | 3
899 | 20566 | 4
350 | 9566 | 2

I would recommend dense_rank():
select t.*, dense_rank() over(order by amount) rn
from mytable t
This assigns a unique, incremental number to each amount. The smallest amount gets ranks 1, and the number are assigned incrementally by increasing amount. This is not exactly the output you showed (where there is no apparent logic to order the ranks), but I think that's the logic you want in essence.

Related

How to SUM() OVER() by pentaho?

MY data like
| ID | Values |
|:---:|:------:|
| 1 | 200 |
| 2 | 300 |
| 3 | 650 |
| 4 | 120 |
| 5 | 830 |
I want : T-SQL : SUM(Values) OVER(ORDER BY ID) AS Sum
ID
Values
Sum
1
200
200
2
300
500
3
650
1150
4
120
1270
5
830
2100
How should I do by pentaho??
You use the "Group by" step with the Cumulative sum option, and without filling the Group field section so it performs the sum for all the rows:
You'll have to feed the data ordered by ID with a Sort step, in my screenshot I haven't put the Sort step because I have fill up the Data grid with the data ordered, but in your case you might need to make sure the data is ordered first.

Get distinct values and sum their respective quantities

I have a problem.
I have a result query with order numbers item numbers and different quantities for each item.
I want to distinct all item numbers and count all quantities for each specific item number.
Here is an example table (Query output):
| OrderNo | ItemNo | Qty |
--------------------------------
| XY123 | 3000 | 4 |
| XY123 | 2000 | 2 |
| ZZ999 | 3000 | 6 |
| ZZ999 | 1000 | 3 |
| PP333 | 1000 | 5 |
The distinct values for all sold items with their item numbers would be:
1000 -> Count/Sum the Qty
2000 -> Count/Sum the Qty
3000 -> Count/Sum the Qty
Result:
| ItemNo | QtyTotal |
-------------------------
| 1000 | 8 |
| 2000 | 2 |
| 3000 | 10 |
My problem is, when I DISTINCT the ItemNo, i dont know how to SUM their corresponding quantities before. I need some advice please.
You can use group by:
select ItemNo, sum(Qty) as QtyTotal
from QueryOutput q
group by ItemNo;
You can replace QueryOutput with a query that produces your example table.
Fiddle

Clean Data Using SQL - Take Column Difference

I have data in SQL as follows:
Actual Table
+-------------+--------+------+
| Id | Weight | Type |
+-------------+--------+------+
| 00011223344 | 35 | A |
| 00011223344 | 10 | A |
| 12311223344 | 100 | B |
| 00034343434 | 25 | A |
| 00034343434 | 25 | A |
| 99934343434 | 200 | C |
| 88855667788 | 100 | D |
+-------------+--------+------+
Column ID will always have length of 11 and has data type varchar. I need to create a column Actual Weight and Actual ID from the table above.
Actual Id is dependent on column ID. If the ID starts with 000 than we need to find ID from column ID that does not starts with 000 but characters after that (i.e. 8 characters from right) are similar. Matched ID would be the Actual Id. For example if we look at first 3 ids first 2 starts with 000 and another ID that does not starts with 000 and contains similar 8 characters from right can be found in 3rd row i.e. 12311223344 therefore in derived column Actual ID the first 2 rows would have Actual Id as 12311223344.
Actual Weight is dependent on values in 2 columns ID and Weight. We need to group column Id based on the criteria mentioned above if for any Id that does not starts with 000 but contains another entry that does starts with 000. Then we need to recalculate Weight for Id that does not starts with 000 by adding all Weights of ones starting with 000 and taking difference with one that does not starts with 000.
Example if we look at first 3 rows, in 3rd row we have Id starting with 123 and having entries that have 8 digits from right similar to this one except they start with 000 instead of 123 (i.e. row 1 and 2). For cases starting with 000 Actual Weight would be similar to Weight but for the one starting with 123 Actual Weight would be 100-(35+10)
I am looking for a query that can create these 2 derived column without need of creating any other table/view.
Desired Output
+-------------+-------------+--------+---------------+------+
| Id | Actual ID | Weight | Actual Weight | Type |
+-------------+-------------+--------+---------------+------+
| 00011223344 | 12311223344 | 35 | 35 | A |
| 00011223344 | 12311223344 | 10 | 10 | A |
| 12311223344 | 12311223344 | 100 | 55 | B |
| 00034343434 | 99934343434 | 25 | 25 | A |
| 00034343434 | 99934343434 | 25 | 25 | A |
| 99934343434 | 99934343434 | 200 | 150 | C |
| 88855667788 | 88855667788 | 100 | 100 | D |
+-------------+-------------+--------+---------------+------+
Hmmmm . . . If I'm following this:
select t.*,
(case when id like '000%' then weight
else weight - sum(case when id like '000%' then weight else 0 end) over (partition by actual_id)
end) as actual_weight
from (select t.*,
max(id) over (partition by stuff(id, 1, 3, '')) as actual_id
from t
) t;
Here is a db<>fiddle.

SQL - SELECT all households by last value

I'm facing a problem that I cant wrap my head around so maybe you can help me to solve it!?
I have one table:
id | datetime | property | house_id | household_id | plug_id | value
---+--------------------+----------+----------+--------------+---------+--------
1 |2013-08-31 22:00:01 | 0 | 1 | 1 | 1 | 15
2 |2013-08-31 22:00:01 | 0 | 1 | 1 | 3 | 3
3 |2013-08-31 22:00:01 | 0 | 1 | 2 | 1 | 21
4 |2013-08-31 22:00:01 | 0 | 1 | 2 | 2 | 1
5 |2013-08-31 22:00:01 | 0 | 2 | 1 | 3 | 53
6 |2013-08-31 22:00:02 | 0 | 2 | 2 | 4 | 34
7 |2013-08-31 22:00:02 | 0 | 1 | 1 | 1 | 16
...
The table holds electricity consumption measurements per second for multiple houses that have multiple households (apartments) in them. Each household has multiple electricity plugs. None of the houses or households have a unique id but are identified by a combination of house_id and household_id.
1) I need a SQL query that can give me a list of all the unique households.
2) I want to use the list from 1) to create a SQL query that gives me a list of the highest value for each household (the value is cumulative, so the latest datetime holds the highest value). I need a total value (SUM) for each household (sum of all the plugs in that household), i.e. a list of of households with their total electricity consumption.
Is this even possible? I'm using SQL Server 2012 and the table has 100.000.000 rows.
If I understand correctly, you want the sum of the highest values of value, for house/household/plug combinations. This may do what you want:
select house_id, household_id, sum(maxvalue)
from (select house_id, household_id, plug_id, max(value) as maxvalue
from consumption
group by house_id, household_id, plug_id
) c
group by house_id, household_id;
according to your description I think you can use this query;
select house_id,household_id, max(value), sum(value) from your_table_name group by house_id,household_id

SQL Group By Having Where Statements

I have a MS Access table tracking quantities of products at end month as below.
I need to generate the latest quantity for a specified ProductId at a specified date e.g.
The Quantity for ProductId 1 on 15-Feb-12 is 100, The Quantity for ProductId 1 on 15-Mar-12 is 150.
ProductId | ReportingDate | Quantity|
1 | 31-Jan-12 | 100 |
2 | 31-Jan-12 | 200 |
1 | 28-Feb-12 | 150 |
2 | 28-Feb-12 | 250 |
1 | 31-Mar-12 | 180 |
2 | 31-Mar-12 | 280 |
My SQL statement below bring all previous values instead the latest one only. Could anyone assist me troubleshoot the query.
SELECT Sheet1.ProductId, Max(Sheet1.ReportingDate) AS MaxOfReportingDate, Sheet1.Quantity
FROM Sheet1
GROUP BY Sheet1.ProductId, Sheet1.Quantity, Sheet1.ReportingDate, Sheet1.ProductId
HAVING (((Sheet1.ReportingDate)<#3/15/2012#) AND ((Sheet1.ProductId)=1))
Here's #naveen's idea:
SELECT TOP 1 Sheet1.ProductId, Sheet1.ReportingDate AS MaxOfReportingDate, Sheet1.Quantity
FROM Sheet1
WHERE (Sheet1.ProductId = 1)
AND (Sheet1.ReportingDate < #2012/03/15#)
ORDER BY Sheet1.ReportingDate DESC
Although note that MsAccess selects top with ties, so this won't work if you have more than one row per ReportingDate, ProductId combo. (But at the same time, this means that the data isn't deterministic anyway)
Edit - I meant that if you have a contradiction in your data like below, you'll get 2 rows back.
ProductId | ReportingDate | Quantity|
1 | 31-Jan-12 | 100
1 | 31-Jan-12 | 200