MS-SQL get difference value - sql

I have this query that calculates current gallons value from all fuel tanks in my database.
SELECT DISTINCT y.TankNumber as TankNumber
, y.Gallons as Gallons
, y.timeUpdated
, y.FuelType as FuelType
FROM (
SELECT TankNumber, max(timeUpdated) as maxdate
FROM someTable
GROUP BY TankNumber) as x
JOIN someTable y
ON x.TankNumber = y.TankNumber
AND x.maxdate = y.timeUpdated
ORDER BY y.TankNumber
Based on the fuel usage, data gets dumped in to my database automatically at any time. And query above will give me only the current gallons value in each fueltank:
TankNumber | Gallons | timeUpdated | FuelType
1 | 14 | 2012-10-22 04:16 | 89
2 | 8 | 2012-10-22 04:14 | 93
and etc..
My problem is, that I am trying to add another output value to my page, that will give me a difference how much fuel was used since last update. So it will look something like this:
TankNumber | Gallons | timeUpdated | FuelType | GallonsUsed
1 | 14 | 2012-10-22 04:16 | 89 | 5
2 | 8 | 2012-10-22 04:14 | 93 | -11
Unfortunately my SQL experience is not as solid for this type of problem and I have spent about two days trying to figure out or google something close. So, any help will be greatly appreciated.

Assuming you're using MS SQL 2005 or later, you can use the ROW_NUMBER function:
WITH cteOrderedUpdates As
(
SELECT
TankNumber,
Gallons,
TimeUpdated,
FuelType,
ROW_NUMBER() OVER
(
PARTITION BY
TankNumber
ORDER BY
TimeUpdated DESC
) As RowNumber
FROM
someTable
)
SELECT
x.TankNumber,
x.Gallons,
x.TimeUpdated,
x.FuelType,
x.Gallons - IsNull(y.Gallons, 0) As GallonsUsed
FROM
cteOrderedUpdates As x
LEFT JOIN cteOrderedUpdates As y
ON x.TankNumber = y.TankNumber
And x.RowNumber = y.RowNumber - 1
WHERE
x.RowNumber = 1
ORDER BY
x.TankNumber
;

Related

Self join to create a new column with updated records

I am trying to write a SQL query to get the start date for employees in a store. As seen in the first screenshot, employee number 5041 had the number A0EH but as the number got updated, it updated the start date for the employee as well. This effects the metric of total duration in the store.
I am trying to get to the output below but haven't been able to figure out how to get this view.
This is the code I was trying but I am not getting the correct output.
select
esd.employee_number,
(case when esd.old_employee_number is null then es.employee_number else es.old_employee_number end) as old_employee_number,
esd.entity_id,
esd.original_start_date
from earliest_start_date as esd
left join earliest_start_date as es
on (es.employee_number = esd.old_employee_number)
How do I solve this on SQL?
Redshift reportedly supports recursion via WITH clause. Here's an example:
MariaDB 10.5 has similar support. Test case is here:
Fully working test case (via MariaDB 10.5) (Updated)
Link to Amazon Redshift detail for WITH clause and window functions:
Amazon Redshift - WITH clause
Amazon redshift - Window functions
WITH RECURSIVE cte (employee_number, original_no, entity_id, original_start_date, n) AS (
SELECT employee_number, employee_number, entity_id, original_start_date, 1 FROM earliest_start_date WHERE old_employee_number IS NULL UNION ALL
SELECT new_tbl.employee_number, cte.original_no, cte.entity_id, cte.original_start_date, n+1
FROM earliest_start_date new_tbl
JOIN cte
ON cte.employee_number = new_tbl.old_employee_number
)
, xrows AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY entity_id ORDER BY n DESC) AS rn
FROM cte
)
SELECT * FROM xrows WHERE rn = 1
;
Result:
+-----------------+-------------+-----------+---------------------+------+----+
| employee_number | original_no | entity_id | original_start_date | n | rn |
+-----------------+-------------+-----------+---------------------+------+----+
| XXXX | XXXX | 88 | 2021-09-02 | 1 | 1 |
| 5041 | A0EH | 96 | 2021-09-05 | 2 | 1 |
+-----------------+-------------+-----------+---------------------+------+----+
2 rows in set
Raw test data:
SELECT * FROM earliest_start_date;
+-----------------+---------------------+-----------+---------------------+
| employee_number | old_employee_number | entity_id | original_start_date |
+-----------------+---------------------+-----------+---------------------+
| 5041 | A0EH | 96 | 2021-09-10 |
| A0EH | NULL | 96 | 2021-09-05 |
| XXXX | NULL | 88 | 2021-09-02 |
+-----------------+---------------------+-----------+---------------------+
Note that the logic makes assumption about uniqueness of the employee_number and, in the current form, can't handle cases where the employee_number is reused by the same employee or used again with a different employee without adjusting prior data. There may not be enough detail in the current structure to handle those cases.

How to aggregate based on various conditions

lets say I have a table which stores itemID, Date and total_shipped over a period of time:
ItemID | Date | Total_shipped
__________________________________
1 | 1/20/2000 | 2
2 | 1/20/2000 | 3
1 | 1/21/2000 | 5
2 | 1/21/2000 | 4
1 | 1/22/2000 | 1
2 | 1/22/2000 | 7
1 | 1/23/2000 | 5
2 | 1/23/2000 | 6
Now I want to aggregate based on several periods of time. For example, I Want to know how many of each item was shipped every two days and in total. So the desired output should look something like:
ItemID | Jan20-Jan21 | Jan22-Jan23 | Jan20-Jan23
_____________________________________________
1 | 7 | 6 | 13
2 | 7 | 13 | 20
How do I do that in the most efficient way
I know I can make three different subqueries but I think there should be a better way. My real data is large and there are several different time periods to be considered i. e. in my real problem I want the shipped items for current_week, last_week, two_weeks_ago, three_weeks_ago, last_month, two_months_ago, three_months_ago so I do not think writing 7 different subqueries would be a good idea.
Here is the general idea of what I can already run but is very expensive for the database
WITH
sq1 as (
SELECT ItemID, sum(Total_shipped) sum1
FROM table
WHERE Date BETWEEN '1/20/2000' and '1/21/2000'
GROUP BY ItemID),
sq2 as (
SELECT ItemID, sum(Total_Shipped) sum2
FROM table
WHERE Date BETWEEN '1/22/2000' and '1/23/2000'
GROUP BY ItemID),
sq3 as(
SELECT ItemID, sum(Total_Shipped) sum3
FROM Table
GROUP BY ItemID)
SELECT ItemID, sq1.sum1, sq2.sum2, sq3.sum3
FROM Table
JOIN sq1 on Table.ItemID = sq1.ItemID
JOIN sq2 on Table.ItemID = sq2.ItemID
JOIN sq3 on Table.ItemID = sq3.ItemID
I dont know why you have tagged this question with multiple database.
Anyway, you can use conditional aggregation as following in oracle:
select
item_id,
sum(case when "date" between date'2000-01-20' and date'2000-01-21' then total_shipped end) as "Jan20-Jan21",
sum(case when "date" between date'2000-01-22' and date'2000-01-23' then total_shipped end) as "Jan22-Jan23",
sum(case when "date" between date'2000-01-20' and date'2000-01-23' then total_shipped end) as "Jan20-Jan23"
from my_table
group by item_id
Cheers!!
Use FILTER:
select
item_id,
sum(total_shipped) filter (where date between '2000-01-20' and '2000-01-21') as "Jan20-Jan21",
sum(total_shipped) filter (where date between '2000-01-22' and '2000-01-23') as "Jan22-Jan23",
sum(total_shipped) filter (where date between '2000-01-20' and '2000-01-23') as "Jan20-Jan23"
from my_table
group by 1
item_id | Jan20-Jan21 | Jan22-Jan23 | Jan20-Jan23
---------+-------------+-------------+-------------
1 | 7 | 6 | 13
2 | 7 | 13 | 20
(2 rows)
Db<>fiddle.

POSTGRESQL : How to select the first row of each group?

With this query :
WITH responsesNew AS
(
SELECT DISTINCT responses."studentId", notation, responses."givenHeart",
SUM(notation + responses."givenHeart") OVER (partition BY responses."studentId"
ORDER BY responses."createdAt") AS total, responses."createdAt",
FROM responses
)
SELECT responsesNew."studentId", notation, responsesNew."givenHeart", total,
responsesNew."createdAt"
FROM responsesNew
WHERE total = 3
GROUP BY responsesNew."studentId", notation, responsesNew."givenHeart", total,
responsesNew."createdAt"
ORDER BY responsesNew."studentId" ASC
I get this data table :
studentId | notation | givenHeart | total | createdAt |
----------+----------+------------+-------+--------------------+
374 | 1 | 0 | 3 | 2017-02-13 12:43:03
374 | null | 0 | 3 | 2017-02-15 22:22:17
639 | 1 | 2 | 3 | 2017-04-03 17:21:30
790 | 1 | 0 | 3 | 2017-02-12 21:12:23
...
My goal is to keep only in my data table the early row of each group like shown below :
studentId | notation | givenHeart | total | createdAt |
----------+----------+------------+-------+--------------------+
374 | 1 | 0 | 3 | 2017-02-13 12:43:03
639 | 1 | 2 | 3 | 2017-04-03 17:21:30
790 | 1 | 0 | 3 | 2017-02-12 21:12:23
...
How can I get there?
I've read many topics over here but nothing I've tried with DISTINCT, DISTINCT ON, subqueries in WHERE, LIMIT, etc have worked for me (surely due to my poor understanding). I've met errors related to window function, missing column in ORDER BY and a few others I can't remember.
You can do this with distinct on. The query would look like this:
WITH responsesNew AS (
SELECT DISTINCT r."studentId", notation, r."givenHeart",
SUM(notation + r."givenHeart") OVER (partition BY r."studentId"
ORDER BY r."createdAt") AS total,
r."createdAt"
FROM responses r
)
SELECT DISTINCT ON (r."studentId") r."studentId", notation, r."givenHeart", total,
r."createdAt"
FROM responsesNew r
WHERE total = 3
ORDER BY r."studentId" ASC, r."createdAt";
I'm pretty sure this can be simplified. I just don't understand the purpose of the CTE. Using SELECT DISTINCT in this way is very curious.
If you want a simplified query, ask another question with sample data, desired results, and explanation of what you are doing and include the query or a link to this question.
use Row_number() window function to add a row number to each partition and then only show row 1.
no need to fully qualify names if only one table is involved. and use aliases when qualifying to simplify readability.
WITH responsesNew AS
(
SELECT "studentId"
, notation
, "givenHeart"
, SUM(notation + "givenHeart") OVER (partition BY "studentId" ORDER BY "createdAt") AS total
, "createdAt"
, Row_number() OVER ("studentId" ORDER BY "createdAt") As RNum
FROM responses r
)
SELECT RN."studentId"
, notation, RN."givenHeart"
, total
, RN."createdAt"
FROM responsesNew RN
WHERE total = 3
AND RNum = 1
GROUP BY RN."studentId"
, notation
, RN."givenHeart", total
, RN."createdAt"
ORDER BY RN."studentId" ASC

SQL GROUP BY and differences on same field (for MS Access)

Hi I have the following style of table under MS Access: (I didn't make the table and cant change it)
Date_r | Id_Person |Points |Position
25/05/2015 | 120 | 2000 | 1
25/05/2015 | 230 | 1500 | 2
25/05/2015 | 100 | 500 | 3
21/12/2015 | 120 | 2200 | 1
21/12/2015 | 230 | 2000 | 4
21/12/2015 | 100 | 200 | 20
what I am trying to do is to get a list of players (identified by Id_Person) ordered by the points difference between 2 dates.
So for example if I pick date1=25/05/2015 and date2=21/12/2015 I would get:
Id_Person |Points_Diff
230 | 500
120 | 200
100 |-300
I think I need to make something like
SELECT Id_Person , MAX(Points)-MIN(Points)
FROM Table
WHERE date_r = #25/05/2015# or date_r = #21/12/2015#
GROUP BY Id_Person
ORDER BY MAX(Points)-MIN(Points) DESC
But my problem is that i don't really want to order by (MAX(Points)-MIN(Points)) but rather by (points at date2 - points at date1) which can be different because points can decrease with the time.
One method is to use first and last However, this can sometimes produce strange results, so I think that conditional aggregation is best:
SELECT Id_Person,
(MAX(IIF(date_r = #25/05/2015#, Points, 0)) -
MIN(IIF(date_r = #21/05/2015#, Points, 0))
) as PointsDiff
FROM Table
WHERE date_r IN (#25/05/2015#, #21/12/2015#)
GROUP BY Id_Person
ORDER BY (MAX(IIF(date_r = #25/05/2015#, Points, 0)) -
MIN(IIF(date_r = #21/05/2015#, Points, 0))
) DESC;
Because you have two dates, this is more easily written as:
SELECT Id_Person,
SUM(IIF(date_r = #25/05/2015#, Points, -Points)) as PointsDiff
FROM Table
WHERE date_r IN (#25/05/2015#, #21/12/2015#)
GROUP BY Id_Person
ORDER BY SUM(IIF(date_r = #25/05/2015#, Points, -Points)) DESC;

SQL Query Compare values in per 15 minutes and display the result per hour

I have a table with 2 columns. UTCTime and Values.
The UTCTime is in 15 mins increment. I want a query that would compare the value to the previous value in one hour span and display a value between 0 and 4 depends on if the values are constant. In other words there is an entry for every 15 minute increment and the value can be constant so I just need to check each value to the previous one per hour.
For example
+---------|-------+
| UTCTime | Value |
------------------|
| 12:00 | 18.2 |
| 12:15 | 87.3 |
| 12:30 | 55.91 |
| 12:45 | 55.91 |
| 1:00 | 37.3 |
| 1:15 | 47.3 |
| 1:30 | 47.3 |
| 1:45 | 47.3 |
| 2:00 | 37.3 |
+---------|-------+
In this case, I just want a Query that would compare the 12:45 value to the 12:30 and 12:30 to 12:15 and so on. Since we are comparing in only one hour span then the constant values must be between 0 and 4 (O there is no constant values, 1 there is one like in the example above)
The query should display:
+----------+----------------+
| UTCTime | ConstantValues |
----------------------------|
| 12:00 | 1 |
| 1:00 | 2 |
+----------|----------------+
I just wanted to mention that I am new to SQL programming.
Thank you.
See SQL fiddle here
Below is the query you need and a working solution Note: I changed the timeframe to 24 hrs
;with SourceData(HourTime, Value, RowNum)
as
(
select
datepart(hh, UTCTime) HourTime,
Value,
row_number() over (partition by datepart(hh, UTCTime) order by UTCTime) RowNum
from foo
union
select
datepart(hh, UTCTime) - 1 HourTime,
Value,
5
from foo
where datepart(mi, UTCTime) = 0
)
select cast(A.HourTime as varchar) + ':00' UTCTime, sum(case when A.Value = B.Value then 1 else 0 end) ConstantValues
from SourceData A
inner join SourceData B on A.HourTime = B.HourTime and
(B.RowNum = (A.RowNum - 1))
group by cast(A.HourTime as varchar) + ':00'
select SUBSTRING_INDEX(UTCTime,':',1) as time,value, count(*)-1 as total
from foo group by value,time having total >= 1;
fiddle
Mine isn't much different from Vasanth's, same idea different approach.
The idea is that you need recursion to carry it out simply. You could also use the LEAD() function to look at rows ahead of your current row, but in this case that would require a big case statement to cover every outcome.
;WITH T
AS (
SELECT a.UTCTime,b.VALUE,ROW_NUMBER() OVER(PARTITION BY a.UTCTime ORDER BY b.UTCTime DESC)'RowRank'
FROM (SELECT *
FROM #Table1
WHERE DATEPART(MINUTE,UTCTime) = 0
)a
JOIN #Table1 b
ON b.UTCTIME BETWEEN a.UTCTIME AND DATEADD(hour,1,a.UTCTIME)
)
SELECT T.UTCTime, SUM(CASE WHEN T.Value = T2.Value THEN 1 ELSE 0 END)
FROM T
JOIN T T2
ON T.UTCTime = T2.UTCTime
AND T.RowRank = T2.RowRank -1
GROUP BY T.UTCTime
If you run the portion inside the ;WITH T AS ( ) you'll see that gets us the hour we're looking at and the values in order by time. That is used in the recursive portion below by joining to itself and evaluating each row compared to the next row (hence the RowRank - 1) on the JOIN.