SQL Server: most efficient way to update multiple records depending on each other - sql

I want to update multiple records from table "a" depending on each other. The values of the table "a" look like:
+------------+---------------+-------+
| date | transfervalue | value |
+------------+---------------+-------+
| 01.03.2018 | 0 | 10 |
| 02.03.2018 | 0 | 6 |
| 03.03.2018 | 0 | 13 |
+------------+---------------+-------+
After the update the values of the table "a" should look like:
+------------+---------------+-------+
| date | transfervalue | value |
+------------+---------------+-------+
| 01.03.2018 | 0 | 10 |
| 02.03.2018 | 10 | 6 |
| 03.03.2018 | 16 | 13 |
+------------+---------------+-------+
What is the most efficient way to do this? I've tried three different solutions, but the last solution doesn't work.
Solution 1: do a loop and iterate over each day to do the update statement
Solution 2: do an update statement statement for each day
Solution 3: do the update for the whole timespan in one statement
The output of solution 3 was:
+------------+---------------+-------+
| date | transfervalue | value |
+------------+---------------+-------+
| 01.03.2018 | 0 | 10 |
| 02.03.2018 | 10 | 6 |
| 03.03.2018 | 6 | 13 |
+------------+---------------+-------+

You seem to want a cumulative sum:
with toupdate as (
select t.*, sum(value) over (order by date rows between unbounded preceding and 1 preceding) as running_value
from t
)
update toupdate
set transfervalue = coalesce(running_value, 0);

This should work:
select t1.*,
coalesce((select sum(value) from table1 t2 where t2.date < t1.date), 0) MyNewValue
from table1 t1

Related

How to de-duplicate SQL table rows by multiple columns with hierarchy?

I have a table with multiple records for each patient.
My end goal is a table that is 1-to-1 between Patient_id and Value.
I would like to de-duplicate (in respect to patient_id) my rows based on "a hierarchical series of aggregate functions" (if someone has a better way to phrase this, I'd appreciate that as well.)
+----+------------+------------+------------+----------+-----------------+-------+
| ID | patient_id | Date | Date2 | Priority | Source | Value |
+----+------------+------------+------------+----------+-----------------+-------+
| 1 | 1 | 2017-09-09 | 2018-09-09 | 1 | 'verified' | 55 |
| 2 | 1 | 2017-09-09 | 2018-11-11 | 2 | 'verified' | 78 |
| 3 | 1 | 2017-11-11 | 2018-09-09 | 3 | 'verified' | 23 |
| 4 | 1 | 2017-11-11 | 2018-11-11 | 1 | 'self_reported' | 11 |
| 5 | 1 | 2017-09-09 | 2018-09-09 | 2 | 'self_reported' | 90 |
| 5 | 1 | 2017-09-09 | 2018-09-09 | 3 | 'self_reported' | 34 |
| 6 | 2 | 2017-11-11 | 2018-09-09 | 2 | 'self_reported' | 21 |
+----+------------+------------+------------+----------+-----------------+-------+
For each patient_id, I would like to get the row(s) that has/have the MAX(Date). In the case that there are still duplicated patient_id, I would like to get the row(s) with the MIN(Priority). In the case that there are still duplicated rows I would like to get the row(s) with the MIN(Date2).
The way I've approached this problem is using a series of queries like this to de-duplicate on the columns one at a time.
SELECT *
FROM #table t1
LEFT JOIN
(SELECT
patient_id,
MIN(priority) AS min_priority
FROM #table
GROUP BY patient_id) t2 ON t2.patient_id = t1.patient_id
WHERE t2.min_priority = t1.priority
Is there a way to do this that allows me to de-dup on multiple columns at once? Is there a more elegant way to do this?
I'm able to get my results, but my solution feels very inefficient, and I keep running into this. Thank you for any input.
You could use row_number(), if your RDBMS supports it:
select ID, patient_id, Date, Date2, Priority, Source, Value
from (
select
t.*,
row_number() over(partition by patient_id order by Date desc, Priority, Date2) rn
from mytable t
) where rn = 1
Another option is to filter with a correlated subquery that sorts the record according to your criteria, like so:
select t.*
from mytable t
where id = (
select id
from mytable t1
where t1.patient_id = t.patient_id
order by t1.Date desc, t1.Priority, t1.Date2
limit 1
)
The actual syntax for limit varies accross RDBMS.

How to select values, where each one depends on a previously aggregated state?

I have the following table:
|-----|-----|
| i d | val |
|-----|-----|
| 1 | 1 |
|-----|-----|
| 2 | 4 |
|-----|-----|
| 3 | 3 |
|-----|-----|
| 4 | 7 |
|-----|-----|
Can I get the following output:
|-----|
| sum |
|-----|
| 1 |
|-----|
| 5 |
|-----|
| 8 |
|-----|
| 1 5 |
|-----|
using a single SQLite3 SELECT-query? I know it could be easily achieved using variables, but SQLite3 lacks those. Maybe some recursive query? Thanks.
No.
In a relational database table rows do not have any order. If you specify an order for the rows, then it's possible to write a query.
Now, you could add an extra column to sort the rows. For example:
| val | sort
|-----|-----
| 1 | 10
| 4 | 20
| 3 | 30
| 7 | 40
The query could be:
select
sum(val) over(order by sort)
from my_table
For the updated question, you can write:
select
sum(val) over(order by id)
from my_table
By using the order of the id column and if you want only the sum column, you can do this:
select (select sum(val) from tablename where id <= t.id) sum
from tablename t

calculating sum of rows with identical id

Let's imagine a table with two columns ex:
| Value | ID |
+-------+----+
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
| 1 | 2 |
| 2 | 2 |
| 2 | 2 |
What I am trying to do is to calculate the sum of those with similar id and display them in different table like:
| Sum | ID |
+-----+----+
| 9 | 1 |
| 5 | 2 |
and so on.
I could find a sum of a known id by
SELECT SUM(VALUE) FROM MYTABLE WHERE ID = 1;
However not sure on how to find sum of different id's separately, could you give an idea on how to proceed?
Select SUM(VALUE),ID FROM MYTABLE GROUP BY ID
Use GROUP BY clause:
SELECT SUM(VALUE) Sum, ID FROM MYTABLE GROUP BY ID;
SELECT SUM(VALUE),ID FROM MYTABLE Group By ID

Union in outer query

I'm attempting to combine multiple rows using a UNION but I need to pull in additional data as well. My thought was to use a UNION in the outer query but I can't seem to make it work. Or am I going about this all wrong?
The data I have is like this:
+------+------+-------+---------+---------+
| ID | Time | Total | Weekday | Weekend |
+------+------+-------+---------+---------+
| 1001 | AM | 5 | 5 | 0 |
| 1001 | AM | 2 | 0 | 2 |
| 1001 | AM | 4 | 1 | 3 |
| 1001 | AM | 5 | 3 | 2 |
| 1001 | PM | 5 | 3 | 2 |
| 1001 | PM | 5 | 5 | 0 |
| 1002 | PM | 4 | 2 | 2 |
| 1002 | PM | 3 | 3 | 0 |
| 1002 | PM | 1 | 0 | 1 |
+------+------+-------+---------+---------+
What I want to see is like this:
+------+---------+------+-------+
| ID | DayType | Time | Tasks |
+------+---------+------+-------+
| 1001 | Weekday | AM | 9 |
| 1001 | Weekend | AM | 7 |
| 1001 | Weekday | PM | 8 |
| 1001 | Weekend | PM | 2 |
| 1002 | Weekday | PM | 5 |
| 1002 | Weekend | PM | 3 |
+------+---------+------+-------+
The closest I've come so far is using UNION statement like the following:
SELECT * FROM
(
SELECT Weekday, 'Weekday' as 'DayType' FROM t1
UNION
SELECT Weekend, 'Weekend' as 'DayType' FROM t1
) AS X
Which results in something like the following:
+---------+---------+
| Weekday | DayType |
+---------+---------+
| 2 | Weekend |
| 0 | Weekday |
| 2 | Weekday |
| 0 | Weekend |
| 10 | Weekday |
+---------+---------+
I don't see any rhyme or reason as to what the numbers are under the 'Weekday' column, I suspect they're being grouped somehow. And of course there are several other columns missing, but since I can't put a large scope in the outer query with this as inner one, I can't figure out how to pull those in. Help is greatly appreciated.
It looks like you want to union all a pair of aggregation queries that use sum() and group by id, time, one for Weekday and one for Weekend:
select Id, DayType = 'Weekend', [time], Tasks=sum(Weekend)
from t
group by id, [time]
union all
select Id, DayType = 'Weekday', [time], Tasks=sum(Weekday)
from t
group by id, [time]
Try with this
select ID, 'Weekday' as DayType, Time, sum(Weekday)
from t1
group by ID, Time
union all
select ID, 'Weekend', Time, sum(Weekend)
from t1
group by ID, Time
order by order by 1, 3, 2
Not tested, but it should do the trick. It may require 2 proc sql steps for the calculation, one for summing and one for the case when statements. If you have extra lines, just use a max statement and group by ID, Time, type_day.
Proc sql; create table want as select ID, Time,
sum(weekday) as weekdayTask,
sum(weekend) as weekendTask,
case when calculated weekdaytask>0 then weekdaytask
when calculated weekendtask>0 then weekendtask else .
end as Task,
case when calculated weekdaytask>0 then "Weekday"
when calculated weekendtask>0 then "Weekend"
end as Day_Type
from have
group by ID, Time
;quit;
Proc sql; create table want2 as select ID, Time, Day_Type, Task
from want
;quit;

sql - select row from group based on multiple values

I have a table like:
| ID | Val |
+-------+-----+
| abc-1 | 10 |
| abc-2 | 30 |
| cde-1 | 10 |
| cde-2 | 10 |
| efg-1 | 20 |
| efg-2 | 11 |
and would like to get the result based on the substring(ID, 1, 3) and minimum value and ist must be only the first in case the Val has duplicates
| ID | Val |
+-------+-----+
| abc-1 | 10 |
| cde-1 | 10 |
| efg-2 | 11 |
the problem is that I am stuck, because I cannot use group by substring(id,1,3), ID since it will then have again 2 rows (each for abc-1 and abc-2)
WITH
sorted
AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY substring(id,1,3) ORDER BY val, id) AS sequence_id
FROM
yourTable
)
SELECT
*
FROM
sorted
WHERE
sequence_id = 1
SELECT SUBSTRING(id,1,3),MIN(val) FROM Table1 GROUP BY SUBSTRING(id,1,3);
You were grouping the columns using both SUBSTRING(id,1,3),id instead of just SUBSTRING(id,1,3). It works perfectly fine.Check the same example in this below link.
http://sqlfiddle.com/#!3/fd9fc/1