SQL - Delete value if incremental pattern not met - sql

I have a table with a column of values with the following sample data that has been pulled for 1 user:
ID | Data
5 Record1
12 NULL
13 NULL
15 Record1
20 Record12
28 NULL
31 NULL
35 Record12
37 Record23
42 Record34
51 NULL
53 Record34
58 Record5
61 Record17
63 NULL
69 Record17
What I would like to do is to delete any values in the Data column where the Data value does not have a start and finish record. So in the above Record 23 and Record 5 would be deleted.
Please note that the Record(n) may appear more than once so it's not as straight forward as doing a count on the Data value. It needs to be incremental, a record should always start and finish before another one starts, if it starts and doesnt finish then I want to remove it.

Sadly SQL Server 2008 does not have LAG or LEAD which would make the operation simpler.
You could use a common table expression for finding the non consecutive (non null) values, and delete them;
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY id) rn FROM table1 WHERE data IS NOT NULL
)
DELETE c1 FROM cte c1
LEFT JOIN cte c2 ON (c1.rn = c2.rn+1 OR c1.rn = c2.rn-1) AND c1.data = c2.data
WHERE c2.id IS NULL
An SQLfiddle to test with.
If you just want to see which rows would be deleted, replace DELETE c1 with SELECT c1.*.
...and as always, remember to back up before running potentially destructive SQL for random people on the Internet.

Related

Having a hard time building an aggregate SQL query

I am new at SQL and have a pretty good knowledge of basic stuff but I am stuck with my request.
My request gets me te following table (except for the last column on the right end side):
Team
Variable
Date
Value
Column_I_need_to_add
A
aa
2022/05/01
100
0
A
aa
2022/06/01
25
0
A
aa
2022/07/01
580
0
A
ad
2022/08/01
50
605
B
aa
2021/05/01
75
0
B
aa
2021/06/01
110
0
B
aa
2021/07/01
514
0
B
ad
2021/08/01
213
624
What I cannot turn my head around, is how to code for the last column that fills rows for the ad variable by summing values of the aa variables of the same team but only for the two months prior to the date of the ad variable.
Here is the script I have so far, that gets me the first four columns:
SELECT
team.Team,
Var.Variable,
TO_DATE(Var.Year||'-'||LPAD(Var.Month,2,'00')||'-'||'01','YYYY-MM-DD')AS Date ,
Var.value
FROM table1 as Var
join table2 as team
on Var.code=team.code
---This last join with table3 is only there to add other columns that are not relevant to this problem.
---join table3 as detail_var on Var.variable=detail_var.code_var
I was not content with the previous answer, with OUTER APPLY, as understood from further reading. So had to do a bit of further grinding and this is what I came up with (Now for Postgres 13).
It is cleaner and does the job in a conciser fashion. I've also added a FIDDLE LINK. If you want to see the previous answer please look at the edit versions.
SELECT
team.Team
,var.Variable
,var.Date
,var.value
,CASE
WHEN var.Variable='ad' THEN
(SELECT sum(value) FROM table1
WHERE
(TO_DATE(Year||'-'||LPAD(Month::varchar(2),2,'0')||'-'||'01','YYYY-MM-DD')
BETWEEN (var.Date - INTERVAL '2 month') AND var.Date)
AND Variable = 'aa'
AND code = var.code)
ELSE null
END as past2monthsValue
FROM (
-- this sub query to change Year & Month to Date Type Value
-- this Date Type Value (Date) will be used to compare dates
-- (var.Date) in the above sub-query
SELECT
code,
Variable,
TO_DATE(Year||'-'||LPAD(Month::varchar(2),2,'0')||'-'||'01','YYYY-MM-DD') AS Date,
value
FROM table1
) var
JOIN table2 AS team ON var.code=team.code

POSTGRESQL : How do I select values A that have multiple pairing values for B?

Background
I have a table which is used to track changes in users' accounts.
Column a is the primary key for this accounthistory-table, column b is a foreign key which contains the primary key for an account from account-table, column c contains the username at the time of change, column d has timestamp from the time of the change and column e describes what the performed action was, from options INSERT / UPDATE / DELETE.
At a given time there can be only one account b with username c but across time multiple accounts b can have identical username values c (see example b=20 and b=07 from table). As there are other columns as well there can be multiple UPDATEs or INSERT + UPDATE for all accounts so each c value should have atleast 2 rows after some time has passed from the insert.
Question:
Below is an example of the data. What I need to figure out is "accounts that have had their username change atleast once" so values of b that have multiple rows with differing values for c. I'm only interested in the value of column b as I need to use the result in further selection queries.
Table accounthistory:
a
b
c
d
e
100
15
toma
2021-11-15 16:22:40.747766
UPDATE
99
20
valt
2021-11-13 08:22:40.747766
UPDATE
98
17
mitk
2021-11-12 15:22:40.747766
INSERT
97
15
tomia
2021-11-10 08:22:40.747766
UPDATE
96
20
valt
2021-11-09 07:22:40.747766
INSERT
95
15
tomia
2021-10-21 20:22:40.747766
INSERT
94
12
alek
2021-10-18 18:22:40.747766
INSERT
93
07
valt
2021-10-15 10:22:40.747766
DELETE
92
04
juur
2021-10-12 10:22:40.747766
DELETE
91
07
valt
2021-10-05 10:22:40.747766
INSERT
The expected result would be 15 as it has had both usernames 'tomia' and 'toma', other b's have only matching values for c and it doesn't matter that both 07 and 20 have had username 'valt' since 07 was deleted before 20 was added.
So is there a way to select these values for b, I tried forming different group by's and other messy queries but as I'm quite novice in the use of postgresql and sql in general I haven't been able to get this to work.
Thank you in advance!
You can use exists to determine which column b has a least 1 column c that does not have the same column c. Then employ distinct on to eliminate the duplicates. Demo here.
select distinct on (b) b
from accounthistory ah1
where exists (
select null
from accounthistory ah2
where ah2.b = ah1.b
and ah2.c <> ah1.c
)
order by b;

SQL complex grouping "in column"

I have a table with 3 columns (sorted by the first two):
letter
number (sorted for each letter)
difference between current number and previous number of the same letter
I'd like to calculate (with vanlla SQL) a fourth new column RESULT to group these data when the third column (difference of number between contiguos record; i.e #2 --> 4 = 5-1) is greater than 30 marking all the records of this interval with letter-number of the first record (i.e A1 for #1,#2,#3).
Since the difference between contiguos numbers makes sense just for records with the same letter, for the first record of a new letter, the value of differnce is 31 (meaning that it's a new group; i.e. #6).
Here is what I'd like to get as result:
# Letter Number Difference RESULT (new column)
1 A 1 1 A1
2 A 5 4 A1
3 A 7 2 A1
4 A 40 33 A40 (*)
5 A 43 3 A40
6 B 1 31 B1 (*)
7 B 25 24 B1
8 B 27 2 B1
9 B 70 43 B70 (*)
10 B 75 5 B70
Now I can only find the "breaking values" (*) with this query where they get a value of 1:
select letter
,number
,cast(difference/30 as int) break
from table
where cast(difference/30 as int) = 1
Even though I'm able to find these breaking values I can't finish my task.
Can anyone help me finding a way to obtain the column RESULT?
Thanks in advance
FF
As I understand you need to construct the last result column. You can use concat to do that:
SELECT letter
,number
,concat(letter, cast(difference/30 as int)) result
FROM table
HAVING result = 'A1'
after some exercise and a little help from a friend of mine, I've found a possible solution to my sql prolblem.
The only requirment for the solution is that my first record must have a value of 31 in Difference field (since I need "breaks" when Difference > 30 than the previous record).
Here is the query to get the column RESULT I needed:
select alls.letter
,alls.number
,ints.letter||ints.number as result
from competition.lag alls
,(select letter
,number
,difference
,result
from (select letter
,number
,difference
,case when difference>30 then 1 else 2 end as result
from competition.lag
) temp
where result = 1
) ints
where ints.letter=alls.letter
and alls.number>=ints.number
and alls.number-30<=ints.number

Sql: have a column return 1 or 0 denoting if an id exists in a table for a predetermined groupID

I wrote the following SQL to create a column that I can use to populate check boxes in a Grid to manage user permissions.
SELECT access_b2b.access_id,
access_b2b.description,
'active'= CASE
WHEN access_group.group_id IS NOT NULL THEN 1
ELSE 0
END
FROM access_b2b
LEFT JOIN access_group
ON access_group.access_id = access_b2b.access_id
WHERE ( access_group.group_id = 10
OR access_group.group_id IS NULL )
However, it does not select all of the entries from access_b2b. The issues is with the last line:
where (access_group.group_id=10 or access_group.group_id is null)
Without it, i get duplicate entries returned with different active values. Also, I realized that this is not the proper condition, because an entry in access_group might exist for a different access_group.group_id, meaning that not all the remaining entries will be pulled in with the access_group.group_id is null.
I am trying to write my condition so that if does something along the lines of:
This is the format I was trying to follow:
Where For Each unique access_id in access_group
select the one where group_id=10
if no group_id=10
select any other one
end
end
Ultimately, the goal is to have a column returned with 1 or 0 denoting if the access_id exists for a predetermined group id.
Please note that throughout this explanation I used group_id=10 for simplification, it will be later replaced with a SqlParameter.
Any help is appreciated, thank you so much!
SAMPLE DATA (only useful columns shown to simplify data)
access_group
access_id group_id
27 1
27 11
28 1
28 11
33 1
33 3
33 11
43 11
44 1
44 10
44 11
...
access_b2b
access_id description
1 Add
2 Edit
3 Delete
4 List
5 Payments
6 Open Files
7 Order
8 Mod
...
Change the query to and it should work:
SELECT access_b2b.access_id,
access_b2b.description,
'active'= CASE
WHEN access_group.group_id IS NOT NULL THEN 1
ELSE 0
END
FROM access_b2b
LEFT JOIN access_group
ON access_group.access_id = access_b2b.access_id
AND ( access_group.group_id = 10
OR access_group.group_id IS NULL )
If you don't want the records to be filtered by the WHERE clause, move the condition in the JOIN.
The JOIN will keep the lines and populate them with NULL if the condition is not met, while the WHERE clause will filter the result set.

how to compare to similar rows in two identical sql tables

I have two identical tables as following:
Table 1
Student#|name|Course1#|Course2#|Course3#
456 abc 12 76 89
789 def 09 13 76
345 ghi 56 34 14
Table 2
Student#|name|Course1#|Course2#|Course3#
456 abc 12 76 89
789 def 90 13 76
345 ghi 56 34 14
Table1 will contain latest data and table 2 will keep a copy of table 1. Table 2 is updated everytime after updation of table 1 and I do not want a complete truncation and insertion. I want to fire a query which will compare these two tables and return only those rows in which value is changed. On the basis of these vale i can fire an update in table 2.
For eg: in table 1, student# 789 have a value changes for course 1# as 90 from 09, but table 2 still have old value. When I fire query i should get result like:
Student#|name|Course1#|Course2#|Course3#
789 def 90 13 76
It would rarely make sense to have two copies of the same data let alone try to keep two copies of data and periodically try to keep them in sync. So the premise seems rather suspect.
It sounds like you are looking for something like
UPDATE table2 t2
SET (course1, course2, course3) = (SELECT course1, course2, course3
FROM table1 t1
WHERE t1.student = t2.student)
WHERE EXISTS( SELECT 1
FROM table1 t1
WHERE t1.student = t2.student
AND ( t1.course1 != t2.course1
OR t1.course2 != t2.course2
OR t1.course3 != t2.course3) );
This won't account for cases where either table has a NULL value. If you want to replace a NULL value in table2 with a non-NULL value from table1 if it is available, and assuming -1 is not a valid value for that column, the predicate in the EXISTS clause would change to look something like t1.course1 != nvl(t2.course1, -1).
Create trigger on T1 for INSERT,DELETE,UPDATE ,in the trigger put the dirty KEYS/rows in onather table, then periodly check the dirty tracing table. Or in the trigger update T2 directlly.