Adding daily changes to database table - sql

I am trying to build a simple database that keeps track of any changes to a users location attribute. Each day I generate the current information of User,Date,Location and upload to a temporary table in sql server. I am trying to figure out the correct sql to query for new users, modified users and deleted users.
Finding new users is easy with:
SELECT table1.UserGuid,table1.Location
FROM table1
WHERE table1.UserGuid NOT IN
(
SELECT DISTINCT table2.UserGuid
FROM table2
)
The problem I am having is finding modified locations and deleted users.
For modified users I am trying to return users where their last location in the database doesn't match the current location in the daily temp database. This is what i have but i don't think it is correct:
SELECT table1.UserGuid,table1.Location
FROM table1
WHERE EXISTS
(
SELECT TOP 1 table2.UserGuid,table2.Location
FROM table2
WHERE (table2.UserGuid = table1.UserGuid) AND (table2.Location != table1.Location)
ORDER BY table2.Date DESC
)
For deleted users, I am trying the following sql to identify any Users in the main table that doesn't exist in the daily temp table and don't have a location of deleted. (if this returns any deleted users then I add them to the main table with a location of deleted so they are not returned the next time)
SELECT table2.UserGuid,table2.Location
FROM table2
WHERE table2.UserGuid NOT IN
(
SELECT UserGuid
FROM table1
)
AND table2.Location != 'deleted'
after I run all 3 queries to find the adds, modifications and deletes I add them to the main table along with the current date and repeat the next day. So the main table has 3 columns (UserGuid, Date, Location) and new rows get added each day with changed information. So far my New user sql is the only one working reliably. Is there an easier way to do this?

So I think this captures all your requirements.
Select
table1.*,
case when table2.userguid is null then 'INSERT'
when table1.userguid is null and table2.location != 'deleted' then 'DELETE'
when table1.location != table2.location then 'UPDATE'
from table1
full join (select max(date) as lastEntry, userGuid from Table2) lastRecords
inner join table2 on table2.date = lastRecords.lastEntry and table2.userGuid = lastRecords.userGuid
on lastRecords.userguid = table1.userguid

For your second query try:
SELECT table1.UserGuid,table1.Location
FROM table1
WHERE table1.UserGuid IN
(
SELECT table2.UserGuid
FROM table2
WHERE table2.UserGuid = table2.UserGuid AND table2.Location <> table2.Location
)

I tend to use EXISTS for these sorts of checks
--INSERTS
SELECT table1.UserGuid,table1.Location
FROM table1
WHERE NOT EXISTS (SELECT 1 FROM table2 WHERE table2.UserGuid = table1.UserGuid)
UNION ALL
--UPDATES
SELECT table1.UserGuid,table1.Location
FROM table1
WHERE EXISTS
(
SELECT 1 FROM table2
WHERE table2.UserGuid = table1.UserGuid
AND ISNULL(table2.Location,'') <> ISNULL(table1.Location,'')
)
UNION ALL
--DELETES
SELECT table2.UserGuid,table2.Location
FROM table2
WHERE NOT EXISTS (SELECT 1 FROM table1 WHERE table2.UserGuid = table1.UserGuid)
I included ISNULL checks in the event your location could be null; they're not needed if that's not possible.

Related

SQL - How to "filter out" people who has more than 1 status

I tried to find this question here but I probably didn't know the exact term to search for.
Here is the problem:
I have this set of customers (see image). I need to filter only those with status "user_paused" or "interval_paused". A same customer_id may have more than 1 status, and sometimes, this status can be "active". If so, this customer should not appear in my final result.
See customer 809 - he shouldn't appear in my final result since he has an "active" status. all the others are fine, because they only have paused statuses.
I still couldn't figure out how to proceed from here.
Thank you so much.
SELECT DISTINCT customer_id FROM TABLE
WHERE status IN ( 'user_paused','interval_paused')
EXCEPT
SELECT DISTINCT customer_id FROM TABLE
WHERE status = 'active'
One method uses group by and having:
select customer_id
from t
group by customer_id
having sum(case when status not in ('user_paused', 'interval_paused') then 1 else 0 end) = 0;
select * from table
where customer_id in
(select customer_id from table
where status in ('interval_paused','user_paused') )
You can find all customers with a status of 'active' quite easily:
SELECT customerid FROM table WHERE status = 'active'
If you want to exclude any customer from your results if they have an active row, you can do this in a subquery:
SELECT * FROM table WHERE /* your other query restrictions */
AND customerID NOT IN
(
SELECT customerid FROM table WHERE status = 'active'
)
This will let you eliminate any row with a customerid that has any 'active' row.
Please note that subqueries are not always the most efficient solution - there could be cases where a subquery would make your query very slow.
To exclude any customer with 'active' in either column use the following:
select * from customers
where paused_statuses != 'active'
and status != 'active';
Not sure if you need distinct or not, but here are 2 approaches. I think both will work in Impala but just in case you have an option. The first uses a "left excluding join" (make the join then exclude the matched rows) which enable us to ignore the active status customers. The second uses an even more traditional "not exists" approach to remove customer_ids that have an active status.
select /* distinct */ t1.customer_id
from table t1
left join table t2 on t1.customer_id = t2.customer_id and t2.status = 'active'
where t2.customer_id IS NULL
and t1.status in ('interval_paused','user_paused')
;
select /* distinct */ t1.customer_id
from table t1
where t1.status in ('interval_paused','user_paused')
and NOT EXISTS (
select null
from table t2
where t1.customer_id = t2.customer_id
and t2.status = 'active'
)
;
if your existing query is complex, then to simplify these additions, use a WITH clause like this:
WITH MyCTE AS (
-- place the whole existing query here
)
select /* distinct */ t1.customer_id
from MyCTE t1
left join MyCTE t2 on t1.customer_id = t2.customer_id and t2.status = 'active'
where t2.customer_id IS NULL
and t1.status in ('interval_paused','user_paused')
;
Notice that that the name you give it ("MyCTE") can be reused in the subsequent query - a very useful feature indeed.
In general the structures created by WITH are called common table expressions (CTE) if you are wondering why I use "MyCTE" as a name.
SELECT customer_id, paused_statuses, status
FROM Customer
WHERE NOT IN (SELECT customer_id, paused_statuses, status
FROM Customer
WHERE status = user_paused
AND status = active
AND status = interval_paused)
GROUP BY customer_id
OR
SELECT customer_id, paused_statuses, status
FROM Customer
WHERE status = user_paused
AND status = interval_paused
AND status <> active
GROUP BY customer_id

Select from different tables using CONDITION using PL SQL

So I have a table with info that I want
TABLE_1
id
name
And a lot of other tables with the same type of data
TABLE_x
id
order
TABLE_y
id
order
...
TABLE_z
id
order
What I want: depending on the name from TABLE_1 I want to acess the att order of other table.
Example:
TABLE_1
name = x
I want to access TABLE_x and check the order.
But if
TABLE 1
name = z
I want to access TABLE_z etc.
I thought about this piece:
SELECT *
FROM TABLE_1
INNER JOIN (
CASE (SELECT name FROM TABLE_1)
WHEN 'x' THEN TABLE_x on TABLE_1.id=TABLE_x.id
WHEN 'y' THEN TABLE_y on TABLE_1.id=TABLE_y.id
WHEN 'z' THEN TABLE_z on TABLE_1.id=TABLE_z.id)
I can't use the case statement in here.
Need your help!
I don't recommend what immediately follows, but look closely at one possible solution:
-- NOT RECOMMENDED...
WITH t1 AS (
SELECT MIN(name) name FROM table_1
)
SELECT table_x.ordr FROM t1 JOIN table_x ON ('x' = t1.name)
UNION ALL
SELECT table_y.ordr FROM t1 JOIN table_y ON ('y' = t1.name)
UNION ALL
SELECT table_z.ordr FROM t1 JOIN table_z ON ('z' = t1.name)
/
If you think about what the database needs to do to satisfy that query, you'll realize it's a lot of individual queries, with at most 1 of them returning data. That could be a lot of work, which is why it's not recommended. You're likely much better off running an initial query against table_1, with the result driving the target of a subsequent query. With sqlplus, you might opt for something like:
COLUMN name NEW_VALUE name NOPRINT
SELECT MIN(name) name FROM table_1
/
SELECT ordr FROM table_&&name
/
Do give some thought to error situations, such as when the name from query 1 doesn't match any existing table, or if there are no rows in table_1.

Delete rows out of table that is innerjoined and unioned with 2 others

We have 3 tables (table1, table2, table3), and I need to delete all the rows from table1 that have the same ID in table2 OR table3. To see a list of all of these rows I have this code:
(
select
table2.ID,
table2.name_first,
table2.name_last,
table2.Collected
from
table2
inner join
table1
on
table1.ID = table2.ID
where
table2.Collected = 'Y'
)
union
(
select
table3.ID,
table3.name_first,
table3.name_last,
table3.Collected
from
table3
inner join
table1
on
table1.ID = table3.ID
where
table3.Collected = 'Y'
)
I get back about 200 rows. How do I delete them from table1? I don't have a way to test if my query will work, so I'm nervous about modifying something I found online and potentially deleting data (we do have backups, but I'd rather not test out their integrity).
TIA!
EDIT
You are correct, we are on MSSQL 2008. Thanks so much for all the replies, I will try it out tomorrow!
Try this:
DELETE FROM Table1 WHERE
ID IN
(
SELECT ID FROM Table2 WHERE Collected = 'Y'
UNION ALL
SELECT ID FROM Table3 WHERE Collected = 'Y'
)
To test this query you can create dupicate tables using into clause i.e.(I assume it is SQL Server):
SELECT * INTO DUP_Table1 FROM Table1;
SELECT * INTO DUP_Table2 FROM Table2;
SELECT * INTO DUP_Table3 FROM Table3;
and then run this query:
DELETE FROM DUP_Table1 WHERE
ID IN
(
SELECT ID FROM DUP_Table2 WHERE Collected = 'Y'
UNION ALL
SELECT ID FROM DUP_Table3 WHERE Collected = 'Y'
)
EDIT: Added the Collected Criteria and used UNION ALL (#Thomas: Thanks..) I think performance of UNION ALL is better than UNION when there is no need for uniqueness in the result.
DELETE FROM table1
WHERE EXISTS (SELECT 1 FROM table2 WHERE table2.id = table1.id AND table2.collected = 'Y')
OR EXISTS (SELECT 1 FROM table3 WHERE table3.id = table1.id AND table3.collected = 'Y')
If you're feeling nervous about a big delete like this, put it into a transaction: that way you can at least check the row count before running commit (or rollback, of course :p)
Make sure foreign keys are setup properly and turn on cascade deletes. But if that's not an option, the correct SQL query is as follows:
begin tran
delete from table1
where exists(select * from table2 where table1.id = id and collected='Y')
or exists(select * from table3 where table1.id = id and collected='Y')
-- commit tran
-- rollback tran
if the SQL runs as expected, execute the "commit tran", otherwise execute the "rollback tran".

SQL Delete Query

I need to write an SQL script that selects one record in table1, then does a lookup in the remaining tables in the database. If it doesn't find the record, I need delete the record from table1. Anyone provide some sample script?
One example
delete table1
where not exists (select 1
from Table2
where table1.SomeColumn = Table2.SomeColumn)
AND table1.SomeColumn = 5 --just an example,
Leave the AND out if you want to delete all the rows from table 1 that do not exist in table 2
you can also use LEFT JOIN or NOT IN
I have done things like this:
DELETE table1
FROM table1
WHERE table1.ID NOT IN (
SELECT RefID FROM Table2
UNION
SELECT RefID FROM Table3
...
)
Assuming RefID are FK's to table1.ID. Is this what you need?
DELETE FROM Table1 WHERE id=10 AND NOT EXISTS (SELECT * FROM Table2 WHERE id=10);
Very generally, (since you gave little details)
Delete Table1 t1
Where [Criteria to find table1 Record]
And Not Exists(Select * From Table2
Where pk = t1.Pk)
And Not Exists(Select * From Table3
Where pk = t1.Pk)
And Not Exists(Select * From Table4
Where pk = t1.Pk)
... etc. for all other tables

How to remove duplicate records in a table?

I've got a table in a testing DB that someone apparently got a little too trigger-happy on when running INSERT scripts to set it up. The schema looks like this:
ID UNIQUEIDENTIFIER
TYPE_INT SMALLINT
SYSTEM_VALUE SMALLINT
NAME VARCHAR
MAPPED_VALUE VARCHAR
It's supposed to have a few dozen rows. It has about 200,000, most of which are duplicates in which TYPE_INT, SYSTEM_VALUE, NAME and MAPPED_VALUE are all identical and ID is not.
Now, I could probably make a script to clean this up that creates a temporary table in memory, uses INSERT .. SELECT DISTINCT to grab all the unique values, TRUNCATE the original table and then copy everything back. But is there a simpler way to do it, like a DELETE query with something special in the WHERE clause?
You don't give your table name but I think something like this should work. Just leaving the record which happens to have the lowest ID. You might want to test with the ROLLBACK in first!
BEGIN TRAN
DELETE <table_name>
FROM <table_name> T1
WHERE EXISTS(
SELECT * FROM <table_name> T2
WHERE
T1.TYPE_INT = T2.TYPE_INT AND
T1.SYSTEM_VALUE = T2.SYSTEM_VALUE AND
T1.NAME = T2.NAME AND
T1.MAPPED_VALUE = T2.MAPPED_VALUE AND
T2.ID > T1.ID
)
SELECT * FROM <table_name>
ROLLBACK
here is a great article on that: Deleting duplicates, which basically uses this pattern:
WITH q AS
(
SELECT d.*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY value) AS rn
FROM t_duplicate d
)
DELETE
FROM q
WHERE rn > 1
SELECT *
FROM t_duplicate
WITH Duplicates(ID , TYPE_INT, SYSTEM_VALUE, NAME, MAPPED_VALUE )
AS
(
SELECT Min(Id) ID TYPE_INT, SYSTEM_VALUE, NAME, MAPPED_VALUE
FROM T1
GROUP BY TYPE_INT, SYSTEM_VALUE, NAME, MAPPED_VALUE
HAVING Count(Id) > 1
)
DELETE FROM T1
WHERE ID IN (
SELECT T1.Id
FROM T1
INNER JOIN Duplicates
ON T1.TYPE_INT = Duplicates.TYPE_INT
AND T1.SYSTEM_VALUE = Duplicates.SYSTEM_VALUE
AND T1.NAME = Duplicates.NAME
AND T1.MAPPED_VALUE = Duplicates.MAPPED_VALUE
AND T1.Id <> Duplicates.ID
)