Deleting a row from a table based on the existence - sql

This might be a trivial solution. I have searched similar posts regarding this but I couldn't find a proper solution.
I'm trying to delete a row if it exists in a table
I have a table say
Table1
----------------------------
|Database| Schema | Number |
----------------------------
| DB1 | S1 | 1 |
| DB2 | S2 | 2 |
| DB3 | S3 | 3 | <--- Want to delete this row
| DB4 | S4 | 4 |
----------------------------
Here is my query
DELETE FROM Table1
WHERE EXISTS
(SELECT * FROM Table1 WHERE Database = 'DB3' and Schema = 'S3');
When I tried the above SQL, it returned me an empty table, don't understand why it's returning an empty table.
There are similar posts on stack overflow but I couldn't find why I'm getting empty table.

Why are you using a subquery? Just use a where clause:
DELETE FROM Table1
WHERE Database = 'DB3' and Schema = 'S3';
Your code will delete either all rows or no rows. The where condition is saying "delete all rows from this table where this subquery returns at least one row". So, if the subquery returns one row, everything is deleted. Otherwise, nothing is deleted.

Related

Update part of the table for a specific category

Let's imagine I have a similar table to this one:
ID | Country | time | location 1 | location 2 | count_clients
------------------------------------------------------------------
1 | PL |2019-01-01 | JAK | ADD3 | 23
2 | PL |2019-03-01 | GGF | ADD5 | 34
3 | PL |2019-01-01 | J3K | 55D3 | 67
4 | NL |2019-04-01 | FDK | AGH3 | 2
5 | NL |2019-01-01 | GGK | AFF3 | 234
It's an aggregated table. Source contains one row per client, in my table it's aggregated showing no. of clients per country, time, location 1 and location 2. It's updated by loading new rows only (new dates). First they are loaded to stage table then, after some modifications, to final table. The values loaded to stage table are already aggregated and stage table contains only new rows.
BUT I just learned that rows in source table can be deleted - it means the "count_clients" value can change or can be deleted. What's also important - I know which COUNTRY, location 1 and location 2 are affected but I don't know WHEN they were changed (was it before or after last load? I don't know).
Do you know any smart ways to handle it? I currently load new rows + rows that were affected by change to stage tables, then remove affected rows from final table and load the stage rows to final table.
The source table is huge. I'm looking for a solution that will allow me to update only part of the table affected by the updates. Please remember that in stage table I have only new rows that needs to be inserted + the rows that was changed. I wanted to use the MERGE statement but to do that I would need to use a part of the table as a target not the whole table. I tried to do it but it didn't work.
I tried to do something like:
MERGE INTO (select country, time, location1, location 2, count from myFinalTable join stage table on country=country and location=location) --target = only rows affected by change
USING myStageTable
ON country = country and location=location
WHEN MATCHED THEN
UPDATE
SET count = count
WHEN NOT MATCHED BY TARGET then INSERT --insert new uploads
WHEN NOT MATCHED BY SOURCE then DELETE
but it looks like I can't use the 'select' statement in target..?

New column referencing second table - do I need a join?

I have two tables (first two shown) and need to make a third from the first two - do I need to do a join or can you reference a table without joining?
The third table shown is the desired output. Thanks for any help!
| ACC | CALL DATE | | |
+-----+-----------+--+--+
| 1 1 | 2/1/18 | | |
+-----+-----------+--
+-----+---------------+--+--+
| ACC | PURCHASE DATE | | |
+-----+---------------+--+--+
| 1 1 | 1/1/18 | | |
+-----+---------------+--+--+
+-----+-----------+----------------------+--+
| ACC | CALL DATE | PRIOR MONTH PURCHASE | |
+-----+-----------+----------------------+--+
| 1 1 | 2/1/18 | YES | |
+-----+-----------+----------------------+--+
Of course you can have a query that references multiple tables without joining. union all is an example of an operator that does that.
There is also the question of what you mean by "joining" in the question. If you mean explicit joins, there are ways around that -- such as correlated subqueries. However, these are implementing some form of "join" in the database engine.
As for your query, you would want to use exists with a correlated subquery:
select t1.*,
(case when exists (select 1
from table2 t2
where t2.acc = t1.acc and
datediff(month, t2.purchase_date, t1.call_date) = 1
)
then 'Yes' else 'No'
end) as prior_month_purchase
from table1 t1;
This is "better" than a join because it does not multiply or remove rows. The result set has exactly the rows in the first table, with the additional column.
The syntax assumes SQL Server (which was an original tag). Similar logic can be expressed in other databases, although date functions are notoriously database-dependent.
Lets check the options,
Say if you were to create a new third table on the basis of the data in first two, then every update/inserts/deletes to either of the tables should also propagate into the third table as well.
Say you instead have a view which does what you need, there isnt a need to maintain that third table and also gets you the data needed from the first two each time you query it.
create view third_table as
select a.acc,a.call_date,case when dateadd(mm,-1,a.call_date)=b.purchase_date then 'Yes' else 'No end as prior_month_purchase
from first_table a
left join second_table b
on a.acc=b.acc

How to delete hive table records ?

how to delete hive table records, we have 100 records there and i need to delete 10 records only,
when i use
dfs -rmr table_name whole table deleted
if any chance to delete in Hbase , send to data in Hbase,
You cannot delete directly from Hive table,
However, you can use a workaround of overwriting into Hive table
insert overwrite into table_name
select * from table_name
where id in (1,2,3,...)
You can't delete data from Hive tables since it is already written in the files in HDFS. You can only drop partitions which deletes directories in HDFS. So best practice is to have partitions if you want to delete in the future.
To delete records in a table, you can use the SQL syntax from your hive client :
DELETE FROM tablename [WHERE expression]
Try with where and your key with in clause
DELETE FROM tablename where id in (select id from tablename limit 10);
Example:-
I had acid transactional table in hive
select * from trans;
+-----+-------+--+
| id | name |
+-----+-------+--+
| 2 | hcc |
| 1 | hi |
| 3 | hdp |
+-----+-------+--+
Now i want to delete only 2 then my delete statement would be
delete from trans where id in (select id from trans limit 1);
Result:-
select * from trans;
+-----+-------+--+
| id | name |
+-----+-------+--+
| 1 | hi |
| 3 | hdp |
+-----+-------+--+
So we have just deleted the first record like this way you can specify limit 10 then hive can delete first 10 records.
you can specify orderby... some other clauses in your subquery if you need to delete only first 10 having specific order(like delete id's from 1 to 10).

SQL Statement - Select from two tables, create column if secondary table has related record

I'm posting here because I have not been able to find what I'm looking for, or even the correct keywords to search on. If there are better answers that I was unable to find, please feel free to point me in that direction.
However I have two tables which Table 1 is the primary table, and I need to SELECT all records out of it and add an additional column in the SELECT that returns if any related records in Table 2.
I have boiled the problem down to the following and any help would be much appreciated.
Table 1 has a many relationship to Table 2
SELECT must return all rows from Table 1
SELECT must have an additional column (preferably a BOOLEAN/INTEGER) column that represents if there are any related records in Table 2.
SELECT must work in both Access and SQL Server
TABLE 1
--------
GUID1 | DATA FIELD | DATA FIELD
GUID2 | DATA FIELD | DATA FIELD
GUID3 | DATA FIELD | DATA FIELD
TABLE 2
--------
GUID1 | TABLE 1 GUID | DATA FIELD | DATA FIELD
GUID2 | TABLE 1 GUID | DATA FIELD | DATA FIELD
GUID3 | TABLE 2 GUID | DATA FIELD | DATA FIELD
GUID4 | TABLE 2 GUID | DATA FIELD | DATA FIELD
SELECTED TABLE ( 1 JOINED ON TABLE 2 )
--------
GUID1 | DATA FIELD | DATA FIELD | 1 (EXISTS IN TABLE 2)
GUID2 | DATA FIELD | DATA FIELD | 1 (EXISTS IN TABLE 2)
GUID3 | DATA FIELD | DATA FIELD | 0 (DOES NOT EXISTS IN TABLE 2)
You can use a LEFT OUTER JOIN with a case statement to check if the data in the second table is null. Here is an example:
SELECT First.*,
CASE
WHEN Second.DATA3 IS NULL
THEN 0
ELSE 1
END
FROM First
LEFT OUTER JOIN Second ON First.GUID1 = Second.GUID1
SQL Fiddle: http://sqlfiddle.com/#!6/ab17a/1

sql insert value from another table with original nulls but not unmatched entries

OK. So this is a hard one to explain, but I am replacing the type of a foreign key in a database. To do this I need to update the values in a table that references it. That is all fine and good, and nice and easy to do.
I'm inserting this stuff into a temporary table which will replace the original table, but the insert query isn't at all difficult, it's the select that I get the values from.
However, I also want to keep any entries where the original reference was NULL. Also not hard, I could use a Left Inner Join for that.
But we're not done yet: I don't want the entries for which there is no match in the second table. I've been dinking around with this for 2 hours now, and am no closer to figuring this out than I am to the moon.
Let me give you an example data set:
____________________________
| Inventory || Customer |
|============||============|
| ID Cust || ID Name |
|------------||------------|
| 1 A || 1 A |
| 2 B || 2 B |
| 3 E || 3 C |
| 4 NULL || 4 D |
|____________||____________|
Let's say the database used to use the Customer.Name field as its Primary Key, and I need to change it to a standard int identity(1,1) not null ID. I've added the field with no issues in the Customer table, and kept the Name because I need it for other stuff. I have had no trouble with this in all the tables that do not allow NULLs, but since the "Inventory" table allows something to be associated with No customer, I'm running into troubles.
If I did a left inner join, my results would be:
______________
| Results |
|============|
| ID Cust |
|------------|
| 1 1 |
| 2 2 |
| 3 NULL |
| 4 NULL |
|____________|
However, Inventory #3 was referencing a customer which does not exist. I want that to be filtered out.
This database is my development database, where I hack, slash, and destroy things with wanton disregard for validity. So a lot of links in these tables are no longer valid.
The next step is replicating this process in the beta-testing environment, where bad records shouldn't exist, but I can't guarantee that. So I'd like to keep the filter, if possible.
The query I have right now is using a sub-query to find all rows in Inventory whose CustID either exists in Customers, or is null. It then tries to only grab the value from those rows which the subquery found. Here's the translated query:
insert into results
(
ID,
Cust
)
select
inv.ID, cust.ID
from Inventory inv, Customer cust
where inv.ID in
(
select inv.ID from Inventory inv, Customer cust
where inv.Cust is null
or cust.Name = inv.Cust
)
and cust.Name = inv.Cust
But, as I'm sure you can see, this query isn't right. I've tried using 2, 3 subqueries, inner joins, left joins, bleh. The results of this query, and many others I've tried (that weren't horribly, horribly wrong) are:
______________
| Results |
|============|
| ID Cust |
|------------|
| 1 1 |
| 2 2 |
|____________|
Which is essentially an inner-join. Considering my actual data has around 1100 records which have NULL values in that field, I don't think truncating them is the answer.
The answer I'm looking for is:
______________
| Results |
|============|
| ID Cust |
|------------|
| 1 1 |
| 2 2 |
| 4 NULL |
|____________|
The trickiest part of this insert into select is the fact that I'm looking to insert either a value from another table, or essentially a value from this table or the literal NULL. That just isn't something I know how to do; I'm still getting the hang of SQL.
Since I'm inserting the results of this query into a table, I've considered doing the insert using a select which leaves out the NULL values and un-matched records, then going back through and adding in all the NULL records, but I really want to learn how to do the more advanced queries like this.
So do any of yous folks have any ideas? 'Cause I'm lost.
How about a union?
Select all records where ID and Cust match and union that with all records where ID matches and inventory.cust is null.