How to get the differences between two - kind of - duplicated tables (sql) - sql

Prolog:
I have two tables in two different databases, one is an updated version of the other. For example we could imagine that one year ago I duplicated table 1 in the new db (say, table 2), and from then I started working on table 2 never updating table 1.
I would like to compare the two tables, to get the differences that have grown in this period of time (the tables has preserved the structure, so that comparison has meaning)
My way of proceeding was to create a third table, in which I would like to copy both table 1 and table 2, and then count the number of repetitions of every entry.
In my opinion, this, added to a new attribute that specifies for every entry the table where he cames from would do the job.
Problem:
Copying the two tables into the third table I get the (obvious) error to have two duplicate key values in a unique or primary key costraint.
How could I bypass the error or how could do the same job better? Any idea is appreciated

Something like this should do what you want if A and B have the same structure, otherwise just select and rename the columns you want to confront....
SELECT
*
FROM
B
WHERE NOT EXISTS (SELECT * FROM A)
if NOT EXISTS doesn't work in your DBMS you could also use a left outer join comparing the rows columns values.
SELECT
A.*
from
A left outer join B
on A.col = B.col and ....

Related

Null values not being returned in Postgresql

I have two tables in Postgresql, which I need to perform the union taking the null values, to add other values in another column of the junction.
Table one:
I filtered by date, because this data is generated daily and I only need the current_date
Table two: All names.
In table two I have 9 names that are not found in table one.
When I try to perform the join, I only get the 9 names from table one as a result.
Trying with date from table one to current_date
But if I don't filter the date from table one, the null value is returned.
That is, the name that is in table two but not in table one.
What I need is to join the two tables and where there is no asset referring to the second table, fill it with 0 (zero).
In this part I understood that I must use COALESCE(vcm.ativo,0).
But first I need the names of the second table to appear as well.
The result should be like this:
If someone could help me, I'll be grateful.
As pointed out in a comment by the asker, the solution turned out to be
with todays_data as (
select vcm.cooperativa, vcm.ativo
from sga_bi.veiculos_coop_mensal as vcm
where data = current_date
)
select coop.nome, COALESCE(vcmm.ativo,0)
from sga.cooperativas as coop
left outer join todays_data as vcmm
on coop.nome = vcmm.cooperativa

How to update numerical column of one table based on matching string column from another table in SQL

I want to update numerical columns of one table based on matching string columns from another table.i.e.,
I have a table (let's say table1) with 100 records containing 5 string (or text) columns and 10 numerical columns. Now I have another table that has the same structure (columns) and 20 records. In this, few records contain updated data of table1 i.e., numerical columns values are updated for these records and rest are new (both text and numerical columns).
I want to update numerical columns for records with the same text columns (in table1) and insert new data from table2 into table1 where text columns are also new.
I thought of taking an intersect of these two tables and then update but couldn't figure out the logic as how can I update the numerical columns.
Note: I don't have any primary or unique key columns.
Please help here.
Thanks in advance.
The simplest solution would be to use two separate queries, such as:
UPDATE b
SET b.[NumericColumn] = a.[NumericColumn],
etc...
FROM [dbo].[SourceTable] a
JOIN [dbo].[DestinationTable] b
ON a.[StringColumn1] = b.[StringColumn1]
AND a.[StringColumn2] = b.[StringColumn2] etc...
INSERT INTO [dbo].[DestinationTable] (
[NumericColumn],
[StringColumn1],
[StringColumn2],
etc...
)
SELECT a.[NumericColumn],
a.[StringColumn1],
a.[StringColumn2],
etc...
FROM [dbo].[SourceTable] a
LEFT JOIN [dbo].[DestinationTable] b
ON a.[StringColumn1] = b.[StringColumn1]
AND a.[StringColumn2] = b.[StringColumn2] etc...
WHERE b.[NumericColumn] IS NULL
--assumes that [NumericColumn] is non-nullable.
--If there are no non-nullable columns then you
--will have to structure your query differently
This will be effective if you are working with a small dataset that does not change very frequently and you are not worried about high contention.
There are still a number of issues with this approach - most notably what happens if either the source or destination table is accessed and/or modified while the update statement is running. Some of these issues can be worked around other ways but so much depends on the context of how the tables are used that it is difficult to provide a more effective generically-applicable solution.

Compare a column in one table against a whole other table?

I know that SQL joins exist but that is only for one column against another column in another table. Is there any way to do something similar with one column against a whole table? I'm trying to figure out if the people that exist within one organization are a certain kind of employee. The problem is I have all the people in a organization listed within a column in one table while the classification for people is scattered throughout various columns in another table.
While I will answer this, I recommend you do one of those schoolkids tutorials on SQL. This question is at such a basic level you'll probably just get confused by the answers anyway...
From your question I would gather that the tables are probably modeled incorrectly to start with (not normalized well enough). But if you want to join a column to all the columns in another table you can do it in two ways:
SELECT COLUMN_1 FROM TABLE_1 T1 INNER JOIN TABLE_2 T2 ON T1.COLUMN_1 = T2.COLUMN_1
UNION ALL
SELECT COLUMN_1 FROM TABLE_1 T1 INNER JOIN TABLE_2 T2 ON T1.COLUMN_1 = T2.COLUMN_2
UNION ALL
... (just change the column name on each row)
(works best if you copy/paste this into Excel with a macro and a list of column names from table 2).
2) More complex: create a view or subquery where you first union all columns in table 2 one by one (hopefully they all have the same type!) and then join to the subquery, which now acts as a table with just one column.
3) Start pivoting table 2. Not going into that one, too complex for your current level.
Yes, you can Basically if you have one Table and then you want to filter out some name from there and then you can use joins
and the second part is yes you can add multiple conditions with the help of where clause.
and id you want to join on multiple conditions then also you can do with and condition

Copy records missing from one table to a new table

I managed to delete 4,000 rows from a table in my 129,000-row production database (Postgres 9.4 on Heroku), but only identified the problem a few days later.
I have a backup from before the loss, but only want to selectively restore the missing rows back to the table, preserving their id's. (A complete restore is not an option as new data has since been added to the table.)
Into a local testing database I have imported the backed-up table as articles_backup, alongside the actual articles table. I want to find all the rows in articles_backups that are missing from articles and then copy these to a new table articles_restores that I will then restore to the production database, back into the articles table (preserving record id's).
This query successfully returns all the id's of the deleted records:
select articles_backups.id
from articles_backups
left outer join articles on (articles_backups.id = articles.id)
where articles.id is null
But I have not been able to copy the result to a new table. I have unsuccessfully tried:
select *
into articles_restores
from articles_backups
left outer join articles on (articles_backups.id = articles.id)
where articles.id is null;
Which gives:
ERROR: column "id" specified more than once
Basically your query with LEFT JOIN / IS NULL does what you are after:
Select rows which are not present in other table
You get the error because you select all columns from both tables, and there is an id column in both. It's not possible to create a new table with duplicate column names, and it's not what you want to begin with. Only select columns from articles_backups:
CREATE TABLE articles_restores AS
SELECT ab.*
FROM articles_backups ab
LEFT JOIN articles a USING (id)
WHERE a.id IS NULL;
While being at it I simplified your query syntax with table aliases. The USING clause is just for the convenience of shorter code. It folds the two id columns into one, but all other columns are still in there twice if you SELECT *.
Use CREATE TABLE AS. SELECT INTO is also defined by the SQL standard and implemented in Postgres, but its use is discouraged. It's used in PL/pgSQL functions for a different purpose. Details:
Creating temporary tables in SQL
You could use an except to retrieve all the rows from articles_backup that are different from articles:
(assuming both tables have the same columns in the same order)
you could also create a temp table with this info to make it easy on your repairing statements:
create table temp_articles as
select * from articles_backup
except
select * from articles
step 1 - update rows from 'articles_backup' present in articles.
This step needs attention... you will have to establish a rule to choose between the data present in articles and the one present in temp_articles.
UPDATE articles a
SET a.col1=b.col1,
a.col2=b.col2,
(... other columns ...)
FROM (SELECT * FROM temp_articles) AS b
WHERE a.id = b.id and /* your rule for data to be (or not) updated goes here */
step 2 - insert rows from 'articles_backup' not present in articles (your deleted records):
insert into articles
select * from temp_articles where id not in (select id from articles)
Let us know if you need more help.

Oracle Select Query Load Optimization

I have two table A and B. A has a column b_id which act as a foreign key reference for a many to one relationship.
So is there any load difference in Oracle when executing the query like
select A.* from A, B where A.b_id=B.ID and B.ID=? -- auto-generated by hibernate
and
select * from A where b_id = ? -- Created manually
UPDATE : I need data from only table A
For sure there will be a difference between both queries, the first one is getting data from two tables and the second one is just querying one unique table.
Even if you don't return any results of table B in the first query, these data are used for the jointure condition (not the case in the second query).