I have a Destination table with 3 columns, ID, Name, Source.
I have 10+ Source tables, each with multiple columns, but I only require the ID, Name, and the table name itself to be appended into Destination table.
Do note the naming of the column names are different in each table, but the required ID and Name are of the same data type and are of sufficient length for the ID and Name fields.
I already have the query to add the data (see below), and I have no issues doing the first run to add the required data into the destination table, as I just need one query for each table, Here is one my code below
INSERT INTO dest_table
SELECT ID, Name, 'source_table' as Source
FROM source_table
The issue now is that I need to schedule this to run on a daily basis.
I would like the source tables to append their new data into the destination table, and not add all records from each source table into the destination table.
Another condition to consider is that I will still need the data from the destination table to be intact. This means what ever records that were removed from the source tables , will not be removed from the destination table.
Thanks people!
You can exclude the old data by using a WHERE clause as shown below. I am assuming that the ID is unique in this table among all tables else you need to add another column in the destination table to identify where this ID is coming from
INSERT INTO dest_table
SELECT ID, Name, 'source_table' as Source FROM source_table
WHERE NOT EXISTS (SELECT 1 FROM dest_table dt WHERE dt.id = source_table.id)
Another non-optimal approach would be to create a trigger on insert in the source tables and push the data to the destination table.
Related
I have this newly created table in SQL Server with 3 columns ID, Name, Source.
Basically this table will be populated with data from other different tables, each specifically taking in their record IDs and record Names. I believe this can be easily achieved with an INSERT INTO SELECT statement.
I would like to find out on how to populate the Source column. This column is supposed to indicate which table the data came from. For example, Source in table A has 3 records, which I then copied the ID and Name columns from this table, and put it into my destination table.
At the same time, the 3 new records will have their Source column set, indicating it came from Table A. Then I will proceed to do the same for other tables.
You can use the constant string as follows:
INSERT INTO your_table
SELECT id, name, 'TableA' as source
FROM tableA
I have 2 tables in my SQL database:
And I want to merge them in a way the result will be:
This is just an example for 2 tables which need to be merged into one new table (The tables contain an example data, the statement should work for any amount of data inside the tables).
The ID which got different value in CSV should be updated into the new table for example:
ID 3's value is 'KKK' and in table T is 'CCC', then what should be updated is the CSV table.
You seem to want a left join and to match to the second table if available:
select t.id, coalesce(csv.value, t.value) as value
from t left join
csv
on t.id = csv.id;
If you want this in a new table, use the appropriate construct for your database, or use insert to insert into an existing table.
I have a scenario as described below need to create a SSIS Package for that.
I have 3 COLUMNS in source table which needs to be entered in destination table.
But all these columns has to be looked up in the look up table of destination database and then enter their ID's in the destination column.
For example
Source table has 3 columns with values
idnum static type timedimension geography modified date
1 price daydate france 8/12/2015
2 RetailpRICE WEEK ITALY 9/12/2014
I want a package which looks up the column values with the matchin ID and populates in the destination table...
I know we can use the LOOKUP transform to update the data for one single column in destination table what about the other columns which I need to insert along with the lookup insertion.
How can I achieve this ? Also is there a way to pull only the recent data from the source table using modified date column values
Use a different lookup for each lookup table that you need to reference to get the Ids. So if each of your columns that you want IDs for gets its ID from a different table, then you need to use three lookups, one after the other, until you have all three IDs.
I have two tables in an Amazon Redshift cluster that both use a timestamp as sort key. The first table is sorted and contains only data from timepoint 1 to timepoint 2. The second table is only temporary but also sorted and contains data from timepoint 3 to timepoint 4. Is there any to insert all the data from the first table into the second without having to run VACUUM on the table as. A normal INSERT from one table to another always needs a VACUUM afterwards as far as I know.
I know it would be possible if I used COPY on a pre-sorted flat file. But is there also a solution for two pre-sorted tables that does not need a VACUUM?
Option 1:
create new table say final table same schema as table two as you wish to copy content of table 1 to table 2.
Please check
select "column", type, encoding
from pg_table_def where tablename='table2'
this will give encoding used for each column for table 2. Create new final table with same encoding for each column.
Use query to load data in final table in sorted order
insert into final table ( select * from table1 order by timepoint asc)
then fire
insert into final table ( select * from table2 order by timepoint asc )
Option 2:
create final table and load data for timpoint1 , then load for timepoint2.. Continue till time points loaded in sorted manner.
Option 3:
You can check for Deep Copy Redshift option as well
here is the link http://docs.aws.amazon.com/redshift/latest/dg/performing-a-deep-copy.html
While doing deep copy, copy data for table 1 first then load table 2
I have tried this query in SQL Server :
SELECT * INTO table_name FROM old_table_name
I am currently working on a database where I am trying to find all the transactional type tables (where the table name does not start with _Result or _ History) in the database and then display how many times each table is used in a calculation for the database model.
The purpose of finding out this is to determine the most important tables in the calculation, so that these certain tables will have a priority when updating statistics.
There are currently two tables I am working with.
1) The first table called tmpCalcSources shows the name for all the source tables in the database (Column called 'Source') along with the Calculation ID (Column called 'CalculationID') associated with it
2) The second table called tmpCalcSourceRows shows the source table name (Column called 'TableName') as well as the amount of rows associated with each source table (Column called 'RowNum')
I currently have this query:
SELECT Source,
COUNT(CalculationID) AS NumberOfUses
FROM tmpCalcSources
WHERE Source not like '%_Result%'
GROUP BY Source
ORDER BY NumberOfUses DESC;
The above query provides me with the following table:
plPeriods 292
plMeasures 10
Time 43
etc...
I am now trying to add one more thing to the above table. I want it to also show the number of rows contained in each table (so a another column). I would like to take the column 'RowNum' from tmpCalcSourceRows table and be able to display that in the table shown above.
Is this what you're looking for:
select
Source
, NumberOfUses
, numberofrows
from (
SELECT Source,COUNT(CalculationID) AS NumberOfUses
FROM tmpCalcSources
WHERE Source not like '%_Result%'
GROUP BY Source
) tableUses
join numRowsTable on numRowsTable.TableName=tableUses.Source
ORDER BY NumberOfUses DESC;