We have a use case where we have to combine data from 3 tables in redshift.the data size is around 1 million records.
The first table contains client details - tabl1
Second table contains visits - table2 (M2O relation to table1)
table contains events- table3 (M2O relation with table2)
Now we have to aggregate these 3 tables and prepare a record based at the visit level such that it contains the client details from table1, visit details from table2 and all the events attributes of table3.
What is the best way to read data in above format from redshift using batch job?
Related
I have 2 tables in my SQL database:
And I want to merge them in a way the result will be:
This is just an example for 2 tables which need to be merged into one new table (The tables contain an example data, the statement should work for any amount of data inside the tables).
The ID which got different value in CSV should be updated into the new table for example:
ID 3's value is 'KKK' and in table T is 'CCC', then what should be updated is the CSV table.
You seem to want a left join and to match to the second table if available:
select t.id, coalesce(csv.value, t.value) as value
from t left join
csv
on t.id = csv.id;
If you want this in a new table, use the appropriate construct for your database, or use insert to insert into an existing table.
Using Microsoft SQL server 2017, I want to merge two tables into a 3rd table.
Is there a way to have it defined automatically, without a query?
If it is not possible, I would like to mention that both tables contain a large amount of columns, therefore I look for an efficient query that will not make me write each column name.
Demonstration:
Original tables are Table1,Table2, column structure is as below:
Table1:
(Column1,Column2,Column3)
Table2:
(Column4,Column5,Column6)
My goal is to create Table3. Table3 is based on Table1 LEFT JOIN Table2 ON Column1=Column4.
Table 3:
(Column1,Column2,Column3,Column4,Column5,Column6)
Create it as a view.
CREATE VIEW table3 AS
SELECT Column1,Column2,Column3,Column4,Column5,Column6
FROM Table1 LEFT JOIN Table2 ON Column1=Column4
You can then reference table3 in other queries exactly like an ordinary table.
I'm facing a problem with fetching data from multiple sources. It would be great if you can provide your ideas to design the SQL query.
I have to take data from two tables and INSERT it into a third table.
INPUT
TABLE 1
- TaskOrderNumer
- MemberID
TABLE 2
- ReferenceID
- MemberID
OUTPUT
TABLE 3
- TaskRefID
- PatID
My input table has TaskOrderNumber and MemberID. Right now I'm joining TABLE1 and TABLE2 based on MemberID. I'm getting the corresponding ReferenceID from TABLE2 and mapping it into PatID of TABLE3. The TaskOrder Number in TABLE1= TaskRefID in TABLE3.
I'm currently doing this using SSIS components. I want to make sure that the correct data is MERGED. I'm not able to map the TaskOrderNumber to TaskRefID. Can you please help me design the solution.
You can simply query the information you are trying to display. I am not sure I would even bother with SSIS here unless your sources aren't SQL.
select t1.TaskOrderID as TaskRefID
,t2.ReferenceID as PatID
into Table3 --Added this as edit.
from Table1 t1
join Table2 t2 on t1.MemberID=t2.MemberID
I have 100s of millions of unique rows spread across 12 tables in the same database. They all have the same schema/columns. Is there a relatively easy way to combine all of the separate tables into 1 table?
I've tried importing the tables into a single table, but given this is a HUGE size of files/rows, SQL Server is making me wait a long time as if I was importing from a flat file. There has to be an easier/faster way, no?
You haven't given much info about your table structure, but you can probably just do a plain old insert from a select, like below. The example would take all records that don't already exist Table2 and Table3, and insert them into Table1. You could do this to merge everything from all your 12 tables into a single table.
INSERT INTO Table1
SELECT * FROM Table2
WHERE SomeUniqueKey
NOT IN (SELECT SomeUniqueKey FROM Table1)
UNION
SELECT * FROM Table3
WHERE SomeUniqueKey
NOT IN (SELECT SomeUniqueKey FROM Table1)
--...
Do what Jim says, but first:
1) Drop (or disable) all indices in the destination table.
2) Insert rows from each table, one table at a time.
3) Commit the transaction after each table is appended, otherwise much disk space will be taken up in case of a possible rollback.
4) Renable or recreate the indices after you are done.
If there is a possibility of duplicate keys, you may need to retain an index on the key field and have a NOT EXISTS clause to hold back the duplicate records from being added.
We have seven tables in postgres database.t1,t2,t3,t4,t5,t,t7 Each table contains various columns with duplicate product_id number
The product_id number is exists is each table.That means
t1 --> 123(product_id)
t1 --> 123(with various other column data)
t2 --> 123 upto t7
this "123" product id will be existing in each table upto t7.And also,the table will have more than one same product_ids.
Current requirement is to process all product_id's in my server, I need to create intermediate table with unique product ids.
whenever i am updating the tables(t1..t7) the intermediate table has to be triggered to update.
Edit1:
The Intermediate view has to be generated by making all seven tables together.
When I am again importing few more rows from csv/(copy tablename from csvpath...) to these seven tables.The intermediate view also need to be computed and updated by the trigger method
Because this is the frequent operation.Updating the tables from csv and again computing and updating the intermediate view.
So ,How it supposed to write the trigger when updating the seven tables by importing from csv?
Don't create a table, create a view that selects from those tables.
create or replace view all_product_ids
as
select product_id
from t1
union
select product_id
from t2
union
... you get the picture ...
Once you have done that, re-think your database model. By the little information you have provided it sure sounds like your model is not ideal.
Take a look at the PostgreSQL documentation on PL/pgSQL triggers, and especially on this example, your activities looks similar.