How to duplicate data in relation table? - sql

Currently, I have 2 tables:
Table A (master) with 3 columns: tblA (tblA_ID, name, desc) (tblA_ID is identity key)
Table B (detail) with 4 columns: tblB (tblB_ID, tblA_ID, name, desc)
Table A has 100 records, every record has 10 records at table B in relationship.
What I want is to add 1.000.000 records to Table 2 for each of these 100 tblA_IDs. Or rather, add 999.990 records for each tblA_IDs, since Table B already has 10 records for each of those IDs.
My solution is using the cursors, get one-by-one in table A, with each tblA_ID, get it, find its data at table B and insert into both tables then.
So, is it possible? Do you have any suggestion to solve this case ?

From what i'm trying to understand, It seems like you are trying to insert dummy data for testing. There are some software available to accomplish what you are after. One which I found to be extremely good is:
RedGate SQL Data Generator: http://www.red-gate.com/products/sql-development/sql-data-generator/
This has a 14 day trial to test it out.
There are some free options available but not as good, since redgate's software handles table relationships on it's own.
One of the free generators is: http://www.generatedata.com
You can download a csv file and add it to your tables by right clicking on the database, hover on Tools, and click on Import data

Related

MS Access - Counting Occurrences of a word in multiple columns

I have a database with a couple tables that tracks personnel errors that require rework by another person. Basically, a person on the job could rework up to 10 different work packages by other people throughout their shift. To make it easy, I just have columns in the table for rework_1/original_worker_1/rework_comment_1 (repeated up to 10) and the person who had to rework it. All of my worker's names are in a separate table so I can add people and my forms update dynamically with their names. What I want to do is this:
Pull a person from my worker's name table.
Search for all occurrences of their name in another table in in column original_worker_X (where X is 1 - 10).
Output the values: Workers Name / How Many Times I found it in the original_worker_X columns.
From here I would need to make a bar graph so that each person's name had a bar with how many times someone had to rework something they did originally.
If I could do this with PHP and MySQL I would be in the money because I could brute force something with some PHP variables, queries, and loops but I am an access novice at best! I appreciate any help you wizards can provide.
Table 1:
Table 2:
Expected Output Numbers:
so i will suggest you do the following
Create a new table,lets say table 3 with three fields
A. ID, pkey, auto number
B. original_worker, text field
C. Person_doing_rework, text field
You will need ten insert statements that will insert each of the original worker 1-10, as well as person doing re-work , this is to a normalise table
Currently, the design of your table is a bit crude, and having a select statement with group by columns numbering 10 is not achievable
Below are samples of the insert statements
INSERT INTO Table3 (original_worker,Person_doing_rework)
SELECT original_worker1,Person_doing_rework
FROM table2 where isnotNull(original_worker1)
INSERT INTO Table3 (original_worker,Person_doing_rework)
SELECT original_worker2,Person_doing_rework
FROM table2 where isnotNull(original_worker2)
replicate this for original_worker3 to original_worker10
Third step
You need a delete statement that will delete all from table 3, this is to ensure that the records from table 3 is not duplicated, since we don't have a pkey/fkey relationship between table 2 and 3
Fourth step
Place all the queries into a macro in the following order
A. Delete query to run first
B. Insert queries to run next
Fifth step
Add a msgbox in the macro, that will run last, this is to inform you that all the other macro steps, i.e A and B above has successfully run.
Sixth step
You can now have a select statement from table 3 that can count the number of times an original workers' work is re worked upon, because you now have two main fields in table 3, one for original_work, and two for Person_reworked.
So any time you want to find out how many times some ones work has been re worked upon, you have to just click the macro button, this will run all the queries and put values you need in the table 3, after which you can view the details via the query in step 6.
SELECT original_worker, Count(Person_doing_rework), FROM table3 GROUP BY original_worker;

How to check if a set of rows already exist in the database and skip migrate them?

I need to create a package to migrate a large amount of data from a database table into a different database table. The source table will continuously have new data in like 4,5 days so I will run my package again and again.
I need to migrate all data from this table to another table but I don't want to migrate those data that I already migrated. What kind of transformation I need to use or what SQL command I need to write to do this?
The usual way this is done is by having "audit" timestamps on the source table and migating only records updated or inserted after the last migration.
for example:
Table Sales
sale_id
sale_date
sale_amount
...............
dw_create_date
dw_update_date
Your source extraction could be something along the lines of..
select sales.sale_id,
sales.sale_date,
....
from sales
where dw_updated_date > {last_migration_date}
last_migration_date is usually read from a config file or table.
Other approaches
There are a few other approaches that you could use, but all of these have bigger performance problems as your data size grows.
1) Do a (target-source) data, to get changed rows in the souurce.
select *
from source
minus
select * from target
You could do the same using a join between source and target.
select source.*
from src
left join tgt on (src.id=tgt.id)
where (src.column1 <> tgt.column1 or
src.column2 <> tgt.column2
............
)
Note that either one of these approaches does not take care of deletes in the source. If you want the tables to be in sync, the only way to do that would be do a (source-target) to get insert/update changes and (target-source) to get deleted rows and do the same in the target.
2. Insert and ignore the primary constraint error:
This has serious issues if the data can change in the source and you want the updates propagated to the target. You'd also be querying the entire source each time. It is usually better to use Merge/Upsert along with filtered source data, instead.
I would assume both tables have some unique identifier, no?
Table A has:
1
2
3
4
You're moving that to Table B, but keeping the data in Table A at the same time, yes?
So you've run your job once. Now Table B has:
1
2
3
4
Table A gets updated. It now has:
1
2
3
4
5
6
7
You run your job again, but you only want to send over 5,6,7.
SELECT *
FROM TableA
LEFT OUTER JOIN TableB ON TableA.ID = TableB.ID
WHERE TableB.ID = NULL.
If you have some sample data it would help. Does this give you a good idea?
See joins: http://i.stack.imgur.com/1UKp7.png

Creating history of flows_030100.wwv_flow_activity_log

Quick Version: I have 4 tables (TableA, TableB, TableC, TableD) identical in design. TableC is a complete History of TableA & B. I want to periodically update TableC with new data from TableA & B. TableD contains a copy of the row most recently transferred from A/B to C. I need to select all records from TablesA/B that are more recent than the record in TableD. Any advice?
Long Version: I'm trying trying to ETL (Extract, Transform, Load) some information from a few different tables into some other tables for quicker, easier reporting... kind of like a data warehouse but within the same database (don't ask).
Basically we want to record and report on system performance. ORACLE have logs for this in tables flows_030100.wwv_flow_activity_log1$ and flows_030100.wwv_flow_activity_log2$ - I believe these tables are filled and cleared every two weeks or something...
I have created a table:
CREATE TABLE dw_log_hist AS
SELECT * FROM flows_030100.wwv_flow_activity_log WHERE 1=0
and filled it with the current information:
INSERT INTO dw_log_hist
SELECT *
FROM flows_030100.wwv_flow_activity_log1$
INSERT INTO dw_log_hist
SELECT *
FROM flows_030100.wwv_flow_activity_log2$
HOWEVER, these log files record EVERY click in the APEX screens. As such, they are continually growing.
I want to periodically update my DW_Log_Hist table with only new information (I am fully aware my history table will grow to be ridiculously sized but I'll deal with that later).
Unfortunately, these tables have no primary key, so I've had to create another table to store marker records that will tell me the latest logs I copied over -_-
CREATE TABLE dw_log_temp AS
SELECT * FROM flows_030100.wwv_flow_activity_log
WHERE time_stamp = (SELECT MAX (time_stamp)
FROM flows_030100.wwv_flow_activity_log2$)
NOW THEN after all that waffle... this is what I need your help with:
Does anyone know whether one of the log tables (wwv_flow_activity_log1$ or wwv_flow_activity_log2$) always has the latest logs? Is it a case of log1$ filling up, log2$ filling then log1$ being overwritten with log2$ so that log2$ always has the latest data? Or do they both fill up and then get filled up again?
Can anyone advise how I would go about populating the DW_Log_Hist table using the DW_Log_Temp marker records?
Conceptually it would be something like:
insert everything into dw_log_hist from activity_log1$ and activity_log2$ where the time_stamp is > (time_stamp of the record in dw_log_temp)
Super sorry for such a long post.
Got the answer :-)
A chap on Reddit helped me realise my over complication...
insert into dw_log_hist
select *
from flows_030100.wwv_flow_activity_log1$
where time_stamp > (select max(time_stamp)
from dw_log_hist)
union
select *
from flows_030100.wwv_flow_activity_log2$
where time_stamp > (select max(time_stamp)
from dw_log_hist)
Hurrah! Always feel like such an idiot when you see the simple answer...

Need SQL to shift entries from one table to another

Heres the situation. I have 2 tables here of the schema:
ID | COMPANY_NAME | DESC | CONTACT
ID | COMPANY_ID | X_COORDINATE | Y_COORDINATE
The first tabel contains a list of companies and the second contacts coordinates of the companies as mentioned.
The thing is that I want to merge the data in this table with the data in another set of tables which already have data. The other tables have similar structure but are already propopulated with data. The IDs are autoincremental.
SO if we have lets say companies marked 1-1000 in table1 and companies marked 1-500 in table 2. We need it merged such that ID number 1 in table 2 becomes ID 1001 when migrated to the other table. And side by side we would also want to migrated the entries in the coordinates table as well in such a way that they map with the new ids of the table. Can this be done in SQL or do I need to resort to using a script here for this kind of work.
i`m not sure i understand how many tables are there and who is table 1 ,2, but the problem is pretty clear. i think the easy way is:
back up all your database before you start this process
add a column to the destination table that will contain the original id.
insert all the records you want to merge (source) into the destination table, putting the original id in the column you added.
now you can update the geo X,Y data using the old ID
after all is done and good you can remove the original id column.
EDIT: in reply to your comment , i`ll add teh code here, since its more readable.
adapted from SQL Books Online: insert rows from another table
INSERT INTO MyNewTable (TheOriginalID, Desc)
SELECT ID, Desc
FROM OldTable;
Then you can do an update to the new table based on values from the old table like so:
UPDATE MyNewTable SET X = oldTable.X , Y = oldTable.Y where
FROM MYNewTable inner JOIN OldTable ON MYNewTable.TheOriginalID = OldTable.ID

Copy data between tables in different databases without PK's ( like synchronizing )

I have a table ( A ) in a database that doesn't have PK's it has about 300 k records.
I have a subset copy ( B ) of that table in other database, this has only 50k and contains a backup for a given time range ( july data ).
I want to copy from the table B the missing records into table A without duplicating existing records of course. ( I can create a database link to make things easier )
What strategy can I follow to succesfully insert into A the missing rows from B.
These are the table columns:
IDLETIME NUMBER
ACTIVITY NUMBER
ROLE NUMBER
DURATION NUMBER
FINISHDATE DATE
USERID NUMBER
.. 40 extra varchar columns here ...
My biggest concern is the lack of PK. Can I create something like a hash or a PK using all the columns?
What could be a possible way to proceed in this case?
I'm using Oracle 9i in table A and Oracle XE ( 10 ) in B
The approximate number of elements to copy is 20,000
Thanks in advance.
If the data volumes are small enough, I'd go with the following
CREATE DATABASE LINK A CONNECT TO ... IDENTIFIED BY ... USING ....;
INSERT INTO COPY
SELECT * FROM table#A
MINUS
SELECT * FROM COPY;
You say there are about 20,000 to copy, but not how many in the entire dataset.
The other option is to delete the current contents of the copy and insert the entire contents of the original table.
If the full datasets are large, you could go with a hash, but I suspect that it would still try to drag the entire dataset across the DB link to apply the hash in the local database.
As long as no duplicate rows should exist in the table, you could apply a Unique or Primary key to all columns. If the overhead of a key/index would be to much to maintain, you could also query the database in your application to see whether it exists, and only perform the insert if it is absent