How to fill database with data - sql

I would like to ask, how can I make insert script without I have to manually write it. Does exist some soft with GUI where I can write a data and it will generate me a script? I do my database in oracle sqldeveloper, but I can't find something like that. Thank you in advice.

If you mean you want to fill with dummy test data in a table, there are dozens of way to do it:
Provided you have access to the data dictionary, here is one easy way to generate 20,000 records:
insert into my_table
select
--<number of columns you want
-- use dbms_random if you would like>
from
dba_objects a,
dba_objects b
where
rownum<=20000;
This makes use of cartesian join with one of the large dictionary views that comes installed with Oracle dba_objects.
PS: Cartesian join on large tables/views can become very slow, so use good judgement to restrict the result set.
OTOH, if you want specific data and not some random stuff to be inserted into the table, you are stuck with the INSERT..VALUES syntax of Oracle wherein which you create a INSERT statement for each of the records. You might reduce the effort to convert your data(in CSV or some other standard format) by automating copy/paste stuff using features like macro available in some editors like Notepad++, sublime, etc.,.
There are also other options like SQL*Loader where you need to write a "Control File" to tell the tool how to load the data from external file to the table. This approach would be best and faster than the INSERT..VALUES approach.

Related

How can I perform the same query on multiple tables in Redshift

I'm working in SQL Workbench in Redshift. We have daily event tables for customer accounts, the same format each day just with updated info. There are currently 300+ tables. For a simple example, I would like to extract the top 10 rows from each table and place them in 1 table.
Table name format is Events_001, Events_002, etc. Typical values are Customer_ID and Balance.
Redshift does not appear to support declare variables, so I'm a bit stuck.
You've effectively invented a kind of pseudo-partitioning; where you manually partition the data by day.
To manually recombine the tables create a view to union everything together...
CREATE VIEW
events_combined
AS
SELECT 1 AS partition_id, * FROM events_001
UNION ALL
SELECT 2 AS partition_id, * FROM events_002
UNION ALL
SELECT 3 AS partition_id, * FROM events_003
etc, etc
That's a hassle, you need to recreate the view every time you add a new table.
That's why most modern databases have partitioning schemes built in to them, so all the boiler-plate is taken care of for you.
But RedShift doesn't do that. So, why not?
In general because RedShift has many alternative mechanisms for dividing and conquering data. It's columnar, so you can avoid reading columns you don't use. It's horizontally partitioned across multiple nodes (sharded), to share the load with large volumes of data. It's sorted and compressed in pages to avoid loading rows you don't want or need. It has dirty pages for newly arriving data, which can then be cleaned up with a VACUUM.
So, I would agree with others that it's not normal practice. Yet, Amazon themselves do have a help page (briefly) describing your use case.
https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-time-series-tables.html
So, I'd disagree with "never do this". Still, it is a strong indication that you've accidentally walked in to an anti-pattern and should seriously re-consider your design.
As others have pointed out many small tables in Redshift is really inefficient, like terrible if taken to the extreme. But that is not your question.
You want to know how to perform the same query on multiple tables from SQL Workbench. I'm assuming you are referring to SQLWorkbench/J. If so you can define variables in the bench and use these variable in queries. Then you just need to update the variable and rerun the query. Now SQLWorkbench/J doesn't offer any looping or scripting capabilities. If you want to loop you will need to wrap the bench in a script (like a BAT file or a bash script).
My preference is to write a jinja template with the SQL in it along with any looping and variable substitution. Then apply a json with the table names and presto you have all the SQL for all the tables in one file. I just need to run this - usually with the psql cli but at times I'm import it into my bench.
My advice is to treat Redshift as a query execution engine and use an external environment (Lambda, EC2, etc) for the orchestration of what queries to run and when. Many other databases (try to) provide a full operating environment inside the database functionality. Applying this pattern to Redshift often leads to problems. Use Redshift for what it is great at and perform the other actions elsewhere. In the end you will find that the large AWS ecosystem provides extended capabilities as compared to other databases, it's just that these aren't all done inside of Redshift.

Iterative union SQL query

I'm working with CA (Broadcom) UIM. I want the most efficient method of pulling distinct values from several views. I have views that start with "V_" for every QOS that exists in the S_QOS_DATA table. I specifically want to pull data for any view that starts with "V_QOS_XENDESKTOP."
The inefficient method that gave me quick results was the following:
select * from s_qos_data where qos like 'QOS_XENDESKTOP%';
Take that data and put it in Excel.
Use CONCAT to turn just the qos names into queries such as:
SELECT DISTINCT samplevalue, 'QOS_XENDESKTOP_SITE_CONTROLLER_STATE' AS qos
FROM V_QOS_XENDESKTOP_SITE_CONTROLLER_STATE union
Copy the formula cell down for all rows and remove Union from the last query as well
as add a semicolon.
This worked, I got the output, but there has to be a more elegant solution. Most of the answers I've found related to iterating through SQL uses numbers or doesn't seem quite what I'm looking for. Examples: Multiple select queries using while loop in a single table? Is it Possible? and Syntax of for-loop in SQL Server
The most efficient method to do what you want to do is to do something like what CA's scripts do (the ones you linked to). That is, use dynamic SQL: create a string containing the SQL you want from system tables, and execute it.
A more efficient method would be to write a different query based on the underlying tables, mimicking the criteria in the views you care about.
Unless your view definitions are changing frequently, though, I recommend against dynamic SQL. (I doubt they change frequently. You regenerate the views no more frequently than you get a new script, right? CA isn't adding tables willy nilly.) AFAICT, that's basically what you're doing already.
Get yourself a list of the view names, and write your query against a union of them, explicitly. Job done: easy to understand, not much work to modify, and you give the server its best opportunity to optimize.
I can imagine that it's frustrating and error-prone not to be able to put all that work into your own view, and query against it at your convenience. It's too bad most organizations don't let users write their own views and procedures (owned by their own accounts, not dbo). The best I can offer is to save what would be the view body to a file, and insert it into a WITH clause in your queries
WITH (... query ...) as V select ... from V

How to read a text file using pl/sql and do string search for each line

I want to read text file and store(using clob datatype) its data in table . I want to do string comparison on loaded data.
Loaded text file contains DDL scripts and i want to get segregation of new/modified tables, new/modified indexes and constraints.
This can be done as Tom suggested Ask tom article
Challenge i'm facing here is that, i have to get above details before running those scripts, otherwise i would have used some DDL trigger to audit schema changes.
My qustion is , is it feasible to do string comparison on large text ? or is there any better alternative. please share your views/ideas on this.
Example file
Create table table_one
Alter table table_two
create index index_table_one_idx table_one (column_one)
etc etc... 100s of statements
from above code i want to get table_one , table_two as modified tables, and index_table_one_idx as newly created index.
i want to achieve this by looking for 'create table','alter table' strings in large text file and get the table name using substring.
It is perfectly feasible to do string comparison on large text.
There are a couple of different approaches. One is to read the file line by line using UTL_FILE. The other word be to load it into a temporary CLOB and process it in chunks. Probably the second way is is the better option. Make sure to use the DBMS_LOB functions for string manipulation, because they will perform better. Find out more.
Your main problem is a specification one. You need to isolate all the different starting points for SQL statements. If your script just as CREATE, ALTER and DROP then it's not too difficult, depending on how much subsequent detail you need (extract object type? extract object name? etc) and what additional processing you need to do.
If your scripts contain DML statements the task becomes harder. If the DDL encompasses programmatic objects (TYPE, PACKAGE, etc) then it's an order of magnitude harder.

Can data and schema be changed with DB2/z load/unload?

I'm trying to find an efficient way to migrate tables with DB2 on the mainframe using JCL. When we update our application such that the schema changes, we need to migrate the database to match.
What we've been doing in the past is basically creating a new table, selecting from the old table into that, deleting the original and renaming the new table to the original name.
Needless to say, that's not a very high-performance solution when the tables are big (and some of them are very big).
With latter versions of DB2, I know you can do simple things like alter column types but we have migration jobs which need to do more complicated things to the data.
Consider for example the case where we want to combine two columns into one (firstname + lastname -> fullname). Never mind that it's not necessarily a good idea to do that, just take it for granted that this is the sort of thing we need to do. There may be arbitrarily complicated transformations to the data, basically anything you can do with a select statement.
My question is this. The DB2 unload utility can be used to pull all of the data out of a table into a couple of data sets (the load JCL used for reloading the data, and the data itself). Is there an easy way (or any way) to massage this output of unload so that these arbitrary changes are made when reloading the data?
I assume that I could modify the load JCL member and the data member somehow to achieve this but I'm not sure how easy that would be.
Or, better yet, can the unload/load process itself do this without having to massage the members directly?
Does anyone have any experience of this, or have pointers to redbooks or redpapers (or any other sources) that describe how to do this?
Is there a different (better, obviously) way of doing this other than unload/load?
As you have noted, SELECTing from the old table into the new table will have very poor performance. Poor performance here is generally due to the relatively high costs of insertion INTO the target table (index building and RI enforcement). The SELECT itself is generally not a performance issue. This is why the LOAD utility is generally perferred when large tables need to be populated from scratch, indices may be built more efficiently and RI may be deferred.
the UNLOAD utility allows unrestricted usage of SELECT. If you can SELECT data using scalar and/or column functions to build a result set that is compatible with your new table column definitions then UNLOAD can be used to do the data conversion. Specify a SELECT statement in SYSIN for the UNLOAD utility. Something like:
//SYSIN DD *
SELECT CONCAT(FIRST_NAME, LAST_NAME) AS "FULLNAME"
FROM OLD_TABLE
/*
The resulting SYSRECxx file will contain a single column that is a concatenation of the two identified columns (result of the CONCAT function) and SYSPUNCH will contain a
compatible column definition for FULLNAME - the converted column name for the new table. All you need to do is edit the new table name in SYSPUNCH (this will have defaulted to TBLNAME) and LOAD it. Try not to fiddle with the SYSRECxx data or the SYSPUNCH column definitions - a goof here could get ugly.
Use the REPLACE option when running the LOAD utility
to create the new table (I think the default is LOAD RESUME which won't work here). Often it is a good idea to leave RI off when running the LOAD, this will improve performance and
save the headache of figuring out the order in which LOAD jobs need to be run. Once finished you need to verify the
RI and build the indices.
The LOAD utility is documented here
I assume that I could modify the load JCL member and the data member somehow to achieve this but I'm not sure how easy that would be.
I believe you have provided the answer within your question. As to the question of "how easy that would be," it would depend on the nature of your modifications.
SORT utilities (DFSORT, SyncSort, etc.) now have very sophisticated data manipulation functions. We use these to move data around, substitute one value for another, combine fields, split fields, etc. albeit in a different context from what you are describing.
You could do something similar with your load control statements, but that might not be worth the trouble. It will depend on the extent of your changes. It may be worth your time to attempt to automate modification of the load control statements if you have a repetitive modification that is necessary. If the modifications are all "one off" then a manual solution may be more expedient.

DYnamic SQL examples

I have lately learned what is dynamic sql and one of the most interesting features of it to me is that we can use dynamic columns names and tables. But I cannot think about useful real life examples. The only one that came into my mind is statistical table.
Let`s say that we have table with name, type and created_data. Then we want to have a table that in columns are years from created_data column and in row type and number of names created in years. (sorry for my English)
What can be other useful real life examples of using dynamic sql with column and table as parameters? How do you use it?
Thanks for any suggestions and help :)
regards
Gabe
/edit
Thx for replies, I am particulary interested in examples that do not contain administrative things or database convertion or something like that, I am looking for examples where the code in example java is more complicated than using a dynamic sql in for example stored procedure.
An example of dynamic SQL is to fix a broken schema and make it more usable.
For example if you have hundreds of users and someone originally decided to create a new table for each user, you might want to redesign the database to have only one table. Then you'd need to migrate all the existing data to this new system.
You can query the information schema for table names with a certain naming pattern or containing certain columns then use dynamic SQL to select all the data from each of those tables then put it into a single table.
INSERT INTO users (name, col1, col2)
SELECT 'foo', col1, col2 FROM user_foo
UNION ALL
SELECT 'bar', col1, col2 FROM user_bar
UNION ALL
...
Then hopefully after doing this once you will never need to touch dynamic SQL again.
Long-long ago I have worked with appliaction where users uses their own tables in common database.
Imagine, each user can create their own table in database from UI. To get the access to data from these tables, developer needs to use the dynamic SQL.
I once had to write an Excel import where the excel sheet was not like a csv file but layed out like a matrix. So I had to deal with a unknown number of columns for 3 temporary tables (columns, rows, "infield"). The rows were also a short form of tree. Sounds weird, but was a fun to do.
In SQL Server there was no chance to handle this without dynamic SQL.
Another example from a situation I recently came up against. A MySQL database of about 250 tables, all in MyISAM engine and no database design schema, chart or other explanation at all - well, except the not so helpful table and column names.
To plan for conversion to InnoDB and find possible foreign keys, we either had to manually check all queries (and the conditions used in JOIN and WHERE clauses) created from the web frontend code or make a script that uses dynamic SQL and checks all combinations of columns with compatible datatype and compares the data stored in those columns combinations (and then manually accept or reject these possibilities).