I'm new to Oracle. I've worked with Microsoft SQL Server for years, though. I was brought into a project that was already overdue and over budget, and I need to be "the Oracle expert." I've got a table that has 14 million rows in it. And I've got a complicated query that updates the table. But I'm going to start with the simple queries. When I issue a simple update that modifies a small number of records (100 to maybe 10,000 or so) it takes no more than 2 to 5 minutes to table scan the table and update the affected records. (Less time if the query can use an index.) But if I update the entire table with:
UPDATE MyTable SET MyFlag = 1;
Then it takes 3 hours!
If the table scan completes in minutes, why should this take hours? I could certainly use some advise on how to troubleshoot this, since I don't have enough experience with Oracle to know what diagnostics queries to run. (I'm on Oracle 11g and using Oracle SQL Developer as the client.)
Thanks.
When you do the UPDATE in Oracle, the data you are modifying are sequentially appended to the redo log and then distributed among the data blocks by a process called CHECKPOINT.
In addition, the old version of the data are copied into UNDO space to support possible transaction rollbacks and access to the old version of data by concurrent processes.
This all may take significantly more time than pure read operations which don't modify data.
Related
I want to migrate more than 10 million of data from source (oracle) to target (oracle) in shorter time in Oracle Data Integrator 12c. I have tried to create multiple scenarios and assign each scenario a million records and run the package of 10 scenarios. Time was reduced but is there any other way so that i can increase the performance of my ODI mapping having more than 10 million records?
I expect a less time for the execution of the mapping for better performance.
Further to add, on the ODI 12c you have option to place the source side indxes, in case the query is performing bad on the source side under the joins option you that option in the physical tab, else you can also deploy the hints in the physical tab. please let me know if these work.
We are redesigning a very large (~100Gb, partitioned) table in SQL Server 2012.
For that we need to convert data from the old (existing) table into the newly designed table on the production server. The new table is also partitioned. Rows are only appended to the table.
The problem is a lot of users work on this server, and we can do this conversion process only in chunks and when the server is not under heavy load (a couple of hours a day).
I wonder if there is a better & faster way?
This time we will finish the conversion process in a few days (and then switch our application to use the new table), but what would we do if the table was 1Tb? Or 10Tb?
PS. More details on the current process:
The tables are partitioned based on the CloseOfBusinessDate column (DATE). Currently we run this query when the server is under low load:
INSERT INTO
NewTable
...
SELECT ... FROM
OldTable -- this SELECT involves xml parsing and CROSS APPLY
WHERE
CloseOfBusinessDate = #currentlyMigratingDate
Every day about 1M rows from the old table gets converted into 200M rows in the new table.
When we finish the conversion process we will simply update our application to use NewTable.
Everybody, who took time to read the question and tried to help me, I'm sorry, I didn't have enough details myself. Turns out the query that selects data from the old table and converts it, is VERY slow (thanks to #Martin Smith I've decided to check the SELECT query). The query involves parsing xml & uses cross apply. I think the better way in our case would be to write a small application that would simply load data from the old table for each day, convert it in memory and then use Bulk Copy to insert into the new table.
I've written a SQL query that looks like this:
SELECT * FROM MY_TABLE WHERE ID=123456789;
When I run it in the Query Analyzer in SQL Server Management Studio, the query never returns; instead, after about ten minutes, I get the following error: System.OutOfMemoryException
My server is Microsoft SQL Server (not sure what version).
SELECT * FROM MY_TABLE; -- return 44258086
SELECT * FROM MY_TABLE WHERE ID=123456789; -- return 5
The table has over forty million rows! However, I need to fetch five specific rows!
How can I work around this frustrating error?
Edit: The server suddenly started working fine for no discernable reason, but I'll leave this question open for anyone who wants to suggest troubleshooting steps for anyone else with this problem.
According to http://support.microsoft.com/kb/2874903:
This issue occurs because SSMS has insufficient memory to allocate for
large results.
Note SSMS is a 32-bit process. Therefore, it is limited to 2 GB of
memory.
The article suggests trying one of the following:
Output the results as text
Output the results to a file
Use sqlcmd
You may also want to check the server to see if it's in need of a service restart--perhaps it has gobbled up all the available memory?
Another suggestion would be to select a smaller subset of columns (if the table has many columns or includes large blob columns).
If you need specific data use an appropriate WHERE clause. Add more details if you are stuck with this.
Alternatively write a small application which operates using a cursor and does not try to load it completely into memory.
My company is cursed by a symbiotic partnership turned parasitic. To get our data from the parasite, we have to use a painfully slow odbc connection. I did notice recently though that I can get more throughput by running queries in parallel (even on the same table).
There is a particularly large table that I want to extract data from and move it into our local table. Running queries in parallel I can get data faster, but I also imagine that this could cause issues with trying to write data from multiple queries into the same table at once.
What advice can you give me on how to best handle this situation so that I can take advantage of the increased speed of using queries in parallel?
EDIT: I've gotten some great feedback here, but I think I wasn't completely clear on the fact that I'm pulling data via a linked server (which uses the odbc drivers). In other words that means I can run normal INSERT statements and I believe that would provide better performance than either SqlBulkCopy or BULK INSERT (actually, I don't believe BULK INSERT would even be an option).
Have you read Load 1TB in less than 1 hour?
Run as many load processes as you have available CPUs. If you have
32 CPUs, run 32 parallel loads. If you have 8 CPUs, run 8 parallel
loads.
If you have control over the creation of your input files, make them
of a size that is evenly divisible by the number of load threads you
want to run in parallel. Also make sure all records belong to one
partition if you want to use the switch partition strategy.
Use BULK insert instead of BCP if you are running the process on the
SQL Server machine.
Use table partitioning to gain another 8-10%, but only if your input
files are GUARANTEED to match your partitioning function, meaning
that all records in one file must be in the same partition.
Use TABLOCK to avoid row at a time locking.
Use ROWS PER BATCH = 2500, or something near this if you are
importing multiple streams into one table.
For SQL Server 2008, there are certain circumstances where you can utilize minimal logging for a standard INSERT SELECT:
SQL Server 2008 enhances the methods that it can handle with minimal
logging. It supports minimally logged regular INSERT SELECT
statements. In addition, turning on trace flag 610 lets SQL Server
2008 support minimal logging against a nonempty B-tree for new key
ranges that cause allocations of new pages.
If your looking to do this in code ie c# there is the option to use SqlBulkCopy (in the System.Data.SqlClient namespace) and as this article suggests its possible to do this in parallel.
http://www.adathedev.co.uk/2011/01/sqlbulkcopy-to-sql-server-in-parallel.html
If by any chance you've upgraded to SQL 2014, you can insert in parallel (compatibility level must be 110). See this:
http://msdn.microsoft.com/en-us/library/bb510411%28v=sql.120%29.aspx
Does anybody know what's the impact of MSSQL 2008 Database when executing insert and delete SQL statement for around 100,000 records each run after a period of time?
I heard from my client saying that for mysql and for its specific data type, after loading and clearing the database for a period of time, the data will become fragmented/corrupted. I wonder if this also happens to MS SQL? Or what will be the possible impact to the database?
Right now the statements we use to load and reload the data in to all the tables in the database are simple INSERT and DELETE statements.
Please advice. Thank you in advance! :)
-Shen
The transaction log will likely grow due to all the inserts/deletes, and depending on the data which is being deleted/inserted and table structure there will likely be data fragmentation
The data won't be 'corrupted' - if this is happening in MySql, it sounds like a bug in that particular storage engine. Fragmentation shouldn't corrupt a database, but does hamper performance
You can combat this using a table rebuild, a table recreate or a reorganise.
There's plenty of good info regarding fragmentation online. A good article is here:
http://www.simple-talk.com/sql/database-administration/defragmenting-indexes-in-sql-server-2005-and-2008/