I am trying to insert 1 million record after performing some calculation on each row in oracle. 15 hours gone and it is still in works. When i write a select query on this table it shows nothing. I don't know where is my inserted data going on each insert.
So my question is that, is there any way to check how many rows insert till now while performing long running insertion in oracle table, thanks.
It depends whether you are doing the insertion in SQL or PL/SQL. While using PL/SQL you have your own ways to get the number of rows already been processed, you can of course write your own program.
Coming to SQL, I can think of two ways :
V$SESSION_LONGOPS
V$TRANSACTION
Most of the GUI based tools would have nice graphical representation for the long operations view. You can query -
SELECT ROUND(SOFAR*21411/TOTALWORK)
FROM V$SESSION_LONGOPS
WHERE username = '<username>'
AND TIME_REMAINING > 0
The V$TRANSACTION view can tell you whether any transaction is still pending. If your INSERT is completed and COMMIT is issued, the transaction would be completed. You can join it with v$session. You can query -
SELECT ....
from v$transaction t
inner join v$session s
ON t.addr = s.taddr;
Related
I want to understand how transactions work in SQL, specifically in PostgreSQL
Imagine I have a very large table (first_table) and the query below lasts 2 seconds and I execute the query below via psql.
sudo -u postgres psql -f database/query.sql
This is the query:
TRUNCATE TABLE second_table;
INSERT INTO second_table (
foo1
,foo2
)
SELECT foo1
, foo2
FROM first_table;
What can happen if I execute another query selecting from second_table at the same time the previous query is executing. Notice the truncate table at the start of the previous query.
example:
SELECT * FROM second_table;
EDIT: I mean I would get zero or non-zero records in the second query?
I mean I would get zero or non-zero records in the second query?
Under reasonable transaction isolation levels, the database does not allow dirty reads, meaning no transaction can see changes from other transactions that have not yet been committed. (In Postgresql, it is not even an option to turn that off, a very sensible choice in my book).
That means that the second query will either see the contents of the table before the TRUNCATE, or it will see the new records added after the TRUNCATE. But it will not see something in between, i.e. it will not get an empty table (assuming there have been records in the table before the TRUNCATE) and it will not see an incomplete half of the new records (or even a weird mix).
If you say that the second query returns before the first query has committed, then it will have seen the state of the table before any changes from the first query have been applied.
Let's say we have one table bookings which contains billion records. We wrote a simple SELECT query to select some records from this table with some WHERE clauses (doesn't matter what was in WHERE clause. This query will take several seconds). After executing this SELECT query (and before it finishes), we then inserted a record in our bookings table (this record satisfies WHERE clauses of first SELECT query).
The question: "Will this new record be selected when first SELECT query finishes its work?"
Preferably I want answer about PostgreSQL case, but would be glad to hear about how MySQL, SQL Server and others would behave in such a situation.
Thanks.
Will this new record be selected when first SELECT query finishes its work
No.
Every statement is an atomic operation, and sees a consistent state (=snapshot) of the database as it was at the moment when the statement started.
The above applies to Postgres and Oracle (maybe to other DBMS as well, but I can't say for sure. Some support dirty reads, where this wouldn't be guaranteed)
Is there a way to define the maximum size per table or the maximum number of rows in a table? (for a SELECT INTO new_table operation)
I was working with SELECT INTO and JOINS in tables with approximately 70 million rows and it happened that I made a mistake in the ON condition. As a consequence, the result of this join operation created a table larger than the database size limit. The DB crashed and went into a recovery mode (which left for 2 days)
I would like to know how to avoid this kind of problem in the future. Is there any "good manners manual" when working with huge tables? Any kind of pre-defined configuration to prevent this problem?
I don't have the code but as I said, it was basically a left join and the result inserted in a new table through SELECT INTO.
PS: I don't have much experience with SQL SERVER or any other Relational database.
Thank you.
SET ROWCOUNT 10000 would have made it so that no more than 10,000 rows would be inserted. However, while that can prevent damage, it would also mask the mistake you made with your SQL query.
I'd say that before running any SELECT INTO, I would do a SELECT COUNT(*) to see how many rows my JOIN and WHERE clauses are resulting in. If the count is huge, or if it spends hours even coming up with a count, then that's your warning sign.
I have a program on QT which first inserts values to table with an auto increase column "Table A". Then it joins the values of Table A with Table B to insert into Table C (I need to use the recently created id's on Table C).
My issue is that the program starts inserting to Table C before the inserts on table A are finalized hence the number of records on Table C are incomplete.
If I run the insert on table A manually, then wait and insert on table C all the records are displayed, so the queries are correct.
CODE
_query.exec("Insert into tableA values(...........)"); --This inserts 3000 records
_query.exec("Insert into tableC (select * from tableA)"); --This inserts 2500 values
If I comment the second query on the code and execute it
_query.exec("Insert into tableA values(...........)"); --This inserts 3000 records
//_query.exec("Insert into tableC (select * from tableA)");
Then run the query on pgphpadmin
Insert into tableC (select * from tableA) --This inserts 3000 records
It does insert the 3000 records
My guess is that the program doesn't wait for the first insert to finish before continuing to the next query.
My guess is that the program doesn't wait for the first insert to
finish before continuing to the next query.
As of Qt-5.3, the QPSQL postgres driver uses the synchronous PQexec() call to implement QSqlQuery::exec(), so it's impossible that it returns before a query is entirely finished.
Besides, even if it used an asynchronous call like PQsendQuery(), this situation would still be not possible, because the same connection cannot start a new SQL statement until it's done with the previous one.
Your program would need to have multiple open connections in multiple threads to submit parallel queries. That's not the kind of thing that happens inadvertently.
And even if it had these, the effect of a half-finished INSERT cannot be seen from any other connection because of the isolation properties of transactions. The second INSERT would either see no row at all from the first INSERT if it's not finished/committed yet, or all the rows if it's finished/committed.
The rare situation when half-loaded data may be seen from outside the server is with COPY FREEZE, but that's not INSERT.
I kept investigating and a piece of my code did multi threads. There was an issue with the threads and the program wasn't waiting for all the threads to complete. That is why it seemed that the program was not waiting enough.
Thanks for your help
I have a table that is used for reporting purposes and data is inserted each time a user runs a report from the web. The insert can vary from a single row to several thousand depending on the parameters of the report. The select statement used for the insert can run for up to 60 seconds for the insert. It has been optimized but due to the complexity of the database i can not tune it further. My question is - when is the table locked for insert? Is it when the stored procedure is called, when the select statement is executed, or when the select statement is finished executing? I would like to limit the time the table is locked so other users are not affected when a large report, up to 50,000 rows, is run.
ex.
INSERT INTO reportTable
SELECT
Column a,
Column b
FROM
Table a
INNER JOIN
Table b
on b.ident = a.Bident
Thank you
Just run this:
ALTER DATABASE [<dbname>] SET READ_COMMITTED_SNAPSHOT ON;
And stop worrying about insert locks blocking reports. See Choosing Row Versioning-based Isolation Levels.