SQL_ID, CBO, Optimizer_Mode and Plan Change History - sql

This is a 4 part question
What is the logic behind SQL_ID. . . Does the value change for the same SQL over time? Does it persists between DB Restarts? Or every plan change gives a new SQL_ID?
How can i check the plan change history for a particular query? Given the SQL_ID i tried querying dba_hist_sqlstat table but it does not give the time of plan change and other details so as to be able to match with the v$sql_plan table.
I have the parameter optimizer_mode set to FIRST_ROWS. Even then when is see the table dba_hist_sqlstat, it indicate ALL_ROWS for some SQL_IDs . . . Can oracle disregard the instance level parameter to use what it deems most suitable?
Between 8PM and 2 PM a query was performing badly. Taking 6 seconds for its execution. After 3 PM the query started responding in < 1 Second. I have the AWR report for the periods that shows this detail. There was no difference in load on the DB in these 2 periods. How could i arrive at the root of this? I am trying to find the History of the plan change but appreciate more feedback to best analyze such issues.
The DB Version is Oracle 10.1.0.4 running on AIX 5.3 64b

1.- SQLID is calculated with a hash function based on the actual SQL text, it shouldnt change with restart or between databases at least the same versions, different oracle versions could have different hashing functions right?, so as long as you do not change the sql text (this includes blanks commas and everything) SQLID will remain the same.
2.- Use DBMS_XPLAN.DISPLAY_AWR to display all plans for a SQL_ID: select * from table(dbms_xplan.display_awr(sql_id => '[your SQL_ID]'));
3.- Oracle will only do this if a query has an OPTIMIZER GOAL hint in it.
4.- There are many things in play for this one. I would start by looking at top 5 timed events in AWR for both periods of time. If they are alike, I would then go investigate the PLAN history for the statement, see if it changed during periods and how data behaved during the periods as well. One of these three should give you the answer.

Related

Problems with TempDb on the SQL Server

I got some problems with my SQL Server. Some external queries write into the Temp db and every 2-3 days it is full and we have to restart the SQL database. I got who is active on it. And also we can check monitor it over grafana. So I get a exact time when the query starts to write a lot of data into the temp db. Can someone give me a tip on how I can search for the user when I get the exact time?
select top 40 User_Account, start_date, tempdb_allocations
from Whoisactive
order by tempdb_allocation, desc
where start_date between ('15-02-2023 14:12:14.13' and '15-02-2023 15:12:14.13')
User_Account
Start_Date
tempdb_allocations
kkarla1
15-02-2023 14:12:14.13
12
bbert2
11-02-2023 12:12:14.13
0
ubert5
15-02-2023 15:12:14.13
888889
I would add this as a comment but I don’t have the necessary reputation points.
At any rate - you might find this helpful.
https://dba.stackexchange.com/questions/182596/temp-tables-in-tempdb-are-not-cleaned-up-by-the-system
It isn’t without its own drawbacks but I think that if the alternative is restarting the server every 2 or 3 days this may be good enough.
It might also be helpful if you add some more details about the jobs that are blowing up your tempdb.
Is this problematic job calling your database once a day? Once a minute? More?
I ask because if it’s more like once a day then I think the answer in the link is more likely to be helpful.

How to get a count of the number of times a sql statement has executed in X hours?

I'm using oracle db. I want to be able to count the number of times that a SQL statement was executed in X hours. For instance, how many times has the statement Select * From ExampleTable been executed in the past 5 hours?
I tried looking in V$SQL, V$SQLSTATS, V$SQLAREA, but they only keep a record of a statement's total amount of executions. They don't store what times the individual executions occurred. Is there any view I missed, or something else that does keep track of each individual statement execution + timestamp so that I can query by which have occurred X hours ago? Thanks for the help.
The views in the Active Workload Repository store historical SQL execution information, specifically the view DBA_HIST_SQLSTAT.
The view is not perfect; it contains a summary of the top SQL statements. This is almost perfect information for performance tuning - in practice, sampling will catch any performance problem. But if you're looking for a perfect record of every SQL execution, as far as I know the only way to get that information is through tracing, which is buggy and slow.
Hopefully this query is good enough:
select begin_interval_time, end_interval_time, executions_delta, dba_hist_sqlstat.*
from dba_hist_sqlstat
join dba_hist_snapshot
on dba_hist_sqlstat.snap_id = dba_hist_snapshot.snap_id
and dba_hist_sqlstat.instance_number = dba_hist_snapshot.instance_number
order by begin_interval_time desc, sql_id;
Apologies for putting this in an answer instead of a comment (I don't have the required reputation), but I think you may be out of luck. Here is an AskTOM asking basically the same question: AskTOM. Tom says unless you are using ASH that just isn't something the database is designed to do.

Is there a way to check if a query hit a cached result or not in BigQuery?

We are performance tuning, both in terms of cost and speed, some of our queries and the results we get are a little bit weird. First, we had one query that did an overwrite on an existing table, we stopped that one after 4 hours. Running the same query to an entirely new table and it only took 5 minutes. I wonder if the 5 minute query maybe used a cached result from the first run, is that possible to check? Is it possible to force BigQuery to not use cache?
If you run query in UI - expand Options and make sure Use Cached Result properly set
Also, in UI, you can check Job Details to see if cached result was used
If you run your query programmatically - you should use respective attributes - configuration.query.useQueryCache and statistics.query.cacheHit

mysqldumpslow: What does these fields indicate..?

Recently we have started on optimizing live slow queries. As part of that, we thought to use mysqldumpslow to prioritize slow queries. I am new to this tool. I am able to understand some basic info, but I would like to know what exactly the below fields in the out put will tell us.
OUTPUT: Count: 6 Time=22.64s (135s) Lock=0.00s (0s) Rows=1.0 (6)
What about the below fields ?
Time : Is it the average time taken of all these 6 times of occurance...?
135s : What is this 135 seconds....?
Rows=1.0 (6): again what does this mean...?
I didn't find a better explanation. Really thanks in advance.
Regards,
UDAY
I made a research for that coz i wanted to know that too.
I have a log from a pretty highly used DB server.
The command mysqldumpslow has several optional parameters (https://dev.mysql.com/doc/refman/5.7/en/mysqldumpslow.html), including sort by (-s)
thanks to many queries I can work with, I can tell, that:
value before brackets represents an average value from all the same queries within to group ('count' in total) and the value within brackets is the maximum value of one of the queries. Meaning, in your case:
you have a query that was called 6 times, it is executed within 22.64 seconds (average), but once it took about 135 seconds to execute it. The same applies for locks (if provided) and rows. So most of the time it returns about one row, however it returned 6 rows at least once

long running queries: observing partial results?

As part of a data analysis project, I will be issuing some long running queries on a mysql database. My future course of action is contingent on the results I obtain along the way. It would be useful for me to be able to view partial results generated by a SELECT statement that is still running.
Is there a way to do this? Or am I stuck with waiting until the query completes to view results which were generated in the very first seconds it ran?
Thank you for any help : )
In general case the partial result cannot be produced. For example, if you have an aggregate function with GROUP BY clause, then all data should be analysed, before the 1st row is returned. LIMIT clause will not help you, because it is applied after the output is computed. Maybe you can give a concrete data and SQL query?
One thing you may consider is sampling your tables down. This is good practice in data analysis in general to get your iteration speed up when you're writing code.
For example, if you have table create privelages and you have some mega-huge table X with key unique_id and some data data_value
If unique_id is numeric, in nearly any database
create table sample_table as
select unique_id, data_value
from X
where mod(unique_id, <some_large_prime_number_like_1013>) = 1
will give you a random sample of data to work your queries out, and you can inner join your sample_table against the other tables to improve speed of testing / query results. Thanks to the sampling your query results should be roughly representative of what you will get. Note, the number you're modding with has to be prime otherwise it won't give a correct sample. The example above will shrink your table down to about 0.1% of the original size (.0987% to be exact).
Most databases also have better sampling and random number methods than just using mod. Check the documentaion to see what's available for your version.
Hope that helps,
McPeterson
It depends on what your query is doing. If it needs to have the whole result set before producing output - such as might happen for queries with group by or order by or having clauses, then there is nothing to be done.
If, however, the reason for the delay is client-side buffering (which is the default mode), then that can be adjusted using "mysql-use-result" as an attribute of the database handler rather than the default "mysql-store-result". This is true for the Perl and Java interfaces: I think in the C interface, you have to use an unbuffered version of the function that executes the query.