MySQL Performance inquiry

MySQL Performance inquiry - sql

I have moved my drupal site from one mysql server to another one.
Old Machine has 1 cpu, 1GB Ram
New Machine has 4 cpu, 4GB Ram.
I have a huge negative difference in perfomance on this query ( 2 mins vs 2 secs )
select distinct c.client
from client_table c
LEFT JOIN reps r on ( c.client = r.client )
where r.user_id is NULL
AND c.client not in ( select distinct client from billing where first_purchase = 1 )
NEW OLD
| connect_timeout 10 |connect_timeout 5
| have_federated_engine DISABLED | have_federated_engine YES
| max_connections 100 | max_connections 400
| max_seeks_for_key 18446744073709551615 | max_seeks_for_key 4294967295
| max_write_lock_count 18446744073709551615 | max_write_lock_count 4294967295
| myisam_max_sort_file_size 9223372036853727232 | myisam_max_sort_file_size 2147483647
| max_binlog_cache_size 18446744073709547520 | max_binlog_cache_size 4294967295
| myisam_recover_options BACKUP | myisam_recover_options OFF
| range_alloc_block_size 4096 | range_alloc_block_size 2048
| table_cache 457 | table_cache 307
| version 5.0.67-0ubuntu6-log | version 5.0.51a-3ubuntu5.4-log
| version_compile_machine x86_64 | version_compile_machine i486
ONLY on NEW | relay_log |
ONLY on NEW | relay_log_index |
ONLY on NEW | relay_log_info_file | relay-log.info
ONLY on NEW innodb_adaptive_hash_index | ON
Any ideas on how to identify what is causing the problem or how to fix it?

You might need to rebuild your indexes on the new instance.

Make triple-sure you've rebuilt your indicies, they don't really carry over.

Try using the MySQL Query Profiler.
I would profile in both environments.
So how do you go about analyzing
database performance? There are three
forms of performance analysis that are
used to troubleshoot and tune database
systems:
Bottleneck analysis - focuses on
answering the questions: What is my
database server waiting on; what is a
user connection waiting on; what is a
piece of SQL code waiting on?
Workload analysis - examines the server and who
is logged on to determine the resource
usage and activity of each.
Ratio-based analysis - utilizes a
number of rule-of-thumb ratios to
gauge performance of a database, user
connection, or piece of code.

Related

How to join between table DurationDetails and Table cost per program

How to design database for tourism company to calculate cost of flight and hotel per every program tour based on date ?
what i do is
Table - program
+-----------+-------------+
| ProgramID | ProgramName |
+-----------+-------------+
| 1 | Alexia |
| 2 | Amon |
| 3 | Sfinx |
+-----------+-------------+
every program have more duration may be 8 days or 15 days only
it have two periods only 8 days or 15 days .
so that i do duration program table have one to many with program .
Table - ProgramDuration
+------------+-----------+---------------+
| DurationNo | programID | Duration |
+------------+-----------+---------------+
| 1 | 1 | 8 for Alexia |
| 2 | 1 | 15 for Alexia |
+------------+-----------+---------------+
And same thing to program amon program and sfinx program 8 and 15 .
every program 8 or 15 have fixed details for every day as following :
Table Duration Details
+------+--------+--------------------+-------------------+
| Days | Hotel | Flight | transfers |
+------+--------+--------------------+-------------------+
| Day1 | Hilton | amsterdam to luxor | airport to hotel |
| Day2 | Hilton | | AbuSimple musuem |
| Day3 | Hilton | | |
| Day4 | Hilton | | |
| Day5 | Hilton | Luxor to amsterdam | |
+------+--------+--------------------+-------------------+
every program determine starting by flight date so that
if flight date is 25/06/2017 for program alexia 8 days it will be as following
+------------+-------+--------+----------+
| Date | Hotel | Flight | Transfer |
+------------+-------+--------+----------+
| 25/06/2017 | 25 | 500 | 20 |
| 26/06/2017 | 25 | | 55 |
| 27/06/2017 | 25 | | |
| 28/06/2017 | 25 | | |
| 29/06/2017 | 25 | 500 | |
+------------+-------+--------+----------+
And this is actually what i need how to make relations ship to join costs with program .
for flight and hotel costs as above ?
for 5 days cost will be 1200
25 is cost per day for hotel Hilton
500 is cost for flight
20 and 55 is cost per transfers
image display what i need
relation between duration and cost

Truthfully, I don't fully understand exactly what you're trying to accomplish. Your description is not clear, your tables seem to be missing information / contain information that should not be in your tables, and the way that I'm understanding your description doesn't really make sense based on the UI screenshot that you shared.
It looks like you're working on an application for a travel agency which will allow agents to create an itinerary for a trip. They can give this trip a name (so if a particular package is a hit with customers, they can just offer the "Alexa" package), and the utility will calculate the total estimated cost of the trip. If I understand correctly, the trips will be either 8, or 15 days long.
Personally, I would delete the "ProgramDuration" table altogether. If there are two versions of the Alexa trip at index 1, then you're going to run into all manners of issues. I can get into the details of why this is a bad idea, but unless you're really hung up on having this ProgramDuration table, it's not worth the time. You should add a "duration" field to your "program" table, and assign a new ProgramID for each different duration version of the "Alexa" program.
Your table "Duration details" also misses the mark. Your fields in this table will make it harder to add new features to your application down the line. You should have a field "ProgramID," which we will use to join this table against the program table later. You should have a field "Day" which obviously indicates the day in the itinerary. You should have only one more field "ItemID." We're going to use the "ItemID" field to join your itinerary against a new items table we're going to create.
Your items table is where you define all of the items that can possibly appear in an itinerary. Your current itinerary table has three possible "types" of expenses, flights, hotels, and transfers. What if your travel agents want to start adding meal expenditures into their itineraries / budgets? What about activities that cost money? What about currency exchange fees? What about items that your clientele will need before their trip (wall adapters, luggage, etc.)? In your items table, you will have fields for an ItemID, ItemName, ItemUnitPrice, and ItemType. A possible item is as follows:
ItemID: 1, ItemName: Night At The Hilton, ItemUnitPrice: 300, ItemType: Lodging
Using the "SELECT [Column] AS [Alias]" syntax with some CTEs or subqueries and the JOIN operator, we can easily reconstitute a table that looks like your "Program Duration Details" table, but we will be afforded considerably more flexibility to add or remove things later down the line.
In the interests of security and programmability, I would also add a table called "ItemTypeTable" with a single field "TypeName." You can use this table to prevent unauthorized users from defining new item types, and you can use this table to create drop down menus, navigation, and all manners of other useful features. There might be cleaner implementations, but this shouldn't represent a serious performance or size hit.
All in all, at the risk of being somewhat rude, it seems like you're trying to take on a rather large, sophisticated task with a very rudimentary understanding of basic relational database design and implementation. If you are doing this in a professional context, I would strongly encourage you to consider consulting with another professional that may be more experienced in this area.

Bigquery pricing for Temporary storage

Background
I am loading data to BQ.Each time i load data to BQ using schema autodetect option for which BQ creates a table with schema.Then i download the autogenerated schema and delete the created table.
I need to this task ,few times on a shedule basis.
I have read in documentation that:
for 100 MB for a month it costs some amount and i have read that loading data is free.
Query
Will this storage cost me any amount?
Apart from storage ,will i be charged for this activity.
Need your suggestions on these.!

There is no such term as temporary storage. As you say you will pay the storage for the time that you keep it. There is a cost if you do streaming insert, but if you do a load job from a file there is no cost for the bandwidth, but you pay the storage price as normal. Also you don't pay for cached queries.
The following table summarizes BigQuery pricing. BigQuery's quota policy applies for these operations.
+---------------------+-------------------------+-----------------------------------------------------------------+
| Action | Cost | Notes |
+---------------------+-------------------------+-----------------------------------------------------------------+
| Storage | $0.02 per GB, per month | See Storage pricing. |
+---------------------+-------------------------+-----------------------------------------------------------------+
| Long Term Storage | $0.01 per GB, per month | See Long term storage pricing. |
+---------------------+-------------------------+-----------------------------------------------------------------+
| Streaming Inserts | $0.01 per 200 MB | See Storage pricing. |
+---------------------+-------------------------+-----------------------------------------------------------------+
| Queries | $5 per TB | First 1 TB per month is free, subject to query pricing details. |
+---------------------+-------------------------+-----------------------------------------------------------------+
| Loading data | Free | See Loading data into BigQuery. |
+---------------------+-------------------------+-----------------------------------------------------------------+
| Copying data | Free | See Copying an existing table. |
+---------------------+-------------------------+-----------------------------------------------------------------+
| Exporting data | Free | See Exporting data from BigQuery. |
+---------------------+-------------------------+-----------------------------------------------------------------+
| Metadata operations | Free | List, get, patch, update and delete calls. |
+---------------------+-------------------------+-----------------------------------------------------------------+
All this is explained better on the official page: https://cloud.google.com/bigquery/pricing
You can check out your current usage on this page:
https://console.cloud.google.com/billing/unbilledinvoice
And there is a pricing calculator here:
https://cloud.google.com/products/calculator/
Update
Storage pricing is prorated per MB, per second. For example, if you store:
100 MB for half a month, you pay $0.001 (a tenth of a cent)
500 GB for half a month, you pay $5
1 TB for a full month, you pay $20

CQRS and Race : how to handle race requirements

While there are articles saying that race conditions do not occur in business world, and it the solution that what we need to look, I am not sure it is the case.
I have a need of capacity and do event ticketing. When the demand for the event is high there are many concurrent bookingCommands that come in the same microsecond. The traditional way to do this is to use locking to prevent RACE conditions. Otherwise it ends up selling tickets for seats that are not available which is a strict business no-no.
Below table shows the sequence of steps that occur concurrently.
Time | Total Capacity | Consumed | Available | Customer1 | customer2
1 | 100 | 99 | 1 |seat available?| -
2 | | | | apply | seat available ?
3 | | | | event handle | apply
4 | | 100 | 0 | update state | event handle
5 | | 101 | -1 | | update state

If "selling tickets for seats that are not available is a strict business no-no." then model it this way. What this requirement tells you is that "selling/reserving a seat" and "number of seats available" should end up in the same transaction and be consistent. You can't take reservation and fire an event to change the number of available seats, it has to be in single transaction. This way when you try to decrease "number of seats available" (Time-5 from your table) you will receive optimistic concurrency exception, because someone modified it in the meantime. Then you can try to process it again and this time number of available seats has been exhausted so you can publish "application/reservation rejected" event and notify the user.
Project "a CQRS Journey" is something you should have a look at:
The reference implementation will be a conference management system
that you will be able to easily deploy and run in your own
environment. This will enable you to explore and experiment with a
realistic application built following a CQRS-based approach.
Especially have a look at SeatsAvailability.MakeReservation and SeatsAvailabilityHandler.Handle(MakeSeatReservation command)

Doing BULK Insert SUSPENDED With Wait Type LCK_M_RIn_LN

I'm having awful problems with doing BULK insert. I'm actually using SqlBulkCopy to insert a number of rows into a table. At first, I would get a Timeout exception. So, I set the SqlBulkCopy's BulkCopyTimeout to a ridiculous[?] 1800 seconds. The exception wouldn't be thrown (yet). So, I checked the Activity Monitor (as suggested here: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. The statement has been terminated) from the MS Server management studio and saw that my BULK INSERT's Task Status is SUSPENDED with a Wait Type of LCK_M_RIn_LN.My code goes like this:
Using sqlCon As SqlConnection = connection.Connect()
Dim sqlBulkCopy As New SqlBulkCopy(sqlCon, SqlBulkCopyOptions.CheckConstraints And
SqlBulkCopyOptions.FireTriggers And
SqlBulkCopyOptions.KeepNulls And
SqlBulkCopyOptions.KeepIdentity, sqlTran)
sqlBulkCopy.BulkCopyTimeout = 1800 ' is this ridiculous?
sqlBulkCopy.BatchSize = 1000
sqlBulkCopy.DestinationTableName = destinationTable
sqlBulkCopy.WriteToServer(dataTableObject)
sqlTran.Commit()
End Using
I have been searching for solutions in the web, but to no avail. Although I have found this defintion of LCK_M_RIn_LN:
Occurs when a task is waiting to acquire a NULL lock on the current key value, and an Insert Range lock between the current and previous key. A NULL lock on the key is an instant release lock. For a lock compatibility matrix, see sys.dm_tran_locks (Transact-SQL).
from http://msdn.microsoft.com/en-us/library/ms179984.aspx
But it's not helping. May someone help me out. My deepest gratitude.
Edit
I think it's because of the KeepIdentity attribute because the primary key is auto incremented. This is according to SqlBulkCopy Insert with Identity Column. I'll see if it fixes my issue.
Edit 2
I don't know what's happening but BULK insert worked fine when I tested it on the management studio (using direct transact-sql). I don't know. Maybe it's with the SqlBulkCopy. When I checked on the Activity Monitor, the query it generated was this:
insert bulk TableName ([ColumnName] Int)
Edit 3
I forgot to write that I'm actually using Entity Framework so I copied a code (translated from c# to vb, actually) that would create a DataTable from an entity object since EntityDataReader is only available for C# (which distressed me). But, anyway. I trashed the SqlBulkCopy thing and just stored the values in XML because when I look at it, I realized I did not need the values inside a database.

I hit something similar trying to bulk insert from Java but with wait type ASYNC_NETWORK_IO e.g.
+-----------+-------+-------------+---------+--------+----------------+--------------------------------------+
| Status | BlkBy | Command | CPUTime | DiskIO | LastBatch | ProgramName |
+-----------+-------+-------------+---------+--------+----------------+--------------------------------------+
| SUSPENDED | . | BULK INSERT | 15 | 4 | 09/16 02:42:04 | Microsoft JDBC Driver for SQL Server |
+-----------+-------+-------------+---------+--------+----------------+--------------------------------------+
It's hard to say what the exact issue was, there are a few things I observed:
Either is the driver swallows errors or you only get them when the copy completes, e.g. when I tried to insert a single row I had exceptions thrown with the errors I needed to fix.
Tuning can be important, specifically the batch size (see https://dba.stackexchange.com/questions/165966/how-does-one-investigate-the-performance-of-a-bulk-insert-statement)
Once I'd addressed these then the full load worked as expected.
Some stats for batch size/rows I generated (note the data is going across the Atlantic) but the point is that the performance is very variable.
+------------+------+----------+----------+----------+
| batch size | rows | start | end | duration |
+------------+------+----------+----------+----------+
| 100 | 2500 | 09:15:45 | 09:18:17 | 00:02:32 |
| 1000 | 2500 | 09:23:34 | 09:25:35 | 00:02:00 |
| 2500 | 2500 | 09:32:53 | 09:34:55 | 00:02:01 |
| 2500 | 7500 | 10:27:18 | 10:30:49 | 00:03:31 |
| 7500 | 7500 | 10:38:10 | 10:45:57 | 00:07:47 |
+------------+------+----------+----------+----------+

My query runs faster the second time around, how do i stop that?

i am running a query in oracle 10 select A from B where C = D
B has millions of records and there is no index on C
The first time i run it it takes about 30 seconds, the second time i run the query it takes about 1 second.
Obviously it's caching something and i want it to stop that, each time i run the query i want it to take 30s - just like it was running for the first time.
i am over-simplifying the issue that i am having for the sake of making the question readable.
Thanks

Clearing the caches to measure performance is possible but very unwieldy.
A very good measure for tracking achieved performance of tuning efforts is counting the number of read blocks during query execution. One of the easiest way to do this is using sqlplus with autotrace, like so:
set autotrace traceonly
<your query>
outputs
...
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1 consistent gets
0 physical reads
0 redo size
363 bytes sent via SQL*Net to client
364 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
The number of blocks read, be it from cache or from disk, is consistent gets.
Another way is running the query with increased statistics i.e. with the hint gather_plan_statistics and then looking at the query plan from the cursor cache:
auto autotrace off
set serveroutput off
<your query with hint gather_plan_statistics>
select * from table(dbms_xplan.display_cursor(null,null,'typical allstats'));
The number of blocks read is output in column buffers.
---------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | | 1 (100)| | 3 |00:00:00.01 | 3 |
| 1 | SORT AGGREGATE | | 3 | 1 | | | 3 |00:00:00.01 | 3 |
| 2 | INDEX FULL SCAN| ABCDEF | 3 | 176 | 1 (0)| 00:00:01 | 528 |00:00:00.01 | 3 |
---------------------------------------------------------------------------------------------------------------------

See this question...
It shows how to clear caches for data and execution plans, but also expands on whether it's a good idea or not.

The obvious answer is for each test case to run the query multiple times, and throw out the first result.
Its not easy to completely replicate the conditions of the first query run, because of the various caches involved: some are Oracle caches (cursor, buffer, etc.); some are OS (disk cache, depending on Oracle config); some are hardware (SAN, RAID, disk).
Rebooting the database server before each trial will probably come pretty close to consistent conditions.

It might help you
http://www.mssqltips.com/tip.asp?tip=1360
CHECKPOINT;
GO DBCC DROPCLEANBUFFERS;
GO

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas