Is sequential disk read actually sequential? - sql

I'm using PostgreSQL 9.4.
First, I have postgreSQL installed on a system with the only one ssd-drive.
I'm trying to understand what sequential read is and end up with some issue. For instance, if we ask for an SQL-Server to give us some unindexed data, the seq-scan is likely to be happen. But What if two different clients ask for data from two different tables simultaneously? In this case, sql-server creates two different processes for each client and executes the queries concurrently.
But if the queries are being executed concurrently, the head of the drive need to jump from the area the first table is stored to the area the second is.
So, we actually have no sequntial read, jumping between the tables' areas. Where am I worng? Couldn't you explain those things a bit?

"sequential scan" means a table was read from the beginning to the end, sequentially row by row. It means nothing in terms of how data is read from physical storage.
So the term is about logical reads.
Not sure if the answer needs more explanation.


Different execution plan for the same query

I have two identical SQL Databases that contain nearly the same records in each of their tables. The only difference between them is that one lives on my local machine and the other one is in Azure. Yet, after investigating a performance issue I found out that the two databases produce different execution plans for some of the queries. To give you an example, here is a simple query that takes approximately 1 second to run.
select count(*) from Deposits
inner join households on = deposits.HouseholdId
where CashierId = 'd89c8029-4808-4cea-b505-efd8279dc66d'
It is obvious that the inner join needs to be omitted as it doesn't contribute to the end result. Indeed, it is omitted on my local machine but this is not the fact for Azure. Here are two pictures visualizing the execution plans for my local machine and Azure, respectively.
Just to give you some background on what happened, everything worked perfectly until I scaled down my Azure database to Basic 5 DTUs. Afterwards, some queries became extremely slow and I had no idea why. I scaled up the Db instance again but I saw no improvement. Then I rebuilt the indexes and noticed that if I rebuild them in the correct order, the queries will once again start performing as expected. Yet, I have absolutely no idea why I need to rebuild them in some specific order and, even less, how to determine the correct order. Now, I have issues with virtually all queries related to the Deposits table. I tried rebuilding the indexes but I saw no improvement whatsoever. I suspect that there is something to do with the PK index but I am not quite sure. The table contains approximately 300k rows.
Your databases may have the same schemas and approximate number of records, but it's tough to make them identical. Are you sure your databases are identical?
What about the hardware they are running on? CPU? Memory? Disks? I mean it's Azure, right? Hard to know what actual server hardware you are using. SQL Server's query optimizer will adjust for hardware differences. Additionally, even if the hardware and software were identical... simply the fact that databases are used differently can make them have differences in statistics. The first time you run a query, it is evaluated, and optimized using statistics. All subsequent calls of that query will use that initially cached query plan. Tables change over time, they get taller. The shape of the data changes, meaning an old cached query plan can eventually fall out of favor. Certain things can reshape the data and enact a change in statistics which in turn invalidate the query plan cache, such as such as rebuilding your indices. Try this. To force a fresh query plan on each statement, add a
statement to the bottom of your queries. Does that help or stabilize performance? Furthermore - this is a strech - but can I assume that you aren't using that exact same query over and over? It makes more sense that you haven't hardcoded that GUID and that we really have created a query plan for something that has #CashierID as parameter? If so, then your existing query plan could be victim of parameter sniffing, where the query plan you're pulling was optimized for some specific GUID and does poorly when you pass in anything else.
For more info about what that statement does, have a look here. For more understanding on why it's hard to have identical databases take a look here and here.
Good luck! Hope you can get it sorted.

Is it a good idea to create tables dynamically to store user-content?

I'm currently designing an application where users can create/join groups, and then post content within a group. I'm trying to figure out how best to store this content in a RDBMS.
Option 1: Create a single table for all user content. One of the columns in this table will be the groupID, designating which group the content was posted in. Create an index using the groupID, to enable fast searching of content within a specific group. All content reads/writes will hit this single table.
Option 2: Whenever a user creates a new group, we dynamically create a new table. Something like group_content_{groupName}. All content reads/writes will be routed to the group-specific dynamically created table.
Pros for Option 1:
It's easier to search for content across multiple groups, using a single simple query, operating on a single table.
It's easier to construct simple cross-table queries, since the content table is static and well-defined.
It's easier to implement schema changes and changes to indexing/triggers etc, since there's only one table to maintain.
Pros for Option 2:
All reads and writes will be distributed across numerous tables, thus avoiding any bottlenecks that can result from a lot of traffic hitting a single table (though admittedly, all these tables are still in a single DB)
Each table will be much smaller in size, allowing for faster lookups, faster schema-changes, faster indexing, etc
If we want to shard the DB in future, the transition would be easier if all the data is already "sharded" across different tables.
What are the general recommendations between the above 2 options, from performance/development/maintenance perspectives?
One of the cardinal sins in computing is optimizing too early. It is the opinion of this DBA of 20+ years that you're overestimating the IO that's going to happen to these groups.. RDBMS's are very good at querying and writing this type of info within a standard set of tables. Worst case, you can partition them later. You'll have a lot more search capability and management ease with 1 set of tables instead of a set per user.
Imagine if the schema needs to change? do you really want to update hundreds or thousands of tables or write some long script to fix a mundane issue? Stick with a single set of tables and ignore sharding. Instead, think "maybe we'll partition the tables someday, if necessary"
It is a no-brainer. (1) is the way to go.
You list these as optimizations for the second method. All of these are misconceptions. See comments below:
All reads and writes will be distributed across numerous tables, thus
avoiding any bottlenecks that can result from a lot of traffic hitting
a single table (though admittedly, all these tables are still in a
single DB)
Reads and writes can just as easily be distributed within a table. The only issue would be write conflicts within a page. That is probably a pretty minor consideration, unless you are dealing with more than dozens of transactions per second.
Because of the next item (partially filled pages), you are actually much better off with a single table and pages that are mostly filled.
Each table will be much smaller in size, allowing for faster lookups,
faster schema-changes, faster indexing, etc
Smaller tables can be a performance disaster. Tables are stored on data pages. Each table is then a partially filled page. What you end up with is:
A lot of wasted space on disk.
A lot of wasted space in your page cache -- space that could be used to store records.
A lot of wasted I/O reading in partially filled pages.
If we want to shard the DB in future, the transition would be easier
if all the data is already "sharded" across different tables.
Postgres supports table partitioning, so you can store different parts of a table in different places. That should be sufficient for your purpose of spreading the I/O load.
Option 1: Performance=Normal Development=Easy Maintenance=Easy
Option 2: Performance=Fast Development=Complex Maintenance=Hard
I suggest to choose the Oprion1 and for the BIG table you can manage the performance with better indexes or cash indexes (for some DB) and the last thing is nothing help make the second Option 2, because development a maintenance time is fatal factor

What are the benefits of using database cursor?

It is based on the interview question that I faced.
Very short definition can be
It can be used to manipulate the rows
returned by a query.
Besides the use of the cursor (Points are listed here on MSDN), I have a question in my mind that if we can perform all the operations using query or stored procedure (if I'm not wrong, Like we can use Transact-SQL for ms-sql), is there any concrete point that we should use cursor?
Using cursors compared to big resultsets is like using video streaming instead of downloading an video in one swoop, and watching it when it has downloaded.
If you download, you have to have a few gigs of space and the patience to wait until the download finished. Now, no matter how fast your machine or network may be, everyone watches a movie at the same speed.
Normally any query gets sent to the server, executed, and the resultset sent over the network to you, in one burst of activity.
The cursor will give you access to the data row by row and stream every row only when you request it (can actually view it).
A cursor can save you time - because you don't need to wait for the processing and download of your complete recordset
It will save you memory, both on the server and on the client because they don't have to dedicate a big chunk of memory to resultsets
Load-balance both your network and your server - Working in "burst" mode is usually more efficient, but it can completely block your server and your network. Such delays are seldom desirable for multiuser environments. Streaming leaves room for other operations.
Allows operations on queried tables (under certain conditions) that do not affect your cursor directly. So while you are holding a cursor on a row, other processes are able to read, update and even delete other rows. This helps especially with very busy tables, many concurrent reads and writes.
Which brings us to some caveats, however:
Consistency: Using a cursor, you do (usually) not operate on a consistent snapshot of the data, but on a row. So your concurrency/consistency/isolation guarantees drop from the whole database (ACID) to only one row. You can usually inform your DBMS what level of concurrency you want, but if you are too nitpicky (locking the complete table you are in), you will throw away many of the resource savings on the server side.
Transmitting every row by itself can be very inefficient, since every packet has negotiation overhead that you might avoid by sending big, maybe compressed, chunks of data per packet. ( No DB server or client library is stupid enough to transmit every row individually, there's caching and chunking on both ends, still, it is relevant.)
Cursors are harder to do right. Consider a query with a big resultset, motivating you to use a cursor, that uses a GROUP BY clause with aggregate functions. (Such queries are common in data warehouses). The GROUP BY can completely trash your server, because it has to generate and store the whole resultset at once, maybe even holding locks on other tables.
Rule of thumb:
If you work on small, quickly created resultsets, don't use cursors.
Cursors excell on ad hoc, complex (referentially), queries of sequential nature with big resultsets and low consistency requirements.
"Sequential nature" means there are no aggregate functions in heavy GROUP BY clauses in your query. The server can lazily decide to compute 10 rows for your cursor to consume from a cache and do other stuff meanwhile.
A cursor is a tool that allows you to iterate the records in a set. It has concepts of order and current record.
Generally, SQL operates with multisets: these are sets of possibly repeating records in no given order, taken as a whole.
Say, this query:
ON b.a =
, operates on multisets a and b.
Nothing in this query makes any assumptions about the order of the records, how they are stored, in which order they should be accessed, etc.
This allows to abstract away implementation details and let the system try to choose the best possible algorithm to run this query.
However, after you have transformed all your data, ultimately you will need to access the records in an ordered way and one by one.
You don't care about how exactly the entries of a phonebook are stored on a hard drive, but a printer does require them to be feed in alphabetical order; and the formatting tags should be applied to each record individually.
That's exactly where the cursors come into play. Each time you are processing a resultset on the client side, you are using a cursor. You don't get megabytes of unsorted data from the server: you just get a tiny variable: a resultset descriptor, and just write something like this:
while (!rs.EOF) {
That's cursor that implements all this for you.
This of course concerns database-client interaction.
As for the database itself: inside the database, you rarely need the cursors, since, as I have told above, almost all data transformations can be implemented using set operations more efficiently.
However, there are exceptions:
Analytic operations in SQL Server are implemented very poorly. A cumulative sum, for instance, could be calculated much more efficiently with a cursor than using the set-based operations
Processing data in chunks. There are cases when a set based operation should be sequentially applied to a portion of a set and the results of each chunk should be committed independently. While it's still possible to do it using set-based operations, a cursor is often a more preferred way to do this.
Recursion in the systems that do not support it natively.
You also may find this article worth reading:
The Island of Misfit Cursors
Using a cursor it is possible to read sequentially through a set of data, programmatically, so it behaves in a similar manner to conventional file access, rather than the set-based behaviour characteristic of SQL.
There are a couple of situations where this may be of use:
Where it is necessary to simulate file-based record access behaviour - for example, where a relational database is being used as the data storage mechanism for a piece of code that was previously written to use indexed files for data storage.
Where it is necessary to process data sequentially - a simple example might be to calculate a running total balance for a specific customer. (A number of relational databases, such as Oracle and SQLServer, now have analytical extensions to SQL that should greatly reduce the need for this.)
Inevitably, wikipedia has more:
With cursor you access one row at a time. So it is good to use it when you want manipulate with a lot of rows but with only one at a given time.
I was told at my classes, the reason to use cursor is you want to access more rows than you can fit your memory - so you can't just get all rows into a collection and then loop through it.
Sometimes a set based logic can get quite complex and opaque. In these cases and if the performance is not an issue a server side cursor can be used to replace the relational logic with a more manageable and familiar (to a non relational thinker) procedural logic resulting in easier maintenance.

Would this method work to scale out SQL queries?

I have a database containing a single huge table. At the moment a query can take anything from 10 to 20 minutes and I need that to go down to 10 seconds. I have spent months trying different products like GridSQL. GridSQL works fine, but is using its own parser which does not have all the needed features. I have also optimized my database in various ways without getting the speedup I need.
I have a theory on how one could scale out queries, meaning that I utilize several nodes to run a single query in parallel. A precondition is that the data is partitioned (vertically), one partition placed on each node. The idea is to take an incoming SQL query and simply run it exactly like it is on all the nodes. When the results are returned to a coordinator node, the same query is run on the union of the resultsets. I realize that an aggregate function like average need to be rewritten into a count and sum to the nodes and that the coordinator divides the sum of the sums with the sum of the counts to get the average.
What kinds of problems could not easily be solved using this model. I believe one issue would be the count distinct function.
Edit: I am getting so many nice suggestions, but none have addressed the method.
It's a data volume problem, not necessarily an architecture problem.
Whether on 1 machine or 1000 machines, if you end up summarizing 1,000,000 rows, you're going to have problems.
Rather than normalizing you data, you need to de-normalize it.
You mention in a comment that your data base is "perfect for your purpose", when, obviously, it's not. It's too slow.
So, something has to give. Your perfect model isn't working, as you need to process too much data in too short of a time. Sounds like you need some higher level data sets than your raw data. Perhaps a data warehousing solution. Who knows, not enough information to really say.
But there are a lot of things you can do to satisfy a specific subset of queries with a good response time, while still allowing ad hoc queries that respond in "10-20 minutes".
Edit regarding comment:
I am not familiar with "GridSQL", or what it does.
If you send several, identical SQL queries to individual "shard" databases, each containing a subset, then the simple selection query will scale to the network (i.e. you will eventually become network bound to the controller), as this is a truly, parallel, stateless process.
The problem becomes, as you mentioned, the secondary processing, notably sorting and aggregates, as this can only be done on the final, "raw" result set.
That means that your controller ends up, inevitably, becoming your bottleneck and, in the end, regardless of how "scaled out" you are, you still have to contend with a data volume issue. If you send your query out to 1000 node and inevitably have to summarize or sort the 1000 row result set from each node, resulting in 1M rows, you still have a long result time and large data processing demand on a single machine.
I don't know what database you are using, and I don't know the specifics about individual databases, but you can see how if you actually partition your data across several disk spindles, and have a decent, modern, multi-core processor, the database implementation itself can handle much of this scaling in terms of parallel disk spindle requests for you. Which implementations actually DO do this, I can't say. I'm just suggesting that it's possible for them to (and some may well do this).
But, my general point, is if you are running, specifically, aggregates, then you are likely processing too much data if you're hitting the raw sources each time. If you analyze your queries, you may well be able to "pre-summarize" your data at various levels of granularity to help avoid the data saturation problem.
For example, if you are storing individual web hits, but are more interested in activity based on each hour of the day (rather than the subsecond data you may be logging), summarizing to the hour of the day alone can reduce your data demand dramatically.
So, scaling out can certainly help, but it may well not be the only solution to the problem, rather it would be a component. Data warehousing is designed to address these kinds of problems, but does not work well with "ad hoc" queries. Rather you need to have a reasonable idea of what kinds of queries you want to support and design it accordingly.
One huge table - can this be normalised at all?
If you are doing mostly select queries, have you considered either normalising to a data warehouse that you then query, or running analysis services and a cube to do your pre-processing for you?
From your question, what you are doing sounds like the sort of thing a cube is optimised for, and could be done without you having to write all the plumbing.
By trying custom solution (grid) you introduce a lot of complexity. Maybe, it's your only solution, but first did you try partitioning the table (native solution)?
I'd seriously be looking into an OLAP solution. The trick with the Cube is once built it can be queried in lots of ways that you may not have considered. And as #HLGEM mentioned, have you addressed indexing?
Even at in millions of rows, a good search should be logarithmic not linear. If you have even one query which results in a scan then your performance will be destroyed. We might need an example of your structure to see if we can help more?
I also agree fully with #Mason, have you profiled your query and investigated the query plan to see where your bottlenecks are. Adding nodes improving speed makes me think that your query might be CPU bound.
Are you using all of the features of GridSQL? You can also use constraint exclusion partitioning, effectively breaking out your big table into several smaller tables. Depending on your WHERE clause, when the query is processed it may look at a lot less data and return results much faster.
Also, are you using multiple logical nodes per physical server? Configuring it that way can take advantage of otherwise idle cores.
If you monitor the servers during execution, is the bottleneck IO or CPU?
Also alluded to here is that you may want to roll up rows in your fact table into summary tables/cubes. I do not know enough about Tableau, will it automatically use the appropriate cube and drill down only when necessary? If so, it seems like you would get big gains doing something like this.
My guess (based on nothing but my gut) is that any gains you might see from parallelization will be eaten up by reaggregation and subsequent queries of the results. Further, I would think that writing might get more complicated with pk/fk/constraints. If this were my world, I would probably create many indexed views on top of my table (and other views) that optimized for the particular queries I need to execute (which I have worked with successfully on 10million+ row tables.)
If you run the incoming query, unpartitioned, on each node, why will any node finish before a single node running the same query would finish? Am I misunderstanding your execution plan?
I think this is, in part, going to depend on the nature of the queries you're executing and, in particular, how many rows contribute to the final result set. But surely you'll need to partition the query somehow among the nodes.
Your method to scale out queries works fine.
In fact, I've implemented such a method in:
It uses a parser, but it supports most SQL constructs.
It doesn't yet support count(distinct expr) but this is doable and I plan to add support in the future.
I also have a tool called Flexviews (google for flexviews materialized views)
This tool lets you create materialized views (summary tables) which include various aggregate functions and joins.
Those tools combined together can yield massive scalability improvements for OLAP type queries.

Stored Procedure Execution Plan - Data Manipulation

I have a stored proc that processes a large amount of data (about 5m rows in this example). The performance varies wildly. I've had the process running in as little as 15 minutes and seen it run for as long as 4 hours.
For maintenance, and in order to verify that the logic and processing is correct, we have the SP broken up into sections:
TRUNCATE and populate a work table (indexed) we can verify later with automated testing tools.
Join several tables together (including some of these work tables) to product another work table
Repeat 1 and/or 2 until a final output is produced.
My concern is that this is a single SP and so gets an execution plan when it is first run (even WITH RECOMPILE). But at that time, the work tables (permanent tables in a Work schema) are empty.
I am concerned that, regardless of the indexing scheme, the execution plan will be poor.
I am considering breaking up the SP and calling separate SPs from within it so that they could take advantage of a re-evaluated execution plan after the data in the work tables is built. I have also seen reference to using EXEC to run dynamic SQL which, obviously might get a RECOMPILE also.
I'm still trying to get SHOWPLAN permissions, so I'm flying quite blind.
Are you able to determine whether there are any locking problems? Are you running the SP in sufficiently small transactions?
Breaking it up into subprocedures should have no benefit.
Somebody should be concerned about your productivity, working without basic optimization resources. That suggests there may be other possible unseen issues as well.
Grab the free copy of "Dissecting Execution Plan" in the link below and maybe you can pick up a tip or two from it that will give you some idea of what's really going on under the hood of your SP.
Are you sure that the variability you're seeing is caused by "bad" execution plans? This may be a cause, but there may be a number of other reasons:
"other" load on the db machine
when using different data, there may be "easy" and "hard" data
issues with having to allocate more memory/file storage
Have you tried running the SP with the same data a few times?
Also, in order to figure out what is causing the runtime/variability, I'd try to do some detailed measuring to pin the problem down to a specific section of the code. (Easiest way would be to insert some log calls at various points in the sp). Then try to explain why that section is slow (other than "5M rows ;-)) and figure out a way to make that faster.
For now, I think there are a few questions to answer before going down the "splitting up the sp" route.
You're right it is quite difficult for you to get a clear picture of what is happening behind the scenes until you can get the "actual" execution plans from several executions of your overall process.
One point to consider perhaps. Are your work tables physical of temporary tables? If they are physical you will get a performance gain by inserting new data into a new table without an index (i.e. a heap) which you can then build an index on after all the data has been inserted.
Also, what is the purpose of your process. It sounds like you are moving quite a bit of data around, in which case you may wish to consider the use of partitioning. You can switch in and out data to your main table with relative ease.
Hope what I have detailed is clear but please feel free to pose further questions.
Cheers, John
In several cases I've seen this level of diversity of execution times / query plans comes down to statistics. I would recommend some tests running update stats against the tables you are using just before the process is run. This will both force a re-evaluation of the execution plan by SQL and, I suspect, give you more consistent results. Additionally you may do well to see if the differences in execution time correlate with re-indexing jobs by your dbas. Perhaps you could also gather some index health statistics before each run.
If not, as other answerers have suggested, you are more likely suffering from locking and/or contention issues.
Good luck with it.
The only thing I can think that an execution plan would do wrong when there's no data is err on the side of using a table scan instead of an index, since table scans are super fast when the whole table will fit into memory. Are there other negatives you're actually observing or are sure are happening because there's no data when an execution plan is created?
You can force usage of indexes in your query...
Seems to me like you might be going down the wrong path.
Is this an infeed or outfeed of some sort or are you creating a report? If it is a feed, I would suggest that you change the process to use SSIS which should be able to move 5 million records very fast.