Poor performance on ldapsearch - ldap

I'm migrating an openldap database backend from HDB to MDB. To migrate I repopulate all data on the DB. Write performance is 3x faster which is great but searching has a very low perfomance. In HDB I get responses in a second but with MDB it takes 4 seconds (some timeouts also!). I test several flags for mdb but it only improve write operations.
My database has a million of entries and every entry has many attributes. A single search could return 13K of data (or more). I try to make index of almost all attributes accessed more than one time per query. But the only index that seems to impact the performance is "objectClass".
Some ideas or documents?
Thanks in advance!

Related

What is a good way to manage large ever growing tables in a database?

I am building a web application for medical record keeping. A requirement for this application is logging all changes (view, create, update, delete) to a patients data and pretty much any other useful info in the system (login, cron run, data export, etc).
I am storing the data into a database table currently which is working fine. However it is likely this table will grow unruly very quickly and bloat the database. I am not allowed to delete log entries.
My current plan is to choose an arbitrary size (such as 1 million entries, large but still manageable). When the table hits 1 million entries I move 100,000 oldest entries into a file and store it onto our file server.
Does anyone have any experience with this issue that has other/better ideas on how to handle it?
Additional info:
My primary concern is nothing will ever be deleted from this data. However the data does not necessarily need to be accessed after several months. Since this data could logically hit 1 Billion entries in a matter of a couple years (and I have 300 copies of this db that all include this table) what is a good way to manage the size and performance. This table needs to be on a pager which is obviously going to be an issue when it breaks 1 Million let alone 1 Billion.
Cases like this are tailor-made for partitioning. Using a partitioning strategy, you span your data across multiple tables. This helps to balance I/O, speed up access times for partition-specific queries, etc. This is a discipline in and of itself, and the choice of partitioning key is crucial. In many cases such as log data like this, people often partition on a datetime value.
Partitioned Tables and Indexes (SQL Server)

Big query is to slow

I am just starting with biquery, my DB is small (10K of rows 1 table) and my queries are simple count and group by.
Its takes and average of 3-4 sec per request but sometimes its jumps to 10 and event 15sec
I am querying from amazon linux server in Irland using the BQ tool.
Is it possible to get results faster (under 1sec) so I will be able to present my webpages faster.
1) Big Query is a highly scalable database, before being a "super fast" database. It's designed to process HUGE amount of data distributing the processing among several different machines using a technique named Dremel. Because it's designed to use several machines and parallel processing, you should expect to have super-scalability with a good performance.
2) BigQuery is an asset when you want to analyze billions of rows.
For example: analyzing all the wikipedia revisions in 5-10 seconds isn't bad, is it? But even a much smaller table would take about the same time, even if has 10k rows.
3) Under this size, you'll be better off using more traditional data storage solutions such as Cloud SQL or the App Engine Datastore. If you want to keep SQL capability, Cloud SQL is the best guess.
Sybase IQ is often installed in a single database and it doesn't use Dremel. That said, it's going to be faster than Big Query in many scenarios...as designed.
4) Certainly the performance differ from a dedicated environment. You get your dedicated environment for 20K$ a month.
That's the expected behaviour. In BigQuery you are using a shared infrastructure, so depending on the use at the moment you will get better or worse response time. Actually batch queries (those not needing interactivity) are encouraged and rewarded by not adding up to your quota.
You typically don't use BigQuery as your main database to show data in your web application. Depending on what you want to do, BigQuery can be a Big Data storage and you should have another intermediate store where you could store computed results to display to your users. Or maybe in your use case you don't really need BigQuery and there is a better solution.
In any case, you are not going to be able to avoid a few seconds wait (even if you go Premium, you get more guarantees about the service, but in no case a service fast enough as to be your main backend for a webapp)

What data load on my DB should I expect if I get more users?

currently as a single user, it takes the 260ms for a certain query to run from start to finish.
what will happen if I have 1000 queries sent at the same time? should I expect the same query to take ~4 minutes? (260ms*1000)
It is not possible to make predictions without any knowledge of the situation. There will be a number of factors which affect this time:
Resources available to the server (if it is able to hold data in memory, things run quicker than if disk is being accessed)
What is involved in the query (e.g. a repeated query will usually execute quicker the second time around, assuming the underlying data has not changed)
What other bottlenecks are in the system (e.g. if the webserver and database server are on the same system, the two processes will be fighting for available resource under heavy load)
The only way to properly answer this question is to perform load testing on your application.

Different users in SQL Server can help in performance

We have one table having continuous inserts from 3 windows services in SQL Server 2008 with same SQL user. This makes table heavily loaded and retrieval operation and IO becomes slow.
We have decided to split this table in to two i.e. one for latest data and another for history data. I have one question here is, whether I gain performance benefit if I create one separate user for each windows service, so total 3 user in our case, for insert operation. I think there will be 3 session here i.e. separate session for each user and that might improve performance.
Am I right?
3 sessions don't necessarily require 3 different user id's. Each of the three data loading processes could establish a session using the same credentials, and you'd still have 3 sessions.
However, you may now run into contention, where each process locks the other out, which may result in slower overall performance. This could be avoided by configuring row-level locking on the table space, which itself will usually cost a slight performance hit.
You might still get better performance by batching your inserts into groups of say 5, 10 or 25 records before committing the operation. The downside to this approach is that on exception when you have to rollback and re-do, it takes longer because the unit of work is bigger.
I don't think there will be any performance increase by using different users. There will still be three different database sessions.
Depending on your setup and database, I think you may alliviate IO load with one or several of the following tips:
Batch inserts. Build a single service that can batch inserts. Less insert operations is way better (IO wise) than many insert operations.
Depending on your scenario, you may gain performance by lowering transaction isolation level for reads and inserts to that table.
Minimize the number of indexes on that table. Inserts to tables with indexes are more expensive.
Make sure the tables are stored on a disk that is fast enough for the IO throughput you need. Make sure the disk is not being used by other services.
I hope that helps.

SQL Server slow down after duplicating database

I recently moved a bunch of tables from an existing database into a new database due to the database getting rather large. After doing so I noticed a dramatic decrease in performance of my queries when running against the new database.
The process I took to recreate the new database is this:
Generate Table CREATE scripts using sql servers automatic script
generator.
Run the create table scripts
Insert all data into new database
using INSERT INTO with a select from
the existing database.
Run all the alter scripts to create
the foreign keys and any indexes
Does anyone have any ideas of possible problems with my process, or some key step I'm missing that is causing this performance issue?
Thanks.
first I would an a mimimum make sure that auto create statistics is enabled you can also set auto update statistics to true
after that I would update the stats by running
sp_updatestats
or
UPDATE STATISTICS
Also realize that the first time you hit the queries it will be slower because nothing will be cached in RAM. On the second hit should be much faster
Did you script the indexes from the tables in the original database? Missing indexes could certainly account for poor performance.
Have you tried looking at the execution plans on each server when running these queries - that should allow you to easily see if they're doing something different e.g. table scanning due to a missing index, poor statistics etc.
Are both DBs sat on the same box with their data files on the same drive arrays?
Can you tell what about those queries got slower? New access plans? Same plans but they perform slower? Do they execute slower or are they suspended more? Did all queries got slower or just some? And last but not least, how do you know, ie. what exactly did you measure and how?
Some of the usual suspects could be:
The new storage is much slower (.mdf on slow disk, or on a busy disk)
You changed the data structure during move (ie. some indexes did not get ported)
You changed the data size (ie. compression options) resulting on more pages for the same data
Did anything else change at the same time, new app code or anything the like?
By extending the data size (you do no mention deleting the old tables) you are now trashing the buffer pool (did the page lifetime expectancy decreased in performance counters?)
Look on how you set up the initial size and growth options. If you didn't give it enough space to begin with or if you are growing by 1MB at a time that could be a cause of performance issues.