Currently, I have a highly transactional database with appx 100,000 inserts daily. Do I need to be concerned if I start allowing a large number of concurrent reads from my main transaction table? I am not concerned about concurrency, so much as performance.
At present there are 110+ million transactions in this table, and I am using SQL 2005
In 2002, a dell server with 2 GB of RAM, and 1.3 GHz CPU served 25 concurrent users as a File Server, a Database Server, and ICR server (very CPU intensive). Users and ICR server continuously insert, read and update one data table with 80+ million records where each operation requires 25 to 50 insert or update statements. It worked like a charm for 24/7 for almost a year. If you use decent indexes, and your selects use these indexes, it will work.
As #huadianz proposed, a read-only copy will do even better.
Related
We are currently migrating to in-memory tables on SQL Server 2019 Standard Edition. The disk based table is 55GB data + 54Gb of indexes (71M records). RAM is 900 GB. But during data migration (INSERT statement) we get an error message:
Msg 41823, Level 16, State 109, Line 150
Could not perform the operation because the database has reached its quota for in-memory tables. This error may be transient. Please retry the operation.
The in-memory file is “unlimited”, so it looks strange since SQL Server 2019 should not have any size restrictions for in-memory tables.
Why do you think in-memory data size in a single mem-opt table is unlimited on standard edition?
From Memory Limits in SQL Server 2016 SP1 (all of which still applies according to 2019 docs):
Each user database on the instance can have an additional 32GB allocated to memory-optimized tables, over and above the buffer pool limit.
So, you can do what you want, I suppose, but you'll have to spread it across multiple databases. You won't be able to store more than 32GB in a single mem-opt table or even in multiple mem-opt tables in a single database.
Cropped and probably inappropriately-scaled screenshot from the 2019 docs:
I have a SAP B1 system that's being migrated from Microsoft SQL to HANA DB. Our solution in the staging environment is producing huge transaction logs, tens of gigabytes in an hour, but the system isn't receiving production workloads yet. SAP have indicated that the database is fine, and that it's our software that's at fault, but I'm not clear on how to identify this. As far as I can tell each program is sleeping between poll intervals, and the intervals are not high (one query per minute). We just Traced SQL for an hour, and there were only in the region of 700 updates, but still tens of gigabytes of transaction log.
Does anybody have an idea how to debug the transaction log? - I'd like to see what's being recorded.
Thanks.
The main driver of high transaction log data is not the number of SQL commands executed but the size/number of records affected by those commands.
In addition to DML commands (DELETE/INSERT/UPDATE) also DDL commands like CREATE and ALTER table produce redo log data. For example, re-partitioning a large table will produce a large volume of redo logs.
For HANA there are tools (hdblogdiag) that allow inspecting the log volume structures. However, the usage and output of this (and similar tools) absolutely require extensive knowledge of the internals of how HANA operates redo logs.
For the OPs situation, I recommend checking for the volume of data changes caused by both DML and DDL.
We had the same issue.
There is a bug in SAP HANA SPS11 < 112.06 and SPS12 < 122.02 in the LOB garbage collector for the row store.
You can take a look at the SAP Note 2351467
In short, you can either
upgrade HANA
or convert the rowstore tables containing LOB columns into columnstore with the query ALTER TABLE "<schema_name>"."<table_name>" COLUMN;
You can find the list with this query :
select
distinct lo.schema_name,
lo.table_name
from sys.m_table_lob_files lo
inner join tables ta on lo.table_oid = ta.table_oid
and ta.table_type = 'ROW'
or disable the row store lob garbage collector by editing the indexserver.ini to set "garbage_lob_file_handler_enabled = false" under the [row_engine] section.
I have 2 MSSQL servers (lets call then SQL1 and SQL2) running a total of 1866 databases
SQL1 has 993 databases (993203 registered users)
SQL2 have 873 databases (931259 registered users)
Each SQL server has a copy of a InternalMaster database (for some shared table data) and then multiple customers, 1 database per customer (Customer/client not registered user).
At the time of writing this we had just over 10,000 users online using our software.
SQL2 behaves as expected and Database I/O is generally 0.2MB/sec and goes up and down in a normal flow, IO's goes up on certain reports and queries and so on in a random fashion.
However SQL1 has a constant pattern almost like a life support machine.
I don't understand why both servers which have the same infrastructure, work so differently? The spike starts at around 2MB/sec and then increases to a max of around 6MB/sec. Both servers have identical IOPS provisions of data, log and transaction partitions and identical AWS specs. The Data file I/O shows that tempdb is the culprit of this spike.
Any advice would be great as I just can't get my head around how 1 tempdb would act different to another when running the same software and setup on both servers.
Regards
Liam
Liam,
Please see this website that explains how to configure TEMPDB. By looking at the image, you only have one file for the TEMPDB database.
http://www.brentozar.com/sql/tempdb-performance-and-configuration/
Hope this helps
I have an application that produce approximately 15000 rows int a table named ExampleLog for each Task. The task has a taskID, that is saved in a table named TaskTable, thus it's possible to retrieve data from the ExampleLog table to run some queries.
The problem is that the ExampleLog table is getting very big, since I run everyday at least 1 task. At the time being my ExampleLog table is over 60 GB.
I would like to compress the 15000 rows which belong to a TaskID, and compress them or just Zip them and then save the compressed data somewhere inside the database as Blob or as Filestream. But it is important for me to be able to query easily the compressed or zipped file and proccess some query in a efficient manner inside the compressed or zipped data. (I don't know, if it's possible or I may lost in term of performance)
PS: The compressed data should not be considered as backup data.
Did someone can recommend an good approach or technique to resolve this problem. My focus is on the speed and of the query running on the ExampleLog and the place taken on the disk.
I'm using SQL Server 2008 on Windows 7
Consider Read-Only Filegroups and Compression.
Using NTFS Compression with Read-Only User-defined Filegroups and Read-Only Databases
SQL Server supports NTFS compression of read-only
user-defined filegroups and read-only databases. You should consider
compressing read-only data in the following situations: You have a
large volume of static or historical data that must be available for
limited read-only access. You have limited disk space.
Also, you can try and estimate the gains from page compression applied to the log table using Data Compression Wizard.
The answer of Denis could not solve my Problem completely, however I will use it for some optimization inside the DB.
Regarding the problem of storing data in package/group, there are 2 solutions of my problem:
The first solution is the use of the Partitioned Table and Index Concepts.
For example, if a current month of data is primarily used for INSERT, UPDATE, DELETE, and MERGE operations while previous months are used primarily for SELECT queries, managing this table may be easier if it is partitioned by month. This benefit can be especially true if regular maintenance operations on the table only have to target a subset of the data. If the table is not partitioned, these operations can consume lots of resources on an entire data set. With partitioning, maintenance operations, such as index rebuilds and defragmentations, can be performed on a single month of write-only data, for example, while the read-only data is still available for online access.
The second solution it to insert from the code (C# in my case) a List or Dictionary of row from a Task, then save them inside a FILESTREAM (SQL Server) on the DB server. Data will later by retrived by Id; the zip will be decompressed and data will be ready to use.
We have decided to use the second solution.
I found this script on SQL Authority:
USE MyDatabase
GO
EXEC sp_MSforeachtable #command1=“print ’?' DBCC DBREINDEX (’?', ’ ’, 80)”
GO
EXEC sp_updatestats
GO
It has reduced my insert fail rate from 100% failure down to 50%.
It is far more effective than the reindexing Maintenance Plan.
Should I also be reindexing master and tempdb?
How often? (I write to this database 24 hrs a day)
Any cautions that I should be aware of?
RAID 5 on your NAS? That's your killer.
An INSERT is logged: it writes to the .LDF (log) file. This file is 100% write (well, close enough).
This huge write to read ratio generates a lot of extra writes per disk in RAID 5.
I have an article in work (add later): RAID 5 writes 4 times as much per disk than RAID 10 in 100% write situations.
Solutions
You need to split your data and log files for your database at least.
Edit: Clarified this line:
The log files need go to RAID 1 or RAID 10 drives. It's not so important for data (.MDF) files. Log files are 100% write so benefit from RAID 1 or RAID 10.
There are other potential isues too such as fragmented file system or many Vlog segments (depending on how your database has grown), but I'd say your main issue is RAID 5.
For a 3TB DB, I'd also stuff as much RAM as possible in (32GB if Windows Advanced/Enterprise) and set PAE/AWE etc. This will mitigate some disk issues but only for data caching.
Fill factor 85 or 90 is the usual rule of thumb. If your inserts are wide and not strictly monotonic (eg int IDENTITY column) then you'll have lots of page splits with anything higher.
I'm not the only one who does not like RAID 5: BAARF
Edit again:
Look for "Write-Ahead Logging (WAL) Protocol" in this SQL Server 2000 article. It's still relevant: it explains why the log file is important.
I can't find my article on how RAID 5 suffers compared to RAID 10 under 100% write loads.
Finally, SQL Server does I/O in 64k chunks: so format NTFS with 64k clusters.
This could be anything at all. Is the system CPU bound? IO bound? Is the disk too fragemented? Is the system paging too much? Is the network overloaded?
Have you tuned the indexes? I don't recall if there was an index tuning wizard in 2000, but at the least, you could run the profiler to create a workload that could be used by the SQL Server 2005 index tuning wizard.
Check out your query plans also. Some indexes might not be getting used or the SQL could be wholly inefficient.
What table maintenance do you have?
is all the data in the tables relevant to todays processing?
Can you warehouse off some data?
What is your locking like? Are you locking the whole table?
EDIT:
The SQL profiler shows all interactions with the SQL Server. It should be a DBAs lifeblood.
Thanks for all of the help. I'm not there yet, but getting better.
I can't do much about hardware constraints.
All available RAM is allowed to SQL
Fillfactor is set at 95
Using profiler, an hour's trace offered index tuning with suggested increase of 27% efficiency.
As a result, I doubled the amount of successful INSERTS. Now only 1 out of 4 are failing.
Tracing now and will tune after to see if it gets better.
Don't understand locking yet.
For those who maintain SQL Server as a profession, am I on the right track?