Oracle SQL database In-Memory - compare compressions sizes - sql

I'm playing with in-memory storage in oracle sql. I would like to compare the results of compression, I meant the amount of used space. For example, I'm running these queries:
ALTER TABLE RENTING INMEMORY MEMCOMPRESS FOR QUERY LOW(RETURN_DATE);
vs
ALTER TABLE RENTING INMEMORY MEMCOMPRESS FOR CAPACITY HIGH(RETURN_DATE);
Is there any easy way to check the size used by these compressions in SQL developer?
I found this article https://blogs.oracle.com/in-memory/database-in-memory-compression, there is a table containing 'space used' for each type of compression. This exactly what I am trying to do on my own. Thanks for any advices.

Querying v$im_segments after population will show you how many bytes from the table were loaded and how much of the in-memory store was utilised.

Since the column space is part of the In-Memory Compression Units (IMCU), there is no way to see how much space is consumed by individual columns. It is possible to display the individual column level compression setting in the view v$im_column_level though. The closest you could come would be to compare the populated size between the two compression levels. As Connor said, you can do this with v$im_segments or you can display individual IMCU information for an object with the view v$im_header.

Related

Vertica Large Objects

I am migrating a table from Oracle to Vertica that contains an LOB column. The maximum actual size of the LOB column amounts to 800MB. How can this data be accommodated in Vertica? Is it appropriate to use the Flex Table?
In Vertica's documentation, it says that data loaded in a Flex table is stored in column raw which is a LONG VARBINARY data type. By default, it has a max value of 32MB, which, according to the documentation can be changed(i.e. increased) using the parameter FlexTablesRawSize.
I'm thinking this is the approach for storing large objects in Vertica. We just need to update the FlexTablesRawSize parameter to handle 800MB of data. I'd like to consult if this is the optimal way or if there's a better way. Or will this conflict with Vertica's table row constraint limitation that only allows up to 32MB of data per row?
Thank you in advance.
If you use Vertica for what it's built for - running a Big Data database, you would, like in any analytical database, try to avoid large objects in your table. BLOBs and CLOBs are usually used to store unstructured data: large documents, image files, audio files, video files. You can't filter by such a column, you can't run functions on it, or sum it, etc, you can't group by it.
A safe and performant design should lead to storing the file name in a Vertica table column, storing the file maybe even in Hadoop, and letting the front end (usually a BI tool, and all BI tools support that) retrieve the file to bring it to a report screen ...
Good luck ...
Marco

How to find out how much space a SQL Server table uses?

Is it possible to get the amount of space on disk that a particular table uses? Let's say I have a million users stored in my table and I want to know how much space it's required to store all users and/or one of them.
Update:
I'm planning to use redis to cache some fields from one particular table in memory to quickly retrieve the needed data after. So I need to calculate how much space approximately will it take and thus will it fit in the memory or not. Definitely it depends on the data types that I use inside my table but if a table consists of several dozens of fields it would take too much time to count this one by one.
There is exactly such answer for the MySQL though it's not suitable for SQL Server: How can you determine how much disk space a particular MySQL table is taking up? You can check it to see what I mean.
If you have SSMS, you can right-click on the table in the Object Explorer, go to Properties, and then look at the Storage page. The field, Data space, is the size of the data in that table, but it probably does not include some of the overhead costs of the table.
This is really an extended comment, because it does not directly answer the question.
For most purposes, you just use the size of the columns, add them together, and multiply by the number of rows. This lowballs the estimate, but it is reasonable. And (depending on how you handle the types) might be a reasonable estimate of the size of exporting the data.
That said, the storage of tables is a difficult matter. Here are some of the factors you need to take into account:
The size of individuals fields. This is made slightly more difficult because some types have varying sizes, so those are entirely data dependent.
The number of pages occupied by a table (or equivalently how full each data page is). Note that this can vary, depending on how full each table is.
The number of pages occupied by "overflow" data types, such as varchar(max).
Whether or not the data pages are compressed or encrypted.
The indexes for the table.
How full each index page is.
And, no doubt, I've left out a bunch of other relevant internal details (here is a place to start on page layouts).
In other words, there isn't a simple answer. Equivalent tables on two different systems could occupy very different amounts of space. This is true of the "same" table on the same system at different times.
The general answer when working with databases is that you need a lot more space than number of rows * row size -- I seem to recall using a factor of 3 at one point in time. In general, storage is pretty cheap, so this is not the limiting factor using a database.
We would need to see your full database schema, with tables and columns and all fields' data types. Without those pieces of information it's just a lucky guess. Here is a helpful cheat sheet of the sizes of each data type: https://www.connectionstrings.com/sql-server-2012-data-types-reference/
Then you just have to do the Math and calculate the space needed for X, which is your number of records

How is Oracle able to achieve 1000 columns?

I am trying to copy what salesforce did in their database architecture. Basically, they have a single oracle table with a thousand varchar(max) columns. They store all the customer data in this table. I am trying to accomplish the same thing with SQL Server. However, I am only able to get 308 varchar(max) fields in SQL server. I would like to know how is Oracle able to achieve 1000 column limit. I'd like to do the same thing in sql server.
A VARCHAR(MAX) field can hold GB's of information... but the max row size is 8060 bytes, so how does that add up? Well it doesn't store the 2GB in the row, it stores a 24 byte pointer instead. Those pointers are adding up to exceed your row size limit.
You could split the table out into multiple tables with fewer columns, but I don't think there is a way to override this limitation.
IMHO, a thousand columns seems more trouble than its worth. Perhaps you could take a more normalized approach.
For example, I have an Object Def table which is linked to an Extended Properties table. The XP table is linked to the OD and has fields XP-ITEM, XP-VALUE, XP-LM-UTC, and XP-LM-Usr. This structure allows any object to have any number of extended properties ... standard and/or non-standard.
The image below may give you a better visualization.
Just a couple of notes:
1) This is not for high volume transactional data i.e. Daily Loan Balances
2) Each Item ID can be linked back to an object which has it's own properties like Pick Lists, Excel Formats, etc.
3) Once can see the entire history of edits (who, what, and when)

SQL Server row length calculator

I am looking for an up to date tool to accurately calculate the total row size and page-density of any SQL table definition for SQL Server 2005+.
Please note that there are plenty of resources concerning calculating sizes of rows in existing tables, estimating techniques for sizing, etc... However, I am designing tables and have some options about column size which I am trying to balance with efficient data access - meaning that I can relocate less-frequently accessed long text into dedicated tables to allow the most frequent access of these new tables to operate at optimum speed.
Ideally there would be an online facility where a create statement can be cut and pasted, or a sproc I can run on a dev db.
and The answer is a simple one until you start making proper table design and balance that against joins and FK data and disk access.
I'd have a look an see how many data pages you are using and remember that one reads an extend (8 data pages) from disk, not only the data page you are looking for. Then there is the option for data compression in your table as well as sparse columns and out of row type of data storage and variable length characters.
It's not about how much data is in a column, it's really about how many data reads and CPU you need to get it. this you can test when executing a Query and looking against the ACTUAL QUERY PLAN.
As for space used you can use a stored procedure called sp_spaceused. here is a source you can use to see how one could use it in dbforms
Hope it helps
Walter

SQL 2005 DB Partitioning for SharePoint

Background
I have a massive db for a SharePoint site collection. It is 130GB and growing at 10gb per month. 100GB of the 130GB is in one site collection. 30GB is the version table. There is only one site collection - this is by design.
Question
Am I able to partition a database (SharePoint) using SQL 2005s data partitioning features (creating multiple data files)?
Is it possible to partition a database that is already created?
Has anyone partitioned a SharePoint DB? Will I encounter any issues?
You would have to create a partition set and rebuild the table on that partition set. SQL2005 can only partition on a single column, so you would have to have a column in the DB that
Behaves fairly predictably so you don't get a large skew in the amount of data in each partition
IIRC the column has to be a numeric or datetime value
In practice it's easiest if it's monotonically increasing - you can create a series of partitions (automatically or manually) and the system will fill them up as it gets to the range definitions.
A date (perhaps the date the document was entered) would be ideal. However, you may or may not have a useful column on the large table. M.S. tech support would be the best source of advice for this.
The partitioning should be transparent to the application (again, you need a column with appropriate behaviour to use as a partition key).
Unless you are lucky enough to have a partition key column that is also used as a search predicate in the most common queries you may not get much query performance benefit from the partitioning. An example of a column that works well is a date column on a data warehouse. However, your Sharepoint application may not make extensive use of this sort fo query.
Mauro,
Is there no way you can segment the data on a Sharepoint level?
ie you may have multiple "sites" using a single (SQL) content database.
You could migrate site data to a new content database, which will allow you to reduce the data in that large content site and then shrink the datafiles.
it will also assist you in managing your obvious continued growth.
James.