Change data type in table due to Disk Space/Memory Error - sql

Attempts at changing data type in Access have failed due to error:
"There isn't enough disk space or memory". Over 385,325 records exists in the table.
Attempts at the following links, among other StackOverFlow threads, have failed:
Can't change data type on MS Access 2007
Microsoft Access can't change the datatype. There isn't enough disk space or memory
The intention is to change data type for one column from "Text" to "Number". The aforementioned links cannot accommodate that either due to size or the desired data type fields.
Breaking out the table may not be an option due to the number of records.
Help on this would be appreciated.

I cannot tell for sure about MS Access, but for MS SQL one can avoid a table rebuild (requiring lots of time and space) by appending a new column that allows null- values at the rightmost end of the table, update the column using normal update queries and AFAIK even drop the old column and rename the new one. So in the end it's just the location of that column that has changed.
As for your 385,325 records (I'd expect that number to be correct) even if the table had 1000 columns with 500 unicode- characters each we'd end up with approximately 385,325*1000*500*2 ~ 385 GB of data. That should nowadays not be the maximum available - so:
if it's the disk space you're running out of, how about move the data to some other computer, change the DB there and move it back.
if the DB seems to be corrupted (and standard tools didn't help (make a copy)) it will most probably help to create a new table or database using table creation (better: create manually and append) queries.

Related

Connection Timeout Error while reading the table having more than 100 columns in Mosaic Decisions

I am reading a table via snowflake reader node having less number of columns/attributes(around 50-80),the table is getting read on the Mosaic decisions Canvas. But when the attributes of table increases (approx 385 columns),Mosaic reader node fails. As a workaround I tried using the where clause with 1=2,in that case it is pulling the structure of the Table. But when I am trying to read the records even by applying the limit (only 10 records) to the query, it is throwing connection timeout Error.
Even I faced similar issue while reading (approx. 300 columns) table and I managed it with the help of input parameters available in Mosaic. In your case you will have to change the copy field variable to 1=1 used in the query at run time.
Below steps can be referred to achieve this -
Create a parameter (e.g. copy_variable) that will contain the default value 2 for the copy field variable
In reader node, write the SQL with 1 = $(copy_variable) So while validating, it’s same as 1=2 condition and it should validate fine.
Once validated and schema is generated, update the default value of $(copy_variable) to 1 so that while running, you will still get all records.

nvarchar(max) - how to speed up getting only meaningful string in SQL

I have a table with a column with Nvarchar(Max). The Column is 90% percent of the time having a string length between 255 and 500. Some go well over 22000 which aren't required as its XML of something that the business wont ever use for reporting purpose. Anyways to cut a long story short was the best way to trim out all the excess bulk. I have tried the usual
left(column,500)
and
substring(column,1,500)
I have set the destination column to be 500 length.
However loading the table from source to target destination takes a while just because of that column alone. I am doing the in SSIS in the Source. I also gone to the output column and ignored truncation. Is there anyway I can reduce the time take loading this column. These methods seem to take as much as loading the full length. Any suggestion will be greatly appreciated.
NVARCHAR(MAX) (even when using a function like SUBSTRING or LEFT) will cost a lot of memory and will fill up your buffers quickly. Check the DefaultBufferMaxSize and also the properties BLOBTempStoragePath and BufferTempStoragePath setting them to an optimal value might increase the performance but note that you have so configure them accordingly because it is like a double edged sword.
Also If Source and Destination are on differents servers, the network could also be an issue because all data has to to from your SQL server via the network to your SSIS server. You could try changing the Network Packet Size
More info are provided in these links
Set BLOBTempStoragePath and BufferTempStoragePath to Fast Drives
Troubleshooting Package Performance
Perfomance Issue with NVarchar(MAX) in SSIS

Find out the amount of space each field takes in Google Big Query

I want to optimize the space of my Big Query and google storage tables. Is there a way to find out easily the cumulative space that each field in a table gets? This is not straightforward in my case, since I have a complicated hierarchy with many repeated records.
You can do this in Web UI by simply typing (and not running) below query changing to field of your interest
SELECT <column_name>
FROM YourTable
and looking into Validation Message that consists of respective size
Important - you do not need to run it – just check validation message for bytesProcessed and this will be a size of respective column
Validation is free and invokes so called dry-run
If you need to do such “columns profiling” for many tables or for table with many columns - you can code this with your preferred language using Tables.get API to get table schema ; then loop thru all fields and build respective SELECT statement and finally Dry Run it (within the loop for each column) and get totalBytesProcessed which as you already know is the size of respective column
I don't think this is exposed in any of the meta data.
However, you may be able to easily get good approximations based on your needs. The number of rows is provided, so for some of the data types, you can directly calculate the size:
https://cloud.google.com/bigquery/pricing
For types such as string, you could get the average length by querying e.g. the first 1000 fields, and use this for your storage calculations.

When are text field lengths enforced in Salesforce? Can I get the actual field length?

I was importing data from Salesforce today, when my BULK INSERT failed with too-long data: longer than the field length as reported from Salesforce itself. I discovered that this field, which Salesforce describes as a TEXT(40), has values up to 255 characters long. I can only guess that the field had a 255-character limit in the past, was changed to TEXT(40), and Salesforce has not yet applied the new limit.
When are field lengths enforced? Only when new data is inserted or modified? Are they enforced at any other point, such as a weekly schedule?
Second, is there any way to know the actual field length limit? As a database guy, not being able to rely on the metadata I've been given makes me cringe. As just one random example, if we were to restore this table from backup I assume that the long values would bomb, or possibly be truncated.
I'm using the SOAP API.
Field lengths are enforced on create/update. If you later reduce the length, existing records are not truncated. I imagine this is because salesforce is storing these as 255's regardless.
Pragmatically speaking, the "actual" field limit for any text field in salesforce should be considered 255. This is because it's possible that at some point in the past, records were inserted when the limit was as high as 255.
And you're right that if you were to dump that table and re-insert it, you very well could have rejected records due to values that exceed the field size as currently defined.

DATA_BUFFER_EXCEEDED error when calling RFC_READ_TABLE?

My java/groovy program receives table names and table fields from the user input, it queries the tables in SAP and returns its contents.
The user input may concern the tables CDPOS and CDHDR. After reading the SAP documentations and googling, I found these are tables storing change document logs. But I did not find any remote call functions that can be used in java to perform this kind of queries.
Then I used the deprecated RFC Function Module RFC_READ_TABLE and tried to build up customized queries only depending on this RFC. However, I found if the number of desired fields I passed to this RFC are more than 2, I always got the DATA_BUFFER_EXCEEDED error even if I limit the max rows.
I am not authorized to be an ABAP developer in the SAP system and can not add any FM to existing systems, so I can only write code to accomplish this requirement in JAVA.
Am I doing something wrong? Could you give me some hints on that issue?
DATA_BUFFER_EXCEEDED only happens if the total width of the fields you want to read exceeds the width of the DATA parameter, which may vary depending on the SAP release - 512 characters for current systems. It has nothing to do with the number of rows, but the size of a single dataset.
So the question is: What are the contents of the FIELDS parameter? If it's empty, this means "read all fields." CDHDR is 192 characters in width, so I'd assume that the problem is CDPOS which is 774 characters wide. The main issue would be the fields VALUE_OLD and VALUE_NEW, both 245 Characters.
Even if you don't get developer access, you should prod someone to get read-only dictionary access to be able to examine the structures in detail.
Shameless plug: RCER contains a wrapper class for RFC_READ_TABLE that takes care of field handling and ensures that the total width of the selected fields is below the limit imposed by the function module.
Also be aware that these tables can be HUGE in production environments - think billions of entries. You can easily bring your database to a grinding halt by performing excessive read operations on these tables.
PS: RFC_READ_TABLE is not released for customer use as per SAP note 382318, and the note 758278 recommends to create your own function module and provides a template with an improved logic.
Use BBP_RFC_READ_TABLE instead
There is a way around the DATA_BUFFER_EXCEED error. Although this function is not released for customer use as per SAP OSS note 382318, you can get around this issue with changes to the way you pass parameters to this function. Its not a single field that is causing your error, but if the row of data exceeds 512 bytes this error will be raised. CDPOS will have this issue for sure!
The work around if you know how to call the function using Jco and pass table parameters is to specify the exact fields you want returned. You then can keep your returned results under the 512 byte limit.
Using your example of table CDPOS, specify something like this and you should be good to go...(be careful, CDPOS can get massive! You should specify and pass a where clause!)
FIELDS = 'OBJECTCLAS'....
FIELDS = 'OBJECTID'
In Java it can be expressed as..
listParams.setValue(this.getpObjectclas(), "OBJECTCLAS");
By limiting the fields you are returning you can avoid this error.