Leading zeros disappear when using bulk insert from file - sql

I am using bulk insert to insert data from a csv file to a SQL table. One of the columns in the csv file is an "ID" columns: i.e. each cell in the column is an "ID number" that may have leading zeros. Example: 00117701, 00235499, etc.
The equivalent column in the SQL table is of varchar(255) type.
When I bulk insert the data into the table, the leading zeros in each element of the "ID" column disappear. In other words, 00117701 becomes 117701, etc.
Is this a column type problem? If not, what's the best way to overcome this problem?
Thanks!

not sure what is causing it to strip off the leading zeroes, but I had to 'fix' some data in the past and did something like this:
UPDATE <table> SET <field> = RIGHT('00000000'+cast(<field> as varchar(8)),8)
You may need to adjust it a bit for your purposes, but maybe you get the idea from it?

Related

importing data with commas in numeric fields into redshift

I am importing data into redshift using the SQL COPY statement. The data has comma thousands separators in the numeric fields which the COPY statement rejects.
The COPY statement has a number of options to specify field separators, date and time formats and NULL values. However I do not see anything to specify number formatting.
Do I need to preprocess the data before loading or is there a way to get redshift to parse the numbers corerctly?
Import the columns as TEXT data type in a temporary table
Insert the temporary table to your target table. Have your SELECT statement for the INSERT replace commas with empty strings, and cast the values to the correct numeric type.

Data Type Selection for table

I have a tab delimited flat file. One of the column called as earlydate has values like:
18-08-2016 08:12:21
Can anyone suggest what would be best datatype to be in table. Other than VARCHAR OR NVARCHAR. I don't want to treat it like string.

Alter column from varchar to decimal when nulls exist

How do I alter a sql varchar column to a decimal column when there are nulls in the data?
I thought:
ALTER TABLE table1
ALTER COLUMN data decimal(19,6)
But I just get an error, I assume because of the nulls:
Error converting data type varchar to numeric. The statement has been terminated.
So I thought to remove the nulls I could just set them to zero:
ALTER TABLE table1
ALTER COLUMN data decimal(19,6) NOT NULL DEFAULT 0
but I dont seem to have the correct syntax.
Whats the best way to convert this column?
edit
People have suggested it's not the nulls that are causing me the problem, but non-numeric data. Is there an easy way to find the non-numeric data and either disregard it, or highlight it so I can correct it.
If it were just the presence of NULLs, I would just opt for doing this before the alter column:
update table1 set data = '0' where data is null
That would ensure all nulls are gone and you could successfully convert.
However, I wouldn't be too certain of your assumption. It seems to me that your new column is perfectly capable of handling NULL values since you haven't specified not null for it.
What I'd be looking for is values that aren't NULL but also aren't something you could turn in to a real numeric value, such as what you get if you do:
insert into table1 (data) values ('paxdiablo is good-looking')
though some may argue that should be treated a 0, a false-y value :-)
The presence of non-NULL, non-numeric data seems far more likely to be causing your specific issue here.
As to how to solve that, you're going to need a where clause that can recognise whether a varchar column is a valid numeric value and, if not, change it to '0' or NULL, depending on your needs.
I'm not sure if SQL Server has regex support but, if so, that'd be the first avenue I'd investigate.
Alternatively, provided you understand the limitations (a), you could use isnumeric() with something like:
update table1 set data = NULL where isnumeric(data) = 0
This will force all non-numeric values to NULL before you try to convert the column type.
And, please, for the love of whatever deities you believe in, back up your data before attempting any of these operations.
If none of those above solutions work, it may be worth adding a brand new column and populating bit by bit. In other words set it to NULL to start with, and then find a series of updates that will copy data to this new column.
Once you're happy that all data has been copied, you should then have a series of updates you can run in a single transaction if you want to do the conversion in one fell swoop. Drop the new column and then do the whole lot in a single operation:
create new column;
perform all updates to copy data;
drop old column;
rename new column to old name.
(a) From the linked page:
ISNUMERIC returns 1 for some characters that are not numbers, such as plus (+), minus (-), and valid currency symbols such as the dollar sign ($).
Possible solution:
CREATE TABLE test
(
data VARCHAR(100)
)
GO
INSERT INTO test VALUES ('19.01');
INSERT INTO test VALUES ('23.41');
ALTER TABLE test ADD data_new decimal(19,6)
GO
UPDATE test SET data_new = CAST(data AS decimal(19,6));
ALTER TABLE test DROP COLUMN data
GO
EXEC sp_RENAME 'test.data_new' , 'data', 'COLUMN'
As people have said, that error doesn't come from nulls, it comes from varchar values that can't be converted to decimal. Most typical reason for this I've found (after checking that the column doesn't contain any logically false values, like non-digit characters or double comma values) is when your varchar values use comma for decimal pointer, as opposed to period.
For instance, if you run the following:
DECLARE #T VARCHAR(256)
SET #T = '5,6'
SELECT #T, CAST(#T AS DEC(32,2))
You will get an error.
Instead:
DECLARE #T VARCHAR(256)
SET #T = '5,6'
-- Let's change the comma to a period
SELECT #T = REPLACE(#T,',','.')
SELECT #T, CAST(#T AS DEC(32,2)) -- Now it works!
Should be easy enough to look if your column has these cases, and run the appropriate update before your ALTER COLUMN, if this is the cause.
You could also just use a similar idea and make a regex search on the column for all values that don't match digit / digit+'.'+digit criteria, but i suck with regex so someone else can help with that. :)
Also, the american system uses weird separators like the number '123100.5', which would appear as '123,100.5', so in those cases you might want to just replace the commas with empty strings and try then?

Modifying a column type with data, without deleting the data

I have a column which I believe has been declared wrongly. It contains data and I do not wish to lose the data.
I wish to change the definition from varchar(max) to varchar(an integer). I was under the impression I cannot just alter the column type?
Is the best method to create a temp column, "column2", transfer the data to this column, from the column with the problematic type, delete the problem column and then rename the temp column to the original problematic column?
If so, how do I copy the values from the problem column to the new column?
EDIT: For anyone with same problem, you can just use the ALTER statements.
As long as the data types are somewhat "related" - yes, you can absolutely do this.
You can change an INT to a BIGINT - the value range of the second type is larger, so you're not in danger of "losing" any data.
You can change a VARCHAR(50) to a VARCHAR(200) - again, types are compatible, size is getting bigger - no risk of truncating anything.
Basically, you just need
ALTER TABLE dbo.YourTable
ALTER COLUMN YourColumn VARCHAR(200) NULL
or whatever. As long as you don't have any string longer than those 200 characters, you'll be fine. Not sure what happens if you did have longer strings - either the conversion will fail with an error, or it will go ahead and tell you that some data might have been truncated. So I suggest you first try this on a copy of your data :-)
It gets a bit trickier if you need to change a VARCHAR to an INT or something like that - obviously, if you have column values that don't "fit" into the new type, the conversion will fail. But even using a separate "temporary" new column won't fix this - you need to deal with those "non-compatible" cases somehow (ignore them, leave NULL in there, set them to a default value - something).
Also, switching between VARCHAR and NVARCHAR can get tricky if you have e.g. non-Western European characters - you might lose certain entries upon conversion, since they can't be represented in the other format, or the "default" conversion from one type to the other doesn't work as expected.
Calculate the max data length store int that column of that table.
Select max(len(fieldname)) from tablename
Now you can decrease the size of that column up to result got in previous query.
ALTER TABLE dbo.YourTable
ALTER COLUMN YourColumn VARCHAR(200) NULL
According to the PostgreSQL docs, you can simply alter table
ALTER TABLE products ALTER COLUMN price TYPE numeric(10,2);
But here's the thing
This will succeed only if each existing entry in the column can be converted to the new type by an implicit cast. If a more complex conversion is needed, you can add a USING clause that specifies how to compute the new values from the old.
add a temp column2 with type varchar(NN), run update tbl set column2 = column, check if any error happens; if everything is fine, alter your original column, copy data back and remove column2.

SQL query to extract text from a column and store it to a different column in the same record

I need some help with a SQL query...
I have a SQL table that holds in a column details of a form that has been submitted. I need to get a part of the text that is stored in that column and put it into a different column on the same row. The bit of text that I need to copy is always in the same position in the column.
Any help would be appreciated guys... my mind has gone blank :">
UPDATE mytable
SET other_column = SUBSTRING(column, begin_position, length)
You may just want to use a computed column. This way if the source string changes, your computed column is still correct. If you need to seek to this substring then you might want a persisted computed column if your db supports it.
UPDATE table
SET Column2 = SUBSTRING(Column1, startPos, length)
What if the value you wanted to copy was in a different position in each record, but always followed the same text?