Remove quotation chars from file while bulk inserting data to table - sql

This is Users table
CREATE TABLE Users
(
Id INT PRIMARY KEY IDENTITY(0,1),
Name NVARCHAR(20) NOT NULL,
Surname NVARCHAR(25) NOT NULL,
Email NVARCHAR(30),
Facebook NVARCHAR(30),
CHECK(Email IS NOT NULL OR Facebook IS NOT NULL)
);
This is BULK INSERT
BULK INSERT Users
FROM 'C:\Users\SAMIR\Downloads\Telegram Desktop\users.txt'
WITH (
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
--FIRSTROW = 0,
--UTF-8
CODEPAGE = '65001'
);
So this is Users.txt file data:
`1, N'Alex', N'Mituchin', N'qwe#gmail.com', NULL`
When I load data from the file it sets Username to values like N'Alex'. But I want to have the data simply like Alex. How can I fix this problem?

I recommend loading data into a staging table where are values are strings.
Then you can use a simply query to get the final results. In this case, you can do:
select (case when name like 'N''%'''
then substring(name, 2, len(name) - 3)
else name
end) as name
from staging

There's a better option for this. If the string delimiters and unicode indicators are consistent (they're present on all rows), you should use a format file where you can indicate delimites for each column. This will allow you to set , N' as delimiter between the first and second columns, ', N' as delimiter for the second and third columns, and so on.

Related

SQL replace 2-3 letter value in string

I have a table with contacts and their key skills where values are divided by a semicolon:
And I can't figure out how to replace short key skills without harming the existing longer skills. E.g. UI is included in word Building
For more than 4 letters I'm using below SQL script to replace value1 (#current) for value2 (#replace) just fine
DECLARE #current varchar(50) = 'UI'
DECLARE #replace varchar(50) = 'New Skill'
UPDATE database.dbo.contact
SET key_skill = CASE
WHEN key_skill LIKE '%'+#replace+'%'+#current THEN REPLACE(key_skill, ';'+#current, '')
WHEN key_skill LIKE '%'+#current+'%'+#replace THEN REPLACE(key_skill, #current+';', '')
WHEN (key_skill LIKE '%'+#replace+'%'+#current+'%') OR (key_skill LIKE '%'+#current+'%'+#replace+'%') THEN REPLACE(key_skill, #current+';', '')
WHEN key_skill LIKE '%'+#current+'%' THEN REPLACE(key_skill, #current, #replace)
ELSE key_skill END
FROM database.dbo.contact
WHERE (key_skill LIKE '%'+#current+'%')
If it is at all possible you should change your design as soon as possible. There is almost never a good reason to store lists as delimited strings in a database. Databases already have the perfect structure for storing lists, they are called tables. A second table that links contacts to skills will be really useful here. Something like this:
CREATE TABLE dbo.Contact
(
ContactID INT IDENTITY (1, 1) NOT NULL,
Name VARCHAR(255) NOT NULL,
CONSTRAINT PK_Contact__ContractID PRIMARY KEY (ContactID)
);
CREATE TABLE dbo.KeySkill
(
KeySkillID INT IDENTITY (1, 1) NOT NULL,
Name VARCHAR(50) NOT NULL,
CONSTRAINT PK_KeySkill__KeySkillID PRIMARY KEY (KeySkillID)
);
CREATE TABLE dbo.ContactKeySkill
(
ContactID INT NOT NULL,
KeySkillID INT NOT NULL,
CONSTRAINT PK_ConactKeySkill__ContactID_KeySkillID PRIMARY KEY (ContactID, KeySkillID),
CONSTRAINT FK_ContactKeySill__ContactID FOREIGN KEY (ContactID) REFERENCES dbo.Contact (ContactID),
CONSTRAINT FK_ContactKeySill__KeySkillID FOREIGN KEY (KeySkillID) REFERENCES dbo.KeySkill (KeySkillID)
);
With this structure in place everything else becomes significantly easier. You can recreate your existing format if needed as follows:
SELECT c.ContactID, c.Name, Skills = STRING_AGG(ks.Name, ';')
FROM dbo.Contact AS c
INNER JOIN dbo.ContactKeySkill AS cks
ON cks.ContactID = c.ContactID
INNER JOIN dbo.KeySkill AS ks
ON ks.KeySkillID = cks.KeySkillID
GROUP BY c.ContactID, c.Name;
You are also in complete control of ordering and filtering (with indexes), and data integrity (no duplicates, or typos etc).
Adding/removing skills becomes as simple as inserting/deleting rows rather than having to do any string manipulation.
And if you decided you wanted to rename a skill, e.g. "UI" with "User Interface" well, again that is really really easy in a properly designed database:
UPDATE dbo.KeySkill
SET Name = 'User Interface'
WHERE Name = 'UI';
Because you have now separated all your data, you can be certain that when you update UI there are no side effects because that is the only value stored in that field.
Working Demo on db<>fiddle
If you are not in control of your design and can't make these changes, then the following should work for you:
STUFF(key_skill,
CHARINDEX(CONCAT(';', #current, ';'),CONCAT(';', key_skill, ';')),
LEN(#current),
#replace);
The premise is that if you add ; to the start and the end of both your key_skill string and your #current parameter, then it doesn't matter whether the term is at the start or the end of the string, you would be looking for ;UI; in ;UI;PHP;Building;, so the search term no longer matches in building.
It is easier to use STUFF() here rather than REPLACE(), just so you don't have to actually build a string with semi-colons on the end, then remove them at the end. All you need is to use CHARINDEX to find out where the skill starts in the string (2nd argument in stuff), the length of the skill (3rd argument), and use this as the starting point to "stuff" your new string in (4th argument).
Demo
CREATE TABLE #T (Contact VARCHAR(255), key_skill VARCHAR(255));
INSERT #T(Contact, key_skill)
VALUES
('John Doe', 'AI;UI;ONC;BI;PHP'),
('Craig Smith', 'UI;PHP;Building'),
('Loren Paul', 'AI;UI');
DECLARE #current VARCHAR(50) = 'UI',
#replace VARCHAR(50) = 'New Skill'
UPDATE #T
SET key_skill = STUFF(key_skill,
CHARINDEX(CONCAT(';', #current, ';'),CONCAT(';', key_skill, ';')),
LEN(#current),
#replace)
WHERE CHARINDEX(CONCAT(';', #current, ';'),CONCAT(';', key_skill, ';')) > 0;
SELECT *
FROM #T;
ADENDUM
Since you can't change your data structure a more robust method of doing this will be to deconstruct your delimited list (using STRING_SPLIT()), then make your changes, then reconstruct it again (using STRING_AGG()), e.g.
CREATE TABLE #T (Contact VARCHAR(255), key_skill VARCHAR(255));
INSERT #T(Contact, key_skill)
VALUES
('John Doe', 'AI;UI;ONC;BI;PHP'),
('Craig Smith', 'UI;PHP;Building'),
('Loren Paul', 'AI;UI');
DECLARE #current VARCHAR(50) = 'UI',
#replace VARCHAR(50) = 'New Skill'
UPDATE t
SET t.key_skill = s.NewList
FROM #T AS t
CROSS APPLY
( SELECT STRING_AGG(Value, ';')
FROM ( SELECT Value
FROM STRING_SPLIT(t.key_skill, ';') AS s
WHERE s.value <> #current
UNION
SELECT #replace
WHERE #replace <> ''
) AS s
) AS s (NewList);
Where no #current value is specified this will simply add a skill, and where no #replace is set, then this will just remove the #current.
Working Demo on db<>fiddle
ADENDUM 2
For SQL Server 2016 that doesn't support STRING_AGG() you can use XML extensions as an alternative:
DECLARE #current VARCHAR(50) = 'UI',
#replace VARCHAR(50) = 'New Skill'
UPDATE t
SET t.key_skill = STUFF(s.NewList.value('.', 'NVARCHAR(MAX)'), 1, 1, '')
FROM #T AS t
CROSS APPLY
( SELECT CONCAT(';', Value)
FROM ( SELECT Value
FROM STRING_SPLIT(t.key_skill, ';') AS s
WHERE s.value <> #current
UNION
SELECT #replace
WHERE #replace <> ''
) AS s
FOR XML PATH(''), TYPE
) AS s (NewList);

BULK INSERT 0 Rows affected

I have been at this problem all morning and can't seem to figure it out. I have a simple txt file with the following entries:
1,van Rhijn
2,van Dam
3,van Rhijn van Dam
I am trying to import these fields using the following query:
CREATE TABLE #test
(
Id INT NOT NULL,
LastName VARCHAR(MAX) NOT NULL
)
BULK INSERT #test
FROM 'C:\test.txt'
WITH
(
MAXERRORS = 0,
FIRSTROW = 1,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\r\n'
)
SELECT *
FROM #test
I have tried everything I found on the web. Changing delimitor, row terminator, encoding, extension. I keep getting the message "0 row(s) affected" and the last select obviously returns no rows.
EDIT: I use Microsoft SQL Server.
Please help.
Might be a silly question, but are you actually using the syntax 'test.txt' or are you using a fully qualified or at least a full path like 'c:\test.txt'? Because I am pretty sure you need to use the full path here.
CREATE TABLE #test
(
Id INT NOT NULL,
LastName VARCHAR(MAX) NOT NULL
)
BULK INSERT #test
FROM 'C:\test.txt'
WITH
(
MAXERRORS = 0,
FIRSTROW = 1,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
SELECT *
FROM #test
Or wherever your file resides on the network (note if your SQL server is on a different machine you will probably need to use a network path and/or shared folder combination).
edit: Try updating your ROWTERMINATOR to \n only

SQL Truncation Issue Converting VARCHAR to VARBINARY

I have a fairly simple insert from a csv file into a temp table into a table with an encrypted column.
CREATE TABLE table1
(number varchar(32) NOT NULL
, user_varchar1 varchar(65) NOT NULL
, account varchar(32) NOT NULL)
CREATE TABLE #temp1
(number varchar(32) NOT NULL
, user_varchar1 varchar(65) NOT NULL
, account varchar(32) NOT NULL)
OPEN SYMMETRIC KEY SKey
DECRYPTION BY CERTIFICATE CERTCERT
--Flat File Insert
BULK INSERT #temp1
FROM '\\Server\Data\filename.csv'
WITH (FIELDTERMINATOR = ','
, FIRSTROW =2
, ROWTERMINATOR = '\n'
);
INSERT INTO table1
(number, user_varchar1, account_encrypted)
SELECT user_varchar1, number
, ENCRYPTBYKEY(KEY_GUID('SKey'),(CONVERT(varbinary(MAX), account)))
FROM #temp1
--SELECT * FROM #esa_import_ach
DROP TABLE #temp1
SELECT * FROM table1
CLOSE MASTER KEY
CLOSE SYMMETRIC KEY SKey;
The error I receive is
Msg 8152, Level 16, State 11, Line 40
String or binary data would be truncated.
Now if I allow NULLS into table1, it fills with NULLS, obviously. If I omit the account_encrypted column altogether, the script works.
If I use
INSERT INTO table1 (number, user_varchar1, account)
VALUES ('175395', '87450018RS', ENCRYPTBYKEY(KEY_GUID('SKey'), (CONVERT(varbinary(MAX), account)))
there's no problem.
So, is there something wrong with the way I'm executing the BULK INSERT, is it my declaration of the data types or is it the source file itself.
The source file looks like this (just one row):
emp_id, number, account
175395, 87450018RS,GRDI27562**CRLF**
Thanks and I'm hoping this makes sense.
The problem is that your account column is defined as varchar(32).
ENCRYPTBYKEY returns a result with a max size of 8000. That just won't fit in your column. Either expand the column, or cast the result to a smaller size to fit it inside the column. Right now it just won't fit.

Bulk Insert (TSQL) from csv file with missing values

Problem:
I have a table
CREATE TABLE BestTableEver
(
Id INT,
knownValue INT,
unknownValue INT DEFAULT 0,
totalValue INT DEFAULT 0);
And I have this CSV File (Loki.csv)
Id, knownValue, unknownValue, totalValue
1, 11114
2, 11135
3, 11235
I want to do a bulk insert into the table and since I do not know the values of unknownValue and totalValue yet , I want them to be take up the default value (as defined in the table creation)
My approach so far
create procedure populateLikeABoss
#i_filepath NVARCHAR(2048)
DECLARE #thor nvarchar(MAX)
SET #thor=
'BULK INSERT populateLikeABoss
FROM ' + char(39) + #i_filepath + char(39) +
'WITH
(
FIELDTERMINATOR = '','',
ROWTERMINATOR = ''\n'',
FIRSTROW = 2,
KEEPNULLS
)'
exec(#thor)
END
and calling the procedure to do the magic
populateLikeABoss 'C:\Loki.csv'
Error
Bulk load data conversion error (type mismatch or invalid character
for the specified codepage) for row 2, column 2 (sizeOnMedia).
References
Keeping NULL value with bulk insert
Microsoft
Similar question without the answer I need
StackOverflow
I think the csv is not in the expected format. For keeping null the records should be in the format 1, 11114,, in each row. Other option is to remove the last two columns in header.

Varchar to Number in sql

i have written a query in which i am fetching an amount which is a number like '50,000','80,000'.
select Price_amount
from per_prices
As these values contain ',' these are considered to be varchar.Requirement is to to print these as 'number' with ','
that is how can '50,000' be considered as number and not varchar
If a value has anything other than numbers in it, it is not an integer it is string containing characters. in your case you have a string containing character 5, 0 and ,.
If this is what is stored in your database and this is what you want to display then go ahead you do not need to change it to Integer or anything else. But if you are doing some calculations on these values before displaying them, Yes then you need to change them to an Integer values. do the calculation. Change them back to the varchar datatype to show , between thousands and hundred thousands and display/select them.
Example
DECLARE #TABLE TABLE (ID INT, VALUE VARCHAR(100))
INSERT INTO #TABLE VALUES
(1, '100,000'),(2, '200,000'),(3, '300,000'),(4, '400,000'),
(1, '100,000'),(2, '200,000'),(3, '300,000'),(4, '400,000')
SELECT ID, SUM(
CAST(
REPLACE(VALUE, ',','') --<-- Replace , with empty string
AS INT) --<-- Cast as INT
) AS Total --<-- Now SUM up Integer values
FROM #TABLE
GROUP BY ID
SQL Fiddle
you could combine the Replace and cast function
SELECT CAST(REPLACE(Price_amount, ',', '') AS int) AS Price_Number FROM per_prices
for more information visit 'replace', 'cast'
SQLFiddle