How to determine properly data types when I have number in quotes or mixed data? - sql

I know that this is very simple questions but I can't go any further. I want to import data from csv file to PostgreSQL. I have made a table, name column as they are named on the file and first problem that I have got is that I don't know the data type. I mean in first column when i open CSV file i have something like that:
"COLUMN1";"COLUMN2";"COLUMN3";"COLUMN4"
"009910";NA;NA;"FALSE"
"953308";0;41;"TRUE"
"936540";NA;NA;"FALSE"
"902346";1;5;"TRUE"
"747665";NA;NA;"FALSE"
"074554";NA;NA;"FALSE"
"154572";NA;NA;"FALSE"
And when I am import this base via pgAdmin 4 its return error with datatype. I set column2 as Integer but it's kinda 'mixed'. The column 1 I also set as integer but numbers are in quote so I wonder if PostgreSQL see it as string. The same is up to column4. How should I properly determine data types of each column?

During import it will cast the value to the column's type, if possible.
For example, if you do SELECT 'FALSE'::boolean it will cast and return false. SELECT '074554'::int works as well and returns 74554.
But the bare characters NA will give you problems. If those are intended to be null, try to do a find/replace on the file and just take them out, so that the first row of data has "009910";;;"FALSE" and see if that works.
You could also have all columns as text, quote the NA values, and import.
Then create a new table, and use INSERT INTO ... SELECT from the all-text table and manually cast or use CASE as needed to convert types.
For example, if you imported into a table called raw_data, and have a nicer table imports:
INSERT INTO imports
SELECT
column1::int,
CASE WHEN column2 = 'NA' THEN null ELSE column2::int END,
CASE WHEN column3 = 'NA' THEN null ELSE column3::int END,
column4::boolean
FROM
raw_data

Related

Replace a range of values in an SQL table to a single value

I am trying to replace a range of values with a string. I know how to do it with the replace function but that, as far as I know, requires them to be done one at a time.
Is there a way to select a range of values, for example (1-200), and replace them with a singular string value say "BLANK"?
I have tried WHEN, THEN and SET but get a syntax error near WHEN or SET as I try these.
Base Code Idea
Select DATA
WHEN DATA >= 1 THEN 'BLANK'
WHEN DATA <200 THEN 'BLANK
END
FROM DATABANK
Thanks!
Is this what you want?
select data,
case when data not between 1 and 200 then data end as new_data
from databank
What this does is take the integer value of data, and replace any value that's in the 1-200 range with null values, while leaving other values unchanged. The result goes into column new_data.
The assumption here is that data is a number - so the alternative value has to be consistent with that datatype (string 'BLANK' isn't): I went for null, which is consistent with any datatype, and is the default value returned by a case expression when no branch matches.
If you wanted something else, say 0, you would do:
select data,
case when data between 1 and 200 then 0 else data end as new_data
from databank

Manipulating a record data

I am looking for a way to take data from one table and manipulate it and bring it to another table using an SQL query.
I have a Column called NumberStuff that has data like this in it:
INC000000315482
I need to cut off the INC portion of the number and convert it into an integer and store it into a Column in another table so that it ends up looking like this:
315482
Any help would be much appreciated!
Another approach is to use the Replace function. Either in TSQL or as a Derived Column Expression in SSIS.
TSQL
SELECT REPLACE(T.MyColumn, 'INC', '') AS ReplacedINC
SSIS
REPLACE([MyColumn], "INC", "")
This removes the character based data. It then becomes an optional exercise in converting to a numeric type before storing it to the target table or letting the implicit conversion happen.
Simplest version of what you need.
select cast(right(column,6) as int) from table
Are you doing this in a SSIS statement, or?...is it always the last 6 or?...
This is a little less dependant on your formatting...removes 0's and can be any length (will trim the first 3 chars and the leading 0's).
select cast(SUBSTRING('INC000000315482',4,LEN('INC000000315482') - 3) as int)

SQL - Conditionally joining two columns in same table into one

I am working with a table that contains two versions of stored information. To simplify it, one column contains the old description of a file run while another column contains the updated standard for displaying ran files. It gets more complicated in that the older column can have multiple standards within itself. The table:
Old Column New Column
Desc: LGX/101/rpt null
null Home
Print: LGX/234/rpt null
null Print
null Page
I need to combine the two columns into one, but I also need to delete the "Print: " and "Desc: " string from the beginning of the old column values. Any suggestions? Let me know if/when I'm forgetting something you need to know!
(I am writing in Cache SQL, but I'd just like a general approach to my problem, I can figure out the specifics past that.)
EDIT: the condition is that if substr(oldcol,1,5) = 'desc: ' then substr(oldcol,6)
else if substr(oldcol,1,6) = 'print: ' then substr(oldcol,7) etc. So as to take out the "desc: " and the "print: " to sanitize the data somewhat.
EDIT2: I want to make the table look like this:
Col
LGX/101/rpt
Home
LGX/234/rpt
Print
Page
It's difficult to understand what you are looking for exactly. Does the above represent before/after, or both columns that need combining/merging.
My guess is that COALESCE might be able to help you. It takes a bunch of parameters and returns the first non NULL.
It looks like you're wanting to grab values from new if old is NULL and old if new is null. To do that you can use a case statement in your SQL. I know CASE statements are supported by MySQL, I'm not sure if they'll help you here.
SELECT (CASE WHEN old_col IS NULL THEN new_col ELSE old_col END) as val FROM table_name
This will grab new_col if old_col is NULL, otherwise it will grab old_col.
You can remove the Print: and Desc: by using a combination of CharIndex and Substring functions. Here it goes
SELECT CASE WHEN CHARINDEX(':',COALESCE(OldCol,NewCol)) > 0 THEN
SUBSTRING(COALESCE(OldCol,NewCol),CHARINDEX(':',COALESCE(OldCol,NewCol))+1,8000)
ELSE
COALESCE(OldCol,NewCol)
END AS Newcolvalue
FROM [SchemaName].[TableName]
The Charindex gives the position of the character/string you are searching for.
So you get the position of ":" in the computed column(Coalesce part) and pass that value to the substring function. Then add +1 to the position which indicates the substring function to get the part after the ":". Now you have a string without "Desc:" and "Print:".
Hope this helps.

SQL Server comma delimiter for money datatype

I import Excel files via SSIS to SQL-Server. I have a temp table to get everything in nvarchar. For four columns I then cast the string to money type and put in my target table.
In my temp table one of those four columns let me call it X has a comma as the delimiter the rest has a dot. Don't ask me why, I have everything in my SSIS set the same.
In my Excel the delimiter is a comma as well.
So now in my target table I have everything in comma values but the X column now moves the comma two places to the right and looks like this:
537013,00 instead of 5370,13 which was the original cell value in the temp and excel column.
I was thinking this is a culture setup problem but then again it should or shouldn't work on all of these columns.
a) Why do I receive dot values in my temp table when my Excel displays comma?
b) how can I fix this? Can I replace the "," in the temp table with a dot?
UPDATE
I think I found the reason but not the solution:
In this X column in excel the first three cells are empty - the other three columns all start with 0. If I fill these three cells of X with 0s then I also get the dot in my temp table and the right value in my target table. But of course I have to use the Excel file as is.
Any ideas on that?
Try the code below. It checks whether the string value being converted to money is of numeric data type. If the string value is of numeric data type, then convert it to money data type, otherwise, return a NULL value. And it also replaces the decimal symbol and the digit grouping symbol of the string value to match the expected decimal symbol and digit grouping symbol of SQL Server.
DECLARE #MoneyString VARCHAR(20)
SET #MoneyString = '$ 1.000,00'
SET #MoneyString = REPLACE(REPLACE(#MoneyString, '.', ''), ',', '.')
SELECT CAST(CASE WHEN ISNUMERIC(#MoneyString) = 1
THEN #MoneyString
ELSE NULL END AS MONEY)
As for the reason why you get comma instead dot I have no clue. My first guess would be cultural settings but you already checked that. What about googling, did you get some results?
First the "separator" in SQL is the decimal point: its only excel that is using the comma. You can change the formatting in excel: you should format the excel column as money and specify a decimal point as the separator. Then in the SSIS import wizard split out the transformation of the column so it imports to a money data type. Its a culture thing, but delimiter tends to be used in the context of signifying the end of one column and the start of the next (as in csv)
HTH
Well thats a longstanding problem with excel. It uses the first 30 or so rows to infer data type. It can lead to endless issues. I think your solution has to be to process everything as a string in the way Yaroslav suggested, or supply an excel template to have data predefined and formatted data type columns, which then have the values inserted. Its a pita.

Conditionally branching in SQL based on the type of a variable

I'm selecting a value out of a table that can either be an integer or a nvarchar. It's stored as nvarchar. I want to conditionally call a function that will convert this value if it is an integer (that is, if it can be converted into an integer), otherwise I want to select the nvarchar with no conversion.
This is hitting a SQL Server 2005 database.
select case
when T.Value (is integer) then SomeConversionFunction(T.Value)
else T.Value
end as SomeAlias
from SomeTable T
Note that it is the "(is integer)" part that I'm having trouble with. Thanks in advance.
UPDATE
Check the comment on Ian's answer. It explains the why and the what a little better. Thanks to everyone for their thoughts.
select case
when ISNUMERIC(T.Value) then T.Value
else SomeConversionFunction(T.Value)
end as SomeAlias
Also, have you considered using the sql_variant data type?
The result set can only have one type associated with it for each column, you will get an error if the first row converts to an integer and there are strings that follow:
Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the nvarchar value 'word' to data type int.
try this to see:
create table testing
(
strangevalue nvarchar(10)
)
insert into testing values (1)
insert into testing values ('word')
select * from testing
select
case
when ISNUMERIC(strangevalue)=1 THEN CONVERT(int,strangevalue)
ELSE strangevalue
END
FROM testing
best bet is to return two columns:
select
case
when ISNUMERIC(strangevalue)=1 THEN CONVERT(int,strangevalue)
ELSE NULL
END AS StrangvalueINT
,case
when ISNUMERIC(strangevalue)=1 THEN NULL
ELSE strangevalue
END AS StrangvalueString
FROM testing
or your application can test for numeric and do your special processing.
You can't have a column that is sometimes an integer and sometimes a string. Return the string and check it using int.TryParse() in the client code.
ISNUMERIC. However, this accepts +, - and decimals so more work is needed.
However, you can't have the columns as both datatypes in one go: you'll need 2 columns.
I'd suggest that you deal with this in your client or use an ISNUMERIC replacement
IsNumeric will get you part of the way there. You can then add some further code to check whether it is an integer
for example:
select top 10
case
when isnumeric(mycolumn) = 1 then
case
when convert(int, mycolumn) = mycolumn then
'integer'
else
'number but not an integer'
end
else
'not a number'
end
from mytable
To clarify some other answers, your SQL statement can't return different data types in one column (it looks like the other answers are saying you can't store different data types in one column - yours are all strign represenations).
Therefore, if you use ISNUMERIC or another function, the value will be cast as a string in the table that is returned anyway if there are other strigns being selected.
If you are selecting only one value then it could return a string or a number, however your front end code will need to be able to return the different data types.
Just to add to some of the other comments about not being able to return different data types in the same column... Database columns should know what datatype they are holding. If they don't then that should be a BIG red flag that you have a design problem somewhere, which almost guarantees future headaches (like this one).