Remove a comma in a number in Redshift

Remove a comma in a number in Redshift - sql

My data contains numbers like 100,000.89 and so on. What function should I use in Redshift to remove the comma and keep it like 100000.89? Do we write the function while creating a table since it is at column level or after its creation and then post process the table?

To remove commas from text columns, use replace():
select replace(col, ',', '')
from t
EDIT : In case of null data use coalesce() :
select coalesce(replace(col,',', ''), '')
from t

I just added all the columns in my insert query with coalesce since all of them somewhere had null values and it worked like a charm. The redshift error for missing data in not null fields is misleading here as mentioned : https://forums.aws.amazon.com/thread.jspa?threadID=119640
I changed my copy command too, and added BLANKSASNULL. This worked! Thanks for all your help. Below is my command:
insert into test.t_final
(select
coalesce(project_number) as project_number,
coalesce(contract_po) as contract_po,
coalesce(tracking_date) as tracking_date,
(coalesce(replace(amount,',',''))) as amount,
(coalesce(replace(tax,',',''))) as tax,
(coalesce(replace(contract_value,',',''))) as contract_value,
coalesce(comments) from test.t)

Related

Filter unwanted characters from a column in Teradata

I have a Phone number column in my table with values only being numbers and no special characters. for one of the column I got a value coming in as ":1212121212".
I will need to filter this record and any records coming in with any special characters in teradata. Can anyone help on this.
I have tried the below solutions but it is not working
where (REGEXP_SUBSTR(column_name, '[0-9]+')<>1 or column_name is null )

In MS SQL Server DB's, you can use TRYCAST to find those entries having non numeric characters:
SELECT column_name
FROM yourtable
WHERE TRY_CAST(column_name AS INT) IS NULL;
In Teradata DB's, you can use TO_NUMBER:
SELECT column_name
FROM yourtable
WHERE TO_NUMBER(column_name) IS NULL;
If you want to stay close to your attempt, can use LIKE to find not numeric entries:
SELECT column_name
FROM yourtable
WHERE column_name LIKE '%[^0-9]%';
Note this could get slow when your table has very many rows.

Thanks Jonas. Since I need only numeric values and the length should be 10, I tried the below and it worked. This would ignore all the additional special characters.
(regexp_similar(Column,'[0-9]{10}')=1)

SQL, update all values trimming the start

I have a column type in my Postgresql db with a lot of values like Content::News or Content::Video.
I need to strip Content:: for every value in the column
Can I use?
select ltrim(type, 'Content::')

You could use replace:
select replace(type, 'Content::', '')

SQL SELECT DISTINCT and GROUP BY both produce duplicates

I have a decent size table with 20+ columns and almost 3 million rows, and I want to select all the unique values from a single column and enter them into a newly created table. After research, I have attempted this using both the DISTINCT and GROUP BY approaches, but both are producing duplicate values. Furthermore, I've set the new column in the new table as a Primary Key, which I don't believe should allow duplicate values.
I'm definitely a beginner here, so perhaps there is something simple I'm doing wrong. Here's some sample code:
Using GROUP BY
INSERT INTO ResourceGroups(ResourceGroup)
SELECT ResourceGroup
FROM dbo.UsageData
WHERE ResourceGroup IS NOT NULL
GROUP BY ResourceGroup
Using DISTINCT
INSERT INTO ResourceGroups(ResourceGroup)
SELECT DISTINCT ResourceGroup
FROM dbo.UsageData
WHERE ResourceGroup IS NOT NULL
The results of both of these seem to be the same. Here's a sample of the first few rows:
ResourceGroup
aiiInnovationTime
Api-Default-Central-US
Api-Default-Central-US
applicationinsights
applicationinsights
azurefunctions-southeastasia
azurefunctions-southeastasia
The query resulted in 532 rows, and it clearly eliminated some duplicates after consolidating down from 3 million. However, there are obviously still duplicates here, and they also successfully inserted into a primary key column which shouldn't allow duplicate. Furthermore there's a blank row despite my attempt to filter out NULLs (though maybe there's a space or something there?). Needless to say, I'm a bit confused about what I'm doing wrong, and would greatly appreciate any help that this community can provide!

Both the queries you mentioned should give you unique results, the anomaly however, is due to may be leading or trailing white-spaces.
Depending on the DB you can modify the query for e.g.
For Oracle DB: You can use TRIM function which removes both leading and trailing white-spaces.
SQL Server Don't have single function you have to use LTRIM and RTRIM to remove spaces.

Assuming there are spaces in your data
SELECT DISTINCT
REPLACE(REPLACE(REPLACE(REPLACE(ResourceGroup, CHAR(13) + CHAR(10), ' ... '),
CHAR(10) + CHAR(13), ' ... '), CHAR(13), ' '), CHAR(10), ' ... ')
FROM dbo.UsageData
WHERE LTRIM(RTRIM(ResourceGroup)) IS NOT NULL
LTRIM trims leading spaces and RTRIM trims trailing spaces. Try this out and see if it works!

As Chetan Ranpariya mentioned, checked leading and trailing spaces. The way you do it depends on the SQL engine. For instance, in MySQL you can use https://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_trim.

SQL Replace comma in results without using replace

I feel like this should be simple enough to do, but have not found any solutions that didn't use replace so far. I have the following select statement I am running, and for some of the columns there are commas separating the values. I would like to replace these commas with semicolons, however I only want to do it in the select statement. I don't want it to alter the values in the tables at all. This is not a one off statement either, or I'd just replace all the commas with semicolons and then revert back.
SELECT a.Category_Id, a.Category_Name, ISNULL(b.Category_Alias, '') as Category_Alias,
ISNULL(b.SUPPORT_NAMES, '') as SUPPORT_NAMES
FROM Categories a
INNER JOIN CategoryInfo b on b.Category_Id=a.Category_Id
For the Category_Alias column, the records are actually stored like CS, Customer Support and I want that to show up as CS; Customer Support just for the select statement.

I believe you may be confused as to what the REPLACE function is doing. You can use REPLACE within your SELECT statement without altering the data in the database:
SELECT REPLACE(MyField, ',', ';') AS NewFieldName
FROM MyTable

I believe you don't want to replace the value physically in the table, but ok to replace on select
So you can
Select REPLACE(ColumnName,',',';')
From TableName

Most SQL servers implement an inline replace function. Most of them are named replace(), and can also be used in a select statement.
Example from MySQL:
SELECT field, REPLACE(field,',',';') FROM my_table;

SQL Server: Best way to concatenate multiple columns?

I am trying to concatenate multiple columns in a query in SQL Server 11.00.3393.
I tried the new function CONCAT() but it's not working when I use more than two columns.
So I wonder if that's the best way to solve the problem:
SELECT CONCAT(CONCAT(CONCAT(COLUMN1,COLUMN2),COLUMN3),COLUMN4) FROM myTable
I can't use COLUMN1 + COLUMN2 because of NULL values.
EDIT
If I try SELECT CONCAT('1','2','3') AS RESULT I get an error
The CONCAT function requires 2 argument(s)

Through discourse it's clear that the problem lies in using VS2010 to write the query, as it uses the canonical CONCAT() function which is limited to 2 parameters. There's probably a way to change that, but I'm not aware of it.
An alternative:
SELECT '1'+'2'+'3'
This approach requires non-string values to be cast/converted to strings, as well as NULL handling via ISNULL() or COALESCE():
SELECT ISNULL(CAST(Col1 AS VARCHAR(50)),'')
+ COALESCE(CONVERT(VARCHAR(50),Col2),'')

SELECT CONCAT(LOWER(LAST_NAME), UPPER(LAST_NAME)
INITCAP(LAST_NAME), HIRE DATE AS ‘up_low_init_hdate’)
FROM EMPLOYEES
WHERE HIRE DATE = 1995

Try using below:
SELECT
(RTRIM(LTRIM(col_1))) + (RTRIM(LTRIM(col_2))) AS Col_newname,
col_1,
col_2
FROM
s_cols
WHERE
col_any_condition = ''
;

Blockquote
Using concatenation in Oracle SQL is very easy and interesting. But don't know much about MS-SQL.
Blockquote
Here we go for Oracle :
Syntax:
SQL> select First_name||Last_Name as Employee
from employees;
Result: EMPLOYEE
EllenAbel
SundarAnde
MozheAtkinson
Here AS: keyword used as alias.
We can concatenate with NULL values.
e.g. : columnm1||Null
Suppose any of your columns contains a NULL value then the result will show only the value of that column which has value.
You can also use literal character string in concatenation.
e.g.
select column1||' is a '||column2
from tableName;
Result: column1 is a column2.
in between literal should be encolsed in single quotation. you cna exclude numbers.
NOTE: This is only for oracle server//SQL.

for anyone dealing with Snowflake
Try using CONCAT with multiple columns like so:
SELECT
CONCAT(col1, col2, col3) AS all_string_columns_together
, CONCAT(CAST(col4 AS VARCHAR(50), col1) AS string_and_int_column
FROM table

If the fields are nullable, then you'll have to handle those nulls. Remember that null is contagious, and concat('foo', null) simply results in NULL as well:
SELECT CONCAT(ISNULL(column1, ''),ISNULL(column2,'')) etc...
Basically test each field for nullness, and replace with an empty string if so.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Remove a comma in a number in Redshift - sql

My data contains numbers like 100,000.89 and so on. What function should I use in Redshift to remove the comma and keep it like 100000.89? Do we write the function while creating a table since it is at column level or after its creation and then post process the table?

To remove commas from text columns, use replace(): select replace(col, ',', '') from t EDIT : In case of null data use coalesce() : select coalesce(replace(col,',', ''), '') from t

Related

Filter unwanted characters from a column in Teradata

SQL, update all values trimming the start

SQL SELECT DISTINCT and GROUP BY both produce duplicates

SQL Replace comma in results without using replace

SQL Server: Best way to concatenate multiple columns?

Categories

Resources