Snowflake automatically rounding number during COPY INTO transformation

Snowflake automatically rounding number during COPY INTO transformation - sql

I am using an AWS S3 stage to load .csv data into my Snowflake database.
The .csv columns are as follows:
My COPY INTO command is this:
copy into MY_TABLE(tot_completions, tot_hov, parent_id)
from (select t.$1, to_decimal(REPLACE(t.$2, ',')), 1 from #my_stage t)
pattern='.*file_path.*' file_format = my_file_format ON_ERROR=CONTINUE;
The Tot. HOV column is being automatically rounded to 40 and 1 respectively. The data type is decimal, and I tried it as a float as well, even though they should both be able to store decimals.
My desired result is to store the decimal as is displayed on the .csv without rounding. Any help would be greatly appreciated.

You need to specify the precision and scale:
create or replace table number_conv(expr varchar);
insert into number_conv values ('12.3456'), ('98.76546');
select expr, to_number(expr), to_number(expr, 10, 1), to_number(expr, 10, 8) from number_conv;
+----------+-----------------+------------------------+------------------------+
| EXPR | TO_NUMBER(EXPR) | TO_NUMBER(EXPR, 10, 1) | TO_NUMBER(EXPR, 10, 8) |
|----------+-----------------+------------------------+------------------------|
| 12.3456 | 12 | 12.3 | 12.34560000 |
| 98.76546 | 99 | 98.8 | 98.76546000 |
+----------+-----------------+------------------------+------------------------+
and:
select column1,
to_decimal(column1, '99.9') as d0,
to_decimal(column1, '99.9', 9, 5) as d5,
to_decimal(column1, 'TM9', 9, 5) as td5
from values ('1.0'), ('-12.3'), ('0.0'), (' - 0.1 ');
+---------+-----+-----------+-----------+
| COLUMN1 | D0 | D5 | TD5 |
|---------+-----+-----------+-----------|
| 1.0 | 1 | 1.00000 | 1.00000 |
| -12.3 | -12 | -12.30000 | -12.30000 |
| 0.0 | 0 | 0.00000 | 0.00000 |
| - 0.1 | 0 | -0.10000 | -0.10000 |
+---------+-----+-----------+-----------+
See more here

Related

Total Percentage Not Adding up to 100 (postgresql)

Need your feedback please. I am struggling with the following code to make percentage equal to 100.
SELECT animaltype, size, SUM(total) AS total,
ROUND(( SUM(total) * 100 / SUM( SUM(total)) OVER ()),2) AS percentage
FROM animals
WHERE sponsored_animalid IS NULL
GROUP BY animaltype, size
ORDER BY animaltype, size DESC;
This is the output, which equates to 99.99% hence making the query incorrect.
I need the percentage column to be rounded upto 2 decimal places, but total needs to add up to 100. I dont know what is the bug?
As soon as I edit ROUND( ...,3) - the code adds up to 100. But I need the figures to be rounded up to 2 decimal places strictly.
Here is the output when I round up to 3 decimal places and total is 100:

Your SQL statement is syntactically incorrect, but I get what you mean.
That's not a perfect solution, but perhaps you can reduce the average rounding error by using a rounding function that round to the nearest even number rather than away from zero:
CREATE FUNCTION myround(numeric, integer) RETURNS numeric
LANGUAGE sql IMMUTABLE AS
'SELECT CASE WHEN abs($1 * 10::numeric ^ $2 % 1) = 0.5
THEN round($1 * 10::numeric ^ $2 / 2, 0)
* 2 / 10::numeric ^ $2
ELSE round($1, $2)
END';
For a really perfect solution you would have to look at this question.

Possible practical solution to your issue is rounding the final aggregate by one decimal place less than the rounding on the previous more granular level
INSERT INTO animals (animaltype, size, total)
VALUES
('Bird', 'Small', 1615),('Bird', 'Medium', 3130),('Bird', 'Large', 8120),
('Cat', 'Small', 518015),('Cat', 'Medium', 250575),('Cat', 'Large', 439490),
('Dog', 'Small', 336680),('Dog', 'Medium', 942095),('Dog', 'Large', 978115);
SELECT
animaltype,
size,
total,
percentage_exact,
percentage_rd2,
sum(percentage_exact) over () as all_perc_exact_sum,
sum(percentage_rd2) over () as all_perc_rd2_sum,
round(sum(percentage_rd2) over () ,1) as all_perc_rd2_sum_rd1
FROM
(
SELECT
animaltype,
size,
total,
100.0 * total / (SUM(total) over ()) as percentage_exact,
round(100.0 * total / (SUM(total) over ()),2) as percentage_rd2
FROM animals
) s
| animaltype | size | total | percentage_exact | percentage_rd2 | all_perc_exact_sum | all_perc_rd2_sum | all_perc_rd2_sum_rd1 |
|------------|--------|--------|----------------------|----------------|--------------------|------------------|----------------------|
| Bird | Small | 1615 | 0.046436935622305255 | 0.05 | 100 | 99.99 | 100 |
| Bird | Medium | 3130 | 0.08999851919369378 | 0.09 | 100 | 99.99 | 100 |
| Bird | Large | 8120 | 0.23347858653443881 | 0.23 | 100 | 99.99 | 100 |
| Cat | Small | 518015 | 14.89475492655632 | 14.89 | 100 | 99.99 | 100 |
| Cat | Medium | 250575 | 7.204913401584607 | 7.2 | 100 | 99.99 | 100 |
| Cat | Large | 439490 | 12.636884728573955 | 12.64 | 100 | 99.99 | 100 |
| Dog | Small | 336680 | 9.68073528502646 | 9.68 | 100 | 99.99 | 100 |
| Dog | Medium | 942095 | 27.088547904084006 | 27.09 | 100 | 99.99 | 100 |
| Dog | Large | 978115 | 28.124249712824213 | 28.12 | 100 | 99.99 | 100 |
So here round first by 2 decimal place
round(100.0 * total / (SUM(total) over ()),2) as percentage_rd2
and then round the aggregate by 1 decimal place
round(sum(percentage_2) over () ,1) as all_perc_rd2_sum_rd1

How to fetch records from DB which fulfill a certain criteria

I have the following problem and wanted to ask if this is the correct way to do it or if there is a better way of doing it:
Assume I have the following table/data in my DB:
|---|----|------|-------------|---------|---------|
|id |city|street|street_number|lastname |firstname|
|---|----|------|-------------|---------|---------|
| 1 | ar | K1 | 13 |Davenport| Hector |
| 2 | ar | L1 | 27 |Cannon | Teresa |
| 3 | ar | A1 | 135 |Brewer | Izaac |
| 4 | dc | A2 | 8 |Fowler | Milan |
| 5 | fr | C1 | 18 |Kaiser | Ibrar |
| 6 | fr | C1 | 28 |Weaver | Kiri |
| 7 | ny | O1 | 37 |Petersen | Derrick |
I now get some some requests of the following structures: (city/street/street_number)
E.g.: {(ar,K1,13),(dc,A2,8),(ny,01,37)}
I want to retrieve the last name of the person living there. Since the request amount is quite large I don't want to run over all the request one-by-one. My current implementation is to insert the data into a temporary table and join the values.
Is this the right approach or is there some better way of doing this?

You can construct a query using in with tuples:
select t.*
from t
where (city, street, street_number) in ( (('ar', 'K1', '13'), ('dc', 'A2', '8'), ('ny', '01', '37') );
However, if the data starts in the database, then a temporary table or subquery is better than bringing the results back to the application and constructing such a query.

I think you can use the hierarchy query and string function as follows:
WITH YOUR_INPUT_DATA AS
(SELECT '(ar,K1,13),(dc,A2,8),(ny,01,37)' AS INPUT_STR FROM DUAL),
--
CTE AS
( SELECT REGEXP_SUBSTR(STR,'[^,]',1,2) AS STR1,
REGEXP_SUBSTR(STR,'[^,]',1,3) AS STR2,
REGEXP_SUBSTR(STR,'[^,]',1,4) AS STR3
FROM (SELECT SUBSTR(INPUT_STR,
INSTR(INPUT_STR,'(',1,LEVEL),
INSTR(INPUT_STR,')',1,LEVEL) - INSTR(INPUT_STR,'(',1,LEVEL) + 1) STR
FROM YOUR_INPUT_DATA
CONNECT BY LEVEL <= REGEXP_COUNT(INPUT_STR,'\),\(') + 1))
--
SELECT * FROM YOUR_TABLE WHERE (city,street,street_number)
IN (SELECT STR1,STR2,STR3 FROM CTE);

How To Check Numerical Format in SQL Server 2008

I am converting some existing Oracle queries to MSSQL Server (2008) and can't figure out how to replicate the following Regex check:
SELECT SomeField
FROM SomeTable
WHERE NOT REGEXP_LIKE(TO_CHAR(SomeField), '^[0-9]{2}[.][0-9]{7}$');
That finds all results where the format of the number starts with 2 positive digits, followed by a decimal point, and 7 decimal places of data: 12.3456789
I've tried using STR, CAST, CONVERT, but they all seem to truncate the decimal to 4 decimal places for some reason. The truncating has prevented me from getting reliable results using LEN and CHARINDEX. Manually adding size parameters to STR gets slightly closer, but I still don't know how to compare the original numerical representation to the converted value.
SELECT SomeField
, STR(SomeField, 10, 7)
, CAST(SomeField AS VARCHAR)
, LEN(SomeField )
, CHARINDEX(STR(SomeField ), '.')
FROM SomeTable
+------------------+------------+---------+-----+-----------+
| Orig | STR | Cast | LEN | CHARINDEX |
+------------------+------------+---------+-----+-----------+
| 31.44650944 | 31.4465094 | 31.4465 | 7 | 0 |
| 35.85609 | 35.8560900 | 35.8561 | 7 | 0 |
| 54.589623 | 54.5896230 | 54.5896 | 7 | 0 |
| 31.92653899 | 31.9265390 | 31.9265 | 7 | 0 |
| 31.4523333333333 | 31.4523333 | 31.4523 | 7 | 0 |
| 31.40208955 | 31.4020895 | 31.4021 | 7 | 0 |
| 51.3047869443893 | 51.3047869 | 51.3048 | 7 | 0 |
| 51 | 51.0000000 | 51 | 2 | 0 |
| 32.220633 | 32.2206330 | 32.2206 | 7 | 0 |
| 35.769247 | 35.7692470 | 35.7692 | 7 | 0 |
| 35.071022 | 35.0710220 | 35.071 | 6 | 0 |
+------------------+------------+---------+-----+-----------+

What you want to do does not make sense in SQL Server.
Oracle supports a number data type that has a variable precision:
if a precision is not specified, the column stores values as given.
There is no corresponding data type in SQL Server. You have have a variable number (float/real) or a fixed number (decimal/numeric). However, both apply to ALL values in a column, not to individual values within a row.
The closest you could do is:
where somefield >= 0 and somefield < 100
Or if you wanted to insist that there is a decimal component:
where somefield >= 0 and somefield < 100 and floor(somefield) <> somefield
However, you might have valid integer values that this would filter out.

This answer gave me an option that works in conjunction with checking the decimal position first.
SELECT SomeField
FROM SomeTable
WHERE SomeField IS NOT NULL
AND CHARINDEX('.', SomeField ) = 3
AND LEN(CAST(CAST(REVERSE(CONVERT(VARCHAR(50), SomeField , 128)) AS FLOAT) AS BIGINT)) = 7
While I understand this is terrible by nearly all metrics, it satisfies the requirements.
The basis of checking formatting on this data type in inherently flawed as pointed out by several posters, however for this very isolated use case I wanted to document the workaround.

Alphanumberic output from ST_MakeLine

I'm trying to convert lat/lon to linestring. Basically, grouping the columns lat and lon, making a point, and creating a linestring.
Table:
+------------+----------+-----------+------------+---------+--------+
| link_id | seq_num | lat | lon | z_coord | zlevel |
+------------+----------+-----------+------------+---------+--------+
| "16777220" | "0" | "4129098" | "-7192948" | | 0 |
| "16777220" | "999999" | "4129134" | "-7192950" | | 0 |
| "16777222" | "0" | "4128989" | "-7193030" | | 0 |
| "16777222" | "1" | "4128975" | "-7193016" | | 0 |
| "16777222" | "2" | "4128940" | "-7193001" | | 0 |
| "16777222" | "3" | "4128917" | "-7192998" | | 0 |
| "16777222" | "4" | "4128911" | "-7193002" | | 0 |
+------------+----------+-----------+------------+---------+--------+
My code:
select link_id, ST_SetSRID(ST_MakeLine(ST_MakePoint((lon::double precision / 100000), (lat::double precision / 100000))),4326) as geometry
from public.rdf_link_geometry
group by link_id
limit 50
geometry output column example:
"0102000020E6100000020000004F92AE997CFB51C021E527D53EA54440736891ED7CFB51C021020EA14AA54440"
^^ What is this? how did it get formatted in such a way? I expected a linestring, something like
geometry
7.123 50.123,7.321 50.321
7.321 50.321,7.321 50.321
Data format for link_id is bingint, and for geometry it says geometry
SOLUTION:
select link_id, ST_AsText(ST_SetSRID(ST_MakeLine(ST_MakePoint(
(lon::double precision / 100000), (lat::double precision / 100000))),4326)) as geometry
from public.rdf_link_geometry
group by link_id
limit 50

The output is a geometry, which you can display as text using st_asText
select st_asText('0102000020E6100000020000004F92AE997CFB51C021E527D53EA54440736891ED7CFB51C021020EA14AA54440');
st_astext
--------------------------------------------------
LINESTRING(-71.92948 41.29098,-71.9295 41.29134)
That being said, should you have more than 2 points, you could order them to create a meaningful line:
select st_makeline(geom ORDER BY seqID) from tbl;

How do I convert HEXADECIMAL to DECIMAL in SAS?

I have:
A string with hexadecimal values every 4 positions
00F701C101C900EC01E001D2
I need:
Separate these values from 4 in 4 positions and convert to decimal numbers in this way:
247, 449, 457, 480, 466
My column can have up to 1200 hexadecimal positions
Can you help me?
Tks!!!

This works:
data out;
hex = "00F701C101C900EC01E001D2";
do while(hex ne "");
valHex = substr(hex, 1, 4);
hex = substr(hex, 5);
valDec = input(valHex, hex4.);
output;
end;
run;
but you'll want to add more error checking etc for your real solution.

Sorry, I was to fast. This is SQL-Server syntax, probably not working for you, but you might get an idea...
Try it like this:
DECLARE #YourString VARCHAR(100)='00F701C101C900EC01E001D2';
WITH Separated AS
(
SELECT CAST(LEFT(#YourString,4) AS VARCHAR(MAX)) AS SourceString
,CAST(SUBSTRING(#YourString,5,10000) AS VARCHAR(MAX)) AS RestString
UNION ALL
SELECT LEFT(RestString,4)
,SUBSTRING(RestString,5,10000)
FROM Separated
WHERE LEN(RestString)>=4
)
SELECT *
,CAST(sys.fn_cdc_hexstrtobin(SourceString) AS VARBINARY(2))
,CAST(CAST(sys.fn_cdc_hexstrtobin(SourceString) AS VARBINARY(2)) AS INT)
FROM Separated
The result
+--------------+----------------------+--------------------+--------------------+
| SourceString | RestString | (Kein Spaltenname) | (Kein Spaltenname) |
+--------------+----------------------+--------------------+--------------------+
| 00F7 | 01C101C900EC01E001D2 | 0x00F7 | 247 |
+--------------+----------------------+--------------------+--------------------+
| 01C1 | 01C900EC01E001D2 | 0x01C1 | 449 |
+--------------+----------------------+--------------------+--------------------+
| 01C9 | 00EC01E001D2 | 0x01C9 | 457 |
+--------------+----------------------+--------------------+--------------------+
| 00EC | 01E001D2 | 0x00EC | 236 |
+--------------+----------------------+--------------------+--------------------+
| 01E0 | 01D2 | 0x01E0 | 480 |
+--------------+----------------------+--------------------+--------------------+
| 01D2 | | 0x01D2 | 466 |
+--------------+----------------------+--------------------+--------------------+

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Snowflake automatically rounding number during COPY INTO transformation - sql

Related

Total Percentage Not Adding up to 100 (postgresql)

How to fetch records from DB which fulfill a certain criteria

How To Check Numerical Format in SQL Server 2008

Alphanumberic output from ST_MakeLine

How do I convert HEXADECIMAL to DECIMAL in SAS?

Categories

Resources