SQL query that prevents Excel from converting long integer to scientific notation - sql

So it's been a long time since I've done anything fancy with SQL, so I'm going to do my best to explain. Please be nice, I'm trying my best here.
Basically, I'm pulling information from a database in Snowflake and putting it into a new XML file, and that data is input exactly as-written into a form email.
One of the values is an ID number that's 14 characters long (example: 12345678912345), which is stored in the database as an integer (or so I'm told), but Excel keeps automatically converting it into scientific notation. Since it's an ID number, it needs to look like an ID number, not scientific notation.
Right now, my query just selects & inputs the regular ol' value, and then we manually change it in the Excel sheet. Like literally just SELECT ID_Number from TheThing
One thing I thought might work is:
SELECT CAST(ID_Number as bigint) as ID_Number
... But it doesn't work. Most other solutions I've found don't seem to address my specific scenario of unwanted integer-to-string conversion & I'm distraught.
I'm just an intern and this might have a very obvious answer, but my fellow interns have given up on it and I need to find the answer for my own sanity. It's been a minute since I did anything fancy with SQL so please be nice to me and sorry if this is a dumb question.

In Snowflake, BIGINT and INT(EGER) are the same thing, what you want is VARCHAR. As Ross mentioned in his comment, this is likely just a formatting issue within Excel. In Excel any value can be cast as a string by including a single quote ' at the beginning of the value, or by using the Text-to-Column feature.
If you wanted to try to format it out of Snowflake as a string, casting it might not do the trick unless you include some kind of additional string character.
To get this type of formatting out of Snowflake, you can try:
SELECT '\'' || CAST(ID_Number AS VARCHAR) as ID_Number;

Related

Interpret numeric field as a string in SQL

I have a 64-bit integer field in my Postgres database, which is populated with 64 bit integer numbers. (Non) coincidentally, those numbers are actually 8-chars strings in ASCII format, little endian. For example, a number 5208208757389214273 is a numeric representation of a string "ABCDEFGH": it is 0x4847464544434241 in hex, where 0x41 is A, 0x42 is B, 0x43 is C and so forth.
I would like to convert those numbers purely for display purposes - i.e. find a way to leave them as numbers in the database, but be able to see them as strings when querying. Is there any way to do it in SQL? If not in SQL, is there anything I can do on the server side (install extensions, stored procedures, anything at all) which would allow this? This problem is trivially solvable with any script or programming language, but I do not know how to solve it with SQL.
P.S. And just one more time for some of trigger-happy duplicate-hammer-yielding bunch - this is not a question of translating number like 5208208757389214273 to string "5208208757389214273" (we have a lot of answers on how to do this, but this is not what I am looking for).
Use to_hex() to get a hexadecimal representation for the number. Then use decode() to turn it into a bytea. (Unfortunately I did not find any direct way from bigint to bytea.) Cast that to text and reverse() it, because of the endianess.
reverse(decode(to_hex(5208208757389214273), 'hex')::text)
ABCDEFGH
The bytea_output must be set to 'escape' for this to work properly -- use SET bytea_output = 'escape';.
(Tested on versions 9.4 and 9.6.)
An alternative way to achieve the same rsult without using SET is following:
select reverse(encode(decode(to_hex(5208208757389214273),'hex'),'escape'))

Exporting values from SQL Server to Excel

I live in Brasil and decimal separators are commas. For a bunch of reasons, I use dots as decimal separators in SQL Server, which is different from Excel.
With that being said, I would like to know why the following query
select 1.0*5
is understood as text in Excel (if so), when copying and pasting, and dots are not converted to commas, while
select cast(1.0*5 as float)
is understood as float in Excel.
What is the type of result in the first query?
UPDATE
If the query were
select 1.1*5
the result of copy and paste in Excel cell would be 5.5. It is not possible to convert this to value in Excel.
While the second query would result in 5,5. I can use the use this value in Excel in an addition operation, for example.
If you're doing it directly IN Excel, it seems that your regional settings are not seeing that as an operation with a decimal, but rather text. If you change your regional settings to US, it would probably resolve it correctly.
The difference between the two is that you are literally telling the value to be cast differently than the default. So your regional setting is overridden.
Excel, as smart as it is, tends to make many assumptions that could be tied to any number of things. Sometimes you just have to deal with it.
In the end, your 2nd query is likely to produce better results.

SQL Parse NVARCHAR Field

I am loading data from Excels into database on SQL Server 2008. There is one column which is in nvarchar data type. This field contains the data as
Text text text text text text text text text text.
(ABC-2010-4091, ABC-2011-0586, ABC-2011-0587, ABC-2011-0604)
Text text text text text text text text text text.
(ABC-2011-0562, ABC-2011-0570, ABC-2011-0575, ABC-2011-0588)
so its text with many sentences of this kind.
For each row I need to get the data ABC-####-####, respectivelly I only need the last part. So e.g. for ABC-2010-4091 I need to obtain 4091. This number I will need to join to other table. I guess it would be enough to get the last parts of the format ABC-####-####, then I should be able to handle the request.
So the example of given above, the result should be 4091, 0586, 0587, 0604, 0562, 0570, 0575, 0588 in the row instead of the whole nvarchar value field.
Is this possible somehow? The text in the nvarchar field differ, but the text format (ABC-####-####) I want to work with is still the same. Only the count of characters for the last part may vary so its not only 4 numbers, but could be 5 or more.
What is the best approach to get these data? Should I parse it in SSIS or on the SQL server side with SQL Query? And how?
I am aware this is though task. I appreciate every help or advice how to deal with this. I have not tried anything yet as I do not know where to start. I read articles about SQL parsing, but I want to ask for best approach to deal with this task.
Stackoverflow is about programming.
Sit down and start programming.
Ok, seriously. That is string parsing and the last part in brackets with multiple fields means no bulk import, it is not a standard CSV file.
Either you use SSIS in SQL Server and program the parsing there or.... you write a program for that.
String maniupation in SQL is the worst part of the language and I would avoid it.
So, yes, sit down and program a routine. Probable the fastest way.
If I understand correctly, "ABS-####-####" will be the value coming through in the column and the numeric part is variable in length.
If that is the case, maybe this will work.
Use a "Derived Column" transformation.
Lets say we call "ABC-####-####" = Column1
SUBSTRING("Column1",(FINDSTRING("Column1","-",2)+1),LEN(Column1)-(FINDSTRING("Column1","-",2)))
If I am not mistaken, that should give you the last # values in a new column no matter how long that value is.
HTH
I have worked this problem out with the following guides:
Split Multi Value Column into Multiple Records &
Remove Multiple Spaces with Only One Space

RIGHT function, not returning whats expected?

Query:
SELECT StartDate, EndDate, RIGHT(Sector, 1 )
FROM Table1
ORDER BY Right(Sector, 1), StartDate
By looking at this, the query should order everything by sector, followed by the start date. This query has worked for quiet awhile until yesterday where it did not order it properly, for some reason, Sector 2 came before Sector 1.
The data type for Sector is of type int, not null. After inserting a TRIM function into Sector it seems to work fine afterwards.
New Query:
SELECT StartDate, EndDate, RIGHT(Sector, 1 )
FROM Table1
ORDER BY Right(TRIM(Sector), 1), StartDate
Which I found really weird since it's suppose to only pick out one character, so why is there leading spaces?
Is there an issue with using RIGHT function on a int before converting the type? Or is it something else?
Thanks for the help everyone!
-Edit- The RIGHT function should return either 1,2,3 or 4 however when ordering it, 2 comes before 1.
To clarify, the column Sector contains an int value, we can determine it's location by obtaining the last digit (Which is why the previous coder did)
MS Access 2003 has a curious little feature (I can't speak for the other versions):
Make a simple query. Sort by Column A Ascending. Save the query.
Run the query. When you see the output, sort by Column A Descending using the toolbar option (see pic below). Save & close.
Run the query again. Your new sort will have overridden the sort that you saved in the query.
I think you or someone else probably just opened the query out of curiosity, sorted by Sector Descending, and when prompted to save Design Changes, you chose Yes (even though technically you didn't make any). The easiest way I found to restore the original sort is to edit the query and save it.
You've got your data stored wrong if you need to sort on a subcharacter of a numeric field.
That said, in certain context, VBA functions reserve a space in string representations of numbers for the sign. A nonsensical example of this would be:
?Len("12345")
5
Notice the space at the beginning (where the - would be if the number returned by Len() could be negative). I thought this was a result of coercing a number to a string value, but that's not it, and I couldn't replicate the problem. But that would likely be the source of the problem, and, of course, trimming off the leading space would take care of the issue.
But that's two function calls for each line, and then you're sorting by it, and that means no use of indexes, so it's going to be slow relative to a SORT BY that can use indexes. So, I'd conclude you have a schema error, in that you're giving meaning to a subpart of the data stored in the field.
It seems pretty obvious that you have a blank space at the end of the Sector field that the trim is removing.

SQL Server xml string parsing in varchar field

I have a varchar column in a table that is used to store xml data. Yeah I know there is an xml data type that I should be using, but I think this was set up before the xml data type was available so a varchar is what I have to use for now. :)
The data stored looks similar to the following:
<xml filename="100100_456_484351864768.zip"
event_dt="10/5/2009 11:42:52 AM">
<info user="TestUser" />
</xml>
I need to parse the filename to get the digits between the two underscores which in this case would be "456". The first part of the file name "shouldn't" change in length, but the middle number will. I need a solution that would work if the first part does change in length (you know it will change because "shouldn't change" always seems to mean it will change).
For what I have for now, I'm using XQuery to pull out the filename because I figured this is probably the better than straight string manipulation. I cast the string to xml to do this, but I'm not an XQuery expert so of course I'm running into issues. I found a function for XQuery (substring-before), but was unable to get it to work (I'm not even sure that function will work with SQL Server). There might be an XQuery function to do this easily, but if there is I am unaware of it.
So, I get the filename from the table with a query similar to the following:
select CAST(parms as xml).query('data(/xml/#filename)') as p
from Table1
From this I'd assume that I'd be able to CAST this back to a string then do some instring or charindex function to figure out where the underscores are so that I can encapsulate all of that in a substring function to pick out the part I need. Without going too far into this I am pretty sure that I can eventually get it done this way, but I know that there has to be an easier way. This way would make a huge unreadable field in the SQL Statement which even if I moved it to a function would still be confusing to try to figure out what is going on.
I'm sure there is an easier than this since it seems to be simple string manipulation. Perhaps someone can point me in the right direction. Thanks
You can use XQuery for this - just change your statement to:
SELECT
CAST(parms as xml).value('(/xml/#filename)[1]', 'varchar(260)') as p
FROM
dbo.Table1
That gives you a VARCHAR(260) long enough to hold any valid file name and path - now you have a string and can work on it with SUBSTRING etc.
Marc
The straightforward way to do this is with SUBSTRING and CHARINDEX. Assuming (wise or not) that the first part of the filename doesn't change length, but that you still want to use XQuery to locate the filename, here's a short repro that does what you want:
declare #t table (
parms varchar(max)
);
insert into #t values ('<xml filename="100100_456_484351864768.zip" event_dt="10/5/2009 11:42:52 AM"><info user="TestUser" /></xml>');
with T(fName) as (
select cast(cast(parms as xml).query('data(/xml/#filename)') as varchar(100)) as p
from #t
)
select
substring(fName,8,charindex('_',fName,8)-8) as myNum
from T;
There are sneaky solutions that use other string functions like REPLACE and PARSENAME or REVERSE, but none is likely to be more efficient or readable. One possibility to consider is writing a CLR routine that brings regular expression handling into SQL.
By the way, if your xml is always this simple, there's no particular reason I can see to use XQuery at all. Here are two queries that will extract the number you want. The second is safer if you don't have control over extra white space in your xml string or over the possibility that the first part of the file name will change length:
select
substring(parms,23,charindex('_',parms,23)-23) as myNum
from #t;
select
substring(parms,charindex('_',parms)+1,charindex('_',parms,charindex('_',parms)+1)-charindex('_',parms)-1) as myNum
from #t;
Unfortunately, SQL Server is not a conformant XQuery implementation - rather, it's a fairly limited subset of a draft version of XQuery spec. Not only it doesn't have fn:substring-before, it also doesn't have fn:index-of to do it yourself using fn:substring, nor fn:string-to-codepoints. So, as far as I can tell, you're stuck with SQL here.