How do I remove the ".000000" part of the "2386.000000" field? I want to leave only the numerical part before the dot in databricks
You can use cast
select cast(2386.000000 as int) i
There are many ways to convert a float to an int. The cast() function is just one.
Please see the following link for all supported Spark SQL functions.
https://spark.apache.org/docs/2.3.0/api/sql/index.html#round
In my solution, I use the round() function. As long as you get the correct answer, the path you take may differ.
Related
I am new to SQLite. When I use MySQL, it's reasonable to use count(*)/5. However, in SQLite, I try to calculate count(Name)/5 but the result shows zero.
I don't know why this won't work. Is there any way to calculate this?
Because COUNT returns an integer, and 5 is an integer. As seen here: You will have to either cast it, or simply add a decimal to /5.0. Now it will no longer be limited to integers.
Cheers!
If the operands are both integers, SQLite does integer division. So, just make one operand a float:
SELECT COUNT(name)/5. FROM demo
I have a list of URLs/ Here is an example — www.site.com/product/item1/?utm_source=google&utm_medium=cpc
How I can get all characters before question mark using BigQuery? Sо I want to get www.site.com/product/item1/ from this string.
Thanks a lot!
The easiest way I think is to use SPLIT function as in below example
SPLIT(url, '?')[OFFSET(0)]
As alternative, you can use REGEXP_EXTRACT as in below example
REGEXP_EXTRACT(url, r'[^?]*')
you can use the REGEXP_EXTRACT function. You will have to create the regexp expression though.
Furthermore, you could use Dataflow to transform the data as another option.
I'm working on making a simple SQL UDF, to cast from char/varchar to time.
Since it is supposed to be something generic, I wanted the user to specify the format his input is, so I could just use something like
cast(user_time as time(0) format user_format)
but it doesn't work.
I'd like to know if it is possible to use a format as an UDF parameter and if it is, how it should be used in this cast. I could break it apart to read the format and make the time match, but I'd rather avoid it if there's a simpler way to do it.
Just to clarify, I'm using Teradata 14, and I have to use SQL, so an UDF in C is not really an option for me.
Thanks in advance
You can't pass a FORMAT as a parameter to a CAST.
You might be able to do this using TO_TIMESTAMP instead, but why do you need a UDF for that?
myUDF('12:34:45', 'hh:mi:ss')
is not much shorter than an old style Teradata CAST:
hh:mi:ss' (time(0), FORMAT 'hh:mi:ss')
I've got this value '0310D45'
I'm using isnumeric to check if values are numeric prior to casting to a bigint. Unfortunately this value is passing the isnumeric check. So my query is failing saying:
Msg 8114, Level 16, State 5, Line 3
Error converting data type varchar to bigint.
What is the simplest way to handle this. I was thinking of using charindex but I would have to check all 26 letters.
Is there a simple solution that I'm not seeing? I really don't want to create a user defined function.
Thanks
Take a look at this article named What is wrong with IsNumeric()? which contains the following abstract:
Abstract: T-SQL's ISNUMERIC() function has a problem. It can falsely interpret
non-numeric letters and symbols (such as D, E, and £), and even tabs
(CHAR(9)) as numeric.
Unfortunately it looks like IsNumeric is just plain weird and you will have to write a few lines of T-SQL to get around it. (By weird I mean that IF the data evaluated can be converted into ANY numeric type at all, the it will get converted.)
I recently faced this problem, and was looking for solution. I think I found two, and wanted to post them here so that its easier for others to find.
First solution is to use regular expression and SQLServer function PATINDEX()
IF PATINDEX('%[^0-9]%', #testString) = 0
Second solution is to concatenate a string 'e0' to your test string and still use SQLServer function ISNUMERIC() with the concatenated string. ISNUMERIC fails to detect presence of characters such as d, e, x because of different notations used in the numeric formats, but it still allows only a single character. Thus concatenating 'e0' prevents the function from giving you a false true, when ever required.
IF (ISNUMERIC (#testString + 'e0') = 1)
Hope this helps
Have a look at this SO question for several alternative suggestions to the SQL Server ISNUMERIC().
I believe Erland has this as a connect item on his wishlist as well - something he calls is_valid_convert().
What is the general guidance on when you should use CAST versus CONVERT? Is there any performance issues related to choosing one versus the other? Is one closer to ANSI-SQL?
CONVERT is SQL Server specific, CAST is ANSI.
CONVERT is more flexible in that you can format dates etc. Other than that, they are pretty much the same. If you don't care about the extended features, use CAST.
EDIT:
As noted by #beruic and #C-F in the comments below, there is possible loss of precision when an implicit conversion is used (that is one where you use neither CAST nor CONVERT). For further information, see CAST and CONVERT and in particular this graphic: SQL Server Data Type Conversion Chart. With this extra information, the original advice still remains the same. Use CAST where possible.
Convert has a style parameter for date to string conversions.
http://msdn.microsoft.com/en-us/library/ms187928.aspx
To expand on the above answercopied by Shakti, I have actually been able to measure a performance difference between the two functions.
I was testing performance of variations of the solution to this question and found that the standard deviation and maximum runtimes were larger when using CAST.
*Times in milliseconds, rounded to nearest 1/300th of a second as per the precision of the DateTime type
CAST is standard SQL, but CONVERT is only for the dialect T-SQL. We have a small advantage for convert in the case of datetime.
With CAST, you indicate the expression and the target type; with CONVERT, there’s a third argument representing the style for the conversion, which is supported for some conversions, like between character strings and date and time values. For example, CONVERT(DATE, '1/2/2012', 101) converts the literal character string to DATE using style 101 representing the United States standard.
Something no one seems to have noted yet is readability. Having…
CONVERT(SomeType,
SomeReallyLongExpression
+ ThatMayEvenSpan
+ MultipleLines
)
…may be easier to understand than…
CAST(SomeReallyLongExpression
+ ThatMayEvenSpan
+ MultipleLines
AS SomeType
)
CAST uses ANSI standard. In case of portability, this will work on other platforms. CONVERT is specific to sql server. But is very strong function. You can specify different styles for dates
You should also not use CAST for getting the text of a hash algorithm. CAST(HASHBYTES('...') AS VARCHAR(32)) is not the same as CONVERT(VARCHAR(32), HASHBYTES('...'), 2). Without the last parameter, the result would be the same, but not a readable text. As far as I know, You cannot specify that last parameter in CAST.