How to split a column value to multiple columns based on delimiter in Snowflake - sql

Need to split column value into multiple columns based on delimiter,
Also need to create columns dynamically based on no. of delimiters, delimiter could be comma or so. Thanks,

You might be better off working with an array in this case. You haven't specified whether each record will have the same number of delimiters, so dynamically creating columns for tables will take some scripting. If they are, you could potentially use a SPLIT, LATERAL FLATTEN, and PIVOT, but the pivot needs static column names, so you'll likely need a stored procedure to deal with that.
To answer your first question, you can use SPLIT and/or SPLIT_PART to split the column into values. The SPLIT function creates an array for you, while the SPLIT_PART function creates the array, but outputs a single value from the array.
https://docs.snowflake.com/en/sql-reference/functions/split.html
https://docs.snowflake.com/en/sql-reference/functions/split_part.html
https://docs.snowflake.com/en/sql-reference/functions/flatten.html
https://docs.snowflake.com/en/sql-reference/constructs/pivot.html

Related

Convert a comma separated string into differents rows in a single query in InfluxDB

I have a table in InfluxDB which, for optimization, contains a field that can either come only as it would be for example "FR204" or it can come in a string concatenated by commas as would be the case with "FR204, FR301".
The fact is that I would like to be able to do exactly what is reflected in this case: https://rstopup.com/convertir-una-cadena-separada-por-comas-en-cada-una-de-las-filas.html
Is it possible to do this in InfluxDB? Thanks.

Split multiple points in text format and switch coordinates in postgres column

I have a PostgreSQL column of type text that contains data like shown below
(32.85563, -117.25624)(32.855470000000004, -117.25648000000001)(32.85567, -117.25710000000001)(32.85544, -117.2556)
(37.75363, -121.44142000000001)(37.75292, -121.4414)
I want to convert this into another column of type text like shown below
(-117.25624, 32.85563)(-117.25648000000001,32.855470000000004 )(-117.25710000000001,32.85567 )(-117.2556,32.85544 )
(-121.44142000000001,37.75363 )(-121.4414,37.75292 )
As you can see, the values inside the parentheses have switched around. Also note that I have shown two records here to indicate that not all fields have same number of parenthesized figures.
What I've tried
I tried extracting the column to Java and performing my operations there. But due to sheer amount of records I have, I will run out of memory. I also cannot do this method in batched due to time constraints.
What I want
A SQL query or a sequence of SQL queries that will achieve the result that I have mentioned above.
I am using PostgreSQL9.4 with PGAdmin III as the client
this is a type of problem that should not be solved by sql, but you are lucky to use Postgres.
I suggest the following steps in defining your algorithm.
First part will be turning your strings into a structured data, second will transform structured data back to string in a format that you require.
From string to data
First, you need to turn your bracketed values into an array, which can be done with string_to_array function.
Now you can turn this array into rows with unnest function, which will return a row per bracketed value.
Finally you need to slit values in each row into two fields.
From data to string
You need to group results of the first query with results wrapped in string_agg function that will combine all numbers in rows into string.
You will need to experiment with brackets to achieve exactly what you want.
PS. I am not providing query here. Once you have some code that you tried, let me know.
Assuming you also have a PK or some unique column, and possibly other columns, you can do as follows:
SELECT id, (...), string_agg(point(pt[1], pt[0])::text, '') AS col_reversed
FROM (
SELECT id, (...), unnest(string_to_array(replace(col, ')(', ');('), ';'))::point AS pt
FROM my_table) sub
GROUP BY id; -- assuming id is PK or no other columns
PostgreSQL has the point type which you can use here. First you need to make sure you can properly divide the long string into individual points (insert ';' between the parentheses), then turn that into an array of individual points in text format, unnest the array into individual rows, and finally cast those rows to the point data type:
unnest(string_to_array(replace(col, ')(', ');('), ';'))::point AS pt
You can then create a new point from the point you just created, but with the coordinates reversed, turn that into a string and aggregate into your desired output:
string_agg(point(pt[1], pt[0])::text, '') AS col_reversed
But you might also move away from the text format and make an array of point values as that will be easier and faster to work with:
array_agg(point(pt[1], pt[0])) AS pt_reversed
As I put in the question, I tried extracting the column to Java and performing my operations there. But due to sheer amount of records I have, I will run out of memory. I also cannot do this method in batched due to time constraints.
I ran out of memory here as I was putting everything in a Hashmap of
< my_primary_key,the_newly_formatted_text >. As the text was very long sometimes and due to the sheer number of records that I had, it wasnt surprising that I got an OOM.
Solution that I used:
As suggested my many folks here, this solution was better solved with a code. I wrote a small script that formatted the text as per my liking and wrote the primary key and the newly formatted text to a file in tsv format. Then I imported the tsv in a new table and updated the original table from the new one.

Split comma-separated string in T-SQL using Coalesce()?

I "googled" and found a magical and elegant sql query to split a comma-separated input string into rows in a single column. Doing so allows a join instead of a "where in". It used a select into or insert into combined with a select, coalesce() and where to create rows, one for each value in the string.
There are plenty of examples using coalesce() to form a string but none (any more) to split it. I've also found this solution for the meantime:
http://www.sqlservercentral.com/articles/T-SQL/62867/
But I'm curious now, that I cannot "regoogle" for the gem I found before (about a year ago).
Has anyone seen how to split a string with coalesce()? If so, how does its performance compare to the various sql string splitters that have been studied and compiled?
COALESCE simply returns the first non-null value from a group of expressions. It would not perform any magical splitting of a delimited string.

In Oracle, how do you select multiple values from a related table and store them in a single column?

I'm selecting columns from one table and would like to select all values of a column from a related table when the two tables have a matching value, separate them by commas, and display them in a single column with my results from table one.
I'm fairly new to this and apologize ahead of time if I'm not wording it correctly.
It sounds like what you're trying to do is to take multiple rows and aggregate them into a single row by concatenating string values from one or more columns. Yes?
If that's the case, I can tell you that it's a more difficult problem than it seems if you want to do it using portable SQL - especially if you don't know ahead of time how many items you may get.
The Oracle-specific solution often used in such cases is to implement a custom aggregate function - STRAGG(). Here's a link to an article that describes exactly how to do so and has examples of it's usage.
If you're on Oracle 9i or later and are willing to live with using undocumented functions (that could change in the future), you can also look at the WM_CONCAT() function - which does much the same thing.
You want a row aggregation or concatenation function, choices are:
If you are using Oracle 11gR2, there is a built-in function to aggregate strings with a delimiter called LISTAGG(column, delimiter).
If you are using any earlier release of Oracle database, you can use WM_CONCAT(column) function, however you have no choice of delimiter and will have to use something like TRANSLATE(string, string_to_replace, replacement_string) function to change the delimiter afterwards if your data does not contain commas.
As mentioned by LBushkin, you can create a custom function in your schema to perform row aggregation for you. Here is PL/SQL code example for one: http://www.oracle-base.com/articles/misc/StringAggregationTechniques.php#user_defined_aggregate_function

Force numerical order on a SQL Server 2005 varchar column, containing letters and numbers?

I have a column containing the strings 'Operator (1)' and so on until 'Operator (600)' so far.
I want to get them numerically ordered and I've come up with
select colname from table order by
cast(replace(replace(colname,'Operator (',''),')','') as int)
which is very very ugly.
Better suggestions?
It's that, InStr()/SubString(), changing Operator(1) to Operator(001), storing the n in Operator(n) separately, or creating a computed column that hides the ugly string manipulation. What you have seems fine.
If you really have to leave the data in the format you have - and adding a numeric sort order column is the better solution - then consider wrapping the text manipulation up in a user defined function.
select colname from table order by dbo.udfSortOperator(colname)
It's less ugly and gives you some abstraction. There's an additional overhead of the function call but on a table containing low thousands of rows in a not-too-heavily hit database server it's not a major concern. Make notes in the function to optomise later as required.
My answer would be to change the problem. I would add an operatorNumber field to the table if that is possible. Change the update/insert routines to extract the number and store it. That way the string conversion hit is only once per record.
The ordering logic would require the string conversion every time the query is run.
Well, first define the meaning of that column. Is operator a name so you can justify using chars? Or is it a number?
If the field is a name then you will use chars, and then you would want to determine the fixed length. Pad all operator names with zeros on the left. Define naming rules for operators (I.E. No leters. Or the codes you would use in a series like "A001")
An index will sort the physical data in the server. And a properly define text naming will sort them on a query. You would want both.
If the operator is a number, then you got the data type for that column wrong and needs to be changed.
Indexed computed column
If you find yourself ordering on or otherwise querying operator column often, consider creating a computed column for its numeric value and adding an index for it. This will give you a computed/persistent column (which sounds like oxymoron, but isn't).