I have a table with a nvarchar max column that has all kinds of json text stored in it. I was hoping to use something like this to extract the json but that only does one json object at a time. How can I run this on every row and get one big table with all of the data?
I didn't look in detail at that article, but it seems to me that you could use CROSS APPLY or OUTER APPLY to do that with whatever parsing function you have got.
Related
A project I'm working on involves storing a string of data in a table column. The table will have other columns relevant to the records. We decided to store the string data column using JSON.
From the table, a view will parse the JSON column into separate columns. The view will also have columns derived from the other main table columns. The data from the view is then used to populate parts of a document through SSRS.
When loading data into the main table, I need to utilize separate tables for deriving the other column values and the JSON column. I decided to use common table expressions for this. At the end of the query, I bring together the derived columns from the different common table expressions, including the JSON column, and insert them into the main table.
I had it almost done until I realized that when I use FOR JSON to create the JSON column, it escapes special characters. I did some research and have been trying to use the JSON_QUERY function to get around this but it's not working. Here is a simplification of the problem:
WITH Table1
(
First_Name_JSON
)
As
(
SELECT 'Tim/' As First_Name
FOR JSON PATH
)
SELECT JSON_QUERY(Table1.First_Name_JSON) as first_name
FROM Table1
FOR JSON PATH
Here is the output:
[{"first_name":[{"First_Name":"Tim\/"}]}]
Why is it still escaping? The documentation shows that passing a column that was created by a FOR JSON should make the JSON_QUERY function return it without escaped characters.
I know that this works:
SELECT JSON_QUERY('{"Firt_Name": "Tim/"}') as first_name
FOR JSON PATH
Output:
[{"first_name":{"Firt_Name": "Tim/"}}]
However, I need to be able to pass a column that's holding JSON data already because it's pretty long logic with many columns. Using FOR JSON is ideal for making changes versus hard coding the JSON format around each column.
I must be missing something. Thanks for any help.
It's quite simple:
{"Firt_Name": "Tim/"} is valid JSON, so JSON_QUERY can return it as is. Tim/ is not valid so needs escaping first.
Quote from the docs:
Using JSON_QUERY with FOR JSON
JSON_QUERY returns a valid JSON fragment. As a result, FOR JSON doesn't escape special characters in the JSON_QUERY return value.
If you're returning results with FOR JSON, and you're including data that's already in JSON format (in a column or as the result of an expression), wrap the JSON data with JSON_QUERY without the path parameter.
Given your use case, is it not possible to pass through the JSON to where you need it and un-escape it there? OPENJSON and JSON_VALUE are capable of this.
I am currently working on building a dataware house in snowflake for the business that i work for and i have encounter some problems. I used to apply the function Json_value in TSQL for extracting certain key/value pair from json format field inside my original MSSQL DB.
All the other field are in the regular SQL format but there is this one field that i really need that is formated in JSON and i can't seems to exact the key/value pair that i need.
I'm new to SnowSQL and i can't seems to find a way to extract this within a regular query. Does anyone knows a way around my problem ?
* ID /// TYPE /// Name (JSON_FORMAT)/// Amount *
1 5 {En: "lunch, fr: "diner"} 10.00
I would like to extract this line (for exemple) and be able to only retrieve the EN: "lunch" part from my JSON format field.
Thank you !
Almost any time you use JSON in Snowflake, it's advisable to use the VARIANT data type. You can use the parse_json function to convert a string into a variant with JSON.
select
parse_json('{En: "lunch", fr: "diner"}') as VARIANT_COLUMN,
VARIANT_COLUMN:En::string as ENGLISH_WORD;
In this sample, the first column converts your JSON into a variant named VARIANT_COLUMN. The second column uses the variant, extracting the "En" property and casting it to a string data type.
You can define columns as variant and store JSON natively. That's going to improve performance and allow parsing using dot notation in SQL.
For anyone else who also stumbles upon this question:
You can also use JSON_EXTRACT_PATH_TEXT. Here is an example, if you wanted to create a new column called meal.
select json_extract_path_text(Name,'En') as meal from ...
Let me explain why I want to do this... I have built a Tableau dashboard that allows a user to browse/search all of the tables & columns in our warehouse by schema, object type (table,view,materialized view), etc. I want to add a column that pulls a sample of the data from each column in each table - this is also done, but with this problem...:
The resulting column is comprised of data of different types (varchar2, LONG, etc.). I can basically get every type of data to conform to a single data type except for LONG - it will not allow me to convert it to anything else compatible with everything else (if that makes sense...). I simply need all data types to coexist in a single column. I've tried many different things and have been reading up on the subject for about a week now, but it sounds like it just can't be done, but in my experience there is always a way... I figured I'd check with the guru's here before admitting defeat.
One of the things I've tried:
--Here, from two different tables, I'm pulling a single piece of data from a single column and attempting to merge into a single column called SAMPLE_DATA
--OTHER is LONG data type
--ORGN_NME is VARCHAR2 data type
select 'PLAN','OTHER', cast(substr(OTHER,1,2) as varchar2(4000)) as SAMPLE_DATA from sde.PLAN union all
select 'BUS_ORGN','ORGN_NME', cast(substr(ORGN_NME,1,2) as varchar2(4000)) as SAMPLE_DATA from sde.BUS_ORGN;
Resulting error:
Lookup Error
ORA-00932: inconsistent datatypes: expected CHAR got LONG
How can I achieve this?
Thanks in advance
Long datatypes are basically unusable by most applications. I made something similar where I wanted to search the contents of packages. The solution is to convert the LONG into CLOB using a pipelined function. Adrian Billington's source code can be found here:
https://github.com/oracle-developer/dla
You end up with a view that you can query. I did not see any performance hit even when looking at large packages so it should work for you.
I have totally rewritten my question because of inaccurate description of the problem!
We have to store a lot of different informations about a specific region. For this we need a flexible data structure which does not limit the possibilities for the user.
So we've create a key-value table for this additional data which is described through a meta table which contains the datatype of the value.
We already use this information for queries over our rest api. We then automatically wrap the requested field with into a cast.
SQL Fiddle
We return this data together with information form other tables as a JSON object. We convert the corresponding rows from the data-table with array_agg and json_object into a JSON object:
...
CASE
WHEN count(prop.name) = 0 THEN '{}'::json
ELSE json_object(array_agg(prop.name), array_agg(prop.value))
END AS data
...
This works very well. Now the problem we have is if we store data like a floating point number into this field, we then get returned a string representation of this number:
e.g. 5.231 returns as "5.231"
Now we would like to CAST this number during our select statement into the right data-format so the JSON result would be correctly formatted. We have all the information we need so we tried following:
SELECT
json_object(array_agg(data.name),
-- here I cast the value into the right datatype!
-- results in an error
array_agg(CAST(value AS datatype))) AS data
FROM data
JOIN (
SELECT name, datatype
FROM meta)
AS info
ON info.name = data.name
The error message is following:
ERROR: type "datatype" does not exist
LINE 3: array_agg(CAST(value AS datatype))) AS data
^
Query failed
PostgreSQL said: type "datatype" does not exist
So is it possible to dynamically cast the text of the data_type column to a postgresql type to return a well-formatted JSON object?
First, that's a terrible abuse of SQL, and ought to be avoided in practically all scenarios. If you have a scenario where this is legitimate, you probably already know your RDBMS so intimately, that you're writing custom indexing plugins, and wouldn't even think of asking this question...
If you tell us what you're actually trying to do, there's about a 99.9% chance we can tell you a better way to do it.
Now with that disclaimer aside:
This is not possible, without using dynamic SQL. With a sufficiently recent version of PostgreSQL, you can accomplish this with the use of 'EXECUTE IMMEDIATE', which you can read about in the manual. It basically boils down to using EXEC.
Note, however, that even using this method, the result for every row fetched in the same query must have the same data type. In other words, you can't expect that row 1 will have a data type of VARCHAR, and row 2 will have INT. That is completely impossible.
The problem you have is, that json_object does create an object out of a string array for the keys and another string array for the values. So if you feed your JSON objects into this method, it will always return an error.
So the first problem is, that you have to use a JSON or JSONB column for the values. Or you can convert the values from string to json with to_json().
Now the second problem is that you need to use another method to create your json object because you want to feed it with a string array for the keys and a json-object array for the values. For this there is a method called json_object_agg.
Then your output should be like the one you expected! Here the full query:
SELECT
json_object_agg(data.name, to_json(data.value)) AS data
FROM data
I have a PostgreSQL column of type text that contains data like shown below
(32.85563, -117.25624)(32.855470000000004, -117.25648000000001)(32.85567, -117.25710000000001)(32.85544, -117.2556)
(37.75363, -121.44142000000001)(37.75292, -121.4414)
I want to convert this into another column of type text like shown below
(-117.25624, 32.85563)(-117.25648000000001,32.855470000000004 )(-117.25710000000001,32.85567 )(-117.2556,32.85544 )
(-121.44142000000001,37.75363 )(-121.4414,37.75292 )
As you can see, the values inside the parentheses have switched around. Also note that I have shown two records here to indicate that not all fields have same number of parenthesized figures.
What I've tried
I tried extracting the column to Java and performing my operations there. But due to sheer amount of records I have, I will run out of memory. I also cannot do this method in batched due to time constraints.
What I want
A SQL query or a sequence of SQL queries that will achieve the result that I have mentioned above.
I am using PostgreSQL9.4 with PGAdmin III as the client
this is a type of problem that should not be solved by sql, but you are lucky to use Postgres.
I suggest the following steps in defining your algorithm.
First part will be turning your strings into a structured data, second will transform structured data back to string in a format that you require.
From string to data
First, you need to turn your bracketed values into an array, which can be done with string_to_array function.
Now you can turn this array into rows with unnest function, which will return a row per bracketed value.
Finally you need to slit values in each row into two fields.
From data to string
You need to group results of the first query with results wrapped in string_agg function that will combine all numbers in rows into string.
You will need to experiment with brackets to achieve exactly what you want.
PS. I am not providing query here. Once you have some code that you tried, let me know.
Assuming you also have a PK or some unique column, and possibly other columns, you can do as follows:
SELECT id, (...), string_agg(point(pt[1], pt[0])::text, '') AS col_reversed
FROM (
SELECT id, (...), unnest(string_to_array(replace(col, ')(', ');('), ';'))::point AS pt
FROM my_table) sub
GROUP BY id; -- assuming id is PK or no other columns
PostgreSQL has the point type which you can use here. First you need to make sure you can properly divide the long string into individual points (insert ';' between the parentheses), then turn that into an array of individual points in text format, unnest the array into individual rows, and finally cast those rows to the point data type:
unnest(string_to_array(replace(col, ')(', ');('), ';'))::point AS pt
You can then create a new point from the point you just created, but with the coordinates reversed, turn that into a string and aggregate into your desired output:
string_agg(point(pt[1], pt[0])::text, '') AS col_reversed
But you might also move away from the text format and make an array of point values as that will be easier and faster to work with:
array_agg(point(pt[1], pt[0])) AS pt_reversed
As I put in the question, I tried extracting the column to Java and performing my operations there. But due to sheer amount of records I have, I will run out of memory. I also cannot do this method in batched due to time constraints.
I ran out of memory here as I was putting everything in a Hashmap of
< my_primary_key,the_newly_formatted_text >. As the text was very long sometimes and due to the sheer number of records that I had, it wasnt surprising that I got an OOM.
Solution that I used:
As suggested my many folks here, this solution was better solved with a code. I wrote a small script that formatted the text as per my liking and wrote the primary key and the newly formatted text to a file in tsv format. Then I imported the tsv in a new table and updated the original table from the new one.