Compare XML data to String - sql

I have a table that houses a bunch of data in an XML field. I can get to the data and display what I need in the select statement, but I also need to use that to compare to another table that houses a translation I am trying to do. Is there a way to compare the value being returned from the XML data to a string value that exists in another table?
The code in my select to return the XML data is:
prv.reported_attributes.value('(/row[#ATTRIBUTE="FIELD"][1])/#VALUE', 'varchar(5)')
I need to compare that text output to another table, but I keep getting NULL like the values I am trying to compare do not match. I have confirmed they do in fact have matches.

Related

Handling Json data in snowflake

enter image description here
I have a table which contains Json file data in each row which gets updated into my snowflake table every weak. I am extracting values from the Json files into another table. When the data is loaded in Json format there are multiple entries of the same ID. So, when I extract values from Json to a table there are duplicate rows. How do I tackle them in order to get the distinct rows only. My select query look something like this:
select
json_data:data[0].attributes."Additional Invoice?":: string as "Additional Invoice?",
json_data:data[0].attributes."Additional PO?":: string as "Additional PO?",
json_data:data[0].attributes."Aggregate Contract Value":: number as "Aggreagate Contract Value" ,
json_data:data[0].attributes."Annualized Baseline Spend" :: number as "Annualized Baseline Spend",
json_data:data[0].id ::number as ID,
json_data:data[0].type::string as TYPE
from scout_projects order by ID
the scout project file screenshot is attached.
The attached Screenshot is the output form the given query and as you could see the ID column is the same but there are only 2 unique rows. I want my query to return only those 2 unique rows.
select distinct json_data:data[0].id :: number as ID from scout_projects
what is the approach should I take?
I tried using subquery, but it gave me error stating "single-row subquery returns more than one row. snowflake error" which is obvious. so, need a way out .

Extract key value pair from json column in redshift

I have a table mytable that stores columns in the form of JSON strings, which contain multiple key-value pairs. Now, I want to extract only a particular value corresponding to one key.
The column that stores these strings is of varchar datatype, and is created as:
insert into mytable(empid, json_column) values (1,'{"FIRST_NAME":"TOM","LAST_NAME" :"JENKINS", "DATE_OF_JOINING" :"2021-06-10", "SALARY" :"1000" }').
As you can see, json_column is created by inserting only a string. Now, I want to do something like:
select json_column.FIRST_NAME from mytable
I just want to extract the value corresponding to key FIRST_NAME.
Though my actual table is far more complex than this example, and I cannot convert these JSON keys into different columns themselves. But, this example clearly illustrates my issue.
This needs to be done over Redshift, please help me out with any valuable suggestions.
using function json_extract_path_text of Redshift can solve this problem easily, as follows:
select json_extract_path_text(json_column, 'FIRST_NAME') from mytable;

Pivoting previously xml data via SQL query throws error

I have data in table as below where value column data is quite big, like unstructured text:
http://s3.pdfconvertonline.com/convert/p3r68-cdx67/78gbs-hvj2r.html
The characters which you find in date like &amp and &nbsp are present and this is just for 2 small records, actual data is quite bigger which is why i use pivot xml as the IDs are 300 in real data set.
The Heading and Value columns were initially HTML data for each ID which is now split on basis of heading and corresponding value in html using xmltype parsing.
Now we have data in the 2 columns split.
I need to pivot this, i.e. the Heading column values which are constant for ever id to become column headers and the respective values to come below as rows.
When I run the pivot query it throws error:
select *
from data
pivot xml (max(id) for heading in (select heading from data));
An error occurs in XML parsing.
Entity reference is not well formed.
XML Parser returned an error while trying to parse the document.
Check if document to be parsed is valid.
Could the error be because of these special characters?

To fetch the datatype of a column, present in the custom type

I have a table which has two varchar columns where the raw data is stored. This raw data contains value and the field name. Field name is like my_cursor.attribute1. This cursor is of the data type (my_custompackage.custom_data_type).
Now, I need to get the data type of the column present in the custom_data_type.
This is wrong, but to give an idea it's something like
my_custompackage.custom_data_type.attribute13
But so far, I couldn't achieve anything. I have tried taking the value into a separate variable. Like `
select field_name into temp_variable from dual;
Then
select dump(temp_variable) into my_data_type
but it didn't work and I was getting the string value. So, could you please tell me how to proceed with this?

Store SQL query result (1 column) as Array

After running my query I get 1 column result as
5
6
98
101
Is there a way to store this result as array so that I can use it later
in queries like
WHERE NOT IN ('5','6','98','101')
I am aware of storing single variable results but is this possible?
I can not use #Table variable as I will be rerunning the query again in the future and it goes out of scope
There are multiple way of storing those column data like using Temporary Tables or View or Table valued function but IMO there is no need of storing that column data anywhere. You can directly use that column in any query saying below (or) perform a JOIN which would be much better option than NOT IN
select * from
table2
where some_column not in (select column1 from this_table);
While this method is not recommended, storing an array in a single column can be done using CSV's(Comma Separated Values). Simply create a VARCHAR array and store it by storing a string containing the values in a specific order. Basically store all of your values into a string with each value being separated by a comma in that string. Store that into a column of your choice. You can later fetch the string and parse it with a string parser i.e using the .split() function in python. AGAIN I do not recommend doing this, I would instead use multiple columns, one referring to each value and access them that way instead
Using separate columns would make it easy to use in a Stored Procedure.