How to automate a field mapping using a table in snowflake - sql

I have one column table in my snowflake database that contain a JSON mapping structure as following
ColumnMappings : {"Field Mapping": "blank=Blank,E=East,N=North,"}
How to write a query that if I feed the Field Mapping a value of E I will get East or if the value if N I will get North so on and so forth without hard coding the value in the query like what CASE statement provides.

You really want your mapping in this JSON form:
{
"blank" : "Blank",
"E" : "East",
"N" : "North"
}
You can achieve that in Snowflake e.g. with a simple JS UDF:
create or replace table x(cm variant) as
select parse_json(*) from values('{"fm": "blank=Blank,E=East,N=North,"}');
create or replace function mysplit(s string)
returns variant
language javascript
as $$
res = S
.split(",")
.reduce(
(acc,val) => {
var vals = val.split("=");
acc[vals[0]] = vals[1];
return acc;
},
{});
return res;
$$;
select cm:fm, mysplit(cm:fm) from x;
-------------------------------+--------------------+
CM:FM | MYSPLIT(CM:FM) |
-------------------------------+--------------------+
"blank=Blank,E=East,N=North," | { |
| "E": "East", |
| "N": "North", |
| "blank": "Blank" |
| } |
-------------------------------+--------------------+
And then you can simply extract values by key with GET, e.g.
select cm:fm, get(mysplit(cm:fm), 'E') from x;
-------------------------------+--------------------------+
CM:FM | GET(MYSPLIT(CM:FM), 'E') |
-------------------------------+--------------------------+
"blank=Blank,E=East,N=North," | "East" |
-------------------------------+--------------------------+
For performance, you might want to make sure you call mysplit only once per value in your mapping table, or even pre-materialize it.

Related

Update column with the same value apart from an object removed in column in Sqlite

I want to remove an object from a json column in Sqlite and I can't make it work. The json column contains a nested object, has the following type:
{
a: number;
pair: {
field1: string;
field2: string;
}[]
}
I want to update the column "ArrayColumn" with the same values but remove the object that has field1 equal to "0" and field2 equal to "1" . Every row contains the "pair" array, but not all the "pair" arrays in ArrayColumn contain this value ({"field1":"0", "field2":"1"})
I have the following structure:
Id| ArrayColumn
--------------------------------------------------------------------------------------------
1 | { "a":1, "pair":[{"field1":"0", "field2":"1"},{"field1":"C", "field2":"D"},{"field1":"E", "field2":"F"}] }
2 | { "a":5, "pair":[{"field1":"C", "field2":"D"},{"field1":"E", "field2":"F"}] }
3 | { "a":8, "pair":[{"field1":"G", "field2":"G"},{"field1":"0", "field2":"1"},{"field1":"A", "field2":"A"}] }
4 | { "a":1, "pair":[{"field1":"F", "field2":"T"},{"field1":"C", "field2":"D"},{"field1":"0", "field2":"1"}] }
5 | { "a":1, "pair":[{"field1":"A", "field2":"B"}] }
After updating the rows, the values would be:
Id| ArrayColumn
--------------------------------------------------------------------------------------------
1 | { "a":1, "pair":[{"field1":"C", "field2":"D"},{"field1":"E", "field2":"F"}] }
2 | { "a":5, "pair":[{"field1":"C", "field2":"D"},{"field1":"E", "field2":"F"}] }
3 | { "a":8, "pair":[{"field1":"G", "field2":"G"},{"field1":"A", "field2":"A"}] }
4 | { "a":1, "pair":[{"field1":"F", "field2":"T"},{"field1":"C", "field2":"D"}] }
5 | { "a":1, "pair":[{"field1":"A", "field2":"B"}] }
I tried with JSON_TREE but can't make it work.
I was thinking that the first step would be to select all the rows that contain that value, I retreived them using these 2 ways:
With LIKE operator searching for the stringified form:
select Id, json_extract(json(par), '$.pair') as pair from Table pair like '%{"field1":"0","field2":"1"}%'
Using json_tree
select Id, value from Table, json_tree(Table.ArrayColumn, '$.pair' ) where json_extract(value, '$.field1' ) = '0' AND json_extract(value, '$.field2' ) = '1'
I tried using json_remove with this small example but no luck:
SELECT json_remove('[{"field1":"1","field2":"0"},{"field1":"A","field2":"B"}]', '${"field1":"1","field2":"0"}' )
I tried using json_remove but had no luck.
Thank you
For this sample data the simplest way to do this is to treat the json column as a string and use string functions to remove the value that you want:
UPDATE tablename
SET ArrayColumn = REPLACE(REPLACE(REPLACE(ArrayColumn, ']', ',]'), '{"field1":"0", "field2":"1"},', ''), ',]', ']')
WHERE ArrayColumn LIKE '%{"field1":"0", "field2":"1"}%';
See the demo.

Scala MatchError while joining a dataframe and a dataset

I have one dataframe and one dataset :
Dataframe 1 :
+------------------------------+-----------+
|City_Name |Level |
+------------------------------+------------
|{City -> Paris} |86 |
+------------------------------+-----------+
Dataset 2 :
+-----------------------------------+-----------+
|Country_Details |Temperature|
+-----------------------------------+------------
|{City -> Paris, Country -> France} |31 |
+-----------------------------------+-----------+
I am trying to make a join of them by checking if the map in the column "City_Name" is included in the map of the Column "Country_Details".
I am using the following UDF to check the condition :
val mapEqual = udf((col1: Map[String, String], col2: Map[String, String]) => {
if (col2.nonEmpty){
col2.toSet subsetOf col1.toSet
} else {
true
}
})
And I am making the join this way :
dataset2.join(dataframe1 , mapEqual(dataset2("Country_Details"), dataframe1("City_Name"), "leftanti")
However, I get such error :
terminated with error scala.MatchError: UDF(Country_Details#528) AS City_Name#552 (of class org.apache.spark.sql.catalyst.expressions.Alias)
Has anyone previously got the same error ?
I am using Spark version 3.0.2 and SQLContext, with scala language.
There are 2 issues here, the first one is that when you're calling your function, you're passing one extra parameter leftanti (you meant to pass it to join function, but you passed it to the udf instead).
The second one is that the udf logic won't work as expected, I suggest you use this:
val mapContains = udf { (col1: Map[String, String], col2: Map[String, String]) =>
col2.keys.forall { key =>
col1.get(key).exists(_ eq col2(key))
}
}
Result:
scala> ds.join(df1 , mapContains(ds("Country_Details"), df1("City_Name")), "leftanti").show(false)
+----------------------------------+-----------+
|Country_Details |Temperature|
+----------------------------------+-----------+
|{City -> Paris, Country -> France}|31 |
+----------------------------------+-----------+

how to extract a value from a json format

I need to extract the email from an intricate 'dict' (I am new to sql)
I have seen several previous posts on the same topic (e.g. this one) however, none seem to work on my data
select au.details
from table_au au
result:
{
"id":3526,
"contacts":[
{
"contactType":"EMAIL",
"value":"name#email.be",
"private":false
},
{
"contactType":"PHONE",
"phoneType":"PHONE",
"value":"025/6251111",
"private":false
}
]
}
I need:
name#email.be
select d.value -> 0 -> 'value' as Email
from json_each('{"id":3526,"contacts":[{"contactType":"EMAIL","value":"name#email.be","private":false},{"contactType":"PHONE","phoneType":"PHONE","value":"025/6251111","private":false}]}') d
where d.key::text = 'contacts'
Output:
| | email |
-------------------
|1 |"name#email.be"|
You can run it here: https://rextester.com/VHWRQ89385

Substring search a numeric field with JPA/Hibernate

I have a JPA entity that has a numeric field. Something like:
#Basic(optional = false)
#Column(name = "FISCAL_YEAR", nullable = false)
private int fiscalYear;
I have a requirement to sub-string search this field. For example, I want a search for 17 to give me 2017 and 1917 and 1789. Forget for a minute what a crazy request this is and assume I have a real use case that makes sense. Changing the column to a varchar in the database is not an option.
In PL/SQL, I'd covert the field to a varchar and do a like '%17%'. How would I accomplish this with Hibernate/JPA without using a native query? I need to be able to use HQL or Criteria to do the same thing.
Achieving like on numeric values using criteria builders
Table
Employee | CREATE TABLE `Employee` (
`id` int(11) NOT NULL,
`first` varchar(255) DEFAULT NULL,
`last` varchar(255) DEFAULT NULL,
`occupation` varchar(255) DEFAULT NULL,
`year` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Entity
private Integer year;
public Integer getYear() {
return year;
}
public void setYear(Integer year) {
this.year = year;
}
Data in the table
+----+-------+------+------------+------+
| id | first | last | occupation | year |
+----+-------+------+------------+------+
| 2 | Ravi | Raj | Textile | 1718 |
| 3 | Ravi | Raj | Textile | 1818 |
| 4 | Ravi | Raj | Textile | 1917 |
| 5 | Ravi | Raj | Textile | NULL |
| 6 | Ravi | Raj | Textile | NULL |
| 7 | Ravi | Raj | Textile | NULL |
+----+-------+------+------------+------+
constructing query using criteria builder
public List<Employee> getEmployees() {
CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<Employee> q = cb.createQuery(Employee.class);
Root<Employee> emp = q.from(Employee.class);
Predicate year_like = cb.like(emp.<Integer>get("year").as(String.class), "%17%");
CriteriaQuery<Employee> fq = q.where(year_like);
List<Employee> resultList = (List<Employee>) entityManager.createQuery(fq).getResultList();
return resultList;
}
query generated(using show_sql: true)
Hibernate: select employee0_.id as id1_0_, employee0_.first as first2_0_, employee0_.last as last3_0_, employee0_.occupation as occupati4_0_, employee0_.year as year5_0_ from Employee employee0_ where cast(employee0_.year as char) like ?
Query Output
// i have printed only id and year in the console
id, year
2, 1718
4, 1917
------------------------------------------------------------
Alternate way
LIKE worked in JPA for numeric field when Tested with JPA, hibernate, mysql.
Note:- May not work with other jpa providers
Query r = entityManager.createQuery("select c from Employee c where c.year like '%17%'");
query fired(using show_sql=true)
Hibernate: select employee0_.id as id1_0_, employee0_.first as first2_0_, employee0_.last as last3_0_, employee0_.occupation as occupati4_0_, employee0_.year as year5_0_ from Employee employee0_ where employee0_.year like '%17%'
Query Result
// i have printed only id and year in the console
id, year
2, 1718
4, 1917
You can declare your own Criterion type
public class CrazyLike implements Criterion {
private final String propertyName;
private final int intValue;
public CrazyLike(String propertyName, int intValue) {
this.propertyName = propertyName;
this.intValue = intValue;
}
#Override
public String toSqlString(Criteria criteria, CriteriaQuery criteriaQuery)
throws HibernateException {
final String[] columns = criteriaQuery.findColumns( propertyName, criteria );
if ( columns.length != 1 ) {
throw new HibernateException( "Crazy Like may only be used with single-column properties" );
}
final String column = columns[0];
return "cast(" + column + " as text) like '%" + intValue + "%'";
}
#Override
public TypedValue[] getTypedValues(Criteria criteria,
CriteriaQuery criteriaQuery) throws HibernateException {
return new TypedValue[] { };
}
}
And then use it like this:
Criteria criteria = session.createCriteria(Person.class);
List<Person> persons = criteria.add(new CrazyLike("year", 17)).list();
assuming that Person has an int property called year. This should produce a SQL like this:
select
this_.id as id1_2_0_,
this_.birthdate as birthdat2_2_0_,
this_.firstname as firstnam3_2_0_,
this_.lastname as lastname4_2_0_,
this_.ssn as ssn5_2_0_,
this_.version as version6_2_0_,
this_.year as year7_2_0_
from
Person this_
where
cast(this_.year as text) like '%17%'
This was tested with Postgres. The cast() syntax may vary for your database engine. If it is, just use that syntax in the Criterion class that you implement.

How to flatten bigquery record with multiple repeated fields?

I'm trying to query against app-engine datastore backup data. In python, the entities are described as something like this:
class Bar(ndb.Model):
property1 = ndb.StringProperty()
property2 = ndb.StringProperty()
class Foo(ndb.Model):
bar = ndb.StructuredProperty(Bar, repeated=True)
baz = ndb.StringProperty()
Unfortunately when Foo gets backed up and loaded into bigquery, the table schema gets loaded as:
bar | RECORD | NULLABLE
bar.property1 | STRING | REPEATED
bar.property2 | STRING | REPEATED
baz | STRING | NULLABLE
What I would like to do is to get a table of all bar.property1 and associated bar.property2 where baz = 'baz'.
Is there a simple way to flatten Foo so that the bar records are "zipped" together? If that's not possible, is there another solution?
As hinted in a comment by #Mosha, it seems that big query supports User Defined Functions (UDF). You can input it in the UDF Editor tab on the web UI. In this case, I used something like:
function flattenTogether(row, emit) {
if (row.bar && row.bar.property1) {
for (var i=0; i < row.bar.property1.length; i++) {
emit({property1: row.bar.property1[i],
name: row.bar.property2[i]});
}
}
};
bigquery.defineFunction(
'flattenBar',
['bar.property1', 'bar.property2'],
[{'name': 'property1', 'type': 'string'},
{'name': 'property2', 'type': 'string'}],
flattenTogether);
And then the query looked like:
SELECT
property1,
property2,
FROM
flattenBar(
SELECT
bar.property1,
bar.property2,
FROM
[dataset.foo]
WHERE
baz = 'baz')
Since baz is not repeated, you can simply filter on it in WHERE clause without any flattening:
SELECT bar.property1, bar.property2 FROM t WHERE baz = 'baz'