Is there a way to transpose json elements into different rows? - sql

I don't know how to ask this so I'm going to create an example:
Suppose I have a table called "market" that consist in just two columns and three rows as follows:
So, what I want to know is if there is a way to take all the purchases products and put it in differente rows, for example:

You need to unnest the array into multiple rows, then you can extract the product name using the ->> operator:
select t.user_id, x.purchase ->> 'product' as product
from the_table t
cross join jsonb_array_elements(t.purchases) as x(purchase);
If your column is a json rather than jsonb you need to use json_array_elements() instead

Related

create an output column

I'm working on a customer level, and I have a column that contains the 3 types of products I'm selling, what I want is to know every customer what products they already purchased?
but what if that customer has purchased more than one item?
I want to create a column that tell me what they exactly bought 'Coffee','mug' or 'chocolate'..
how can I represent that in the output? again, all these info are stored in one column called 'product'
Thank you
For Snowflake, you can use LISTAGG or ARRAY_AGG depending on whether you want to store the data as a string or as an array. In Snowflake, I would recommend that you store this data as an array, as it is easier to deal with when querying later.
https://docs.snowflake.com/en/sql-reference/functions/array_agg.html
https://docs.snowflake.com/en/sql-reference/functions/listagg.html
You want to use GROUP_CONCAT (mySQL) or STRING_AGG (tSQL)
So you want to do something like this for mysql:
SELECT a.[CustomerID], GROUP_CONCAT(a.[Product]) as Products
FROM [tblSells] a
GROUP BY a.[CustomerID];
Or something like this for tsql:
SELECT a.[CustomerID], STRING_AGG(a.[Product],'.') as Products
FROM [tblSells] a
GROUP BY a.[CustomerID];

DAX - Calcuate a many to many mapping?

I'm trying to build a calculated table, containing the mapping between different datasets. The keys I'm using to do the lookup can be repeated and I would like to generate the list of all possible combinations. In SQL, this would be a join which would generate additional rows. I'm looking to do the same in DAX, with a calculated table, however LOOKUPVALUE can only return one row and will error if it finds more than one match.
A table of multiple values was supplied where a single value was expected
I feel like it could be possible with summarise columns and a virtual relationship, however when trying this, I also get an error
=SUMMARIZECOLUMNS (
Label[LabelText],
User[Dim_CustomerUser_Skey],
Computer[Dim_Computer_Skey]
,FILTER ( Computer, Label[Device] = Computer[Device name])
, FILTER ( User, Label[UserName] =User[UserName])
)
but this also gives:
Calculated table 'CalculatedTable 1': A single value for column 'Device' in table 'Label' cannot be determined. This can happen when a measure formula refers to a column that contains many values without specifying an aggregation such as min, max, count, or sum to get a single result
How to I produce a calculated table for a many to many?
In SQL, there are Joins. Luckily for us DAX provide joins between tables.
But first of all, what function to use for what? Here it is:
Left Outer: GENERATEALL, NATURALLEFTOUTERJOIN
Right Outer: GENERATEALL, NATURALLEFTOUTERJOIN
Full Outer: CROSSJOIN, GENERATE, GENERATEALL
Inner: GENERATE, NATURALINNERJOIN
Left Anti: EXCEPT
Right Anti: EXCEPT
Visit : https://www.sqlbi.com/articles/from-sql-to-dax-joining-tables/

SQL Server match partial text in the comma separated varchar column

I have a VARCHAR column category_text in the table that contain tags to a notification stored. I have three tags Query, Complaint and Suggestion and column can have one or more values separated by comma. I am applying a filter and filter can have one or more values as well in comma separated pattern.
Now what I want is to retrieve all the rows that contain at least one tag based on the filter user is applying, for instance user can select 'query,suggestion' as a filter and result would be all the rows that contain one of the tags i.e. query or suggestion.
select
t.category_text
from
real_time_notifications t
where
charindex('query, suggestion, complaints', t.category_text) > 0
order by
t.id desc
Create a new table, like user_category (user.id link to user table, category) and create an index on both. It will speed up a lot for searching and ease your future maintenance a lot.
If you still persist to do that, create an inline function to split string to records and then merge to test.

SQL query to achieve those rows which matches for two(or more) column value pair list

Suppose, there is a table users_customers which has three column user_id,customer_id and id. This table gives information about which user is assigned to which customer and vice-versa.
Now, I have a list of pair of user_id and customer_id. I know SQL query for to get row for a single pair of user_id and customer_id.
That is,
select * from users_customers where user_id in(uId) and customer_id in (cId).
But, how to get for all pairs in one go without executing query again and again for different pair. I am using postgresql 9.6. And, I will use alternate of this query in Spring Data JPA.
I would appreciate any help.
I'm not sure what your actual problem is, but you can use in with tuples:
where (user_id, customer_id) in ( (u1, c1), (u2, c2), . . . )
You can also pass in multiple values as an array.

Hive to Hive ETL

I have two large Hive tables, say TableA and TableB (which get loaded from different sources).
These two tables have almost identical table structure / columns with same partition column, a date stored as string.
I need to filter records from each table based on certain (identical) filter criteria.
These tables have some columns containing "codes", which need to be looked up to get its corresponding "values".
There are eight to ten such lookup tables, say, LookupA, LookupB, LookupC, etc.,
Now, I need to:
do a union of those filtered records from TableA and TableB.
do a lookup into the lookup tables and replace those "codes" from the filtered records with their respective "values". If a "code" or "value" is unavailable in the filtered records or lookup table respectively, I need to substitute it with zero or an empty string
transform the dates in the filtered records from one format to another
I am a beginner in Hive. Please let know how I can do it. Thanks.
Note: I can manage till union of the tables. Need some guidance on lookup and transformation.
To basically do a lookup Please follow these steps below,
You have to create a custom User Defined function(UDF) which basically does the look up work,meaning you have to create a Java Program internally for looking up, jar it and add it to Hive something like below:
ADD JAR /home/ubuntu/lookup.jar
You then have to add lookup file containing keyvalue pair as follows:
ADD FILE /home/ubuntu/lookupA;
You then have to create a temporary lookup function such as
CREATE TEMPORARY FUNCTION getLookupValueA AS 'com.LookupA';
Finally you have to call this lookup function in the Select query which will basically populate lookup value for the given lookup key.
Same thing can be achieved using JOIN but that will take a hit on the performance.
Taking a join approach you can very well join by the lookupcode for source and lookup tables something like
select a.key,b.lookupvalue
table a join lookuptable b
where a.key=b.lookupKey
Now for Date Transformation, you can use Date functions in Hive.
For the above problem follow the following steps:
Use union schema to union two tables(schema must be same).
For the above scenario you can try pig script.
script would look like(jn table A and tableB with lookup table and generate the appropriate columns):
a = join TableA by codesA left outer, lookupA by codesA.
b = join a by codesB left outer, lookupB by codesB.
Similarly for Table B.
Suppose some value of codesA does not have a value in the lookup table, then:
z = foreach b generate codesA as codesA, valueA is null ? '0' as valuesA.
(will replace all null values from value with 0).
If you are using Pig 0.12 or later, you can use ToString(CurrentTime(),'yyyy-MM-dd')
I hope it will solve your problem. Let me know in case of any concern.