KQL: mv-expand OR bag_unpack equivalent command to convert a list to multiple columns - azure-log-analytics

According to mv-expand documentation:
Expands multi-value array or property bag.
mv-expand is applied on a dynamic-typed column so that each value in the collection gets a separate row. All the other columns in an expanded row are duplicated.
Just like the mv-expand operator will create a row each for the elements in the list -- Is there an equivalent operator/way to make each element in a list an additional column?
I checked the documentation and found Bag_Unpack:
The bag_unpack plugin unpacks a single column of type dynamic by treating each property bag top-level slot as a column.
However, it doesn't seem to work on the list, and rather works on top-level JSON property.
Using bag_unpack (like the below query):
datatable(d:dynamic)
[
dynamic({"Name": "John", "Age":20}),
dynamic({"Name": "Dave", "Age":40}),
dynamic({"Name": "Smitha", "Age":30}),
]
| evaluate bag_unpack(d)
It will do the following:
Name Age
John 20
Dave 40
Smitha 30
Is there a command/way (see some_command_which_helps) I can achieve the following (convert a list to columns):
datatable(d:dynamic)
[
dynamic(["John", "Dave"])
]
| evaluate some_command_which_helps(d)
That translates to something like:
Col1 Col2
John Dave
Is there an equivalent where I can convert a list/array to multiple columns?
For reference: We can run the above queries online on Log Analytics in the demo section if needed (however, it may require login).

you could try something along the following lines
(that said, from an efficiency standpoint, you may want to check your options of restructuring the data set to begin with, using a schema that matches how you plan to actually consume/query it)
datatable(d:dynamic)
[
dynamic(["John", "Dave"]),
dynamic(["Janice", "Helen", "Amber"]),
dynamic(["Jane"]),
dynamic(["Jake", "Abraham", "Gunther", "Gabriel"]),
]
| extend r = rand()
| mv-expand with_itemindex = i d
| summarize b = make_bag(pack(strcat("Col", i + 1), d)) by r
| project-away r
| evaluate bag_unpack(b)
which will output:
|Col1 |Col2 |Col3 |Col4 |
|------|-------|-------|-------|
|John |Dave | | |
|Janice|Helen |Amber | |
|Jane | | | |
|Jake |Abraham|Gunther|Gabriel|

To extract key value pairs from text and convert them to columns without hardcoding the key names in query:
print message="2020-10-15T15:47:09 Metrics: duration=2280, function=WorkerFunction, count=0, operation=copy_into, invocationId=e562f012-a994-4fc9-b585-436f5b2489de, tid=lct_b62e6k59_prd_02, table=SALES_ORDER_SCHEDULE, status=success"
| extend Properties = extract_all(#"(?P<key>\w+)=(?P<value>[^, ]*),?", dynamic(["key","value"]), message)
| mv-apply Properties on (summarize make_bag(pack(tostring(Properties[0]), Properties[1])))
| evaluate bag_unpack(bag_)
| project-away message

Related

Combine query to get all the matching search text in right order

I have the following table:
postgres=# \d so_rum;
Table "public.so_rum"
Column | Type | Collation | Nullable | Default
-----------+-------------------------+-----------+----------+---------
id | integer | | |
title | character varying(1000) | | |
posts | text | | |
body | tsvector | | |
parent_id | integer | | |
Indexes:
"so_rum_body_idx" rum (body)
I wanted to do phrase search query, so I came up with the below query, for example:
select id from so_rum
where body ## phraseto_tsquery('english','Is it possible to toggle the visibility');
This gives me the results, which only match's the entire text. However, there are documents, where the distance between lexmes are more and the above query doesn't gives me back those data. For example: 'it is something possible to do toggle between the. . . visibility' doesn't get returned. I know I can get it returned with <2> (for example) distance operator by giving in the to_tsquery, manually.
But I wanted to understand, how to do this in my sql statement itself, so that I get the results first with distance of 1 and then 2 and so on (may be till 6-7). Finally append results with the actual count of the search words like the following query:
select count(id) from so_rum
where body ## to_tsquery('english','string & string . . . ')
Is it possible to do in a single query with good performance?
I don't see a canned solution to this. It sounds like you need to use plainto_tsquery to get all the results with all the lexemes, and then implement your own custom ranking function to rank them by distance between the lexemes, and maybe filter out ones with the wrong order.

How psql's \dt PATTERN works?

I try to list trigger. I know that trigger contain new word.
I try to find trigger by that word but nothing is found:
tucha=> \dft new
List of functions
Schema | Name | Result data type | Argument data types | Type
--------+------+------------------+---------------------+------
(0 rows)
tucha=> \dft xxx_child_fk_check_new
List of functions
Schema | Name | Result data type | Argument data types | Type
--------+------------------------+------------------+---------------------+------
public | xxx_child_fk_check_new | trigger | | func
(1 row)
I suppose that PATTERN is regex, but it does not work.
What PATTERN is? and how to find my trigger by new?
I think that you want:
\dFt *new*
Or if you want to search in all schemas:
\dFt *.*new*

Transpose variable number of rows into columns in OpenRefine

I have an xml file containing records from a library catalogue. I have imported it into OpenRefine but all the values are in one column. I want to transpose it so each field in the record has its own column. However, this is complicated by the fact that a) each field is optional so does not exist in all records and b) many fields are repeatable so can appear multiple times in each record. Here's a simplified example of what the data looks like:
| RecordID | Tag | Data |
| 1 | 040a | CaABCD |
| 1 | 245a | Go fish |
| 1 | 245a | A guide to fish |
| 1 | 246i | Fish series |
| 1 | 260a | Fishing friends |
| 2 | 040a | CaABDC |
| 2 | 245a | Happy trails |
| 2 | 246i | Hiking series |
| 2 | 260i | The happy hiker |
| 2 | 500a | Notes |
I have read the Q&A here Openrefine - Transpose rows into columns based on text but the problem with this solution is that if I concatenate all the values together I have no way to be sure what field they belong in anymore, as my data is much more complicated than the data in that question (my actual data has 25+ fields and many thousands of records).
I was able to get closer using Google Sheets and making a pivot table with a calculated field (as in PivotTable to show values, not sum of values - see the answer at the very bottom). However, I still don't know how to handle the repeating fields. In the pivot table the multiple values are there but only the first displays (double-clicking on an individual cell brings up a details table which lists all the values), so when I copy-paste the table I lose the additional values. I would like to concatenate them but I cannot see a way to do so within the pivot table.
Can you think of any other way I could do this, in OpenRefine or another tool? Thanks!
The classic way to fix this in OpenRefine is to use "Transpose -> Columnize by key value". But this feature is poorly documented and can cause headaches even for OpenRefine developers. In your case, repeated fields will be problematic, so here is a possible solution.
1° Go to the "tag" column, click on "Transpose -> Columnize by key value" and use the following configuration (don't forget the "Note column (optional)")
The result will look like this (my dataset is not exactly the same as yours, I modified a value to do some test)
2° In the new column "Record ID: 040 a", click on "edit column -> Move Column To Beginning".
3° If you want to merge the repeated fields, go to each column that contains them and click on "Edit Cells -> Join Multi Value cells" by choosing a separator, for example "|".
The end result will look like this.
To get rid of unnecessary columns: Click on Export -> Custom tabular export and deselect the columns whose name starts with RecordId.
OpenRefine also has a native MARC importer which might be something worth trying if you need to work with MARC data in the future. MARCEdit also has some specific OpenRefine support built in.

Split string and Pivot Result - SQL Server 2012

I am using SQL Server 2012 and I have a table called XMLData that looks like this:
| Tag | Attribute |
|--------------|-----------------------------|
| tag1 | Cantidad=222¬ClaveProdServ=1|
| tag1 | Cantidad=333¬ClaveProdServ=2|
The column Tag has many repeated values, what is different is the column Attribute that has a string of attributes separated by "¬". I want to separate the list of attributes and then pivot the table so the tags are the column names.
The result I want is like this:
| tag1 | tag1 |
|-----------------|----------------|
| Cantidad=222 | Cantidad=333 |
| ClaveProdServ=1 | ClaveProdServ=2|
I have a custom made function that splits the string since SQL server 2012 doesn't have a premade function that does this. The function I have receives a
string as a parameter and the delimiter like so:
select *
from [dbo].[Split]('lol1,lol2,lol3,lol4',',')
this function will return this:
| item |
|--------|
| lol1 |
| lol2 |
| lol3 |
I can't find a way to pass the values of the column Attribute as parameter of this function, something like this:
SELECT *
FROM Split(A.Attribute,'¬'),XMLData A
And then put the values of the column Tag as the the column names for each set of Attributes
My magic crystal ball tells me, that you have - why ever - decided to do it this way and any comments about don't store CSV data are just annoying to you.
How ever...
If this is just a syntax issue, try it like this:
SELECT t.Tag
,t.Attribute
,splitted.item
FROM YourTable AS t
CROSS APPLY dbo.Split(t.Attribute,'¬') AS splitted
Otherwise show some more relevant details. Please read How to ask a good SQL question and How to create a MCVE

Postgres: how to View an hstore Column as a Table

Simple problem but my search fu is failing somehow so I don't even know if this is possible or not.
I have a table in Postgres, call it key_value_store, with an hstore field, call it document. Also I defined a type, call it key_value_type, which has as properties the fields that I want to extract from the hstore into a full blown table row:
CREATE TYPE key_value_type AS (property1 text, property2 text, property3 text)
So I'd like to output (eventually using a VIEW) a table with as many columns as the properties of key_value_type and I need to do this for many combinations of properties, therefore I don't want to create a table for each combination.
I tried with:
SELECT populate_record(null::key_value_type, document) FROM key_value_store
but instead of:
| PROPERTY 1 | PROPERTY 2 | PROPERTY 3 |
........................................
| value 1.1 | value 1.2 | value 1.3 |
........................................
| value 2.1 | value 2.2 | value 2.3 |
........................................
| and so on | and so on | and so on |
........................................
what I get is:
| populate_record |
........................................
| (value 1.1, value 1.2, value 1.3) |
........................................
| (value .1, value 2.2, value 2.3) |
........................................
| (and so on, and so on, and so on) |
........................................
How do I get to the desired result from here (or from anywhere else for that matter)?
I noticed that using a table name instead of key_value_type I actually get what I want, but as I said I'd avoid creating a table for each combination of properties I need.
populate_record() does exactly that: it populates a record.
Therefor you get only a single column, not three columns.
If you want all columns from the record, you need to explicitly say so:
SELECT (populate_record(null::key_value_type, document)).*
FROM key_value_store
SQLFiddle example: http://sqlfiddle.com/#!15/b6794/1