I'm receiving data involving the status of various prefixed servers, something like the following:
twserv1: UP
twserv2: UP
poserv1: DOWN
poserv2: UNKNOWN
I want to display this data in a table separated by the prefix and server name like the following:
TW PO
serv1 UP DOWN
serv2 UP UNKNOWN
I've got this mostly working with the following command courtesy of this Q&A:
| inputlookup serverStatus
| table twserv* poserv*
| eval temp=1
| untable temp key value
| rex field=key "(?<x>tw|po)(?<y>\w+\d+)"
| chart values(value) over y by x
| rename tw as TW po as PO
This yields something like the following:
y TW PO
serv1 UP DOWN
serv2 UP UNKNOWN
Is there any way to remove just that first column title? I've tried setting it to whitespace, but to no avail.
That column "title" is really a field name (in this case, "y")
Field names cannot be just whitespace ... because then they "don't exist"
Related
Currently I have several files and I want to upload them to a DB, creating new columns with some metadata on them. An example of the files I have is the following:
MYBRAND-GOOD_20210202.tab
MYBRAND-BAD_20210202.tab
MYBRAND_20210202.tab
each file have x,y,z columns and additionally I want to create 3 new columns with metadata on them, based on some properties of the files. What I would like to have as a result is the following:
Table MYBRAND-GOOD
x | y | z | brand | FILE_DATE | SOURCE_DETAILS | Name
a. b c GOOD 20210202 tab MYBRAND-GOOD_20210202
Table MYBRAND-BAD
x | y | z | brand | FILE_DATE | SOURCE_DETAILS | Name
a. b c BAD 20210202 tab MYBRAND-BAD_20210202
Table MYBRAND
x | y | z | brand | FILE_DATE | SOURCE_DETAILS | Name
a. b c MYBRAND 20210202 tab MYBRAND_20210202
What I'm currently doing is the following :
SELECT x,y,z,
split(INPUT_FILE_NAME(),'- | _')[1] AS brand,
regexp_extract(INPUT_FILE_NAME(), '.*/modified_dttm=(.*)/.+', 1) AS FILE_DATE,
regexp_extract(regexp_replace(INPUT_FILE_NAME()\\,'%20'\\,'')\\, '.*/.*-([0-9]{4}-[0-9]{2}-[0-9]{2}).tab'\\, 1)) AS SOURCE_DETAILS
regexp_extract(INPUT_FILE_NAME(), '^([^\.]+)\.?', 0) AS NAME
However I'm facing several problems (since I'm not very proficient with regex):
brand fails if it doesn't have a '-' separator (AS in 'MYBRAND')
I'm not sure if 'FILE_DATE' it's doing what's suppose to do
SOURCE_DETAILS is giving me empty results
NAME is ok, but I would like to exclude the '.'
If someone could guide me with this regex rules, which I don't follow completely, I would appreciate any correction.
We can write one pattern for the whole string and vary the index argument of regexp_extract() for each desired element.
(Mybrand(-([A-Za-z0-9]*))?_(\d{8,8}))\.(\w+)
Using that pattern each time, you can select which capture group to display
Select x,y,z
Regexp_extract(INPUT_FILE_NAME(),'(Mybrand(-([A-Za-z0-9]*))?_(\d{8,8}))\.(\w+)', 3) AS Brand,
Regexp_extract(INPUT_FILE_NAME(),'(Mybrand(-([A-Za-z0-9]*))?_(\d{8,8}))\.(\w+)', 4) AS FileDate,
Regexp_extract(INPUT_FILE_NAME(),'(Mybrand(-([A-Za-z0-9]*))?_(\d{8,8}))\.(\w+)', 5) AS SourceDetails,
Regexp_extract(INPUT_FILE_NAME(),'(Mybrand(-([A-Za-z0-9]*))?_(\d{8,8}))\.(\w+)', 1) AS Name
You parenthesize each subpattern you want to capture, so we start with a parenthesis pair right at the beginning to capture the name. Then we scan MYBRAND, then start a new parenthesis group because the hyphen is optional. Then we start the third parenthesis group to capture the alphanumerics [A-Za-z0-9]* which make up the brand. The star lets the group be empty which will retrieve a null. Next comes an underscore followed by a new set of parens to capture the digits making up the date \d{8,8}. We close the first parenthesis here to end the file name capture, then a dot, and the final parens to capture the filetype (\w+).
My splunk result looks like this:
9/1/20
5:00:14.487 PM
2020-09-01 16:00:14.487, 'TOTALITEMS'="Number of items registered in the last 2 hours ", COUNT(*)="1339"
I am trying to table the number that appears in the end in quotes.
index=my_db sourcetype=no_of_items_registered source=P_No_of_items_registered_2hours | rex field=_raw "\"Number of items registered in the last 2 hours \", COUNT(\*)=\"(?P<itm_ct>\d+)\"$" | table itm_ct
This displays a blank table without any numbers. The number of rows in the table however matches the the number of events.
Any help much appreciated
The regular expression doesn't match the sample data. Literal parentheses must be escaped in the regex. Try this:
index=my_db sourcetype=no_of_items_registered source=P_No_of_items_registered_2hours
| rex "COUNT\(\*\)="(?<itm_ct>\d+)" | table itm_c
I have the following table:
postgres=# \d so_rum;
Table "public.so_rum"
Column | Type | Collation | Nullable | Default
-----------+-------------------------+-----------+----------+---------
id | integer | | |
title | character varying(1000) | | |
posts | text | | |
body | tsvector | | |
parent_id | integer | | |
Indexes:
"so_rum_body_idx" rum (body)
I wanted to do phrase search query, so I came up with the below query, for example:
select id from so_rum
where body ## phraseto_tsquery('english','Is it possible to toggle the visibility');
This gives me the results, which only match's the entire text. However, there are documents, where the distance between lexmes are more and the above query doesn't gives me back those data. For example: 'it is something possible to do toggle between the. . . visibility' doesn't get returned. I know I can get it returned with <2> (for example) distance operator by giving in the to_tsquery, manually.
But I wanted to understand, how to do this in my sql statement itself, so that I get the results first with distance of 1 and then 2 and so on (may be till 6-7). Finally append results with the actual count of the search words like the following query:
select count(id) from so_rum
where body ## to_tsquery('english','string & string . . . ')
Is it possible to do in a single query with good performance?
I don't see a canned solution to this. It sounds like you need to use plainto_tsquery to get all the results with all the lexemes, and then implement your own custom ranking function to rank them by distance between the lexemes, and maybe filter out ones with the wrong order.
I have an xml file containing records from a library catalogue. I have imported it into OpenRefine but all the values are in one column. I want to transpose it so each field in the record has its own column. However, this is complicated by the fact that a) each field is optional so does not exist in all records and b) many fields are repeatable so can appear multiple times in each record. Here's a simplified example of what the data looks like:
| RecordID | Tag | Data |
| 1 | 040a | CaABCD |
| 1 | 245a | Go fish |
| 1 | 245a | A guide to fish |
| 1 | 246i | Fish series |
| 1 | 260a | Fishing friends |
| 2 | 040a | CaABDC |
| 2 | 245a | Happy trails |
| 2 | 246i | Hiking series |
| 2 | 260i | The happy hiker |
| 2 | 500a | Notes |
I have read the Q&A here Openrefine - Transpose rows into columns based on text but the problem with this solution is that if I concatenate all the values together I have no way to be sure what field they belong in anymore, as my data is much more complicated than the data in that question (my actual data has 25+ fields and many thousands of records).
I was able to get closer using Google Sheets and making a pivot table with a calculated field (as in PivotTable to show values, not sum of values - see the answer at the very bottom). However, I still don't know how to handle the repeating fields. In the pivot table the multiple values are there but only the first displays (double-clicking on an individual cell brings up a details table which lists all the values), so when I copy-paste the table I lose the additional values. I would like to concatenate them but I cannot see a way to do so within the pivot table.
Can you think of any other way I could do this, in OpenRefine or another tool? Thanks!
The classic way to fix this in OpenRefine is to use "Transpose -> Columnize by key value". But this feature is poorly documented and can cause headaches even for OpenRefine developers. In your case, repeated fields will be problematic, so here is a possible solution.
1° Go to the "tag" column, click on "Transpose -> Columnize by key value" and use the following configuration (don't forget the "Note column (optional)")
The result will look like this (my dataset is not exactly the same as yours, I modified a value to do some test)
2° In the new column "Record ID: 040 a", click on "edit column -> Move Column To Beginning".
3° If you want to merge the repeated fields, go to each column that contains them and click on "Edit Cells -> Join Multi Value cells" by choosing a separator, for example "|".
The end result will look like this.
To get rid of unnecessary columns: Click on Export -> Custom tabular export and deselect the columns whose name starts with RecordId.
OpenRefine also has a native MARC importer which might be something worth trying if you need to work with MARC data in the future. MARCEdit also has some specific OpenRefine support built in.
I am using SQL Server 2012 and I have a table called XMLData that looks like this:
| Tag | Attribute |
|--------------|-----------------------------|
| tag1 | Cantidad=222¬ClaveProdServ=1|
| tag1 | Cantidad=333¬ClaveProdServ=2|
The column Tag has many repeated values, what is different is the column Attribute that has a string of attributes separated by "¬". I want to separate the list of attributes and then pivot the table so the tags are the column names.
The result I want is like this:
| tag1 | tag1 |
|-----------------|----------------|
| Cantidad=222 | Cantidad=333 |
| ClaveProdServ=1 | ClaveProdServ=2|
I have a custom made function that splits the string since SQL server 2012 doesn't have a premade function that does this. The function I have receives a
string as a parameter and the delimiter like so:
select *
from [dbo].[Split]('lol1,lol2,lol3,lol4',',')
this function will return this:
| item |
|--------|
| lol1 |
| lol2 |
| lol3 |
I can't find a way to pass the values of the column Attribute as parameter of this function, something like this:
SELECT *
FROM Split(A.Attribute,'¬'),XMLData A
And then put the values of the column Tag as the the column names for each set of Attributes
My magic crystal ball tells me, that you have - why ever - decided to do it this way and any comments about don't store CSV data are just annoying to you.
How ever...
If this is just a syntax issue, try it like this:
SELECT t.Tag
,t.Attribute
,splitted.item
FROM YourTable AS t
CROSS APPLY dbo.Split(t.Attribute,'¬') AS splitted
Otherwise show some more relevant details. Please read How to ask a good SQL question and How to create a MCVE