I'm having an issue parsing out syslog data coming into Sentinel. I think it's a misunderstanding of the data types and what my options are when working with them.
I have some raw syslog coming into Sentinel. This data is being ingested with 4 columns: TimeStamp, SyslogMessage, Computer, and Facility. The 'SyslogMessage' column is the one with by far the most data in it, but I'm having issues parsing it out to make it useful. I'd like to be able to take pieces out of the "SyslogMessage" column, and extend new columns from that data, which will give a better ability to manipulate that data than some string operator like contains.
For instance, in a separate situation I had some raw event data coming through as what I think is Json. With this dataset, I was able to do something like extend c = RawEventData.AccountMoniker, which would give me a column 'c' and would only project the AccountMoniker data. Here is an example of that working dataset:
The data set that I am currently working with, looks like this picture. It looks to be formatted similarly to json, but seems to have had a string prefixed to the beginning of it which made the rest of the data a string I think. Here is that data:
I've been able to work in some regex and get the 'SyslogMessage' down to just the bracketed material, but have still been having issues when trying to do something like 'parse_json'. Right now, the only way I'm able to search through this data is using 'has' or 'contains'. What are my options for getting the 'SyslogMessage' data into a type that I can more easily search through and project as columns?
I've got a bigquery import from a firestore database where I want to query on a particular field from a document. This was populated via the firestore-bigquery extension and the document data is stored as a JSON string.
I'm trying to use a WHERE clause in my query that uses one of the fields from the JSON data. However this doesn't seem to work.
My query is as follows:
SELECT json_extract(data,'$.title') as title,p
FROM `table`
left join unnest(json_extract_array(data, '$.tags')) as p
where json_extract(data,'$.title') = 'technology'
data is the JSON object and title is an attribute of all of the items. The above query will run but yield 'no results' (There are definitely results there for the title in question as they appear in the table preview).
I've tried using WHERE title = 'technology' as well but this returns an error that title is an unrecognized field (hence the json_extract).
From my research this should work as a standard SQL JSON query but doesn't seem to work on Bigquery. Does anyone know of a way around this?
All I can think of is if I put the results in another table, but I don't know if that's a workable solution as the data is updated via the extension on an update, so I would need to constantly refresh my second table as well.
Edit
I'm wondering if configuring a view would help with this? Though ultimately I would like to query this based on different parameters and the docs here https://cloud.google.com/bigquery/docs/views suggest you can't reference query parameters in a view
I've since managed to work this out, and will share the solution for anyone else with the same problem.
The solution was to use JSON_VALUE in the WHERE clause instead e.g:
where JSON_VALUE(data,'$.title') = 'technology';
I'm still not sure if this is the best way to do this in terms of performance and cost so I will wait to see if anyone else leaves a better answer.
I am trying to create a dashboard from the data present in the Hive. The catch is the column which I want to visualize is a nested JSON type. So will tableau able to parse and flatten the JSON column and list out all possible attributes? thanks!
Unfortunately Tableau will not automatically flatten the JSON structure of the field for you, but you can manually do so.
Here is an article that explains the use of Regex within Tableau to extract pertinent information from your JSON field.
I realize this may not be the answer you were looking for, but hopefully it gets you started down the right path.
(In case it helps, Tableau does have a JSON connector in the event you are able to connect directly to your JSON as a datasource instead of embedded in your Hive connection as a complex field type.)
I have some relational data in a SQL Server 2008 database split across 3 tables, which I would like to use to populate some classes that represent them.
The hierarchy is: Products -> Variants -> Options.
I have considered passing back 3 result sets and using LINQ to check if there are any related/child records in the related tables. I've also considered passing back a single de-normalised table containing all of the data from the three tables and reading through the rows, manually figuring out where a product/variant/option begins and ends. Having little to no prior experience with LINQ, I opted to go for the latter, which sort of worked but required many lines of code for something that I had hoped would be pretty straight forward.
Is there an easier way of accomplishing this?
The end goal is to serialize the resulting classes to JSON, for use in a Web Service Application.
I've searched and searched on Google for an answer, but I guess I'm not searching for the right keywords.
After a bit of playing around, I've figured out a way of accomplishing this...
Firstly, create a stored procedure in SQL Server that will return the data as XML. It's relatively easy to generate an XML document containing hierarchical data.
CREATE PROCEDURE usp_test
AS
BEGIN
SELECT
1 AS ProductID
, 'Test' AS ProductDesc
, (
SELECT 1 AS VariantID
'Test'AS VariantDesc
FOR XML PATH('ProductVariant'), ROOT('ProductVariants'), TYPE
)
FOR XML PATH('Product'), ROOT('ArrayOfProduct'), TYPE
END
This gives you an XML document with a parent-child relationship:
<ArrayOfProduct>
<Product>
<ProductID>1</ProductID>
<ProductDesc>Test</ProductDesc>
<ProductVariants>
<ProductVariant>
<VariantID>1</VariantID>
<VariantDesc>Test</VariantDesc>
</ProductVariant>
</ProductVariants>
</Product>
</ArrayOfProduct>
Next, read in the results into the VB.Net application using a SqlDataReader. Declare an empty object to hold the data and deserialize the XML into the object using an XmlSerializer.
At this point, the data that once was in SQL tables is now represented as classes in your VB.Net application.
From here, you can then serialize the object into JSON using JavaScriptSerializer.Serialize.
I am looking for a tool that can serialize and/or transform SQL Result Sets into XML. Getting dumbed down XML generation from SQL result sets is simple and trivial, but that's not what I need.
The solution has to be database neutral, and accepts only regular SQL query results (no db xml support used). A particular challenge of this tool is to provide nested XML matching any schema from row based results. Intermediate steps are too slow and wasteful - this needs to happen in one single step; no RS->object->XML, preferably no RS->XML->XSLT->XML. It must support streaming due to large result sets, big XML.
Anything out there for this?
With SQL Server you really should consider using the FOR XML construct in the query.
If you're using .Net, just use a DataAdapter to fill a dataset. Once it's in a dataset, just use its .WriteXML() method. That breaks your DB->object->XML rule, but it's really how things are done. You might be able to work something out with a datareader, but I doubt it.
Not that I know of. I would just roll my own. It's not that hard to do, maybe something like this:
#!/usr/bin/env jruby
import java.sql.DriverManager
# TODO some magic to load the driver
conn = DriverManager.getConnection(ARGV[0], ARGV[1], ARGV[2])
res = conn.executeQuery ARGV[3]
puts "<result>"
meta = res.meta_data
while res.next
puts "<row>"
for n in 1..meta.column_count
column = meta.getColumnName n
puts "<#{column}>#{res.getString(n)}</#{column}"
end
puts "</row>"
end
puts "</result>"
Disclaimer: I just made all of that up, I'm not even bothering to pretend that it works. :-)
In .NET you can fill a dataset from any source and then it can write that out to disk for you as XML with or without the schema. I can't say what performance for large sets would be like. Simple :)
Another option, depending on how many schemas you need to output, and/or how dynamic this solution is supposed to be, would be to actually write the XML directly from the SQL statement, as in the following simple example...
SELECT
'<Record>' ||
'<name>' || name || '</name>' ||
'<address>' || address || '</address>' ||
'</Record>'
FROM
contacts
You would have to prepend and append the document element, but I think this example is easy enough to understand.
dbunit (www.dbunit.org) does go from sql to xml and vice versa; you might be able to modify it more for your needs.
Technically, converting a result set to an XML file is straight forward and doesn't need any tool unless you have a requirement to convert the data structure to fit specific export schema. In general the result set gets the top-level element of an XML file, then you produce a number of record elements containing attributes, which effectively are the fields of a record.
When it comes to Java, for example, you just need appropriate JDBC driver for interfacing with DBMS of your choice addressing the database independency requirement (usually provided by a DBMS vendor), and a few lines of code to read a result set and print out an XML string per record, per field. Not a difficult task for an average Java developer in my opinion.
Anyway, the more concrete purpose you state the more concrete answer you get.
In Java, you may just fill an object with the xml data (like an entity bean) and then use XMLEncoder to get it to xml. From there you may use XSLT for further conversion or XMLDecoder to bring it back to an object.
Greetz, GHad
PS: See http://ghads.wordpress.com/2008/09/16/java-to-xml-to-java/ for an example for the Object to XML part... From DB to Object multiple more way are possible: JDBC, Groovy DataSets or GORM. Apache Common Beans may help to fill up JavaBeans via Reflection-like methods.
I created a solution to this problem by using the equivalent of a mail merge using the resultset as the source, and a template through which it was merged to produce the desired XML.
The template was standard XML, with a Header element, a Footer element and a Body element. Using a CDATA block in the Body element allowed me to include a complete XML structure that acted as the template for each row. In order to include a fields from the resultset in the template, I used markers that looked like this <[FieldName]>. The template was then pre-parsed to isolate the markers such that in operation, the template requests each of the fields from the resultset as the Body is being produced.
The Header and Footer elements are output only once at the beginning and end of the output set. The body could be any XML or text structure desired. In your case, it sounds like you might have several templates, one for each of your desired schemas.
All of the above was encapsulated in a Template class, such that after loading the Template, I merely called merge() on the template passing the resultset in as a parameter.