Google Bigquery, WHERE clause based on JSON item - google-bigquery

I've got a bigquery import from a firestore database where I want to query on a particular field from a document. This was populated via the firestore-bigquery extension and the document data is stored as a JSON string.
I'm trying to use a WHERE clause in my query that uses one of the fields from the JSON data. However this doesn't seem to work.
My query is as follows:
SELECT json_extract(data,'$.title') as title,p
FROM `table`
left join unnest(json_extract_array(data, '$.tags')) as p
where json_extract(data,'$.title') = 'technology'
data is the JSON object and title is an attribute of all of the items. The above query will run but yield 'no results' (There are definitely results there for the title in question as they appear in the table preview).
I've tried using WHERE title = 'technology' as well but this returns an error that title is an unrecognized field (hence the json_extract).
From my research this should work as a standard SQL JSON query but doesn't seem to work on Bigquery. Does anyone know of a way around this?
All I can think of is if I put the results in another table, but I don't know if that's a workable solution as the data is updated via the extension on an update, so I would need to constantly refresh my second table as well.
Edit
I'm wondering if configuring a view would help with this? Though ultimately I would like to query this based on different parameters and the docs here https://cloud.google.com/bigquery/docs/views suggest you can't reference query parameters in a view

I've since managed to work this out, and will share the solution for anyone else with the same problem.
The solution was to use JSON_VALUE in the WHERE clause instead e.g:
where JSON_VALUE(data,'$.title') = 'technology';
I'm still not sure if this is the best way to do this in terms of performance and cost so I will wait to see if anyone else leaves a better answer.

Related

Understanding GraphQL

I started experimenting with the GraphQL wp api.
I am querying the menus. As for the documentation, the query is very long
I would expect that querying
{
menus
}
only would bring about all the data nested in menus, it does not.
Why is this? What is the way to getting all nested data in an object as to see what's in there?
Thank you for your time
The rule is that every "leaf" fields in a GraphQL query should be a Scalar something like Int , Boolean , String etc. So if the meuns field in the root Query type is a Scalar , it is a valid query and will return you something.
If not , you have to continue navigating the Menu type and pick the fields that you want to include in the GraphQL query such as :
{
menus {
id
createdDate
}
}
There is no wildcard that can represent all fields in current GraphQL spec.You have to explicitly declare all fields you want to select in the query.By looking at the GraphQL schema, you can know the available fields for each type. One of the tips is to rely on the GraphQL introspection system .It basically means that you can use some of the GraphQL client such as Altair, Graphiql, or GraphQL Playground etc. which most of them will have some auto-suggest function that will guide you to compose a query by suggesting you what fields are available to be included for a type .
P.S. A similar analogy to SQL is that there is no select * from foo , you have to explicitly define the columns that you want to select in the select clause such as select id,name,address from foo.
If you keep in mind that you're getting back a JSON object, you can think of your GraphQL query as defining the left-hand side of the response (this is intentional in how it was designed), e.g. just the keys. So unless there are null values, what you get back should exactly match the shape of the query.
If you want to see what can be queried, you need access to the schema itself. If it's a schema provided by someone else (looks like WordPress in this case), they should also have provided the means to explore and understand it.
That is the main feature of GraphQL, you can specify what data you need from a query. And because of that, you can't just query menus in that way, you need to specify every nested field in menus you need and only then it'll work :)

SELECT query using LIKE property in Microsoft Access returns no results when it should

I'm sure I'm making some kind of rookie error here, but I have no idea what the problem is. I am trying to run a simple query on one table in a microsoft access database using the LIKE property to find records that have a certain text string in a particular field. More specifically, the table, called Catreqs, has a few fields, bib_num, MARC_336, MARC_337, and MARC_338. The MARC_336 field has a text string in it and I want a query that selects all the records for which that text string includes the characters "txt".
Here's my query:
SELECT [Catreqs].record_num, [Catreqs].MARC_336
FROM [Catreqs]
WHERE [Catreqs].MARC_336 Like '%txt%';
I should note that I created this query in MS Access design view and this is the query that was generated when I switched to SQL view. I am a little familiar with SQL and even less familiar with Access so this is actually my preferred way of dealing with it.
I've also tried using Like '*txt*' but that didn't return any results either. For reference, here is the entire text string these characters are in:
text txt rdacontent
Any suggestions thoughts on why this fails and how I can fix it?
Thanks!
In Access, for a string you must use the * character.
Check if [Catreqs] has rows where MARC_336 contains "txt".
This is the official documentation of Access:
https://support.office.com/en-us/article/Like-Operator-b2f7ef03-9085-4ffb-9829-eef18358e931?ui=it-IT&rs=en-001&ad=IT&omkt=en-001

Updating SQL from object with groovy

When you read in a result set in Groovy it comes in a collection of maps.
Seems like you should be able to update values inside those maps and write them back out, but I can't find anything built into groovy to allow me to do so.
I'm considering writing a routine that allows me to write a modified map by iterating over the fields of one of the result objects, taking each key/value pair and using them to create the appropriate update statement, but it could be annoying so I was wondering if anyone else had done this or if it'sa vailable already in groovy.
It seems like just a few lines of code so I'd rather not bring in hibernate for this. I'm just thinking a little "update" method that would allow:
def rows=sql.rows(query)
rows[0].name="newName"
update(sql, rows[0])
to update the first guy's name in the database. Anyone seen/created such a monster, or is something like this already built into Groovy Sql and I'm just missing it?
(I suppose you may have to point out to the update method which field is the key field, but that's doable...)
Using the rows method will actually read out all of the values into a List of GroovyRowResult so it's not really possible to update the data without creating an update method like the one you mention.
It's not really possible to do that in the generic case because your query can contain joins or a column reference that is an aggregate, etc.
If you're selecting from a single table use the Sql.eachRow method however and set the ResultSet to be an updatable one, you can use the underlying ResultSet interface to update as you iterate through:
sql.resultSetConcurrency = ResultSet.CONCUR_UPDATABLE
sql.resultSetType = ResultSet.TYPE_FORWARD_ONLY
sql.eachRow(query) { row ->
row.updateString('name', 'newName')
row.updateRow()
}
Depending on the database/driver you use, you may not be able to create an updatable ResultSet.

Is there a way to parser a SQL query to pull out the column names and table names?

I have 150+ SQL queries in separate text files that I need to analyze (just the actual SQL code, not the data results) in order to identify all column names and table names used. Preferably with the number of times each column and table makes an appearance. Writing a brand new SQL parsing program is trickier than is seems, with nested SELECT statements and the like.
There has to be a program, or code out there that does this (or something close to this), but I have not found it.
I actually ended up using a tool called
SQL Pretty Printer. You can purchase a desktop version, but I just used the free online application. Just copy the query into the text box, set the Output to "List DB Object" and click the Format SQL button.
It work great using around 150 different (and complex) SQL queries.
How about using the Execution Plan report in MS SQLServer? You can save this to an xml file which can then be parsed.
You may want to looking to something like this:
JSqlParser
which uses JavaCC to parse and return the query string as an object graph. I've never used it, so I can't vouch for its quality.
If you're application needs to do it, and has access to a database that has the tables etc, you could run something like:
SELECT TOP 0 * FROM MY_TABLE
Using ADO.NET. This would give you a DataTable instance for which you could query the columns and their attributes.
Please go with antlr... Write a grammar n follow the steps..which is given in antlr site..eventually you will get AST(abstract syntax tree). For the given query... we can traverse through this and bring all table ,column which is present in the query..
In DB2 you can append your query with something such as the following, but 1 is the minimum you can specify; it will throw an error if you try to specify 0:
FETCH FIRST 1 ROW ONLY

Need Pattern for dynamic search of multiple sql tables

I'm looking for a pattern for performing a dynamic search on multiple tables.
I have no control over the legacy (and poorly designed) database table structure.
Consider a scenario similar to a resume search where a user may want to perform a search against any of the data in the resume and get back a list of resumes that match their search criteria. Any field can be searched at anytime and in combination with one or more other fields.
The actual sql query gets created dynamically depending on which fields are searched. Most solutions I've found involve complicated if blocks, but I can't help but think there must be a more elegant solution since this must be a solved problem by now.
Yeah, so I've started down the path of dynamically building the sql in code. Seems godawful. If I really try to support the requested ability to query any combination of any field in any table this is going to be one MASSIVE set of if statements. shiver
I believe I read that COALESCE only works if your data does not contain NULLs. Is that correct? If so, no go, since I have NULL values all over the place.
As far as I understand (and I'm also someone who has written against a horrible legacy database), there is no such thing as dynamic WHERE clauses. It has NOT been solved.
Personally, I prefer to generate my dynamic searches in code. Makes testing convenient. Note, when you create your sql queries in code, don't concatenate in user input. Use your #variables!
The only alternative is to use the COALESCE operator. Let's say you have the following table:
Users
-----------
Name nvarchar(20)
Nickname nvarchar(10)
and you want to search optionally for name or nickname. The following query will do this:
SELECT Name, Nickname
FROM Users
WHERE
Name = COALESCE(#name, Name) AND
Nickname = COALESCE(#nick, Nickname)
If you don't want to search for something, just pass in a null. For example, passing in "brian" for #name and null for #nick results in the following query being evaluated:
SELECT Name, Nickname
FROM Users
WHERE
Name = 'brian' AND
Nickname = Nickname
The coalesce operator turns the null into an identity evaluation, which is always true and doesn't affect the where clause.
Search and normalization can be at odds with each other. So probably first thing would be to get some kind of "view" that shows all the fields that can be searched as a single row with a single key getting you the resume. then you can throw something like Lucene in front of that to give you a full text index of those rows, the way that works is, you ask it for "x" in this view and it returns to you the key. Its a great solution and come recommended by joel himself on the podcast within the first 2 months IIRC.
What you need is something like SphinxSearch (for MySQL) or Apache Lucene.
As you said in your example lets imagine a Resume that will composed of several fields:
List item
Name,
Adreess,
Education (this could be a table on its own) or
Work experience (this could grow to its own table where each row represents a previous job)
So searching for a word in all those fields with WHERE rapidly becomes a very long query with several JOINS.
Instead you could change your framework of reference and think of the Whole resume as what it is a Single Document and you just want to search said document.
This is where tools like Sphinx Search do. They create a FULL TEXT index of your 'document' and then you can query sphinx and it will give you back where in the Database that record was found.
Really good search results.
Don't worry about this tools not being part of your RDBMS it will save you a lot of headaches to use the appropriate model "Documents" vs the incorrect one "TABLES" for this application.