I have a repository of SQL queries and I want to understand which queries use certain tables or fields.
Let's say I want to understand what queries use the email field, how can I write it?
Example SQL query:
select
users.email as email_user
,users.email as email_user_too
,email as email_user_too_2
email as email_user_too_3,
back_email as wrong_email -- wrong field
from users
So to state the problem more accurately, you are sorting through a list of SQL queries [as text], and you now need to find the queries that use certain fields using SQL & RegEx (Regular Expressions) in PostgreSQL. (please tag the question so that StackOverflow indexes your question correctly, more importantly, readers have more context about the question)
PostgreSQL has Regular Expression support OOTB (Out Of The Box). So we skip exploring other ways to do this. (If you are reading this as Microsoft SQL Server person, then I strongly suggest you to have a read of this brilliant article on Microsoft's website on defining a Table-Valued UDF (User Defined Function))
The simplest way I could think of to approach your problem, is to throw away what we don't want out of the query text first, and then filter out what's left.
This way, after throwing away the stuff you don't need, you will be left with a set of "tokens" that you can easily filter, and I'm putting token in quotes since we are not really parsing the SQL language, but if we did that would be the first step: to extract tokens.. (:
Take this query for example:
With Queries (
Id
, QueryText
) As (
values (1, 'select
users.email as email_user
,users.email as email_user_too
,email as email_user_too_2,
email as email_user_too_3,
back_email as wrong_email -- wrong field
from users')
)
Select QueryText
, found
From (
Select Id
, QueryText
, regexp_split_to_table (QueryText, '(--[\s\w]+|select|from|as|where|[ \s\n,])') As found
From Queries
) As Result
Where found != ''
And found = 'back_email'
I have sourced the concept of a "query repository" with a WITH statement for ease of doing the pseudo-code.
I have also selected few words/characters to split QueryText with. Like select, where etc. We don't need these in our 'found' set.
And in the end, as you can see above, I simply used found as what's left and filtered it with the field name you are looking for. (Assuming that you know the field you are looking for)
You could improve upon the RegEx I did, or change the method as you wish to make it better. But I think the general concept addresses what you need to achieve. One problem I can see with my solution right off the bat is the fact that you can search for anything really, not just names of the selected fields - which begs the question, why use RegEx, and not Like statements? But again, as I mentioned, you can improve upon the RegEx and address specific requirements you may have. Using Like might limit you in that direction. (In other words, only you know what's good for you. I can't say that from here.)
You can play with the query online here: db-fiddle query and use https://regex101.com/ for testing your RegEx.
Disclaimer I'm not a PostgreSQL developer. There must be other, perhaps better ways of doing this. (:
I'm still a bit of a noob, so pardon if this question is a bit obvious. I did search for an answer but either couldn't understand how the answers I found applied, or simply couldn't find an answer.
I have a massive database housed on a DB2 for i server which I'm accessing using SQL through SQLExplorer (based on Squirrel SQL). The tables are very poorly documented and the first order of business is figuring out how to find my way around.
I want to write a simple query that does this:
1) Allows me to search the entire database looking for tables that include a column called "Remarks" (which contains field descriptions).
2) I then want it to search that column for a keyword.
3) I want a table returned that includes the names of the tables that include that keyword (just the name, I can look up the table alphabetically later and look inside if I need to.)
I need this search to be super lightweight, and I'm hoping the concept I describe will achieve that. Anything that eats up a lot of resources will likely anger the sys admin for the server.
Just to show I have tried (and that I am a complete noob), here's what I've got so far.
SELECT *
FROM <dbname>
WHERE Remarks LIKE '<keyword>'
Feel free to mock, I told you I'm an idiot :-).
Any help? Perhaps at least a push in the right direction?
PS - I can't seem to find a search function in SQLExplorer, if someone knows if I can perhaps use a simple search or filter to accomplish this same goal...that would be great.
You can query the system catalog to identify the tables:
SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME
FROM QSYS2.SYSCOLUMNS WHERE UPPER(DBILFL) = 'REMARKS'
And then query each table individually:
SELECT * FROM TABLE_SCHEMA.TABLE_NAME WHERE Remarks LIKE '%<keyword>%'
See the LIKE predicate for details of the pattern expression.
Normally i use something like this
SELECT TABLE_SCHEMA, TABLE_NAME
,COLUMN_NAME,SYSTEM_COLUMN_NAME,COLUMN_HEADING
,DATA_TYPE, "LENGTH",NUMERIC_SCALE
FROM QSYS2.SYSCOLUMNS
WHERE UPPER(COLUMN_NAME) LIKE '%REMARK%'
#JamesA, i'm at V6R1, by default, normal user are not authorized to object QADBIFLD in QSYS
Generally, many if not most IBM i shops (especially those that use RPG) stick to 10 (or less) character schema names & table names, and have a 10 (or less) character names for 'system' column names, even if longer column names are also provided. Column text generally describes each field.
SELECT SYSTEM_TABLE_SCHEMA, SYSTEM_TABLE_NAME
,SYSTEM_COLUMN_NAME,
,DATA_TYPE, "LENGTH",NUMERIC_SCALE
,CHAR(COLUMN_TEXT)
FROM QSYS2.SYSCOLUMNS
WHERE UPPER(COLUMN_NAME) LIKE '%REMARK%'
Before I ask my question will layout what I'm trying to do.
I have this table such as below
Columns - PID, Choice1, Choice2,......Choice10
Rows - 1,X, O, X, O.........
Ive been searching on the net for quite some time and need a little push in the right direction if what I'm trying to do is possible. While getting the coding will help me with the small project I'm doing, it doesn't really help me learn more about SQl.
Is it possible to do a search on the table and return only the columns that have a value of X where PID = some value??
My gut instinct is saying no and I might have to restructure my database to accomplish what I'm doing. As i said a point in the right direction where I can read up on what I'm trying to do is great, getting the coding for it.. really doesn't help me learn it for future reference.
It does sound like you should restructure your database, but you can use PIVOT and UNPIVOT to transpose and restructure the output table. Columns are normally fixed with a variable number of rows depending on the WHERE clause. Using PIVOT can swap columns for rows, giving you what you need.
I've already answered my question but didn't see it on here, so here we go. Please feel free to link to the question if it has been asked exactly.
I simplified my question to the following code:
SELECT 'a' AS col1, 'b' AS col1
Will this give a same column name error?
Will the last value always be returned or is there a chance col1 could be 'a'?
I'm not sure why you would ever want this, but I tried it in Oracle (10g) and it worked fine, returning both columns. I realize you've asked about SQL Server specifically, but I found it interesting that this worked at all.
Edit: It also works on MySQL.
It works in the final query:
However when you do it in a subselect and refer to the ambiguous column aliases in an outer query you get an error:
In SQL 2008 r2 it is valid as a stand alone query. Under certain circumstances it will produce errors (incomplete list):
Inline views
Common Table Entries
Stored Procedures when the output is used by reporting services and presumably similarly integrated tools
It's hard to imagine a case where you would want duplicate row names, and it's easy to think of ways in which writing queries with repeats now could turn sour in the future.
When I try to run this query in Access through the ODBC interface into a MySQL database I get an "Expression too complex in query expression" error. The essential thing I'm trying to do is translate abbreviated names of languages into their full body English counterparts. I was curious if there was some way to "trick" access into thinking the expression is smaller with sub queries, or if someone else had a better idea of how to solve this problem. I thought about making a temporary table and doing a join on it, but that's not supported in Access SQL.
Just as an FYI, the query worked fine until I added the big long IFF chain. I tested the query on a smaller IFF chain for three languages, and that wasn't an issue, so the problem definitely stems from the huge IFF chain (It's 26 deep). Also, I might be able to drop some of the options (like combining the different forms of Chinese or Portuguese)
As a test, I was able to get the SQL query to work after paring it down to 14 IFF() statements, but that's a far cry from the 26 languages I'd like to represent.
SELECT TOP 5 Count( * ) AS [Number of visits by language], IIf(login.lang="ar","Arabic",IIf(login.lang="bg","Bulgarian",IIf(login.lang="zh_CN","Chinese (Simplified Han)",IIf(login.lang="zh_TW","Chinese (Traditional Han)",IIf(login.lang="cs","Czech",IIf(login.lang="da","Danish",IIf(login.lang="de","German",IIf(login.lang="en_US","United States English",IIf(login.lang="en_GB","British English",IIf(login.lang="es","Spanish",IIf(login.lang="fr","French",IIf(login.lang="el","Greek",IIf(login.lang="it","Italian",IIf(login.lang="ko","Korean",IIf(login.lang="hu","Hungarian",IIf(login.lang="nl","Dutch",IIf(login.lang="pl","Polish",IIf(login.lang="pt_PT","European Portuguese",IIf(login.lang="pt_BR","Brazilian Portuguese",IIf(login.lang="ru","Russian",IIf(login.lang="sk","Slovak",IIf(login.lang="sl","Slovenian","IIf(login.lang="fi","Finnish",IIf(login.lang="sv","Swedish",IIf(login.lang="tr","Turkish","Unknown")))))))))))))))))))))))))) AS [Language]
FROM login, reservations, reservation_users, schedules
WHERE (reservations.start_date Between DATEDIFF('s','1970-01-01 00:00:00',[Starting Date in the Following Format YYYY/MM/DD]) And DATEDIFF('s','1970-01-01 00:00:00',[Ending Date in the Following Format YYYY/MM/DD])) And reservations.is_blackout=0 And reservation_users.memberid=login.memberid And reservation_users.resid=reservations.resid And reservation_users.invited=0 And reservations.scheduleid=schedules.scheduleid And scheduletitle=[Schedule Title]
GROUP BY login.lang
ORDER BY Count( * ) DESC;
# Michael Todd
I completely agree. The list of languages should have been a table in the database and the login.lang should have been a FK into that table. Unfortunately this isn't how the database was written, and it's not really mine to modify. The languages are placed into the login.lang field by the PHP running on top of the database.
I thought about making a temporary table and doing a join on it, but that's not supported in Access SQL.
Did you try making a table of languages within Access, and joining it to the MySQL tables?
You may try the below expression. what I did is, your expression is cut down to two parts, then a final 'IIf' check will do the trick. You will have additional 2 fields and you may ignore those. I had the same situation and this worked well for me. PS: You may need to double check the closing brackets in the below expression. I did it quickly.
Thanks,
Shibin
IIf(login.lang="ar","Arabic",IIf(login.lang="bg","Bulgarian",IIf(login.lang="zh_CN","Chinese (Simplified Han)",IIf(login.lang="zh_TW","Chinese (Traditional Han)",IIf(login.lang="cs","Czech",IIf(login.lang="da","Danish",IIf(login.lang="de","German",IIf(login.lang="en_US","United States English",IIf(login.lang="en_GB","British English",IIf(login.lang="es","Spanish",IIf(login.lang="fr","French",IIf(login.lang="el","Greek",IIf(login.lang="it","Italian",""))))))))))))) as l1,
IIf(login.lang="ko","Korean",IIf(login.lang="hu","Hungarian",IIf(login.lang="nl","Dutch",IIf(login.lang="pl","Polish",IIf(login.lang="pt_PT","European Portuguese",IIf(login.lang="pt_BR","Brazilian Portuguese",IIf(login.lang="ru","Russian",IIf(login.lang="sk","Slovak",IIf(login.lang="sl","Slovenian","IIf(login.lang="fi","Finnish",IIf(login.lang="sv","Swedish",IIf(login.lang="tr","Turkish","Unknown")))))))))))) as l2,
IIf(l1="",l2,l1) AS [Language]
If you can't use a lookup table, create a custom VB function, so that instead of 26 IIf statements, you have one function call.