SQL: Need to hide a specific word(s) when they are used in a column - - sql

My question right now is whether something can be done or not, as a result, no code has been included in this question. If it can be done what is the correct phrase that I can query and research this further.
I am working with a customer database where the request has been made that if a specific word is used in the comments field, that word is replaced or hidden when the query is used report and viewed using SSRS / Report Builder.
I had also wondered if even an expression can be written to hide or mask that word, and this would then be used in the tablix field that is used on the report.
Any suggestions are appreciated.
The database is Microsoft SQL 2016 with SSRS 2017 and Report Builder 2016.

If there is a specific word, then the answer is simple. You can just use replace(). In fact, you can add this into the table:
alter table t add safe_comments as (replace(comments, '<bad word>', 'XXXXXXX'));
You can extend this to a handful of hard-coded words by nesting replace() values.
I suspect, however, that your problem is that you have a fairly long list of words that you want to replace. If that is the case, such a simple solution is not going to work.
It is possible in SQL Server to remove a list of words, stored in a table, from a given comment. That requires a recursive CTE (or a user-defined function). This probably has acceptable performance for returning a single record or a handful of records. However, for scanning the entire table, it would probably be too slow.

Related

Efficiently access long text from a column in a SQL Server database

Newbie question: I'm using SQL Server on a remote server where I cannot use UDFs of any type. I am using a database where one column is a text column containing several paragraphs of text per entry. As I test my code, I would like to have a way to efficiently access these columns a few at a time.
Currently, the only way I can find to do this is using eg:
SELECT TOP 25 ReportText
FROM MyDb.Table1
...which, of course, gives the output in the Results window, as a single string of characters that I have to keep dragging the column width wider and wider and wider in order to try to see.
What am I missing? I don't need it to look pretty, I just need to be able to see it efficiently....
As #JohnSpecko pointed out in a comment, results to text is exactly what I needed. It is turned on by hitting (ctrl+t) before running your query. Thanks!

Access dynamic query - Better to build one conditional SQL query or multiple queries with VBA?

I have a Microsoft Access 2010 form with dropboxes and a checkbox which represent certain parameters. I need to run a query with conditions based on these parameters. It should also be a possibility for no criteria from the dropdown boxes and checkbox in order to pull all data.
I have two working ways of implementing this:
I build a query with IIf statements in the WHERE clause, nesting statements until I have accounted for every combination of criteria. I reference the criteria in the SQL logic by using Forms!frmMyFrm!checkbox1 for example or by using a function FormFieldValue(formName,fieldName) which returns the value of a control with the input of the form and control name (This is because of previous issues). I set this query to run with the press of the form's button.
I set a vba sub to run with the press of the button. I check the conditions and set the query SQL to a predetermined SQL string based on the control criteria (referenced in the same way as the previous method). This also involves many If...Else statements, but is a little easier to read than a giant query.
What is the preferred method? Which is more efficient?
I don't believe you would find one way is more efficient over the other, at least not noticeably. For the most part it is simply personal preference.
I generally use VBA and check the value of each dropdown/checkbox and build pieces of the SQL query then put together at the end. The issue that you may run into with this method though is that if you have a large number of dropdowns and checkboxes the code is easy to get "lost" in.
If time to run is very key though you could always use some of the tips How do you test running time of VBA code? to see which way is faster.
After a lot of experimentation, and a bit of new information indicating having a pre-built query is faster than having SQL compiled in VBA, the most efficient and clear solution in the context of Microsoft Access is to build and save a number of dependent queries beforehand.
Essentially, build a string of queries each with an IIf dependent on a different criterium. Then you only need to run the final query. The only case where you would have to incorporate a VBA If...Else is if you need to query something more complicated than SELECT...WHERE(IIf(...)).
This has a few advantages:
The SQL is already compiled in the saved query, speeding things up.
No more getting lost in code:
There is no giant, nearly-impossible-to-edit query with way too many IIfs.
The minimal VBA code is even easier to follow.
At least for me, who's not an expert in SQL, it's convenient that I can often use the MS Access visual query builder for each part.

How can I make MS Access Query Parameters Optional?

I have a query that I would like to filter in different ways at different times. The way I have done this right now by placing parameters in the criteria field of the relevant query fields, however there are many cases in which I do not want to filter on a given field but only on the other fields. Is there any way in which a wildcard of some sort can be passed to the criteria parameter so that I can bypass the filtering for that particular call of the query?
If you construct your query like so:
PARAMETERS ParamA Text ( 255 );
SELECT t.id, t.topic_id
FROM SomeTable t
WHERE t.id Like IIf(IsNull([ParamA]),"*",[ParamA])
All records will be selected if the parameter is not filled in.
Note the * wildcard with the LIKE keyword will only have the desired effect in ANSI-89 Query Mode.
Many people mistakenly assume the wildcard character in Access/Jet is always *. Not so. Jet has two wildcards: % in ANSI-92 Query Mode and * in ANSI-89 Query Mode.
ADO is always ANSI-92 and DAO is always ANSI-89 but the Access interface can be either.
When using the LIKE keyword in a database object (i.e. something that will be persisted in the mdb file), you should to think to yourself: what would happen if someone used this database using a Query Mode other than the one I usually use myself? Say you wanted to restrict a text field to numeric characters only and you'd written your Validation Rule like this:
NOT LIKE "*[!0-9]*"
If someone unwittingly (or otherwise) connected to your .mdb via ADO then the validation rule above would allow them to add data with non-numeric characters and your data integrity would be shot. Not good.
Better IMO to always code for both ANSI Query Modes. Perhaps this is best achieved by explicitly coding for both Modes e.g.
NOT LIKE "*[!0-9]*" AND NOT LIKE "%[!0-9]%"
But with more involved Jet SQL DML/DDL, this can become very hard to achieve concisely. That is why I recommend using the ALIKE keyword, which uses the ANSI-92 Query Mode wildcard character regardless of Query Mode e.g.
NOT ALIKE "%[!0-9]%"
Note ALIKE is undocumented (and I assume this is why my original post got marked down). I've tested this in Jet 3.51 (Access97), Jet 4.0 (Access2000 to 2003) and ACE (Access2007) and it works fine. I've previously posted this in the newsgroups and had the approval of Access MVPs. Normally I would steer clear of undocumented features myself but make an exception in this case because Jet has been deprecated for nearly a decade and the Access team who keep it alive don't seem interested in making deep changes to the engines (or bug fixes!), which has the effect of making the Jet engine a very stable product.
For more details on Jet's ANSI Query modes, see About ANSI SQL query mode.
Back to my previous exampe in your previous question. Your parameterized query is a string looking like that:
qr = "Select Tbl_Country.* From Tbl_Country WHERE id_Country = [fid_country]"
depending on the nature of fid_Country (number, text, guid, date, etc), you'll have to replace it with a joker value and specific delimitation characters:
qr = replace(qr,"[fid_country]","""*""")
In order to fully allow wild cards, your original query could also be:
qr = "Select Tbl_Country.* From Tbl_Country _
WHERE id_Country LIKE [fid_country]"
You can then get wild card values for fid_Country such as
qr = replace(qr,"[fid_country]","G*")
Once you're done with that, you can use the string to open a recordset
set rs = currentDb.openRecordset(qr)
I don't think you can. How are you running the query?
I'd say if you need a query that has that many open variables, put it in a vba module or class, and call it, letting it build the string every time.
I'm not sure this helps, because I suspect you want to do this with a saved query rather than in VBA; however, the easiest thing you can do is build up a query line by line in VBA, and then creating a recordset from it.
A quite hackish way would be to re-write the saved query on the fly and then access that; however, if you have multiple people using the same DB you might run into conflicts, and you'll confuse the next developer down the line.
You could also programatically pass default value to the query (as discussed in you r previous question)
Well, you can return non-null values by passing * as the parameter for fields you don't wish to use in the current filter. In Access 2003 (and possibly earlier and later versions), if you are using like [paramName] as your criterion for a numeric, Text, Date, or Boolean field, an asterisk will display all records (that match the other criteria you specify). If you want to return null values as well, then you can use like [paramName] or Is Null as the criterion so that it returns all records. (This works best if you are building the query in code. If you are using an existing query, and you don't want to return null values when you do have a value for filtering, this won't work.)
If you're filtering a Memo field, you'll have to try another approach.

Is there a way to parser a SQL query to pull out the column names and table names?

I have 150+ SQL queries in separate text files that I need to analyze (just the actual SQL code, not the data results) in order to identify all column names and table names used. Preferably with the number of times each column and table makes an appearance. Writing a brand new SQL parsing program is trickier than is seems, with nested SELECT statements and the like.
There has to be a program, or code out there that does this (or something close to this), but I have not found it.
I actually ended up using a tool called
SQL Pretty Printer. You can purchase a desktop version, but I just used the free online application. Just copy the query into the text box, set the Output to "List DB Object" and click the Format SQL button.
It work great using around 150 different (and complex) SQL queries.
How about using the Execution Plan report in MS SQLServer? You can save this to an xml file which can then be parsed.
You may want to looking to something like this:
JSqlParser
which uses JavaCC to parse and return the query string as an object graph. I've never used it, so I can't vouch for its quality.
If you're application needs to do it, and has access to a database that has the tables etc, you could run something like:
SELECT TOP 0 * FROM MY_TABLE
Using ADO.NET. This would give you a DataTable instance for which you could query the columns and their attributes.
Please go with antlr... Write a grammar n follow the steps..which is given in antlr site..eventually you will get AST(abstract syntax tree). For the given query... we can traverse through this and bring all table ,column which is present in the query..
In DB2 you can append your query with something such as the following, but 1 is the minimum you can specify; it will throw an error if you try to specify 0:
FETCH FIRST 1 ROW ONLY

Need Pattern for dynamic search of multiple sql tables

I'm looking for a pattern for performing a dynamic search on multiple tables.
I have no control over the legacy (and poorly designed) database table structure.
Consider a scenario similar to a resume search where a user may want to perform a search against any of the data in the resume and get back a list of resumes that match their search criteria. Any field can be searched at anytime and in combination with one or more other fields.
The actual sql query gets created dynamically depending on which fields are searched. Most solutions I've found involve complicated if blocks, but I can't help but think there must be a more elegant solution since this must be a solved problem by now.
Yeah, so I've started down the path of dynamically building the sql in code. Seems godawful. If I really try to support the requested ability to query any combination of any field in any table this is going to be one MASSIVE set of if statements. shiver
I believe I read that COALESCE only works if your data does not contain NULLs. Is that correct? If so, no go, since I have NULL values all over the place.
As far as I understand (and I'm also someone who has written against a horrible legacy database), there is no such thing as dynamic WHERE clauses. It has NOT been solved.
Personally, I prefer to generate my dynamic searches in code. Makes testing convenient. Note, when you create your sql queries in code, don't concatenate in user input. Use your #variables!
The only alternative is to use the COALESCE operator. Let's say you have the following table:
Users
-----------
Name nvarchar(20)
Nickname nvarchar(10)
and you want to search optionally for name or nickname. The following query will do this:
SELECT Name, Nickname
FROM Users
WHERE
Name = COALESCE(#name, Name) AND
Nickname = COALESCE(#nick, Nickname)
If you don't want to search for something, just pass in a null. For example, passing in "brian" for #name and null for #nick results in the following query being evaluated:
SELECT Name, Nickname
FROM Users
WHERE
Name = 'brian' AND
Nickname = Nickname
The coalesce operator turns the null into an identity evaluation, which is always true and doesn't affect the where clause.
Search and normalization can be at odds with each other. So probably first thing would be to get some kind of "view" that shows all the fields that can be searched as a single row with a single key getting you the resume. then you can throw something like Lucene in front of that to give you a full text index of those rows, the way that works is, you ask it for "x" in this view and it returns to you the key. Its a great solution and come recommended by joel himself on the podcast within the first 2 months IIRC.
What you need is something like SphinxSearch (for MySQL) or Apache Lucene.
As you said in your example lets imagine a Resume that will composed of several fields:
List item
Name,
Adreess,
Education (this could be a table on its own) or
Work experience (this could grow to its own table where each row represents a previous job)
So searching for a word in all those fields with WHERE rapidly becomes a very long query with several JOINS.
Instead you could change your framework of reference and think of the Whole resume as what it is a Single Document and you just want to search said document.
This is where tools like Sphinx Search do. They create a FULL TEXT index of your 'document' and then you can query sphinx and it will give you back where in the Database that record was found.
Really good search results.
Don't worry about this tools not being part of your RDBMS it will save you a lot of headaches to use the appropriate model "Documents" vs the incorrect one "TABLES" for this application.