I've got a part of a query that I wrote in SSRS that doesn't work like I need it to. The situation is that I'm trying to create a drop-down parameter with a list of values from a column - we'll call it Column5. The problem is that some of the rows in that column are blank; they're not null, they're just made up entirely of whitespace. Because of this, they're not showing up in the list of values for the parameter, which means those rows will be completely ignored.
What I'm trying to do is something like this:
SELECT [...stuff...] CASE WHEN (RTRIM(LTRIM(Column5)) = '') THEN 'None' ELSE Column5 END AS ColumnAlias
I've tried the above and a few variations on it, but nothing seems to work.
NB: I pasted the above query into SQL Server, and it worked just fine. Seems like SQL Server and SSRS deal with whitespace differently.
EDIT: Apparently part of the problem is that SQL won't filter based on a renamed column. I pasted my query into SQL Server, and included WHERE Column5 = 'None'. It didn't return any rows, even though there were a few thousand rows that clearly said 'None' in Column5. It seems like I might have to rethink my whole approach.
Try this:
SELECT [...stuff...] CASE WHEN LEFT(Column5, 1) = ' ' or RIGHT(Column5, 1) = ' ' THEN 'None' ELSE Column5 END AS ColumnAlias
The reason for this is that when you specify a condition, trailing spaces are ignored. In other words, "HI", "hi", and "Hi " (with a space after it) are all considered equal.
I know this is an older post, but just in case you're still having trouble...
I've seen that empty strings or strings of spaces seem to handled and identified differently in different systems (Access vs SSRS vs T-SQL run in SSMS). I find that using the len() function works pretty well everywhere. I usually do something like the following:
Case
When ISNULL(LEN(LTRIM(RTRIM([Column5]))),0) = 0
Then 'None'
Else [Column5]
End
This way it's actively looking for and counting characters after you've removed all spacers.
Once you've identified these fields, you can put your whole query (assuming it's not overly complex) into a CTE and work with the end product. This can sometimes be easier than trying to work with the data as it is being derived.
With Selected_Records as (
Select ...
...
)
Select Distinct Column5
From Selected_Records
Then you can add any other conditions, ordering, or aggregates on top of the derived data without hindering its derivation. This works pretty well until the query in the CTE gets very complicated or utilizes many parameters (these are anecdotal observations).
Hope this helps someone.
Related
Thank you for checking my question out!
I'm trying to write a query for a very specific problem we're having at my workplace and I can't seem to get my head around it.
Short version: I need to be able to target columns by their name, and more specifically by a part of their name that will be consistent throughout all the columns I need to combine or compare.
More details:
We have (for example), 5 different surveys. They have many questions each, but SOME of the questions are part of the same metric, and we need to create a generic field that keeps it. There's more background to the "why" of that, but it's pretty important for us at this point.
We were able to kind of solve this with either COALESCE() or CASE statements but the challenge is that, as more surveys/survey versions continue to grow, our vendor inevitably generates new columns for each survey and its questions.
Take this example, which is what we do currently and works well enough:
CASE
WHEN SURVEY_NAME = 'Service1' THEN SERV1_REC
WHEN SURVEY_NAME = 'Notice1' THEN FNOL1_REC
WHEN SURVEY_NAME = 'Status1' THEN STAT1_REC
WHEN SURVEY_NAME = 'Sales1' THEN SALE1_REC
WHEN SURVEY_NAME = 'Transfer1' THEN Null
ELSE Null
END REC
And also this alternative which works well:
COALESCE(SERV1_REC, FNOL1_REC, STAT1_REC, SALE1_REC) as REC
But as I mentioned, eventually we will have a "SALE2_REC" for example, and we'll need them BOTH on this same statement. I want to create something where having to come into the SQL and make changes isn't needed. Given that the columns will ALWAYS be named "something#_REC" for this specific metric, is there any way to achieve something like:
COALESCE(all columns named LIKE '%_REC') as REC
Bonus! Related, might be another way around this same problem:
Would there also be a way to achieve this?
SELECT (columns named LIKE '%_REC') FROM ...
Thank you very much in advance for all your time and attention.
-Kendall
Table and column information in Db2 are managed in the system catalog. The relevant views are SYSCAT.TABLES and SYSCAT.COLUMNS. You could write:
select colname, tabname from syscat.tables
where colname like some_expression
and syscat.tabname='MYTABLE
Note that the LIKE predicate supports expressions based on a variable or the result of a scalar function. So you could match it against some dynamic input.
Have you considered storing the more complicated properties in JSON or XML values? Db2 supports both and you can query those values with regular SQL statements.
I am new to SQL. I am trying to practice writing CASE expressions. Below is a query I have been working with.
SELECT bill,
'provider' as
case
when refer != '' THEN refer
WHEN render != '' THEN render
ELSE 'NULL'
END
FROM billing
This is the criteria for my query -
1) I need a new column in the select that is not part of the table. I have named it provider in the above query.
2) I need the new column's value to be the refer column's value if refer is not empty.
3) I need it to be equal to the render column's value if render is not empty.
4) I need it to be NULL if both are empty.
5) The output should look like
Bill Provider
123 Health
456 Org
789 NULL
The correct syntax is:
SELECT bill,
(CASE WHEN refer <> '' THEN refer
WHEN render <> '' THEN render
END) as provider
FROM billing;
Notes:
The column alias comes after the definition.
Although != works, <> is the tradition comparison operator for not equals.
Do not use single quotes for column aliases. Only use them for string and date constants.
You've already got a fine answer, but I figured I'd mention a few other commands to investigate while you're learning about CASE. They may not apply to your current problem, but you'll likely find over time that FILTER and COALESCE are equally worth knowing about. FILTER often works as a simpler-to-read alternative to CASE. Check it out while you're CASE, and you'll have another option for future problems. Here's a short write-up you might like:
https://medium.com/little-programming-joys/the-filter-clause-in-postgres-9-4-3dd327d3c852
I use FILTER for manually constructed pivot tables, and it's much simpler to construct and review in that situation.
COALESCE you may already know about. But, if not, it's super handy. Pass in a list of possible values, and get back the first one (reading left-to-right) that's not null. That can sometimes be what you need where you would otherwise have to write a CASE.
https://www.postgresql.org/docs/current/functions-conditional.html
I have tried looking for answers online, but I am lacking the right nomenclature to find any answers matching my question.
The DB I am working with is an inconsistent mess. I am currently trying to import a number of maintenance codes which I have to link to a pre-existing Excel table. For this reason, the maintenance code I import have to be very universal.
The table is designed to work with 2-3 digit number (time lengths), followed by a time unit.
For example, SERV-01W and SERV-03M .
As these used to be added to the DB by hand, a large number of older maintenance codes are actually written with 1 digit numbers.
For example, SERV-1W and SERV-3M.
I would like to replace the old codes by the new codes. In other words, I want to add a leading 0 if only one digit is used in the code.
REPLACE(T.Code,'-[0-9][DWM]','-0[0-9][DWM]') unfortunately does not work, most likely because I am using wildcards in the result string.
What would be a good way of handling this issue?
Thank you in advance.
Assuming I understand your requirement this should get you what you are after:
WITH VTE AS(
SELECT *
FROM (VALUES('SERV-03M'),
('SERV-01W'),
('SERV-1Q'),
('SERV-4X')) V(Example))
SELECT Example,
ISNULL(STUFF(Example, NULLIF(PATINDEX('%-[0-9][A-z]%',Example),0)+1,0,'0'),Example) AS NewExample
FROM VTE;
Instead of trying to replace the pattern, I used PATINDEX to find the pattern and then inject the extra '0' character. If the pattern wasn't found, so 0 was returned by PATINDEX, I forced the expression to return NULL and then wrapped the entire thing with a further ISNULL, so that the original value was returned.
I find a simple CASE expression to be a simple way to express the logic:
SELECT (CASE WHEN code LIKE '%-[0-9][0-9]%'
THEN code
ELSE REPLACE(code, '-', '-0')
END)
That is, if the code has two digits, then do nothing. Otherwise, add a zero. The code should be quite clear on what it is doing.
This is not generalizable (it doesn't add two zeros for instance), but it does do exactly what you are asking for.
If I have a column called 'Categories' with say science,maths,english in the row comma-separated as shown, how would I match all rows with the category containing maths?
I've tried a simple LIKE but it is not quite accurate as there may be 'poo_science' which when searching for '%science%' would match both.
I've looked around StackOverflow and there are plenty of similar questions but all seem to want to return data as a comma separated list or something - not quite what I'm after.
I'd prefer not to use a stored procedure and cannot use full-text searching. I have a stored procedure I used which added another character ('$') around each value and then would search for '$value$'... is this too nasty? I'm after a little more simple method.
Disclaimer: The commentators are right... CSVs in a single field are a horrible design, and should be re-done.
With that said, here's how you can work around your problem:
Pad Categories with leading and trailing ,, that way you can include them in your wildcard search:
WHERE (',' + Categories + ',') LIKE '%,science,%'
Use FIND_IN_SET(,)
SQL:
SELECT name FROM orders,company
WHERE orderID = 1
AND
FIND_IN_SET(companyID, attachedCompanyIDs)
or
can check this link FIND_IN_SET() vs IN()
I propose a 4x WHERE that can match any of the possible cases: value alone, value at the start, middle or end of the csv:
WHERE Categories = 'science' /* CSV containing only the one value */
OR Categories LIKE 'science,%' /* value at start of CSV */
OR Categories LIKE '%,science,%' /* value somewhere in the middle */
OR Categories LIKE '%,science' /* value at the end of CSV */
This way all 'science' rows should be selected but none of the 'poo_science' rows.
I've made some assumptions about your data layout. Try this - using SQL Server 2K8+ this should work:
DECLARE #SearchString NVarChar(100) = 'maths';
SELECT 1 SomeId, 'science,maths,english' Categories
INTO #TestTable;
WITH R AS (
SELECT
X.SomeId,
C.value('#value', 'NVarChar(100)') SomeTagValue
FROM (SELECT SomeId,
CONVERT(XML, '<tag value = "' + REPLACE(Categories, ',', '" /><tag value = "') + '" />') XMLValue
FROM #TestTable) X CROSS APPLY X.XMLValue.nodes('//tag') T(C)
)
SELECT *
FROM R
WHERE SomeTagValue = #SearchString;
DROP TABLE #TestTable;
It's definitely not going to be super-efficient or very scalable, but then working against denormalized data tends to inherently have those issues.
This question is visible on google and has many views, so I want to share my approach to this problem. I had to deal with such a poor design as comma-separated values stored as strings too. I came across this issue while tweaking a CMS's plugin responsible for tags.
Yeah, tags related to a site article were stored like this: "tag1,tag2,...,tagN". So, getting the exact match wasn't as trivial as it might have initially appeared: using simple LIKE, with articles tagged "ball" I also got ones tagged "football" and "ballroom". Not critical, but rather annoying.
FIND_IN_SET function seemed awesome at first but then it turned out that it doesn't use index and doesn't work properly if the first argument contains a comma character.
I had no desire to alter the plugin itself or deeper CMS core functionality which that plugin had been built upon.
Also it is worth noting that needed tag (substring) can be the first, the last element in the string or can be somewhere in the middle, so this piece of code WHERE (',' + Categories + ',') LIKE '%,science,%' doesn't cover all three cases.
Finally, I ended up with very simple solution. It worked for me like this:
... WHERE tags LIKE 'ball,%' OR tags LIKE '%,ball,%' OR tags LIKE '%,ball'
All theree cases covered; commas used as delimiters. Hope it helps others who came across similar pitfall.
PS. I am not a MySQL/DB expert at all and I would love to read about potential drawbacks of this approach especially on really huge tables (which wasn't my case, btw). I just shared the results of my small research and what I did to solve this problem with minimal efforts.
use FIND_IN_SET() mysql function
Syntax
SELECT * FROM as a WHERE FIND_IN_SET(value to search in string,comma separated string);
Example
SELECT * FROM as a WHERE FIND_IN_SET(5,"1,2,3,4,5,6");
More Information Follow Below Link :
http://blog.sqlauthority.com/2014/03/21/mysql-search-for-values-within-a-comma-separated-values-find_in_set/
In the result for SELECT * from myTable WHERE some-condition;
I'm interested in 9 of all the 10 columns that exist. The only way out is to specify the 9 columns explicitly ?
I cannot somehow specify just the column I don't want to see?
The only way is to list all 9 columns.
Such as:
SELECT col1, col2, col3, col4, col5, col6, col7, col8, col9 FROM myTable
No, you can not. An example definition of select list for Sybase can be found here, you can easily find others for other DBs
The reason for that is that the standard methods of selection - "*" (aka all columns) and a list of columns - are defined operations in relational Algebra whereas the exclusion of columns is not
Also, as mentioned in Joe's comment, it is usually considered good practice to explicitly specify column list as opposed to "*" even when selecting all columns.
The reason for that is that having * in a joined query may cause the query to break if a table schema change introduces identically-named fields in both of the joined tables.
However, when selecting without a join from a very wide and often-mutating table, the above rule may not apply, as having "*" makes for a good change management (your query is one less place to fix and release when adding new columns), especially if you have flexible DB retrieval code that can dynamically deal with a column set from table definition instead of something specified in the code. (e.g., 100% of our extractors and loaders are fully working whenever a new column is added to the DB).
If you had to (can't think of why), but you could dynamically create this select statement by querying the columns in this table and exclude the one column name in the where clause.
Not worth the performance hit, confusion, and maintenance issues that will come up.
You actually need to specify the columns explicitly (as said by Luke it is good practice), and here is the reason:
Let's say that you write some code / scripts around you sql queries. You now have a whooping 50 different selects in various places of your code.
Suddenly you realize that for this new feature you are working on, you need another column (symmetry, you are doing cleanup and realize a column is useless and wasting space, though it is harder).
Now you are in either of this 2 situations:
You explicitly stated the columns in each and every query: Adding a column is a backward compatible change, just code your new feature and be done with it.
You used the '*' operator for a few queries: you have to track them down and modify them all. Forget a single one and it will be your grave.
Oh, and did I specify that a query with a '' selector takes more time to be executed since the DB actually has to query the model and develop the '' selector ?
Moral: only use the '*' selector when you are checking manually that your columns are fine (at which point you actually need to check everything), in code, just bane them or they'll be your doom.
No, you can't (at least not in any SQL dialect that I'm aware of).
It's good practice to explicitly specify your column names anyway, rather than using SELECT *.
In the end, you need to specify all 9 out of 10 columns separately - but there's tooling help out there which helps you make this easier!
Check out Red-Gate's SQL Prompt which is an intellisense-add-on for SQL Server Management Studio and Visual Studio.
Amongst a lot of other things, it allows you to type
SELECT * FROM MyTable
and then go back, put the cursor after the " * ", and press TAB - it will then list out all the columns in that table and you can tweak that list (e.g. remove a few you don't need).
Absolutely invaluable - saves hours and hours of mindless typing! Well worth the price of a license, I'd say.
Highly recommended!
Marc