I am trying to subset my data with PROC SQL, and it is giving me an error when I use my variable TNM_CLIN_STAGE_GROUP. Example below:
create table subset as
select ncdb.*
from ncdb
ERROR: Expression using equals (=) has components that are of different data types.
When I run the same code, but take out the variable TNM_CLIN_STAGE_GROUP, the code works. Anyone know what the problem with that variable's name is?

That error indicates a difference in type. SAS has only two types, numeric and character, so the variable is probably character; verify the specific values, but in general it likely needs quotations (single or double, doesn't matter in this case).
If it is not a hardcoded value, but a value of another variable, you can use PUT to convert to character or INPUT to convert to numeric, whichever is easier to convert based on the data.
SAS in a data step will happily convert this for you, but in SQL and SQL-like (WHERE statements) it does not automatically convert character to numeric and vice versa; you must provide the correct type.

Before doing equality, check what you are trying to compare.
Check the structure of you ncbd table, in particulary field type of TNM_CLIN_STAGE_GROUP
You would see the real type, if its a varchar, you need to use single quote like #JChao suggest in is comment.
If its another type, so you need to adapt the comparator or use cast if you don t have choice.


Checking SQLite value type - numeric vs. textual

Is it possible to filter SQLite column values in SQL based on whether the value is numeric or textual? I have seen references to using CAST for this purpose. However, it appears to be useless as SELECT CAST('1a' AS NUMERIC) passes the check for a numeric type.
The typeof() SQL function is designated for type checking. However, its result depends on both column type definition (according to the official docs) and the format used during insertion. For example, when a number is inserted as a text literal into a NUMERIC column, it is converted into a number if possible, and typeof() will return an appropriate numeric type or text, if conversion did not occur. The TEXT column, on the other hand, stores all numeric literals as text. BLOB column stores textual and numeric literals without interpretation. Therefore, a mixed-type column should be probably declared as BLOB or NUMERIC (depending on whether textual literals needs to be converted to numbers, if possible). With this behavior in mind, typeof() is well suitable for type checking.
Thats just an idea:
SELECT [FilterColumn] FROM [Table] WHERE [FilterColumn]='0' OR (ceiling(log([FilterColumn],10)) =LENGTH([FilterColumn]) AND CAST([FilterColumn] AS INTEGER)>0)
This works for integer numbers where number of digits=log([FilterColumn],10). To distinguish a single letter from casting to 0, [FilterColumn]='0' OR [FilterColumn]>0 included.
I suppose there are more elegant solutions

What is the purpose of using `timestamp(nullif('',''))`

I am in the process of moving a decade old back-end from DB2 9.5 to Oracle 19c.
I frequently see in SQL queries and veiw definitions bizarre timestamp(nullif('','')) constructs used instead of a plain null.
What is the point of doing so? Why would anyone in their same mind would want to do so?
Disclaimer: my SQL skills are fairly mediocre. I might well miss something obvious.
It appears to create a NULL value with a TIMESTAMP data type.
The TIMESTAMP DB2 documentation states:
TIMESTAMP scalar function
The TIMESTAMP function returns a timestamp from a value or a pair of values.
TIMESTAMP(expression1, [expression2])
expression1 and expression2
The rules for the arguments depend on whether expression2 is specified and the data type of expression2.
If only one argument is specified it must be an expression that returns a value of one of the following built-in data types: a DATE, a TIMESTAMP, or a character string that is not a CLOB.
If you try to pass an untyped NULL to the TIMESTAMP function:
Then you get the error:
The invocation of routine "TIMESTAMP" is ambiguous. The argument in position "1" does not have a best fit.
To invoke the function, you need to pass one of the required DATE, TIMESTAMP or a non-CLOB string to the function which means that you need to coerce the NULL to have one of those types.
This could be:
Using NULLIF is more confusing but, if I have to try to make an excuse for using it, is slightly less to type than casting a NULL to a string.
The equivalent in Oracle would be:
This also works in DB2 (and is even less to type).
It is not clear why - in any SQL dialect, no matter how old - one would use an argument like nullif('',''). Regardless of the result, that is a constant that can be calculated once and for all, and given as argument to timestamp(). Very likely, it should be null in any dialect and any version. So that should be the same as timestamp(null). The code you found suggests that whoever wrote it didn't know what they were doing.
One might need to write something like that - rather than a plain null - to get null of a specific data type. Even though "theoretical" SQL says null does not have a data type, you may need something like that, for example in a view, to define the data type of the column defined by an expression like that.
In Oracle you can use the cast() function, as MT0 demonstrated already - that is by far the most common and most elegant equivalent.
If you want something much closer in spirit to what you saw in that old code, to_timestamp(null) will have the same effect. No reason to write something more complicated for null given as argument, though - along the lines of that nullif() call.

Data Factory expression substring? Is there a function similar like right?

How could I extract 2019-04-02 out of the following string with Azure data flow expression?
The first part of the string received as a ChildItem from a GetMetaData activity is dynamically. So in this case it is ABC_DATASET that is dynamic.
There are several ways to approach this problem, and they are really dependent on the format of the string value. Each of these approaches uses Derived Column to either create a new column or replace the existing column's value in the Data Flow.
Static format
If the format is always the same, meaning the length of the sections is always the same, then substring is simplest:
This will parse the string like so:
Useful reminder: substring and array indexes in Data Flow are 1-based.
Dynamic format
If the format of the base string is dynamic, things get a tad trickier. For this answer, I will assume that the basic format of {variabledata}-{timestamp}.parquet is consistent, so we can use the hyphen as a base delineator.
Derived Column has support for local variables, which is really useful when solving problems like this one. Let's start by creating a local variable to convert the string into an array based on the hyphen. This will lead to some other problems later since the string includes multiple hyphens thanks to the timestamp data, but we'll deal with that later. Inside the Derived Column Expression Builder, select "Locals":
On the right side, click "New" to create a local variable. We'll name it and define it using a split expression:
Press "OK" to save the local and go back to the Derived Column. Next, create another local variable for the yyyy portion of the date:
The cool part of this is I am now referencing the local variable array that I created in the previous step. I'll follow this pattern to create a local variable for MM too:
I'll do this one more time for the dd portion, but this time I have to do a bit more to get rid of all the extraneous data at the end of the string. Substring again turns out to be a good solution:
Now that I have the components I need isolated as variables, we just reconstruct them using string interpolation in the Derived Column:
Back in our data preview, we can see the results:
Where else to go from here
If these solutions don't address your problem, then you have to get creative. Here are some other functions that may help:

SQL Server - simple select and conversion between int and string

I have a simple select statement like this:
SELECT [dok__Dokument].[dok_Id],
FROM [dok__Dokument]
WHERE [dok_NrPelnyOryg] = 2753
AND [dok_PlatnikId] = 174
AND [dok_OdbiorcaId] = 174
AND [dok_PlatnikAdreshId] = 625
AND [dok_OdbiorcaAdreshId] = 624
Column dok_NrPelnyOryg is of type varchar(30), and not null.
The table contained both integer and string values in this column and this select statement was fired millions of times.
However recently this started crashing with message:
Conversion failed when converting the varchar value 'garbi czerwiec B' to data type int.
Little explanation: the table contains multiple "document" records and the mentioned column contains document original number (which comes from multiple different sources).
I know I can fix this by adding '' around the the number, but I'm rather looking for an explanation why this used to work and while not changing anything now it crashes.
It's possible that a plan change (due to changed statistics, recompile etc) led to this data being evaluated earlier (full scan for example), or that this particular data was not in the table previously (maybe before this started happening, there wasn't bad data in there). If it is supposed to be a number, then make it a numeric column. If it needs to allow strings as well, then stop treating it like a number. If you properly parameterize your statements and always pass a varchar you shouldn't need to worry about whether the value is enclosed in single quotes.
All those equality comparison operations are subject to the Data Type Precedence rules of SQL Server:
When an operator combines two
expressions of different data types,
the rules for data type precedence
specify that the data type with the
lower precedence is converted to the
data type with the higher precedence.
Since character types have lower precedence than int types, the query is basically the same as:
FROM [dok__Dokument]
WHERE cast([dok_NrPelnyOryg] as int) = 2753
This has two effects:
it makes all indexes on columns involved in the WHERE clause useless
it can cause conversion errors.
You're not the first to have this problem, in fact several CSS cases I faced had me eventually write an article about this: On SQL Server boolean operator short-circuit.
The correct solution to your problem is that if the field value is numeric then the column type should be numeric. since you say that the data come from a 3rd party application you cannot change, the best solution is to abandon the vendor of this application and pick one that knows what is doing. Short of that, you need to search for character types on character columns:
FROM [dok__Dokument]
WHERE [dok_NrPelnyOryg] = '2753'
In .Net managed ADO.Net parlance this means you use a SqlCommand like follows:
SqlCommand cmd = new SqlCommand (#" SELECT ...
FROM [dok__Dokument]
WHERE [dok_NrPelnyOryg] = #nrPelnyOryg
... ");
cmd.Parameters.Add("#nrPelnyOryg", SqlDbType.Varchar).Value = "2754";
Just make sure you don't fall into he easy trap of passing in a NVARCHAR parameter (Unicode) for comparing with a VARCHAR column, since the same data type precendence rules quoted before will coerce the comparison to occur on the NVARCHAR type, thus rendering indexes, again, useless. the easiest way to fall for this trap is to use the dredded AddWithValue and pass in a string value.
Your query stopped working because someone inserted the text string in to the field you are querying using INT. Up until that time it was possible to implicitly convert the data but now that's no longer the case.
I'd go check your data and, more importantly, the model; as Aaron said do you need to allow strings in that field? If not, change the data type to prevent this happening in the future.

Force numerical order on a SQL Server 2005 varchar column, containing letters and numbers?

I have a column containing the strings 'Operator (1)' and so on until 'Operator (600)' so far.
I want to get them numerically ordered and I've come up with
select colname from table order by
cast(replace(replace(colname,'Operator (',''),')','') as int)
which is very very ugly.
Better suggestions?
It's that, InStr()/SubString(), changing Operator(1) to Operator(001), storing the n in Operator(n) separately, or creating a computed column that hides the ugly string manipulation. What you have seems fine.
If you really have to leave the data in the format you have - and adding a numeric sort order column is the better solution - then consider wrapping the text manipulation up in a user defined function.
select colname from table order by dbo.udfSortOperator(colname)
It's less ugly and gives you some abstraction. There's an additional overhead of the function call but on a table containing low thousands of rows in a not-too-heavily hit database server it's not a major concern. Make notes in the function to optomise later as required.
My answer would be to change the problem. I would add an operatorNumber field to the table if that is possible. Change the update/insert routines to extract the number and store it. That way the string conversion hit is only once per record.
The ordering logic would require the string conversion every time the query is run.
Well, first define the meaning of that column. Is operator a name so you can justify using chars? Or is it a number?
If the field is a name then you will use chars, and then you would want to determine the fixed length. Pad all operator names with zeros on the left. Define naming rules for operators (I.E. No leters. Or the codes you would use in a series like "A001")
An index will sort the physical data in the server. And a properly define text naming will sort them on a query. You would want both.
If the operator is a number, then you got the data type for that column wrong and needs to be changed.
Indexed computed column
If you find yourself ordering on or otherwise querying operator column often, consider creating a computed column for its numeric value and adding an index for it. This will give you a computed/persistent column (which sounds like oxymoron, but isn't).