SELECT JOIN +Column - sql

I have the following query :
SELECT A.*, B.* FROM Employee AS A
LEFT JOIN EmployeeHistory AS B ON +B.EmployeeId = CASE
WHEN DeptId=1 THEN SUBSTRING(FunctionRef,2,3)
WHEN DeptId=2 THEN SUBSTRING(FunctionRef,2,2) END
I want to understand the + before the +B.EmployeeId, since SQL Server isn't throwing an error

The SQL you have posted is not valid T-SQL. This is because after you define the alias for Employee as A you have an ON ("A" is for Employee? What A? A is for Apple, E is for Employee. Bad Habits to Kick : Using table aliases like (a, b, c) or (t1, t2, t3)). The first table in a FROM can't be followed by an ON, ON is for tables that you are joining to.
If the first ON is meant to be a JOIN, it does work: db<>fiddle. But all the + is doing is being a leading plus or concatenation operator (which depends on the data type of EmployeeID). SELECT +1, + ''; is perfectly valid (though odd) syntax. The + is basically doing nothing.
Disclaimer: The opening paragraph is based on the original SQL the OP posted, which they stated they had copy and pasted from a working environment.

The + is a unary plus sign. It is analogous to then unary minus (-) but it doesn't do anything. If the second argument is a string, then it is a unary string concatenator, once again doing nothing.
I'm not sure what the purpose is.
You can check this out to get an idea:
select ++++1, + '', +'abc'

Maybe you forgot to include the second line JOIN table2 as B. If you included that line the query could become valid.
The +B.EmployeeId expression can mean 0+B.EmployeeId and thus the plus sign would not have any effect.

Related

Find duplicates in case-sensitive query in MS Access

I have a table containing Japanese text, in which I believe that there are some duplicate rows. I want to write a SELECT query that returns all duplicate rows. So I tried running the following query based on an answer from this site (I wasn't able to relocate the source):
SELECT [KeywordID], [Keyword]
FROM Keyword
WHERE [Keyword] IN (SELECT [Keyword]
FROM [Keyword] GROUP BY [Keyword] HAVING COUNT(*) > 1);
The problem is that Access' equality operator treats the two Japanese writing systems - hiragana and katakana - as the same thing, where they should be treated as distinct. Both writing systems have the same phonetic value, although the written characters used to represent the sound are different - e.g. あ (hiragana) and ア (katakana) both represent the sound 'a'.
When I run the above query, however, both of these characters will appear, as according to Access, they're the same character and therefore a duplicate. Essentially it's a case-insensitive search where I need a case-sensitive one.
I got around this issue when doing a simple SELECT to find a Keyword using StrComp to perform a binary comparison, because this method correctly treats hiragana and katakana as distinct. I don't know how I can adapt the query above to use StrComp, though, because it's not directly evaluating one string against another as in the linked question.
Basically what I'm asking is: how can I do a query that will return all duplicates in a table, case-sensitive?
You can use exists instead:
SELECT [KeywordID], [Keyword]
FROM Keyword as k
WHERE EXISTS (SELECT 1
FROM Keyword as k2
WHERE STRCOMP(k2.Keyword, k.KeyWord, 0) = 0 AND
k.KeywordID <> k2.KeywordID
);
Try with a self join:
SELECT k1.[KeywordID], k1.[Keyword], k2.[KeywordID], k2.[Keyword]
FROM Keyword AS k1 INNER JOIN Keyword AS k2
ON k1.[KeywordID] < k2.[KeywordID] AND STRCOMP(k1.[Keyword], k2.[Keyword], 0) = 0

Hive - SELECT inside WHEN clause of CASE function gives an error

I am trying to write a query in Hive with a Case statement in which the condition depends on one of the values in the current row (whether or not it is equal to its predecessor). I want to evaluate it on the fly, this way, therefore requiring a nested query, not by making it another column first and comparing 2 columns. (I was able to do the latter, but that's really second-best). Does anyone know how to make this work?
Thanks.
My query:
SELECT * ,
CASE
WHEN
(SELECT lag(field_with_duplicates,1) over (order by field_with_duplicates) FROM my_table b
WHERE b.id=a.id) = a.field_with_duplicates
THEN “Duplicate”
ELSE “”
END as Duplicate_Indicator
FROM my_table a
Error:
java.sql.SQLException: org.apache.spark.sql.AnalysisException: cannot recognize input near 'SELECT' 'lag' '(' in expression specification; line 4 pos 9
Notes:
The reason I needed the complicated 'lag' function is that the unique Id's in the table are not consecutive, but I don't think that's where it's at: I tested by substituting another simpler inner query and got the same error message.
Speaking of 'duplicates', I did search on this issue before posting, but the only SELECT's inside CASE's I found were in the THEN statement, and if that works the same, it suggests mine should work too.
You do not need the subquery inside CASE:
SELECT a.* ,
CASE
WHEN prev_field_with_duplicates = field_with_duplicates
THEN “Duplicate”
ELSE “”
END as Duplicate_Indicator
FROM (select a.*,
lag(field_with_duplicates,1) over (order by field_with_duplicates) as prev_field_with_duplicates
from my_table a
)a
or even you can use lag() inside CASE instead without subquery at all (I'm not sure if it will work in all Hive versions ):
CASE
WHEN lag(field_with_duplicates,1) over (order by field_with_duplicates) = field_with_duplicates
THEN “Duplicate”
ELSE “”
END as Duplicate_Indicator
Thanks to #MatBailie for the answer in his comment. Don't I feel silly...
Resolved

stream analytics query get error column name doesn't exist, but it does?

When I run my query in management studio it works fine, but in a stream analytics job it throws an error: Query compilation error: Invalid column name: 'afkorting'. Column with such name does not exist..
I downloaded the input tables to check if something went wrong with uploading, but that file does have that column name (and I double checked for capital letters, miswriting etc), so how can I fix this?
This is my query:
; WITH Check AS
(
SELECT afkorting, *
FROM Reizen RE
LEFT JOIN Gegevens AP
ON RE.ID = AP.code
)
SELECT *
FROM Check CH
JOIN Model VM
ON CH.afkorting = VM.Station
WHERE VM.h_station = VM.v_station
AND DATEPART(hour, CH.MsgReportDate) = VM.start_uur
AND (DATEPART(minute, CH.MsgReportDate) BETWEEN VM.start_minuut AND VM.eind_minuut)
AND DATEPART(weekday, CH.MsgReportDate) = VM.weekdag
Hope someone can help me!
*PROBLEM SOLVED: you need to give in all columnnames, so not SELECT * but SELECT column1, column2 and use the given prefixes of the table, in my case: AP.column1, RE.column2 etc*
Just summarize all comments above for resolving the issue, I did some testing for Stream Query language elements WITH, SELECT & JOIN. Here is my result list for the issue.
Without JOIN, using column names with symbol * in the WITH scope is correct for executing on ASA.
With JOIN, it's necessary to list all column names you want without symbol * for executing. The reason seems to be to avoid ambiguity with column name conflict.
you need to give in all column names, so not
SELECT * but SELECT column1, column2
and use the given prefixes of the table,
for example
in my case:
AP.column1, RE.column2 etc

What does =+ mean in an Oracle query

Normally in C++ programming language, the plus means addition, in the example below
int x;
x += 1;
However in plsql query, I am confused about the same usage. That usage does not mean addition. In that case, what is the meaning of =+ ?
Select c.* From alf_numeric a, run_of_id b, tail_of_st c
WHERE category_id IN(33,36) AND a.flow_id =+ b.flow_id
Any idea?
This:
...
FROM alf_numeric a, run_of_id b
WHERE a.flow_id = b.flow_id (+)
would mean:
...
FROM alf_numeric a
LEFT JOIN run_of_id b
ON a.flow_id = b.flow_id
My guess is that:
a.flow_id =+b.flow_id
is parsed as the (simple):
a.flow_id = (+b.flow_id)
and so is the same as:
a.flow_id = b.flow_id
It looks to me that the '+' part of '=+' is a no-op. Try running the following statements:
CREATE TABLE test1 (v1 NUMBER);
INSERT INTO test1(v1) VALUES (-1);
INSERT INTO test1(v1) VALUES (1);
CREATE TABLE test2(v2 NUMBER);
INSERT INTO test2(v2) VALUES (-1);
INSERT INTO test2(v2) VALUES (1);
SELECT *
FROM test1 t1
INNER JOIN test2 t2
ON (t1.v1 = t2.v2)
WHERE t1.v1 =+ t2.v2;
which returns
V1 V2
-1 -1
1 1
Thus, it appears the '+' operator isn't doing anything, it's just answering whatever is there. As a test of this, run the following statement:
SELECT V1, +V1 AS PLUS_V1, ABS(V1) AS ABS_V1, -V1 AS NEG_V1 FROM TEST1;
and you'll find it returns
V1 PLUS_V1 ABS_V1 NEG_V1
-1 -1 1 1
1 1 1 -1
which seems to confirm that a unary '+' is effectively a no-op.
Share and enjoy.
In your SELECT statement, the clause
a.flow_id =+b.flow_id
is mainly a comparison. It tests whether the value of a.flow_id is equal to the value of b.flow_id. So the + operator in this case is an arithmetic operator working on a single operand. It turns the sign of the value to positive.
Update:
It seems I was slightly wrong. The operator doesn't change the sign. It has basically no effect.
It's probably a typo for the old left join syntax in Sybase, which would be =* instead of =+. If that's true, you can rewrite the query in a clearer way using joins, like:
select c.*
From alf_numeric a
left join
run_of_id b
on a.flow_id = b.flow_id
cross join
tail_of_st c
WHERE category_id IN(33,36)
Which would basically return the entire table tail_of_st for each entry in alf_numeric, with a filter on category_id (not sure what table that's in.) A mysterious query!
In your C++ example, the + designates the positive sign, it has nothing to do with addition. Just as you can write x = -1, you can also write x = +1 (which is equal to x = 1, since + as sign can be omitted - and is, in most cases, since it does in fact have no effect whatsoever). But both these cases are an assignment in C++, not an addition - no actual calculation is involved; you're probably thinking of x += 1 (the order is important!), which would increase x by 1.
In your SQL query, I think the + is supposed to have a special meaning - it should probably indicate an outer join. Although if I read that document correctly, it should actually be a.flow_id = b.flow_id (+); as it is here, I doubt that the query parser will recognize it as an outer join, but will instead just interpret it as a positive sign, just as in your C++ example.
I believe that's a join syntax thing. The standard way is to say something like tableA join tableB on <whatever> but some DBs, such as Sybase and Oracle support alternate syntax. In Sybase, it's =* or *=. Postgres probably does the same. From the format, I'd guess a right outer join, but it's hard to say. I looked in the PG docs, but didn't immediately see it.
BTW, in C you'd have x += 1 not x = +1.

What applications are there for NULLIF()?

I just had a trivial but genuine use for NULLIF(), for the first time in my career in SQL. Is it a widely used tool I've just ignored, or a nearly-forgotten quirk of SQL? It's present in all major database implementations.
If anyone needs a refresher, NULLIF(A, B) returns the first value, unless it's equal to the second in which case it returns NULL. It is equivalent to this CASE statement:
CASE WHEN A <> B OR B IS NULL THEN A END
or, in C-style syntax:
A == B || A == null ? null : A
So far the only non-trivial example I've found is to exclude a specific value from an aggregate function:
SELECT COUNT(NULLIF(Comment, 'Downvoted'))
This has the limitation of only allowing one to skip a single value; a CASE, while more verbose, would let you use an expression.
For the record, the use I found was to suppress the value of a "most recent change" column if it was equal to the first change:
SELECT Record, FirstChange, NULLIF(LatestChange, FirstChange) AS LatestChange
This was useful only in that it reduced visual clutter for human consumers.
I rather think that
NULLIF(A, B)
is syntactic sugar for
CASE WHEN A = B THEN NULL ELSE A END
But you are correct: it is mere syntactic sugar to aid the human reader.
I often use it where I need to avoid the Division by Zero exception:
SELECT
COALESCE(Expression1 / NULLIF(Expression2, 0), 0) AS Result
FROM …
Three years later, I found a material use for NULLIF: using NULLIF(Field, '') translates empty strings into NULL, for equivalence with Oracle's peculiar idea about what "NULL" represents.
NULLIF is handy when you're working with legacy data that contains a mixture of null values and empty strings.
Example:
SELECT(COALESCE(NULLIF(firstColumn, ''), secondColumn) FROM table WHERE this = that
SUM and COUNT have the behavior of turning nulls into zeros. I could see NULLIF being handy when you want to undo that behavior. If fact this came up in a recent answer I provided. If I had remembered NULLIF I probably would have written the following
SELECT student,
NULLIF(coursecount,0) as courseCount
FROM (SELECT cs.student,
COUNT(os.course) coursecount
FROM #CURRENTSCHOOL cs
LEFT JOIN #OTHERSCHOOLS os
ON cs.student = os.student
AND cs.school <> os.school
GROUP BY cs.student) t