Empty Char in Where Clause? - sql

I have the following table:
CREATE TABLE SOAUDIT
(SOU_USER CHAR(8 BYTE),
SOU_ORDREF CHAR(8 BYTE),
SOU_TYPE CHAR(1 BYTE),
SOU_DESC CHAR(50 BYTE))
There is a unique index defined on the first three columns (but no primary key, which is something we have no control over).
And in the table there are some records:
| SOU_USER | SOU_ORDREF | SOU_TYPE | SOU_DESC |
|----------|------------|----------|------------------|
| proust | | S | recherche |
| joyce | 12345678 | S | pelurious |
| orwell | 19841984 | T | doubleplusungood |
| camus | 34598798 | P | peiner |
On closer inspection it appears that the value in SOU_ORDREF for user 'proust' is an empty char string of 8 characters.
Now, what I need to be able to do is to query this table based on their unique values (which I will receive from a SQL Server database (just to complicate matters nicely). In the case of SOU_ORDREF the search value will be a blank field:
SELECT *
FROM SOAUDIT
WHERE (SOU_USER, TRIM(SOU_ORDREF), SOU_TYPE)
IN (('proust', null, 'S'))
This doesn't return the record I am looking for.
When I rewrite the query as following:
SELECT *
FROM SOAUDIT
WHERE (SOU_USER, SOU_TYPE)
IN (('proust', 'S'))
AND TRIM(sou_ordref) is null
Then I do get the desired record.
However, I want to be able to pass in more than one record into the WHERE clause so the second version doesn't really help.

Oracle -- by default -- treats empty strings and NULL as the same thing.
This can cause awkward behavior, because comparisons to NULL almost never return true. So a simple expression such as where sou_ordref = '' never returns true, because it is equivalent to where sou_ordref = NULL.
Here is one workaround:
SELECT *
FROM SOAUDIT
WHERE (SOU_USER, COALESCE(TRIM(SOU_ORDREF), ' '), SOU_TYPE) IN
( ('proust', ' ', 'S') )
Note that this replaces the empty string (NULL) with a space. It then compares the results to a space.

Try this way:
SELECT *
FROM test
WHERE SOU_USER = 'proust'
AND SOU_TYPE = 'S'
AND TRIM(sou_ordref) = ''
Since an empty char is different than NULL

Related

In Spark SQL How do I get persons name to show up on all rows that they are in? One they logged in and one they did not. The ID shows on both rows

I have a table with an ID for each person. The ID is unique to that person. They could show up multiple time in multiple categories but from the source their name shows up as null if they have not logged in. How can I make it so that all values with a certain ID have their name show up
Small Example but there are tons of rows in the real one so I can't just choose one specific name to replace the nulls
| ID | First Name| Last Name| Login| Date |
|--------|---------- |----------|------|-----------|
|1245 |Matt | Carter | Yes | 12-03-2022|
|2344 |Emily | Seuss | Yes | 12-01-2022|
|1245 |NULL | NULL | No | 11-04-2022|
|4266 |Drew | Bob | Yes | 10-03-2022|
I had filtered the df for non null names and tried to join it back to the original df and get the name columns that don't have null.
df=spark.createDataFrame(data,["ID","FirstName","LastName","Login","Date"])
nameDf=df\
.filter("FirstName is not null or LastName is not null")\
.selectExpr("id as nameDf_ID","FirstName as nameDf_FirstName","LastName as nameDf_LastName")\
.distinct()
df.join(nameDf,df.ID==nameDf.nameDf_ID)\
.selectExpr("ID","nameDf_FirstName as FirstName","nameDf_LastName as LastName","Login","Date")\
.show()
+----+---------+--------+-----+----------+
| ID|FirstName|LastName|Login| Date|
+----+---------+--------+-----+----------+
|1245| Matt| Carter| Yes|12-03-2022|
|1245| Matt| Carter| No|11-04-2022|
|2344| Emily| Seuss| Yes|12-01-2022|
|4266| Drew| Bob| Yes|10-03-2022|
+----+---------+--------+-----+----------+
You need to create a dataframe that can serve as a lookup for the names against a specific ID. You can do that by filtering only non-NULL values of names like the previous answer.
names = df.filter(FirstName.isNotNull || LastName.isNotNull)
You can then use the names dataframe to fill out the NULL names in the original dataframe.
df.as("original").join(name.as("names"), Seq("ID"))
.select($"ID",
coalesce($"original.FirstName", $"names.FirstName").as("FirstName"),
coalesce($"original.LastName", $"names.LastName").as("LastName"),
$"original.Login",
$"original.Date"
)
coalesce results in selecting the first value and defaulting to the second value if the first one is null.

BigQuery select multiple tables with different column names

Consider the following BigQuery tables schemas in my dataset my_dataset:
Table_0001: NAME (string); NUMBER (string)
Table_0002: NAME(string); NUMBER (string)
Table_0003: NAME(string); NUMBER (string)
...
Table_0865: NAME (string); CODE (string)
Table_0866: NAME(string); CODE (string)
...
I now want to union all tables using :
select * from `my_dataset.*`
However this will not yield the CODE column of the second set of table. From my understanding, the schema of the first table in the dataset will be adopted instead.
So the result with be something like:
| NAME | NUMBER |
__________________
| John | 123456 |
| Mary | 123478 |
| ... | ...... |
| Abdul | null |
| Ariel | null |
I tried to tap into the INFORMATION_SCHEMA so as to select the two sets of tables separately and then union them:
with t_code as (
select
table_name,
from my_dataset.INFORMATION_SCHEMA.COLUMNS
where column_name = 'CODE'
),
select t.NAME, t.CODE as NUMBER from `my_dataset.*` as t
where _TABLE_SUFFIX in (select * from t_code)
However, still the script will look to the first table of my_dataset for its schema and will return: Error Running Query: Name CODE not found inside t.
So now I'm at a loss: How can I union all my tables without having to union them one by one? ie. how to select CODE as NUMBER in the second set of tables.
Note: Although it seems the question was asked over here, the accepted answer did not seem to actually respond to the question (as far as I'm concerned).
The trick I see you can do is to first gather all codes by running
create table `my_another_dataset.codes` as
select * from `my_dataset.*` where not code is null
Then to do any simple fake update of any just one table with number column - this will make schema with number column default. so now you can gather all numbers
create table `my_another_dataset.numbers` as
select * from `my_dataset.*` where not number is null
Finally then you can do simple union
select * from `my_another_dataset.numbers` union all
select * from `my_another_dataset.codes`
Note: see also my comment below your question
SELECT
borrow.id AS `borrowId`,
IF(borrow.created_date IS NULL, '', borrow.created_date) AS `borrowCreatedDate`,
IF(borrow.return_date IS NULL, '', borrow.return_date) AS `borrowReturnDate`,
IF(borrow.return_date IS NULL, '0', '1') AS `borrowIsReturn`,
IF(person.card_identity IS NULL, '', person.card_identity) AS `personCardIdentity`,
IF(person.fullname IS NULL, '', person.fullname) AS `personFullname`,
IF(person.phone_number IS NULL, '', person.phone_number) AS `personPhoneNumber`,
IF(book.book_name IS NULL, '', book.book_name) AS `bookName`,
IF(book.year IS NULL, '', book.year) AS `bookYear`
FROM tbl_tbl_borrow AS borrow
LEFT JOIN tbl_person AS person
ON person.card_identity = borrow.person_card_identity
LEFT JOIN tbl_book AS book
ON book.unique_id = borrow.book_unique_id
ORDER BY
borrow.return_date ASC, person.fullname ASC;

What is the maximum value for STRING ordering in SQL (SQLite)?

I have a SQLite database and I want to order my results by ascending order of a String column (name). I want the null-valued rows to be last in ascending order.
Moreover, I am doing some filtering on the same column (WHERE name>"previously obtained value"), which filters out the NULL-valued rows, which I do not want. Plus, the version of SQLite I'm using (I don't have control over this) does not support NULLS LAST. Therefore, to keep it simple I want to use IFNULL(name,"Something") in my ORDER BY and my comparison.
I want this "Something" to be as large as possible, so that my null-valued rows are always last. I have texts in Japanese and Korean, so I can't just use "ZZZ".
Therefore, I see two possible solutions. First, use the "maximum" character used by SQLite in the default ordering of strings, do you know what this value is or how to obtain it? Second, as the cells can contain any type in SQLite, is there a value of any other type that will always be considered larger than any string?
Example:
+----+-----------------+---------------+
| id | name | othercol |
+----+-----------------+---------------+
| 1 | English name | hello |
| 2 | NULL | hi |
| 3 | NULL | hi hello |
| 4 | 暴鬼 | hola |
| 5 | NULL | bonjour hello |
| 6 | 아바키 | hello bye |
+----+-----------------+---------------+
Current request:
SELECT * FROM mytable WHERE othercol LIKE "hello" AND (name,id)>("English name",1) ORDER BY (name,id)
Result (by ids): 6
Problems: NULL names are filtered out because of the comparison, and when I have no comparison they are shown first.
What I think would solve these problems:
SELECT * FROM mytable WHERE othercol LIKE "hello" AND (IFNULL(name,"Something"),id)>("English name",1) ORDER BY (IFNULL(name,"Something"),id)
But I need "Something" to be larger than any string I might encounter.
Expected result: 6, 3, 5
I think a simpler way is to use nulls last:
order by column nulls last
This works with both ascending and descending sorts. And it has the advantage that it can make use of an index on the column, which coalesce() would probably prevent.
Change your WHERE clause to:
WHERE SOMECOL > "previously obtained value" OR SOMECOL IS NULL
so the NULLs are not filtered out (since you want them).
You can sort the NULLs last, like this:
ORDER BY SOMECOL IS NULL, SOMECOL
The expresssion:
SOMECOL IS NULL
evaluates to 1 (True) or 0 (False), so the values that are not NULL will be sorted first.
Edit
If you want a string that is greater than any name in the table, then you can get it by:
select max(name) || ' ' from mytable
so in your code use:
ifnull(name, (select max(name) || ' ' from mytable))
Finally found a solution, for anyone looking for a character larger than any other (when I'm posting this, the unicode table might get expanded), here's your guy:
CAST(x'f48083bf' AS TEXT).
Example in my case:
SELECT * FROM mytable WHERE othercol LIKE "hello" AND (IFNULL(name,CAST(x'f48083bf' AS TEXT)),id)>("English name",1) ORDER BY (IFNULL(name,CAST(x'f48083bf' AS TEXT)),id)

Not able to compare columns in SQL Server

I am using the following SQL code to compare two nvarchar columns. But the code is showing incorrect results:
SELECT
DP.NAME, DP.sid, SU.sid,
CASE
WHEN DP.sid = SU.sid
THEN 'TRUE'
ELSE 'FALSE'
END AS DESDIREDRESULT
FROM
#SQLLOGINS DP
INNER JOIN
SYS.sysusers SU ON DP.name COLLATE DATABASE_DEFAULT = SU.name COLLATE DATABASE_DEFAULT
In the code, I am doing a inner join of a temp table, #SQLLOGINS, with sys.sysusers. This temp table includes NAME and SID of sys.sqllogins.
I am facing an issue, while both the SID is same, it should be 'TRUE' in the output. Screenshot attached. But it's returning FALSE.
I am not sure where I am wrong here in comparing the SID columns.
You are mixing the types. Try this out:
DECLARE #mockupTable TABLE(ID INT IDENTITY,SomeString VARCHAR(100),SomeBinary VARBINARY(100));
INSERT INTO #mockupTable VALUES('0x1234',0x1234);
INSERT INTO #mockupTable VALUES(0x6565,0x6565); --implicit cast!
INSERT INTO #mockupTable VALUES('ee', CAST('ee' AS VARBINARY(100))) --explicit cast!
SELECT *, CASE WHEN SomeString=SomeBinary THEN 'TRUE' ELSE 'FALSE' END FROM #mockupTable;
The result
+----+------------+------------+--------------------+
| ID | SomeString | SomeBinary | |
+----+------------+------------+--------------------+
| 1 | 0x1234 | 0x1234 | FALSE |
+----+------------+------------+--------------------+
| 2 | ee | 0x6565 | TRUE |
+----+------------+------------+--------------------+
| 3 | ee | 0x6565 | TRUE |
+----+------------+------------+--------------------+
What happens here?
The first row looks the same, but isn't, while the 2 and 3 are obviously different but return they are the same?
The reason: The binary value 0x1234 and the string 0x1234 are not the same, although the look as if they were.
Just try SELECT CAST('0x1234' AS VARBINARY(100)). The result is 0x307831323334 which is - obviously! - not the same as 0x1234. It is a list of codes actually: 30 (0), 78 (x), 31 (1), 32 (2), 33 (3), 34 (4).
But in row 2 and 3 you can see, that the binary value of a string can be compared with a real binary. Doing this, you can see that the string ee has two small letter e, with the ASCII code 65. So 0x6565 translates to ee.
There are typically 3 causes of things like this:
Hidden characters (line feeds etc)
Incompatible data types or code pages
Trailing spaces
I suggest you cast/convert both attributes and throw a trim in for good measure.

SQL - Multiple select filter: Combine filter conditions to get proper results

I'm working on a filter where the user can choose different conditions for the end output. Right now I'm doing the construction of the SQL query, but whenever more conditions are selected, it doesn't work.
Example of the advalues table.
+----+-----------+---------------+------------+
| id | listingId | value | identifier |
+----+-----------+---------------+------------+
| 1 | 1a | Alaskan Husky | race |
+----+-----------+---------------+------------+
| 2 | 1a | Højt | activity |
+----+-----------+---------------+------------+
| 3 | 1c | Akita | race |
+----+-----------+---------------+------------+
| 4 | 1c | Mellem | activity |
+----+-----------+---------------+------------+
As you can see, there's a different row for each advalue.
The outcome I expect
Let's say the user has checked/ticked the checkbox for the race where it says "Alaskan Husky", then it should return the listingId for the match (once). If the user has selected both "Alaskan Husky" and activity level to "Low" then it should return nothing, if the activity level is either "Mellem" or "Højt" (medium, high), then it should return the listingId for where the race is "Alaskan Husky" only, not "Akita". I hope you understand what I'm trying to accomplish.
I tried something like this, which returns nothing.
SELECT * FROM advalues WHERE (identifier="activity" AND value IN("Mellem","Højt")) AND (identifier="race" AND value IN("Alaskan Husky"))
By the way, I want to select distinct listingId as well, so it only returns unique listingId's.
I will continue to search around for solutions, which I've been doing for the past few hours, but wanted to post here too, since I haven't been able to find anything that helped me yet. Thanks!
You can split the restictions on identifier in two tables for each type. Then you join on listingid to obtain the listingId wich have the two type of identifier.
SELECT ad.listingId
FROM advalues ad
JOIN advalues ad2
ON ad.listingId = ad2.listingId
WHERE ( ad.identifier = 'activity' AND ad.value IN( 'Mellem', 'Højt' ) )
AND ( ad2.identifier = 'race' AND ad2.value IN( 'Alaskan Husky' ) )
The question isn't exactly clear, but I think you want this:
WHERE (identifier="activity" AND value IN("Mellem","Højt")) OR (identifier="race" AND value IN("Alaskan Husky"))
If I got you right you are trying to fetch data with different "filters".
Your Query
SELECT listingId FROM advalues
WHERE identifier="activity"
AND value IN("Mellem","Højt")
AND identifier="race"
AND value IN("Alaskan Husky")
Will always return 0 results as you are asking for identifier = "activity" AND identifier = "race"
I think you wanted to do something like this instead:
SELECT listingId FROM advalues
WHERE
(identifier="activity" AND value IN("Mellem","Højt"))
OR
(identifier="race" AND value IN("Alaskan Husky"))