Using an alias defined in `FROM` part, instead of a column result sub‑query: is it possible? - sql

Question
For readability mainly, while I know I could achieve the same with a TEMPORARY TABLE, I would like to avoid the latter (personal preference to have the whole in a single query).
The question is asked in the context of standard SQL.
Abstract case
Say I have something about like this:
SELECT a, (a IN (SELECT … )) as b,
FROM t
Is there any way to have something like this instead:
SELECT a, (a IN u) as b,
FROM t, (SELECT … ) as u
If I do this, the database engine (which is actually SQLite, for the anecdote), complains the table u is unknown. I though is would be visible, as it would be possible to use u as a column prefix.
I know I can do this:
CREATE TEMPORARY TABLE IF NOT EXISTS u AS SELECT … ;
SELECT a, (a IN u) as b,
FROM t
However, as I said above, I would like to avoid it, as I want a monolithic query (due to personal preferences).
That's mainly for readability when the sub‑query is a bit large, and it does not need to be a lot large, to prevent good readability.

the database engine (which is actually SQLite, for the anecdote)
In SQLite you could use Common Table Expressions:
WITH u(col) AS
(
SELECT col FROM b
)
SELECT a, (a IN u) AS b
FROM t;
SqlFiddleDemo
Output:
╔════╦═══╗
║ a ║ b ║
╠════╬═══╣
║ 1 ║ 0 ║
║ 2 ║ 1 ║
║ 3 ║ 1 ║
║ 4 ║ 0 ║
╚════╩═══╝

Related

Return rows where array column has match for every pattern in given array

I have the following table:
╔════╦════════════════════════════════════════╗
║ id ║ value ║
╠════╬════════════════════════════════════════╣
║ 1 ║ ['friend', 'apple', 'cat'] ║
║ 2 ║ ['cat', 'friend', 'dog'] ║
║ 3 ║ ['pasta', 'best-friend', 'lizard'] ║
║ 4 ║ ['wildcat', 'potato', 'alices-friend'] ║
╚════╩════════════════════════════════════════╝
My goal is to return all rows where value contains a given array. For example:
['friend', 'cat'] should return rows 1 and 2.
['%friend%', '%cat%'] should return rows 1, 2 and 4.
Currently I'm using this command:
SELECT DISTINCT(id), value
FROM table
WHERE value #> (ARRAY['friend', 'cat']::VARCHAR[]);
But it's not working for example 2 listed above, when (array['%friend%', '%cat%']::varchar[]).
As it works for example 1, I think the problem is with the % symbols, but I don't know how to handle this, since I don't need to explicitly match the values.
DBFiddle
You want a match in the array column value for every LIKE pattern in the given array of matches.
This query is tricky for two main reasons:
There is no array operator to compare a whole array to an array of LIKE patterns. (No "array contains" operator with pattern-matching.) The array column must be unnested.
It's not enough to simply count matches after unnesting, as one pattern can match multiple times, masking the absence of matches for another.
Rephrase the task like this:
"Return all rows where none of the input patterns fails to find a match."
This query implements it, as efficiently as possible:
SELECT t.id, t.value
FROM tbl t
WHERE NOT EXISTS (
SELECT FROM unnest('{%friend%, %cat%}'::text[]) AS p(pattern)
WHERE NOT EXISTS (
SELECT FROM unnest(t.value) AS a(elem)
WHERE a.elem LIKE p.pattern
)
);
db<>fiddle here
Unfortunately, no index support possible. You'd have to normalize your relational design to allow that - with a many-to-one table replacing the array value.
Asides
Either way, to optimize performance, fork two distinct cases: search with and without special LIKE characters. Just check for the existence of characters with special meaning, i.e. one of \%_. Related:
Escape function for regular expression or LIKE patterns
Your simple query can deal with plain equality - after sanitizing it:
SELECT id, value
FROM tbl
WHERE value #> '{friend, cat}';
DISTINCT(id), value was just a misleading, equivalent syntax variant of DISTINCT id, value. Are you confusing this with DISTINCT ON? See:
Select first row in each GROUP BY group?
And, assuming id is the PK, then DISTINCT is just an expensive no-op in the given query. Remove it.
Finally, use text[] rather than varchar[]. There are corner cases where text[] is superior, text being the "preferred" string type. See:
Any downsides of using data type "text" for storing strings?
PostgreSQL ignores index, runs seq scan

Cleaning data by adding/removing characters to a string when it meets certain conditions T-SQL

I'm looking for help with cleaning a column in my data set so that I can join to another table.
The first data set is my complete data and includes what we call "reference_numbers" which relate to a specific case. Here is a dummy sample:
reference_number
case_opened
case_closed
01353568-00000001
11/01/2021
03/02/2022
09736473-00000009
21/04/2005
19/07/2021
05839576-00000012
13/09/2014
19/12/2017
09364857-00000006
13/09/2014
19/12/2017
As you can see, the "reference_number" is 8 digits then hyphen (-) and then another 8 digits. This is how a reference number should look.
My second data set is full of the same "reference_numbers". However, there is inconsistencies in the character length as they are often written differently by individuals:
reference_number
Case_workers
1353568-00000001
5
09736473-9
10
5839576-12
7
09364857-000000006
4
The first reference_number in the second data set is missing the
first "0"
The second reference_number in the second data set is missing seven "0" after the hyphen
The third reference_number in the second data set is missing both the first "0" and six "0" after the hyphen
The fourth reference_number in the second data set has too many digits after the hyphen (there is supposed to be seven 0's)
I want to be able to join the first data set onto the second data set using the reference_number. However, I need to clean them first. Is this possible and is there any efficient way of doing this?
Thanks
If the rules are so specific, you could try to use a combination of STRING_SPLIT and STRING_AGG:
SELECT
t.reference_number,
STRING_AGG(RIGHT('00000000'+s.value,8),'-') new_reference_number
FROM dbo.SecondTable t
CROSS APPLY STRING_SPLIT(t.reference_number,'-') s
GROUP BY t.reference_number
;
Using the sample data you posted, the results are:
╔════════════════════╦══════════════════════╗
║ reference_number ║ new_reference_number ║
╠════════════════════╬══════════════════════╣
║ 09364857-000000006 ║ 09364857-00000006 ║
║ 09736473-9 ║ 09736473-00000009 ║
║ 1353568-00000001 ║ 01353568-00000001 ║
║ 5839576-12 ║ 05839576-00000012 ║
╚════════════════════╩══════════════════════╝
select reference_number,
CONCAT(left(reference_number, charindex('-', reference_number) - 1),'-',RIGHT(CONCAT('000000000',right(reference_number, len(reference_number) - charindex('-', reference_number))),9)) as NewReferenceNumber
from YourSecondTableName
Reference_Number
New_Reference_Number
1353568-00000001
1353568-000000001
09736473-9
09736473-000000009
5839576-12
5839576-000000012
09364857-000000006
09364857-000000006

Multiple IN Conditions on a DELETE FROM query throwing a #245 Conversion Type Error

I have a table setup like the following:
Parameters
╔═══╦═════════╦════════╗
║ID ║ Name ║ Value ║
╠═══╬═════════╬════════╣
║ 7 ║ first ║ 0 ║
║ 7 ║ second ║ -1 ║
║ 7 ║ third ║ -1 ║
╚═══╩═════════╩════════╝
It contains more rows, but I only want to delete the ones listed above. I have made the following query below to perform this action, but when you add a 3rd value to the IN condition for name I get:
ErrorNumber 245 - "Conversion failed when converting the varchar value to data type int."
DELETE FROM Parameters
WHERE
ID = 7 AND
Name IN ('first', 'second', 'third') AND
Value IN (0, -1)
If I delete any of the 3 names making the IN condition 1 or 2 names it runs fine, but I need the third row to be deleted in the same query. What can I do to accomplish this?
Clearly, either id or value is a string. SQL Server has to decide what type to use. If a string is compared to a number, then a number is used. If another row has a bad number, then you get a type conversion error.
You should use the proper types for comparison. If I had to guess, it would be value:
DELETE FROM Parameters
WHERE
ID = 7 AND
Name IN ('first', 'second', 'third') AND
Value IN ('0', '-1');
You can put single quotes around numeric constants such as 7. I discourage it, because mis-using types is a bad coding habit.

How to UNPIVOT a table in SQL Server?

I'm trying to unpivot my data but getting some weird results. How can I accomplish this?
Below is my code and screenshot of the results. (SQL Fiddle)
select distinct recId, caseNumber, servtype, mins
from
(
select
recid
,caseNumber
,[preEnrollment_type]
,[preEnrollment_minutes]
,[screening_type]
,[screeningEnA_minutes]
,[ifsp_type]
,[ifsp_minutes]
from
CaseManagementProgressNote
where
[formComplete]=1
and [reviewed]<>1
and [dataentry]<>1
and [caseManagementEntry]=1
and [serviceCoordinator] <> 'webmaster#company.net'
and [contactDateTime] >= '1/1/2015'
and [childID] is not null
) as cp
unpivot
(
servType for servTypes in ([preEnrollment_type],[screening_type],[ifsp_type])
) as up1
unpivot
(
mins for minutess in ([preEnrollment_minutes],[screeningEnA_minutes],[ifsp_minutes])
) as up2
order by
recId
Top part is the strange unpivoted data and the bottom part is the actual table.
As you can see in the unpivoted data, the [column]_type repeats twice and has incorrect corresponding values.
I need
1439 964699 -NA- null
1439 964699 SC 45
1439 964699 TCM FF 20
Take also into account that I still have more columns to select.
This is the reference I was using mssqltips
SQL Fiddle of the example above.
You seem to have the impression that your two UNPIVOT operations are somehow linked. They're not, other than that the second UNPIVOT is performed on the result of the first.
If you look at the results of your first UNPIVOT:
select *
from
(
select
recid
,caseNumber
,[preEnrollment_type]
,[preEnrollment_minutes]
,[screening_type]
,[screeningEnA_minutes]
,[ifsp_type]
,[ifsp_minutes]
from
CaseManagementProgressNote
) as cp
unpivot
(
servType for servTypes in ([preEnrollment_type],[screening_type],[ifsp_type])
) as up1
You will see
╔═════════╦═════════════╦════════════════════════╦═══════════════════════╦═══════════════╦═══════════╦════════════════════╗
║ recid ║ caseNumber ║ preEnrollment_minutes ║ screeningEnA_minutes ║ ifsp_minutes ║ servType ║ servTypes ║
╠═════════╬═════════════╬════════════════════════╬═══════════════════════╬═══════════════╬═══════════╬════════════════════╣
║ 143039 ║ 964699 ║ (null) ║ 45 ║ 20 ║ -NA- ║ preEnrollment_type ║
║ 143039 ║ 964699 ║ (null) ║ 45 ║ 20 ║ SC ║ screening_type ║
║ 143039 ║ 964699 ║ (null) ║ 45 ║ 20 ║ TCM FF ║ ifsp_type ║
╚═════════╩═════════════╩════════════════════════╩═══════════════════════╩═══════════════╩═══════════╩════════════════════╝
It should be clear from this what the second UNPIVOT operation does, why it gives you the results it does: to get your desired result from this, you don't need to unpivot. UNPIVOT transforms columns to rows. That's not what you're looking for. You already have the rows you want. What you want is to put all three minutes columns together in one single column, depending on the servTypes. There are ways to do that, for instance by adding an expression to your SELECT list, like so:
CASE servType
WHEN 'preEnrollment_type' THEN preEnrollment_minutes
WHEN 'screening_type' THEN screeningEnA_minutes
WHEN 'ifsp_type' THEN isfp_minutes
END
Or use #ander2ed's approach and drop the UNPIVOT entirely, if you don't mind that it doesn't filter out the NULLs.
The article you link to covers this problem too:
The only complication here is matching the output phone to the corresponding phone type - for this we need to do some string interrogation to ensure that Phone1 matches to PhoneType1, Phone2 matches to PhoneType2, etc.
It solves it by doing the second UNPIVOT, and then filtering the results. You can make it work by linking servTypes and minutess. In your particular sample data, the first character of them is sufficient for identification, and is the same in the two columns, so you could add where left(servTypes, 1) = left(minutess, 1) to your query.
This seems pointlessly complicated to me, and I wouldn't recommend it, but it's the difference between the article and your query, it's the reason your query doesn't work when the article's does.
You can use Cross Apply in place of unpivot here. I find the syntax much easier to understand, and you will be able to retain null values. Try something like:
select recid, casenumber, servtype, mins
from CaseManagementProgressNote a
cross apply (VALUES (preEnrollment_type, preEnrollment_minutes),
(screening_type, screeningEna_Minutes),
(ifsp_type, ifsp_minutes)) unpvt (servtype, mins)

Squeryl update from select?

Given the following widgets table
╔════╦═════════╦═════╗
║ id ║ prev_id ║ foo ║
╠════╬═════════╬═════╣
║ 1 ║ ║ bar ║
║ 2 ║ 1 ║ ║
╚════╩═════════╩═════╝
And the following sql query
UPDATE widgets
SET
widgets.foo =
(
SELECT widgets.foo
FROM widgets
WHERE widgets.id = 1
)
WHERE
widgets.id = 2
How do I do the above update in squeryl?
I tried
update(widgets) (
w=>
where(w.id === 2)
set(w.foo := from(widgets)(prevW => where(prevW.id === 1) select foo))
)
But that gave me the following compile error:
error: No implicit view available from org.squeryl.Query[Option[String]] => org.squeryl.dsl.ast.TypedExpressionNode[Option[org.squeryl.PrimitiveTypeMode.StringType]].
It looks like Squeryl added support for sub queries with https://github.com/max-l/Squeryl/commit/e75ddecf4a0855771dd569b4c4df4e23fde2133e
However it needs an aggregate query that is guaranteed to return one result. This is done with the compute clause.
I came up with a workaround using compute(min(foo)).
So my solution ends up looking like
update(widgets) (
w=>
where(w.id === 2)
set(w.foo := from(widgets)(prevW => where(prevW.id === 1) compute(min(foo))))
)
Maybe there should be a single aggregate function that throws an exception if the query returns more than one result.
Related conversations:
https://groups.google.com/d/msg/squeryl/F9JZsPlsjVU/j1FsD7ZDJ08J
https://groups.google.com/d/msg/squeryl/6ZRMk4I9vZU/CGSiE8q3MpAJ