[Oracle SQL]Selecting results based where results are found in another query - sql

I have a query for a search page based on user data; that is, users should only be able to search for other users whose results are similar to their own. Since each result is in its own row in its respective table, i'm using oracle's listagg() function to produce a list of user records.
Is there any way to search between two sets of records for similarities?
for example, something like:
select <data> from (select listagg(data, ', ')...where userid='<whatever>')
where <data> in (select listagg(data, ', ')...where userid='<whatever>')
Obviously this is pseudocode, but assume the '...' represent valid syntax that's been omitted for brevity's sake. Also in brevity's sake, i've only included one example; there are several fields that i would be filtering on, but i'm assuming they should all function more or less like this. Whenever I try something similar, what i'm finding is that the listagg function returns in the format of 'x, y, z', instead of 'x', 'y', 'z', which causes the query (using IN) to return no results, since there is no 'x, y, z' values.

This is a better approach
select yourFields
from yourTables
where userId in (
select userId
from wherever
where whatever makes them similar
)
Establishing similarity is probably the hard part. Make sure you know the rules for that before you try to code them.

I think the OP is asking for...
SELECT a_concatenation_fxn_like_listagg(T1.FieldsOfInterest) as result
FROM theTables T1,
(select FieldsOfInterest
from theTables
where userid = "user2" //different user from T1
) as T2
WHERE
T1.userid = "user1"
AND (
T1.FieldOfinterest1 like '%'||T2.FieldOfinterest2||'%'
OR
T1.FieldOfInterest2 like '%'\\T2.FieldOfInterest2||'%' //for string similarity
OR
ABS(T1.FieldOfInterest3 - T2.FieldOfInterest3) < tolerance //for numeric similarity
)

Related

Convert strings into table columns in biq query

I would like to convert this table
to something like this
the long string can be dynamic so it's important to me that it's not a fixed solution for these values specifically
Please help, i'm using big query
You could start by using SPLIT SPLIT(value[, delimiter]) to convert your long string into separate key-value pairs in an array.
This will be sensitive to you having commas as part of your values.
SPLIT(session_experiments, ',')
Then you could either FLATTEN that array or access each element, and then use some REGEXs to separate the key and the value.
If you share more context on your restrictions and intended result I could try and put together a query for you that does exactly what you want.
It's not possible what you want, however, there is a better practice for BigQuery.
You can use arrays of structs to store that information in a table.
Let's say you have a table like that
You can use that sample query to understand how to use it.
with rawdata AS
(
SELECT 1 as id, 'test1-val1,test2-val2,test3-val3' as experiments union all
SELECT 1 as id, 'test1-val1,test3-val3,test5-val5' as experiments
)
select
id,
(select array_agg(struct(split(param, '-')[offset(0)] as experiment, split(param, '-')[offset(1)] as value)) from unnest(split(experiments)) as param ) as experiments
from rawdata
The output will look like that:
After having that output, it's more convenient to manipulate the data

SQL Like condition fails to run

I've been tasked to develop a query that behaves essentially like the following one:
SELECT * FROM tblTestData WHERE *.TestConditions LIKE '*textToSearch*'
The textToSearch is a string which contains information about the condition in which a given device is tested (Voltage, Current, Frequency, etc) in the following format as an example:
[V:127][PF:1][F:50][I:65]
The objective is to recover a list of any and all tests performed at a voltage of 127 Volts, so the SQL developed would look like the folllowing:
SELECT * FROM tblTestData WHERE *.TestConditions LIKE '*V:127*'
This works as intended but there is a problem due to an inproper introduction of data, there are cases in which the _textToSearch string looks like the following examples:
[V.127][PF:1][F:50][I:65]
[V.230][PF:1][F:50][I:65]
As you can see, my previous SQL transaction does not work as it does not meet the conditions.
If I try to do the following transaction with the objective of ignoring improper data format:
SELECT * FROM tblTestData WHERE *.TestConditions LIKE '*V*127*'
The transaction is not succesful and returns an error.
What am I doing wrong for this transaction not to work? I am approaching this problem wrong?
I see a pair of problems although with this transaction, if there were a group of test conditions like the following:
[V.127][PF:1][F:50][I:127]
[V.230][PF:1][F:50][I:127]
Would it return the values of both points given that both meet the condition of the transaction stated above?
In conclusion, my questions are:
What is wrong with the LIKE '*V*127*' condition for it not to work?
What implications has working with this condition? Can it return more information than desired if I am not careful?
I hope it is clear what I am asking for, if it isn't, please point out what is not clear and I will try to clarify it
One choice is to look for any character between the "V" and the "127":
WHERE TestConditions LIKE '%V_127%'
Note that % is the wildcard for a string of any length and _ is the wildcard for a single character.
You can also use regular expressions:
WHERE regexp_like(TestConditions, 'V[.:]127')
Note that regular expressions match anywhere in the string, so wildcards at the beginning and end are not needed.
You could check for both cases (although this will decrease performance)
SELECT *
FROM tblTestData
WHERE (TestConditions LIKE '%V:127%' OR TestConditions LIKE '%V.127%')
It is better to clean the data in your database if only old records have this problem.
Using regular expressions is recommended by Oracle for this kind of conditions. You could build a regular expression for your case:
WITH your_table AS (
SELECT '[V.127][PF:1][F:50][I:65]' text_to_search FROM dual
UNION
SELECT '[V.230][PF:1][F:50][I:65]' text_to_search FROM dual
UNION
SELECT '[V:127][PF:1][F:50][I:65]' text_to_search FROM dual
)
SELECT *
FROM your_table
WHERE REGEXP_LIKE(text_to_search,'\[V(.|:)127\]','i')
Or you could use the good old LIKE operator. In this case, you need to know that:
% matches zero or more characters
_ matches only one character
So you should use an underscore to match the : or the .
WITH your_table AS (
SELECT '[V.127][PF:1][F:50][I:65]' text_to_search FROM dual
UNION
SELECT '[V.230][PF:1][F:50][I:65]' text_to_search FROM dual
UNION
SELECT '[V:127][PF:1][F:50][I:65]' text_to_search FROM dual
)
SELECT *
FROM your_table
WHERE text_to_search LIKE '%V_127%';

SQL statement with local (inline) array

In many languages, one can use inline lists of values, with some form of code similar to this:
for x in [1,7,8,12,14,56,123]:
print x # Or whatever else you fancy doing
Working with SQL for the last year or so, I've found out that even though using such an array in WHERE is not a problem...
select *
from foo
where someColumn in (1,7,8,12,14,56,123) and someThingElse...
...I have not found an equivalent form to GET data from an inline array:
-- This is not working
select *
from (1,7,8,12,14,56,123)
where somethingElse ...
Searching for solutions, I have only found people suggesting a union soup:
select *
from (SELECT 1 UNION SELECT 1 UNION SELECT 7 UNION ...)
where somethingElse ...
...which is arguably, ugly and verbose.
I can quickly generate the UNION soup from the list with a couple of keystrokes in my editor (VIM) and then paste it back to my DB prompt - but I am wondering whether I am missing some other method to accomplish this.
Also, if there's no standard way to do it, I would still be interested in DB-engine-specific solutions (Oracle, PostgreSQL, etc)
Thanks in advance for any pointers.
Row/Table value constructors can sometimes be used as a shortish hand, for example in MSSQL:
select * from (values (1),(7),(8),(12)) as T (f)
The syntax is more complex by necessity than for a simple array-like list passed to in () because it must be able to describe a multi-dimensional set of data:
select * from (values (1, 'a'),(7, 'b'),(8, 'c'),(12, 'd')) as T (f, n)
Of course, when you find the requirement to list literal values its often a good idea to stick them in a table and query for them.

How can I SELECT DISTINCT on the last, non-numerical part of a mixed alphanumeric field?

I have a data set that looks something like this:
A6177PE
A85506
A51SAIO
A7918F
A810004
A11483ON
A5579B
A89903
A104F
A9982
A8574
A8700F
And I need to find all the ENDings where they are non-numeric. In this example, that means PE, AIO, F, ON, B and F.
In pseudocode, I'm imagining I need something like
SELECT DISTINCT X FROM
(SELECT SUBSTR(COL,[SOME_CLEVER_LOGIC]) AS X FROM TABLE);
Any ideas? Can I solve this without learning regexp?
EDIT: To clarify, my data set is a lot larger than this example. Also, I'm only interested in the part of the string AFTER the numeric part. If the string is "A6177PE" I want "PE".
Disclaimer: I don't know Oracle SQL. But, I think something like this should work:
SELECT DISTINCT X FROM
(SELECT SUBSTR(COL,REGEXP_INSTR(COL, "[[:ALPHA:]]+$")) AS X FROM TABLE);
REGEXP_INSTR(COL, "[[:ALPHA:]]+$") should return the position of the first of the characters at the end of the field.
For readability, I'd recommend using the REGEXP_SUBSTR function (If there are no performance issues of course, as this is definitely slower than the accepted solution).
...also similar to REGEXP_INSTR, but instead of returning the position of the substring, it returns the substring itself
SELECT DISTINCT SUBSTR(MY_COLUMN,REGEXP_SUBSTR("[a-zA-Z]+$")) FROM MY_TABLE;
(:alpha: is supported also, as #Audun wrote )
Also useful: Oracle Regexp Support (beginning page)
For example
SELECT SUBSTR(col,INSTR(TRANSLATE(col,'A0123456789','A..........'),'.',-1)+1)
FROM table;

Is it possible to use LIKE and IN for a WHERE statment?

I have a list of place names and would like to match them to records in a sql database the problem is the properties have reference numbers after there name. eg. 'Ballymena P-4sdf5g'
Is it possible to use IN and LIKE to match records
WHERE dbo.[Places].[Name] IN LIKE('Ballymena%','Banger%')
No, but you can use OR instead:
WHERE (dbo.[Places].[Name] LIKE 'Ballymena%' OR
dbo.[Places].[Name] LIKE 'Banger%')
It's a common misconception that for the construct
b IN (x, y, z)
that (x, y, z) represents a set. It does not.
Rather, it is merely syntactic sugar for
(b = x OR b = y OR b = z)
SQL has but one data structure: the table. If you want to query search text values as a set then put them into a table. Then you can JOIN your search text table to your Places table using LIKE in the JOIN condition e.g.
WITH Places (Name)
AS
(
SELECT Name
FROM (
VALUES ('Ballymeade Country Club'),
('Ballymena Candles'),
('Bangers & Mash Cafe'),
('Bangebis')
) AS Places (Name)
),
SearchText (search_text)
AS
(
SELECT search_text
FROM (
VALUES ('Ballymena'),
('Banger')
) AS SearchText (search_text)
)
SELECT *
FROM Places AS P1
LEFT OUTER JOIN SearchText AS S1
ON P1.Name LIKE S1.search_text + '%';
well a simple solution would be using regular expression not sure how it's done in sql but probably something similiar to this
WHERE dbo.[Places].[Name] SIMILAR TO '(Banger|Ballymena)';
or
WHERE dbo.[Places].[Name] REGEXP_LIKE(dbo.[Places].[Name],'(Banger|Ballymena)');
one of them should atleast work
you could use OR
WHERE
dbo.[Places].[Name] LIKE 'Ballymena%'
OR dbo.[Places].[Name] LIKE 'Banger%'
or split the string at the space, if the places.name is always in the same format.
WHERE SUBSTRING(dbo.[Places].[Name], 1, CHARINDEX(dbo.[Places].[Name], ' '))
IN ('Ballymena', 'Banger')
This might decrease performance, because the database may be able to use indexes with like (if the wildcard is at the end you have even a better chance) but most probably not when using substring.