Use value of one column as identifier in another table in SNOWFLAKE - sql

I have two table one of which contains the rule for another
create table t1(id int, query string)
create table t2(id int, place string)
insert into t1 values (1,'id < 10')
insert into t1 values (2,'id == 10')
And the values in t2 are
insert into t2 values (11,'Nevada')
insert into t2 values (20,'Texas')
insert into t2 values (10,'Arizona')
insert into t2 values (2,'Abegal')
I need to select from second table as per the value of first table column value.
like
select * from t2 where {query}
or
with x(query)
as
(select c2 from test)
select * from test where query;
but neither are helping.

There are a couple of problems with storing criteria in a table like this:
First, as has already been noted, you'll likely have to resort to dynamic SQL, which can get messy, and limits how you can use it.
It's going to be problematic (to say the least) to validate and parse your criteria. What if someone writes a rule of [id] *= 10, or [this_field_doesn't_exist] = blah?
If you're just storing potential values for your [id] column, one solution would be to have your t1 (storing your queries) include a min value and max value, like this:
CREATE TABLE t1
(
[id] INT IDENTITY(1,1) PRIMARY KEY,
min_value INT NULL,
max_value INT NULL
)
Note that both the min and max values can be null. Your provided criteria would then be expressed as this:
INSERT INTO t1
([id], min_value, max_value)
VALUES
(1, NULL, 10),
(2, 10, 10)
Note that I've explicitly referenced what attibutes we're inserting into, as you should also be doing (to prevent issues with attributes being added/modified down the line).
A null value on min_value means no lower limit; a null max_value means no upper limit.
To then get results from t2 that meet all your t1 criteria, simply do an INNER JOIN:
SELECT t2.*
FROM t2
INNER JOIN t1 ON
(t2.id <= t1.max_value OR t1.max_value IS NULL)
AND
(t2.id >= t1.min_value OR t1.min_value IS NULL)
Note that, as I said, this will only return results that match all your criteria. If you need to more complex logic (for example, show records that meet Rules 1, 2 and 3, or meet Rule 4), you'll likely have to resort to dynamic SQL (or at the very least some ugly JOINs).
As stated in a comment, however, you want to have more complex rules, which might mean you have to end up using dynamic SQL. However, you still have the problem of validating and parsing your rule. How do you handle cases where a user enters an invalid rule?
A better solution might be to store your rules in a format that can easily be parsed and validated. For example, come up with an XML schema that defines a valid rule/criterion. Then, your Rules table would have a rule XML attribute, tied to that schema, so users could only enter valid rules. You could then either shred that XML document, or create the SQL client-side to come up with your query.

I got the answer myself. And I am putting it below.
I have used python CLI to do the job. (As snowflake does not support dynamic query)
I believe one can use the same for other DB (tedious but doable)
setting up configuration to connect
CONFIG_PATH = "/root/config/snowflake.json"
with open(CONFIG_PATH) as f:
config = json.load(f)
#snowflake
snf_user = config['snowflake']['user']
snf_pwd = config['snowflake']['pwd']
snf_account = config['snowflake']['account']
snf_region = config['snowflake']['region']
snf_role = config['snowflake']['role']
ctx = snowflake.connector.connect(
user=snf_user,
password=snf_pwd,
account=snf_account,
region=snf_region,
role=snf_role
)
--comment
Used multiple cursor as in loop we don't want recursive connection
cs = ctx.cursor()
cs1 = ctx.cursor()
query = "select c2 from test"
cs.execute(query)
for (x) in cs:
y = "select * from test1 where {0}".format(', '.join(x).replace("'",""))
cs1.execute(y)
for (y1) in cs1:
print('{0}'.format(y1))
And boom done

Related

Determine the values of a long list which are not existing in a postgres table

Provided there is a long list of values, which happen to be values of attributes of records in a postgres-database.
I would like to create a query which finds out which of these values can not be found in the database.
I have no right to execute DDL-Statements and I would like to avoid procedural code.
Example:
the table might be
CREATE TABLE Test (
ID Integer,
attr varchar(30)
)
The list might be something like (but longer, about 240000 values)
ATTR
TestValue0
TestValue1
TestValue2
TestValue3
Using sed I can create and execute a statement
select count(*) from Test where attr in ('TestValue0',
'TestValue1','TestValue2','TestValue3')
This statement shows me, that not all of these values can be found in Test.
How can I formulate a query which tells me which of these uniq-values can not be found in the postgres-database?
For what you want to do, you can use left join, not in or not exists. But the key is that you need a derived table with the values you care about:
select v.attr
from (values ('TestValue0'), ('TestValue1'), ('TestValue2'), ('TestValue3')
) v attr
where not exists (select 1 from test t where t.attr = v.attr);

'In' clause in SQL server with multiple columns

I have a component that retrieves data from database based on the keys provided.
However I want my java application to get all the data for all keys in a single database hit to fasten up things.
I can use 'in' clause when I have only one key.
While working on more than one key I can use below query in oracle
SELECT * FROM <table_name>
where (value_type,CODE1) IN (('I','COMM'),('I','CORE'));
which is similar to writing
SELECT * FROM <table_name>
where value_type = 1 and CODE1 = 'COMM'
and
SELECT * FROM <table_name>
where value_type = 1 and CODE1 = 'CORE'
together
However, this concept of using 'in' clause as above is giving below error in 'SQL server'
ERROR:An expression of non-boolean type specified in a context where a condition is expected, near ','.
Please let know if their is any way to achieve the same in SQL server.
This syntax doesn't exist in SQL Server. Use a combination of And and Or.
SELECT *
FROM <table_name>
WHERE
(value_type = 1 and CODE1 = 'COMM')
OR (value_type = 1 and CODE1 = 'CORE')
(In this case, you could make it shorter, because value_type is compared to the same value in both combinations. I just wanted to show the pattern that works like IN in oracle with multiple fields.)
When using IN with a subquery, you need to rephrase it like this:
Oracle:
SELECT *
FROM foo
WHERE
(value_type, CODE1) IN (
SELECT type, code
FROM bar
WHERE <some conditions>)
SQL Server:
SELECT *
FROM foo
WHERE
EXISTS (
SELECT *
FROM bar
WHERE <some conditions>
AND foo.type_code = bar.type
AND foo.CODE1 = bar.code)
There are other ways to do it, depending on the case, like inner joins and the like.
If you have under 1000 tuples you want to check against and you're using SQL Server 2008+, you can use a table values constructor, and perform a join against it. You can only specify up to 1000 rows in a table values constructor, hence the 1000 tuple limitation. Here's how it would look in your situation:
SELECT <table_name>.* FROM <table_name>
JOIN ( VALUES
('I', 'COMM'),
('I', 'CORE')
) AS MyTable(a, b) ON a = value_type AND b = CODE1;
This is only a good idea if your list of values is going to be unique, otherwise you'll get duplicate values. I'm not sure how the performance of this compares to using many ANDs and ORs, but the SQL query is at least much cleaner to look at, in my opinion.
You can also write this to use EXIST instead of JOIN. That may have different performance characteristics and it will avoid the problem of producing duplicate results if your values aren't unique. It may be worth trying both EXIST and JOIN on your use case to see what's a better fit. Here's how EXIST would look,
SELECT * FROM <table_name>
WHERE EXISTS (
SELECT 1
FROM (
VALUES
('I', 'COMM'),
('I', 'CORE')
) AS MyTable(a, b)
WHERE a = value_type AND b = CODE1
);
In conclusion, I think the best choice is to create a temporary table and query against that. But sometimes that's not possible, e.g. your user lacks the permission to create temporary tables, and then using a table values constructor may be your best choice. Use EXIST or JOIN, depending on which gives you better performance on your database.
Normally you can not do it, but can use the following technique.
SELECT * FROM <table_name>
where (value_type+'/'+CODE1) IN (('I'+'/'+'COMM'),('I'+'/'+'CORE'));
A better solution is to avoid hardcoding your values and put then in a temporary or persistent table:
CREATE TABLE #t (ValueType VARCHAR(16), Code VARCHAR(16))
INSERT INTO #t VALUES ('I','COMM'),('I','CORE')
SELECT DT. *
FROM <table_name> DT
JOIN #t T ON T.ValueType = DT.ValueType AND T.Code = DT.Code
Thus, you avoid storing data in your code (persistent table version) and allow to easily modify the filters (without changing the code).
I think you can try this, combine and and or at the same time.
SELECT
*
FROM
<table_name>
WHERE
value_type = 1
AND (CODE1 = 'COMM' OR CODE1 = 'CORE')
What you can do is 'join' the columns as a string, and pass your values also combined as strings.
where (cast(column1 as text) ||','|| cast(column2 as text)) in (?1)
The other way is to do multiple ands and ors.
I had a similar problem in MS SQL, but a little different. Maybe it will help somebody in futere, in my case i found this solution (not full code, just example):
SELECT Table1.Campaign
,Table1.Coupon
FROM [CRM].[dbo].[Coupons] AS Table1
INNER JOIN [CRM].[dbo].[Coupons] AS Table2 ON Table1.Campaign = Table2.Campaign AND Table1.Coupon = Table2.Coupon
WHERE Table1.Coupon IN ('0000000001', '0000000002') AND Table2.Campaign IN ('XXX000000001', 'XYX000000001')
Of cource on Coupon and Campaign in table i have index for fast search.
Compute it in MS Sql
SELECT * FROM <table_name>
where value_type + '|' + CODE1 IN ('I|COMM', 'I|CORE');

Inner-Join on two column where one column has a single tailing character

Hi I'm new to SQL and I have 2 tables that I am trying to do an inner-join with.
------------------------
First table:
------------------------
ID-Number CustomerName
------------------------
Second table
------------------------
ID-Number CustomerDevice
(ID with a single tailing character)
Questions
What would be the best preforming way to execute the inner-join on both table's ID-number?
Is there a method to remove the trailing character within the inner-join command?
You don't have much choice. Here is how you can express the logic:
select . . .
from t1 join
t2
on t1.id like t2.id + '_';
Unfortunately, this may not make use of indexes. (Also note that + for string concatenation is SQL Server-specific).
You might be able to rewrite the query as:
on t1.id = left(t2.id, len(t2.id) - 1)
This should be able to use an index on t1(id).
The best approach is to fix the data, so your ids are the same type, same length, and have a properly declared foreign key relationship. Another alternative available in SQL Server is an index on a computed column:
alter table t2 add realId as (left(id, len(id) - 1));
create index idx_t2_realId on t2(realId);
Then write the join logic using realId.
Would this work?
SELECT
ID-Number,
CustomerName,
CustomerDevice
FROM t1
INNER JOIN t2 on t1.ID-Number=LEFT(t2.ID-Number,LEN(t2.ID-Number)-1)
EDIT: Forgot the 1
Given that the table Customer has this column
ID_number int not null;
And the the table Device has this column
ID_number varchar(15);
And we know that Device.ID_number, if it is not NULL, is always equal to some Customer.ID_number with a letter appended, then (SQL Server):
SELECT *
FROM Customer c
JOIN Device d
ON c.ID_number = CAST(SUBSTRING(i.ID_number, 1, LEN(i.ID_number) - 1) AS int)
More robust solutions that allow for more possibilities in the data require more defensive coding. You may want to define a scalar function to process Customer.ID_number.

Oracle: Accessing parent attribute in subquery

How can I access 'parent' attributes in subqueries.
E.g. if I have the following Minimal Working Example snippet, I expect as output
"1,2:3"
however it fails with
ORA-904, T1.F1 invalid Identifier.
Now I know I can rewrite this complete query to get this working, however reason for asking this is:
Why can't I access the 'outer' attrbute?
How can I access it with less modification and
I want to add a column without modyfing the outer query too much.
MWE:
create table T1(F1 INTEGER);
create table T2(F2 INTEGER,F3 INTEGER);
insert into T1(F1) VALUES(1);
insert into T2(F2,F3) VaLUES(1,2);
insert into T2(F2,F3) VALUES(1,2);
insert into T2(F2,F3) VALUES(1,3);
select T1.F1,
(SELECT LISTAGG(A,':') WITHIN GROUP (ORDER BY A) from (select distinct(F3) as A froM T2 where F2 = T1.F1)) as B
from T1;
1) Why can't I access the 'outer' attribute?
Oracle just allows the subquery to access its direct parent query tables... actually, you're trying to access the main query in a subquery of a subquery.
2) How can I access it with less modification
Your inner most subquery could be removed and you could apply a regex to remove duplicates as following:
select
T1.F1,
(
SELECT REGEXP_REPLACE(
LISTAGG(F3,':') WITHIN GROUP (ORDER BY F3),
'([^:]+):(\1(:|$))+',
'\1\3'
)
from T2
where F2 = T1.F1
) as B
from T1;
This regex finds any non-duplicate token (token = data before a : or before the end of line) and checks the next tokens to find any duplicate, replacing all the match for the first non-duplicate found and the : if it's not the end of line.
3) I want to add a column without modyfing the outer query too much
This way your outer query haven't changed, so you can manage it the way you want.

IF-Statement in SQLite: update or insert?

I Can't run this query with SQLite
if 0<(select COUNT(*) from Repetition where (Word='behnam' and Topic='mine'))
begin
update Repetition set Counts=1+ (select Counts from Repetition where (Word='behnam' and Topic='mine'))
end
else
begin
insert Repetition(Word,Topic,Counts)values('behnam','mine',1)
end
It says "Syntax error near IF"
How can I solve the problem
SQLite does not have an IF statement (see the list of supported queries)
Insetad, check out out ERIC B's suggestion on another thread. You're effectively looking at doing an UPSERT (UPdate if the record exists, INSERT if not). Eric B. has a good example of how to do this in SQLite syntax utilizing the "INSERT OR REPLACE" functionality in SQLite. Basically, you'd do something like:
INSERT OR REPLACE INTO Repetition (Word, Topic, Counts)
VALUES ( 'behnam', 'mine',
coalesce((select Counts + 1 from Repetition
where Word = 'behnam', AND Topic = 'mine)
,1)
)
Another approach is to INSERT ... SELECT ... WHERE ... EXISTS [or not] (SELECT ...);
I do this sort of thing all the time, and I use jklemmack's suggestion as well. And I do it for other purposes too, such as doing JOINs in UPDATEs (which SQLite3 does not support).
For example:
CREATE TABLE t(id INTEGER PRIMARY KEY, c1 TEXT NOT NULL UNIQUE, c2 TEXT);
CREATE TABLE r(c1 TEXT NOT NULL UNIQUE, c2 TEXT);
INSERT OR REPLACE INTO t (id, c1, c2)
SELECT t.id, coalesce(r.c1, t.c1), coalesce(r.c2, t.c2)
FROM r LEFT OUTER JOIN t ON r.c1 = t.c1
WHERE r.c2 = #param;
The WHERE there has the condition that you'd have in your IF. The JOIN in the SELECT provides the JOIN that SQLite3 doesn't support in UPDATE. The INSERT OR REPLACE and the use of t.id (which can be NULL if the row doesn't exist in t) together provide the THEN and ELSE bodies.
You can apply this over and over. If you'd have three statements (that cannot somehow be merged into one) in the THEN part of the IF you'd need to have three statements with the IF condition in their WHEREs.
This is called an UPSERT (i.e. UPdate or inSERT). It has its forms in almost every type of database. Look at this question for the SQLite version: SQLite - UPSERT *not* INSERT or REPLACE
One way that I've found is based on SQL WHERE clause true/false statement:
SELECT * FROM SOME_TABLE
WHERE
(
SELECT SINGLE_COLUMN_NAME FROM SOME_OTHER_TABLE
WHERE
SOME_COLUMN = 'some value' and
SOME_OTHER_COLUMN = 'some other value'
)
This actually means execute some QUERIES if some other QUERY returns 'any' result.