Determine the values of a long list which are not existing in a postgres table - sql

Provided there is a long list of values, which happen to be values of attributes of records in a postgres-database.
I would like to create a query which finds out which of these values can not be found in the database.
I have no right to execute DDL-Statements and I would like to avoid procedural code.
Example:
the table might be
CREATE TABLE Test (
ID Integer,
attr varchar(30)
)
The list might be something like (but longer, about 240000 values)
ATTR
TestValue0
TestValue1
TestValue2
TestValue3
Using sed I can create and execute a statement
select count(*) from Test where attr in ('TestValue0',
'TestValue1','TestValue2','TestValue3')
This statement shows me, that not all of these values can be found in Test.
How can I formulate a query which tells me which of these uniq-values can not be found in the postgres-database?

For what you want to do, you can use left join, not in or not exists. But the key is that you need a derived table with the values you care about:
select v.attr
from (values ('TestValue0'), ('TestValue1'), ('TestValue2'), ('TestValue3')
) v attr
where not exists (select 1 from test t where t.attr = v.attr);

Related

PostgreSQL subqueries as values

I am trying to use a postgreSQL INSERT query with a subquery as parameter value. This is to find the corresponding user_id from an accompanying auth_token in user_info tabel first and then create a new entry in a different table with the corresponding user_id.
My query looks something like this
INSERT INTO user_movies(user_id, date, time, movie, rating)
VALUES ((SELECT user_id FROM user_info where auth_token = $1),$2,$3,$4,$5)
RETURNING *
I know that a query such as this will work with a single value
INSERT INTO user_movies(user_id)
SELECT user_id FROM user_info where auth_token = $1
RETURNING *
but how do I allow for multiples input values. Is this even possible in postgreSQL.
I am also using nodejs to run this query -> therefore the $ as placeholders.
To expand on my comment (it is probably a solution, IIUC): Easiest in this case would be to make the inner query return all the values. So, assuming columns from the inner query have the right names, you could just
INSERT INTO user_movies(user_id, date, time, movie, rating)
SELECT user_id,$2,$3,$4,$5 FROM user_info where auth_token = $1
RETURNING *
Note this form is also without VALUES, it uses a query instead.
Edited 20220424: a_horse_with_no_name removed the useless brackets around SELECT ... that appeared in my original version; thanks!
YOu could try uising where IN clause
INSERT INTO user_movies(user_id)
SELECT user_id
FROM user_info
WHERE auth_token IN ($1,$2,$3,$4,$5)
RETURNING *

How can I parse a string value in one table to join with values in another table

I have an issue where to create a report, I need two tables to join that don't have any way to join. I did find a way they could potentially join, but it's complicated.
There is table A, which contains a column called select_criteria. Here are some examples of 3 values it contains:
SELECT DISTINCT SUM(TRANSCRIPTDETAIL.CREDIT_BILLING) FROM SOMETABLE WHERE (( STUDENTFINANCIAL.TUITION_EXEMPTION = 'EMPFT' ) OR ( STUDENTFINANCIAL.TUITION_EXEMPTION = 'EMPPT' )))
SELECT DISTINCT SUM(TRANSCRIPTDETAIL.CREDIT_BILLING) FROM SOMETABLE WHERE ( STUDENTFINANCIAL.TUITION_EXEMPTION = 'PART50' )
In table B, I have a column called tuition_exemption, which contains values like:
EMPFT
EMPPT
PART50
At the tail end of the whole value within the column in table A, there are the tuition exemption codes that match the values in table B.
Is there a way using MSSQL where I can parse out the codes from the long statement in select_criteria, so that they perfectly match the codes from table B? This is my thought on a way to join up table A and table B like I need to do. The other complication is that there is a 1:many connection between select_criteria and a tuition_exemption value, but a 1:1 connection between a tuition_exemption value and a select_criteria value.
So in the end, the join between the two tables should print, in one example, the same select_criteria value twice (I am referencing the first value in my list above from table A), but in those two rows, the two different tuition_exemption values (EMPFT and EMPPT). Or in the case of table A example 2, it would be printed once and match up to PART50 once.
I am stuck here. I have a statement that successfully grabs the select_criteria values I want:
SELECT select_criteria
WHERE (
select_criteria LIKE '%EMPFT%' OR
select_criteria LIKE '%EMPPT%' OR
select_criteria LIKE '%PART50%' OR
)
But what I need to do is this. When it grabs the select_criteria values I want, I then want to print to a new column in this table the code it matches up to. Those codes are values in table B like 'EMPFT', 'EMPPT' and 'PART50'. That is why I was thinking of basically parsing out the codes from select_criteria, and printing them into the new column in table A. That way table A and table B have a value to match up on and I write run my report. I just don't now how to do it in SQL. I kind of know in Perl, but was hoping to just do all of this in SSMS 2012.
Thanks for any help!
byobob
You can use any expression which returns a boolean as a join criteria. Since LIKE returns a bool, you should be able to just do this:
select *
from tableA
join tableB
on tableA.select_criteria like '%' + tableB.codecolumn + '%'

HiveQL: Using query results as variables

in Hive I'd like to dynamically extract information from a table, save it in a variable and further use it. Consider the following example, where I retrieve the maximum of column var and want to use it as a condition in the subsequent query.
set maximo=select max(var) from table;
select
*
from
table
where
var=${hiveconf:maximo}
It does not work, although
set maximo=select max(var) from table;
${hiveconf:maximo}
shows me the intended result.
Doing:
select '${hiveconf:maximo}'
gives
"select max(var) from table"
though.
Best
Hive substitutes variables as is and does not execute them. Use shell wrapper script to get result into variable and pass it to your Hive script.
maximo=$(hive -e "set hive.cli.print.header=false; select max(var) from table;")
hive -hiveconf "maximo"="$maximo" -f your_hive_script.hql
And after this inside your script you can use select '${hiveconf:maximo}'
#Hein du Plessis
Whilst it's not possible to do exactly what you want from Hue -- a constant source of frustration for me -- if you are restricted to Hue, and can't use a shell wrapper as suggested above, there are workarounds depending on the scenario.
When I once wanted to set a variable by selecting the max of a column in a table to use in a query, I got round it like this:
I first put the result into a table comprising two columns, with the (arbitrary word) 'MAX_KEY' in one column and the result of the max calculation in the other, like this:
drop table if exists tam_seg.tbl_stg_temp_max_id;
create table tam_seg.tbl_stg_temp_max_id as
select
'MAX_KEY' as max_key
, max(pvw_id) as max_id
from
tam_seg.tbl_dim_cc_phone_vs_web;
I then added the word 'MAX_KEY' to a sub-query then joined in the above table so I could use the result in the main query:
select
-- *** here is the joined in value from the table being used ***
cast(mxi.max_id + qry.temp_id as string) as pvw_id
, qry.cc_phone_vs_web
from
(
select
snp.cc_phone_vs_web
, row_number() over(order by snp.cc_phone_vs_web) as temp_id
-- *** here is the key being added to the sub-query ***
, 'MAX_KEY' as max_key
from
(
select distinct cc_phone_vs_web from tam_seg.tbl_stg_base_snapshots
) as snp
left outer join
tam_seg.tbl_dim_cc_phone_vs_web as pvw
on snp.cc_phone_vs_web = pvw.cc_phone_vs_web
where
pvw.cc_phone_vs_web is null
) as qry
-- *** here is the table with the select result in being joined in ***
left outer join
tam_seg.tbl_stg_temp_max_id as mxi
on qry.max_key = mxi.max_key
;
Not sure if this is your scenario but maybe it can be adapted. I'm 99% sure you can't just put a select statement directly into a variable in Hue though.
If I am doing something in just Hue I would probably do the temporary table and join method. But if I were using a shall wrapper anyway I would definitely do it there.
I hope this helps.

'In' clause in SQL server with multiple columns

I have a component that retrieves data from database based on the keys provided.
However I want my java application to get all the data for all keys in a single database hit to fasten up things.
I can use 'in' clause when I have only one key.
While working on more than one key I can use below query in oracle
SELECT * FROM <table_name>
where (value_type,CODE1) IN (('I','COMM'),('I','CORE'));
which is similar to writing
SELECT * FROM <table_name>
where value_type = 1 and CODE1 = 'COMM'
and
SELECT * FROM <table_name>
where value_type = 1 and CODE1 = 'CORE'
together
However, this concept of using 'in' clause as above is giving below error in 'SQL server'
ERROR:An expression of non-boolean type specified in a context where a condition is expected, near ','.
Please let know if their is any way to achieve the same in SQL server.
This syntax doesn't exist in SQL Server. Use a combination of And and Or.
SELECT *
FROM <table_name>
WHERE
(value_type = 1 and CODE1 = 'COMM')
OR (value_type = 1 and CODE1 = 'CORE')
(In this case, you could make it shorter, because value_type is compared to the same value in both combinations. I just wanted to show the pattern that works like IN in oracle with multiple fields.)
When using IN with a subquery, you need to rephrase it like this:
Oracle:
SELECT *
FROM foo
WHERE
(value_type, CODE1) IN (
SELECT type, code
FROM bar
WHERE <some conditions>)
SQL Server:
SELECT *
FROM foo
WHERE
EXISTS (
SELECT *
FROM bar
WHERE <some conditions>
AND foo.type_code = bar.type
AND foo.CODE1 = bar.code)
There are other ways to do it, depending on the case, like inner joins and the like.
If you have under 1000 tuples you want to check against and you're using SQL Server 2008+, you can use a table values constructor, and perform a join against it. You can only specify up to 1000 rows in a table values constructor, hence the 1000 tuple limitation. Here's how it would look in your situation:
SELECT <table_name>.* FROM <table_name>
JOIN ( VALUES
('I', 'COMM'),
('I', 'CORE')
) AS MyTable(a, b) ON a = value_type AND b = CODE1;
This is only a good idea if your list of values is going to be unique, otherwise you'll get duplicate values. I'm not sure how the performance of this compares to using many ANDs and ORs, but the SQL query is at least much cleaner to look at, in my opinion.
You can also write this to use EXIST instead of JOIN. That may have different performance characteristics and it will avoid the problem of producing duplicate results if your values aren't unique. It may be worth trying both EXIST and JOIN on your use case to see what's a better fit. Here's how EXIST would look,
SELECT * FROM <table_name>
WHERE EXISTS (
SELECT 1
FROM (
VALUES
('I', 'COMM'),
('I', 'CORE')
) AS MyTable(a, b)
WHERE a = value_type AND b = CODE1
);
In conclusion, I think the best choice is to create a temporary table and query against that. But sometimes that's not possible, e.g. your user lacks the permission to create temporary tables, and then using a table values constructor may be your best choice. Use EXIST or JOIN, depending on which gives you better performance on your database.
Normally you can not do it, but can use the following technique.
SELECT * FROM <table_name>
where (value_type+'/'+CODE1) IN (('I'+'/'+'COMM'),('I'+'/'+'CORE'));
A better solution is to avoid hardcoding your values and put then in a temporary or persistent table:
CREATE TABLE #t (ValueType VARCHAR(16), Code VARCHAR(16))
INSERT INTO #t VALUES ('I','COMM'),('I','CORE')
SELECT DT. *
FROM <table_name> DT
JOIN #t T ON T.ValueType = DT.ValueType AND T.Code = DT.Code
Thus, you avoid storing data in your code (persistent table version) and allow to easily modify the filters (without changing the code).
I think you can try this, combine and and or at the same time.
SELECT
*
FROM
<table_name>
WHERE
value_type = 1
AND (CODE1 = 'COMM' OR CODE1 = 'CORE')
What you can do is 'join' the columns as a string, and pass your values also combined as strings.
where (cast(column1 as text) ||','|| cast(column2 as text)) in (?1)
The other way is to do multiple ands and ors.
I had a similar problem in MS SQL, but a little different. Maybe it will help somebody in futere, in my case i found this solution (not full code, just example):
SELECT Table1.Campaign
,Table1.Coupon
FROM [CRM].[dbo].[Coupons] AS Table1
INNER JOIN [CRM].[dbo].[Coupons] AS Table2 ON Table1.Campaign = Table2.Campaign AND Table1.Coupon = Table2.Coupon
WHERE Table1.Coupon IN ('0000000001', '0000000002') AND Table2.Campaign IN ('XXX000000001', 'XYX000000001')
Of cource on Coupon and Campaign in table i have index for fast search.
Compute it in MS Sql
SELECT * FROM <table_name>
where value_type + '|' + CODE1 IN ('I|COMM', 'I|CORE');

IF-Statement in SQLite: update or insert?

I Can't run this query with SQLite
if 0<(select COUNT(*) from Repetition where (Word='behnam' and Topic='mine'))
begin
update Repetition set Counts=1+ (select Counts from Repetition where (Word='behnam' and Topic='mine'))
end
else
begin
insert Repetition(Word,Topic,Counts)values('behnam','mine',1)
end
It says "Syntax error near IF"
How can I solve the problem
SQLite does not have an IF statement (see the list of supported queries)
Insetad, check out out ERIC B's suggestion on another thread. You're effectively looking at doing an UPSERT (UPdate if the record exists, INSERT if not). Eric B. has a good example of how to do this in SQLite syntax utilizing the "INSERT OR REPLACE" functionality in SQLite. Basically, you'd do something like:
INSERT OR REPLACE INTO Repetition (Word, Topic, Counts)
VALUES ( 'behnam', 'mine',
coalesce((select Counts + 1 from Repetition
where Word = 'behnam', AND Topic = 'mine)
,1)
)
Another approach is to INSERT ... SELECT ... WHERE ... EXISTS [or not] (SELECT ...);
I do this sort of thing all the time, and I use jklemmack's suggestion as well. And I do it for other purposes too, such as doing JOINs in UPDATEs (which SQLite3 does not support).
For example:
CREATE TABLE t(id INTEGER PRIMARY KEY, c1 TEXT NOT NULL UNIQUE, c2 TEXT);
CREATE TABLE r(c1 TEXT NOT NULL UNIQUE, c2 TEXT);
INSERT OR REPLACE INTO t (id, c1, c2)
SELECT t.id, coalesce(r.c1, t.c1), coalesce(r.c2, t.c2)
FROM r LEFT OUTER JOIN t ON r.c1 = t.c1
WHERE r.c2 = #param;
The WHERE there has the condition that you'd have in your IF. The JOIN in the SELECT provides the JOIN that SQLite3 doesn't support in UPDATE. The INSERT OR REPLACE and the use of t.id (which can be NULL if the row doesn't exist in t) together provide the THEN and ELSE bodies.
You can apply this over and over. If you'd have three statements (that cannot somehow be merged into one) in the THEN part of the IF you'd need to have three statements with the IF condition in their WHEREs.
This is called an UPSERT (i.e. UPdate or inSERT). It has its forms in almost every type of database. Look at this question for the SQLite version: SQLite - UPSERT *not* INSERT or REPLACE
One way that I've found is based on SQL WHERE clause true/false statement:
SELECT * FROM SOME_TABLE
WHERE
(
SELECT SINGLE_COLUMN_NAME FROM SOME_OTHER_TABLE
WHERE
SOME_COLUMN = 'some value' and
SOME_OTHER_COLUMN = 'some other value'
)
This actually means execute some QUERIES if some other QUERY returns 'any' result.