SQL - Select In from Array, and NOT in in same query - sql

I have a userform built in VBA where my coworkers can enter multiple values that builds an array and places it into an IN statement, which works great. Problem is I need to also be able to display what values do not exist within the tables.
Example table
id | value
1 | value1
2 | value2
4 | value4
Then a query that could be generated would be
SELECT [id],[value] FROM [tablea] WHERE [id] IN (1,2,3,4)
Expected or desirable outcome would be as follows
id | value
1 | value1
2 | value2
3 | null
4 | value4
I've tried doing it like so;
SELECT [id],[value] FROM [tablea] WHERE [id] IN (1,2,3,4) AND [id] NOT IN (1,2,3,4)
since both arrays will be the same, this returns 0 of course.
I know I can do this with a union, and define the not in statement within the second union, but I'd like to do this without a union.. Any other thoughts?
This is on Microsoft SQL 2005
I unfortunately only have access to SELECT, since I'm performing queries either via VBA or Tableau. So I cannot create a derived table or have anything to reference other than the select statement.

You need a left join of some sort. One way would be to construct your query as:
select v.id, t.value
from (values (1), (2), (3), (4)
) v(id) left join
table t
on v.id = t.id;

Thanks to Joel Coehoorn for the tip towards using a CTE
I was able to accomplish this like so;
WITH numbers AS (
SELECT 1 AS num UNION ALL
SELECT 2 AS num UNION ALL
SELECT 3 AS num UNION ALL
SELECT 4 as num UNION ALL )
SELECT
COALESCE(id,num) as col1,
id as col2
FROM tablea
RIGHT JOIN numbers ON tablea.id = numbers.num
This would return
col1 | col2
1 | 1
2 | 2
3 | NULL
4 | 4

Related

How to do the equivalent of a double for loop append for SQL

I'm setting up this table in SQL (Presto, also new to sql) that has 2 columns, col1 and col2 These two columns are generated from two other existing tables table1 and table2. To keep things simple, let's say col1 from table1 has 3 values and col2 from table2 has 2 values. I want the table I want to create to look like this (let's call it table3, and I'll use col1.1 to denote the first value in that col and so on and so forth):
col1 | col 2
--------------------
col1.1 | col2.1
col1.1 | col2.2
col1.2 | col2.1
col1.2 | col2.2
col1.3 | col2.1
col1.3 | col2.2
I know how to do this in Python using Pandas like I did here (dummy example):
a = [1, 2, 3]
b = ['sam', 'john']
combined_lst = []
for i in a:
for j in b:
combined_lst.append({'col1': i, 'col2': j})
table = pandas.io.json.json_normalize(combined_lst)
print(table)
Table output:
col1 col2
0 1 sam
1 1 john
2 2 sam
3 2 john
4 3 sam
5 3 john
Basically it should be in the format of that table above. I've looked into trying out UNION ALL iteratively but I'm not too sure if I'm on the right track
I think you want a cross join:
select row_number() over (order by t1.col1, t2.col2) as id, t1.col1, t2.col2
from table1 t1 cross join
table2 t2;
The row_number() is in case the first column is supposed to be part of the data.

Undo a LISTAGG in redshift

I have a table that probably resulted from a listagg, similar to this:
# select * from s;
s
-----------
a,c,b,d,a
b,e,c,d,f
(2 rows)
How can I change it into this set of rows:
a
c
b
d
a
b
e
c
d
f
In redshift, you can join against a table of numbers, and use that as the split index:
--with recursive Numbers as (
-- select 1 as i
-- union all
-- select i + 1 as i from Numbers where i <= 5
--)
with Numbers(i) as (
select 1 union
select 2 union
select 3 union
select 4 union
select 5
)
select split_part(s,',', i) from Numbers, s ORDER by s,i;
EDIT: redshift doesn't seem to support recursive subqueries, only postgres. :(
SQL Fiddle
Oracle 11g R2 Schema Setup:
create table s(
col varchar2(20) );
insert into s values('a,c,b,d,a');
insert into s values('b,e,c,d,f');
Query 1:
SELECT REGEXP_SUBSTR(t1.col, '([^,])+', 1, t2.COLUMN_VALUE )
FROM s t1 CROSS JOIN
TABLE
(
CAST
(
MULTISET
(
SELECT LEVEL
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT(t1.col, '([^,])+')
)
AS SYS.odciNumberList
)
) t2
Results:
| REGEXP_SUBSTR(T1.COL,'([^,])+',1,T2.COLUMN_VALUE) |
|---------------------------------------------------|
| a |
| c |
| b |
| d |
| a |
| b |
| e |
| c |
| d |
| f |
As this is tagged to Redshift and no answer so far has a complete overview from undoing a LISTAGG in Redshift properly, here is the code that solves all its use cases:
CREATE TEMPORARY TABLE s (
s varchar(255)
);
INSERT INTO s VALUES('a,c,b,d,a');
INSERT INTO s VALUES('b,e,c,d,f');
SELECT
TRIM(split_part(s.s,',',R::smallint)) AS s
FROM s
LEFT JOIN (
SELECT
ROW_NUMBER() OVER (PARTITION BY 1) AS R
FROM any_large_table
LIMIT 1000
) extend_number
ON (SELECT MAX(regexp_count(s.s,',')+1) FROM s) >= extend_number.R
AND NULLIF(TRIM(split_part(s.s,',',extend_number.R::smallint)),'') IS NOT NULL;
DROP TABLE s;
Where “any_large_table” is any table you have in redshift already that has enough records for your purposes depending on the number of elements the list of each record will contain (i.e. in the above case, I ensure it is up to one-thousand records). Unfortunately, generate_series function does not work properly in Redshift as far as I know and that is the only way.
Another point of advise is check if you can get the values before they already list_agg whenever possible. As you can see from the above code, it looks quite complex, and you save a lot of maintenance time on your code if you keep things simple (that is, whenever the opportunity is available).

Show elements of where clause that are not present in table

I search a table based on an ID column in my where clause. I have a list of IDs that may or may not be present in this table. A simple query will give me the IDs which exist in that table (if any). Is there a way to also return ID's that were not found ?
Table --
ID
1GH
2BN
3ER
SELECT *
FROM Table
WHERE ID IN (big list 9FG, 1GH, 3UI etc)
--If ID's in above list are not in table, then show those ids.
Desired output -
9FG, 3UI were not found in the table
If I understand correctly what you need you can do it this way
SELECT q.id,
CASE WHEN t.id IS NULL THEN 'no' ELSE 'yes' END id_exists
FROM
(
SELECT '9FG' id UNION ALL
SELECT '1GH' UNION ALL
SELECT '3UI'
) q LEFT JOIN table1 t
ON q.id = t.id
Output:
| ID | ID_EXISTS |
|-----|-----------|
| 9FG | no |
| 1GH | yes |
| 3UI | no |
or if you just need a list of non-existent ids
SELECT q.id
FROM
(
SELECT '9FG' id UNION ALL
SELECT '1GH' UNION ALL
SELECT '3UI'
) q LEFT JOIN table1 t
ON q.id = t.id
WHERE t.id IS NULL
Output:
| ID |
|-----|
| 9FG |
| 3UI |
The trick is to use an OUTER JOIN instead of WHERE condition to filter data from your table and be able to see the mismatches.
Here is SQLFiddle demo
To search you can use
SELECT *
From Mytable
where id in (
select id from (values (1), (2), (3)) as SearchedIds(Id) )
and the opposite to find unamtched:
SELECT id from (values (1), (2), (3)) as SearchedIds(Id)
WHERE id not in (SELECT id From MyTable)
The syntax
Values(...) asSearchedIds(id)
is supported in Sql2008, for Sql2005 you have to do
( SELECT 1 as Id UNION ALL SELECT 2 UNION ALL ...etc ) as SearchedIds
Note: you can rewrite those queries with JOINS (INNER and LEFT)
Maybe something like:
SELECT id FROM my_table WHERE id NOT IN (val1, val2, val3)

T-sql Inner Join erratic behavior

I'm trying a simple self join, but it's output is some what erratic.
My table (input) looks like this:
ID | Value
1 | val1
1 | val2
1 | val3
2 | val4
2 | val5
2 | val6
2 | val7
What I am trying to achieve is the following:
ID 1 | Value 1 | ID 2 | Value 2
1 | val1 | 2 | val4
1 | val2 | 2 | val5
1 | val3 | 2 | val6
Null | Null | 2 | val7
My attempt at achieving this output has been the following:
SELECT DISTINCT
column1.ID,
column1.value,
column2.ID,
column2.value
FROM table column1
INNER JOIN table column2 ON column1.ID = 1 AND column2.ID = 2
This chunk of code returns the incorrect number of rows; the total number of rows I should get is 4 with a few a null values in the last. I don't get any null values but I do get some numbers which I don't know how are getting there. Additionally, if choose to display more fields from my table, the numbers of rows returned grows larger. I don't understand this behavior. Can Somebody please help me fix it? (and possibly tell me what I am doing wrong).
You cannot do this effectively. There is no relationship between the rows on the left side and the right side of the join. A join needs a relationship; your ON condition doesn't specify a relationship between the two so you will see lots more rows than you intend.
If you are trying to use SQL to format your data for display, DON'T. Get your data, then format it in your client application.
You separate both your data sets in different CTE or sub-queries and use ROW_NUMBER() function in the process to assign row numbers in order of value. In the end join the two on row number - but using FULL instead of INNER join so you can get null values on whatever side has fewer rows.
WITH CTE_1 AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY [Value]) AS RN
FROM dbo.table1
WHERE ID = 1
)
, CTE_2 AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY [Value]) AS RN
FROM dbo.table1
WHERE ID = 2
)
SELECT
c1.ID AS ID1
, c1.VALUE AS Value1
, c2.ID AS ID2
, c2.VALUE AS Value2
FROM CTE_1 c1
FULL JOIN CTE_2 c2 ON c1.RN = c2.RN
SQLFIddle is not working at the moment, I can't setup a demo, but here is sample table I have used:
CREATE TABLE Table1
([ID] int, [Value] varchar(4))
;
INSERT INTO Table1
([ID], [Value])
VALUES
(1, 'val1'),
(1, 'val2'),
(1, 'val3'),
(2, 'val4'),
(2, 'val5'),
(2, 'val6'),
(2, 'val7')
;

SQL select values of each selected columns on separate rows

I have a table with hundreds of rows and tens of W columns:
Column1 | Column2_W | Column3_W | ColumnX_W
123 | A | B | x
223 | A | NULL | NULL
How can i select it so that the output would be:
Column1 | W
123 | A
123 | B
123 | x
223 | A
EDIT: I am well aware that i am working with a terrible DB "design". Unfortunately i can't change it. This question is actually part of a larger problem i was handed today. I will try the given ideas tomorrow
SELECT Column1, Column2_W
FROM table
UNION ALL
SELECT Column1, Column3_W
FROM table
UNION ALL
SELECT Column1, Column4_W
FROM table
....
ORDER BY Column1
Better option: redesign your database! This looks like a spreadsheet, not a relational database.
Take a look at this article: UNPIVOT: Normalizing data on the fly
Unforunately, you're going to be stuck hand typing the column names with any of the solutions you use. Here's a sample using UNPIVOT in SQL Server...
SELECT Column1, W
FROM YourTable
UNPIVOT (W for Column1 in (Column2_W, Column3_W /* and so on */)) AS W
If you don't have any UNPIVOT options...
SELECT
Column1,
CASE ColumnID WHEN 2 THEN Column2_W
WHEN 3 THEN Column3_W
...
WHEN X THEN ColumnX_W
END AS Column2
FROM
yourTable
CROSS JOIN
( SELECT 2 AS ColumnID
UNION ALL SELECT 3
...
UNION ALL SELECT X
)
AS UnPivot