How to write redshift aws query to search for a value in comma delimited values - sql

table1
user_id
country_code
1
'IN,AU,AC'
2
'MX,IN'
table2
user_id
valid_country
1
'IN'
1
'AU'
2
'MX'
3
'YT'
4
'RU'
As you can see, some entries in the country_code column are multiple codes separated by commas. I would like to print user_id in table1 and their corresponding country_code only if they are valid. To check for validity here I need to use table2 which has user_id and valid_country.
The desired output is:
user_id
country_code
1
'IN'
1
'AU'
2
'MX'
Query looks like
select tb1.user_id, country_code from table1 tb1, table2 tb2 where
tb1.user_id=tb2.user_id and <Here I need to check if tb2.country_code
is there in tb1.country_code (codes separated by commas)>
Are there any simple solution that I could check valid_country in the comma separated values.

The simple way isn't always the best. There are a number of corner cases that can arise here (like are all country codes 2 letters). That said a LIKE clause would be simple:
select tb1.user_id, valid_country as country_code
from table1 tb1, table2 tb2
where tb1.user_id=tb2.user_id
and tb1.country_code like '%'||tb2.valid_country||'%'
Or if we are to put this in modern SQL syntax:
select tb1.user_id, valid_country as country_code
from table1 tb1 join table2 tb2
on tb1.user_id=tb2.user_id
and tb1.country_code like '%'||tb2.valid_country||'%'

Try this:
a) Verticalise tb1 by CROSS JOINing it with a series of consecutive integers (which I supply in a Common Table Expression), and applying the SPLIT_PART() function to break the comma delimited list into single element.
b) INNER JOIN the verticalised result with the valid user_id/country code combinations table on an equi-join on both columns.
WITH
-- your table 1, don't use in end query ...
tb1(user_id,country_code) AS (
SELECT 1,'IN,AU,AC'
UNION ALL SELECT 2,'MX,IN'
)
,
-- your table 2, don't use in end query ...
tb2(user_id,valid_country) AS (
SELECT 1,'IN'
UNION ALL SELECT 1,'AU'
UNION ALL SELECT 2,'MX'
UNION ALL SELECT 3,'YT'
UNION ALL SELECT 4,'RU'
)
-- real query starts here, replace following comma with "WITH" ...
,
i(i) AS ( -- need a series of integers ...
SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
)
,
vertical AS (
SELECT
tb1.user_id
, i
, SPLIT_PART(country_code,',',i) AS valid_country
FROM tb1 CROSS JOIN i
WHERE SPLIT_PART(country_code,',',i) <> ''
)
SELECT
vertical.user_id
, vertical.valid_country
FROM vertical
JOIN tb2 USING(user_id,valid_country)
ORDER BY vertical.user_id,vertical.i
;
-- out user_id | valid_country
-- out ---------+---------------
-- out 1 | IN
-- out 1 | AU
-- out 2 | MX

Related

Count for a list of items with zero for does not exist

If I have a table t1 with:
my_col
------
foo
foo
bar
And I have a list with foo and hello
How can I get:
my_col | count
-------|-------
foo | 2
hello | 0
If I just do
SELECT my_col, COUNT(*)
FROM t1
WHERE my_col in ('foo', 'hello')
GROUP BY my_col
I get
my_col | count
-------|------
foo | 2
without any value for hello.
I'm specifically wanting this to be in reference to a list of items because this will be called in a program where the list is a variable.
Ideally you should maintain a separate table with all the possible column values which you want to appear in your report. In the absence of that, we can try using a CTE here:
WITH cte AS (
SELECT 'foo' AS my_col UNION ALL
SELECT 'bar' UNION ALL
SELECT 'hello'
)
SELECT
a.my_col,
COUNT(b.my_col) AS count
FROM cte a
LEFT JOIN t1 b
ON a.my_col = b.my_col
WHERE
a.my_col IN ('foo', 'hello')
GROUP BY
a.my_col;
Demo
Here's yet another way, using values:
select
t2.my_col, count (t1.my_col)
from
(values ('foo'), ('hello')) as t2 (my_col)
left join t1 on t1.my_col = t2.my_col
group by
t2.my_col
Note that count (t1.my_col) returns a 0 for "hello" since nulls are not counted. count (*) by contast would have returned 1 for "hello" because it was counting the row.
You can turn your list into a set of rows and use a LEFT JOIN, like :
SELECT x.val, COUNT(t.my_col)
FROM
(SELECT 'foo' val UNION SELECT 'hello') x
LEFT JOIN t ON t.my_col = x.val
GROUP BY x.val
Postgres solution:
One way is to place the 'list' into an ARRAY, and then convert the ARRAY into a column using unnest. Then perform a left join on that column with the other table and perform a count.
WITH t1 AS (
SELECT 'foo' AS my_col UNION ALL
SELECT 'foo' UNION ALL
SELECT 'bar'
)
SELECT
a.my_col,
COUNT(b.my_col) AS count
FROM unnest(ARRAY['foo', 'hello']) a (my_col)
LEFT JOIN t1 b
ON a.my_col = b.my_col
GROUP BY
a.my_col;
The issue I had with the other answers is that (while they they helped me get to the solution) they did not provide a solution where the items of interest were in a single list (which isn't an actual sql term, so the fault is on me).
However, my real use case is to perform a native query using java and hibernate, and unfortunately the above does not work because the typing cannot be determined. Instead I converted my list into a single string and used string_to_array in place of the ARRAY function.
So the solution that worked best for my use case is below (but at this point, the other answers would be just as correct since I'm now having to do manual string manipulation, but I'm leaving this here for the sake of posterity)
WITH t1 AS (
SELECT 'foo' AS my_col UNION ALL
SELECT 'foo' UNION ALL
SELECT 'bar'
)
SELECT
a.my_col,
COUNT(b.my_col) AS count
FROM unnest(string_to_array('foo, hello', ',')) a (my_col)
LEFT JOIN t1 b
ON a.my_col = b.my_col
GROUP BY
a.my_col;

SQL - Return a default value when my search returns no results along with search criteria

I am searching with a query
--Code Format
SELECT COLA,COLB,COLC from MYTABLE where SWITCH IN (1,2,3);
If MYTABLE does not contain rows with SWITCH 1,2 or 3 I need default values returned along with the SWITCH value. How do I do it?
Below is my table format
COLA | COLB | COLC | SWITCH
------------------------------
A B C 1
a b c 2
i want a query when I search with
select * from MYTABLE where switch in (1,2,3)
That gets results like this --
COLA | COLB | COLC | SWITCH
------------------------------
A B C 1
a b c 2
NA NA NA 3
--Check to see if any row exists matching your conditions
IF NOT EXISTS (SELECT COLA,COLB,COLC from MYTABLE where SWITCH IN (1,2,3))
BEGIN
--Select your default values
END
ELSE
BEGIN
--Found rows, return them
SELECT COLA,COLB,COLC from MYTABLE where SWITCH IN (1,2,3)
END
if not exists( SELECT 1 from MYTABLE where SWITCH IN (1,2,3))
select default_value
How about:
SELECT COLA,COLB,COLC from MYTABLE where SWITCH IN (1,2,3)
union select 5555, 6666, 7777 where not exists (
SELECT COLA,COLB,COLC from MYTABLE where SWITCH IN (1,2,3)
);
5555, 6666, 7777 being the default row in case there aren't any rows matching your criteria.
Here is one way to tackle this. You need a table of the SWITCH values you want to look at. Then a simple left join makes this super easy.
select ColA
, ColB
, ColC
v.Switch
from
(
values
(1)
, (2)
, (3)
)v (Switch)
left join YourTable yt on yt.Switch = v.Switch
You can Use a Split Function And Left Join As Shown Below:
Select ISNULL(ColA,'NA') As ColA,ISNULL(ColB,'NA') As ColB,ISNULL(ColC,'NA') As ColC,ISNULL(Switch,a.splitdata)
from [dbo].[fnSplitString]('1,2,3',',') a
LEFT JOIN #MYTABLE t on a.splitdata=t.Switch
[dbo].[fnSplitString] is a Split Function with 2 arguments - Delimeter Separated String and Delimeter and Output a Table.
EDIT:
Given the new explanation, I changed the answer completely. I think I got your question now:
SELECT * FROM MYTABLE AS mt
RIGHT JOIN (SELECT 1 AS s UNION SELECT 2 AS s UNION SELECT 3 AS s) AS st
ON st.s = mt.SWITCH
You could change the SELECT 1 AS s UNION SELECT 2 AS s UNION SELECT 3 AS spart to a subquery that results in all possible values SWITCH could assume. E.g.:
SELECT DISTINCT SWITCH FROM another_table_with_all_switches
If all want is the value of switch that is not in MYTABLE, not the whole table with null values, you could try:
SELECT * FROM
(SELECT 1 AS s UNION SELECT 2 AS s UNION SELECT 3) AS st
WHERE st.s NOT IN (SELECT DISTINCT SWITCH FROM MYTABLE)

SQL: Return a count of 0 with count(*)

I am using WinSQL to run a query on a table to count the number of occurrences of literal strings. When trying to do a count on a specific set of strings, I still want to see if some values return a count of 0. For example:
select letter, count(*)
from table
where letter in ('A', 'B', 'C')
group by letter
Let's say we know that 'A' occurs 3 times, 'B' occurs 0 times, and 'C' occurs 5 times. I expect to have a table returned as such:
letter count
A 3
B 0
C 5
However, the table never returns a row with a 0 count, which results like so:
letter count
A 3
C 5
I've looked around and saw some articles mentioning the use of joins, but I've had no luck in correctly returning a table that looks like the first example.
You can create an in-line table containing all letters that you look for, then LEFT JOIN your table to it:
select t1.col, count(t2.letter)
from (
select 'A' AS col union all select 'B' union all select 'C'
) as t1
left join table as t2 on t1.col = t2.letter
group by t1.col
on many platforms you can now use the values statement instead of union all to create your "in line" table - like this
select t.letter, count(mytable.letter)
from ( values ('A'),('B'),('C') ) as t(letter)
left join mytable on t.letter = mytable.letter
group by t.letter
I'm not that familiar with WinSQL, but it's not pretty if you don't have the values that you want in the left most column in a table somewhere. If you did, you could use a left join and a conditional. Without it, you can do something like this:
SELECT all_letters.letter, IFNULL(letter_count.letter_count, 0)
FROM
(
SELECT 'A' AS letter
UNION
SELECT 'B' AS letter
UNION
SELECT 'C' AS letter
) all_letters
LEFT JOIN
(SELECT letter, count(*) AS letter_count
FROM table
WHERE letter IN ('A', 'B', 'C')
GROUP BY letter) letter_count
ON all_letters.letter = letter_count.letter

How to check existence of data in a table from a where clause in sql server 2008?

Suppose I have a table with columns user_id, name and the table contains data like this:
user_id name
------- -----
sou souhardya
cha chanchal
swa swapan
ari arindam
ran ranadeep
If I want to know these users (sou, cha, ana, agn, swa) exists in this table or not then I want output like this:
user_id it exists or not
------- -----------------
sou y
cha y
ana n
agn n
swa y
As ana and aga do not exist in the table it must show "n" (like the above output).
Assuming your existing checklist is not on the database, you will have to assemble a query containing those. There are many ways of doing it. Using CTEs, it would look like this:
with cte as
(
select 'sou' user_id
union all
select 'cha'
union all
select 'ana'
union all
select 'agn'
union all
select 'swa'
)
select
cte.user_id,
case when yt.user_id is null then 'n' else 'y' end
from cte
left join YourTable yt on cte.user_id = yt.user_id
This also assumes user_id is unique.
Here is the SQLFiddle with the proof of concept: http://sqlfiddle.com/#!3/e023a0/4
Assuming you're just testing this manually:
DECLARE #Users TABLE
(
[user_id] VARCHAR(50)
)
INSERT INTO #Users
SELECT 'sou'
UNION SELECT 'cha'
UNION SELECT 'ana'
UNION SELECT 'agn'
UNION SELECT 'swa'
SELECT a.[user_id]
, [name]
, CASE
WHEN b.[user_id] IS NULL THEN 'N'
ELSE 'Y'
END AS [exists_or_not]
FROM [your_table] a
LEFT JOIN #Users b
ON a.[user_id] = b.[user_id]
You didn't provide quite enough information to provide a working example, but this should get you close:
select tbl1.user_id, case tbl2.user_id is null then 'n' else 'y' end
from tbl1 left outer join tbl2 on tbl1.user_id = tbl2.user_id
;with usersToCheck as
(
select 'sou' as userid
union select 'cha'
union select 'ana'
union select 'agn'
union select 'swa'
)
select utc.userid,
(case when exists ( select * from usersTable as ut where ut.user_id = utc.userid) then 'y' else 'n' end)
from usersToCheck as utc

SELECT DISTINCT for data groups

I have following table:
ID Data
1 A
2 A
2 B
3 A
3 B
4 C
5 D
6 A
6 B
etc. In other words, I have groups of data per ID. You will notice that the data group (A, B) occurs multiple times. I want a query that can identify the distinct data groups and number them, such as:
DataID Data
101 A
102 A
102 B
103 C
104 D
So DataID 102 would resemble data (A,B), DataID 103 would resemble data (C), etc. In order to be able to rewrite my original table in this form:
ID DataID
1 101
2 102
3 102
4 103
5 104
6 102
How can I do that?
PS. Code to generate the first table:
CREATE TABLE #t1 (id INT, data VARCHAR(10))
INSERT INTO #t1
SELECT 1, 'A'
UNION ALL SELECT 2, 'A'
UNION ALL SELECT 2, 'B'
UNION ALL SELECT 3, 'A'
UNION ALL SELECT 3, 'B'
UNION ALL SELECT 4, 'C'
UNION ALL SELECT 5, 'D'
UNION ALL SELECT 6, 'A'
UNION ALL SELECT 6, 'B'
In my opinion You have to create a custom aggregate that concatenates data (in case of strings CLR approach is recommended for perf reasons).
Then I would group by ID and select distinct from the grouping, adding a row_number()function or add a dense_rank() your choice. Anyway it should look like this
with groupings as (
select concat(data) groups
from Table1
group by ID
)
select groups, rownumber() over () from groupings
The following query using CASE will give you the result shown below.
From there on, getting the distinct datagroups and proceeding further should not really be a problem.
SELECT
id,
MAX(CASE data WHEN 'A' THEN data ELSE '' END) +
MAX(CASE data WHEN 'B' THEN data ELSE '' END) +
MAX(CASE data WHEN 'C' THEN data ELSE '' END) +
MAX(CASE data WHEN 'D' THEN data ELSE '' END) AS DataGroups
FROM t1
GROUP BY id
ID DataGroups
1 A
2 AB
3 AB
4 C
5 D
6 AB
However, this kind of logic will only work in case you the "Data" values are both fixed and known before hand.
In your case, you do say that is the case. However, considering that you also say that they are 1000 of them, this will be frankly, a ridiculous looking query for sure :-)
LuckyLuke's suggestion above would, frankly, be the more generic way and probably saner way to go about implementing the solution though in your case.
From your sample data (having added the missing 2,'A' tuple, the following gives the renumbered (and uniqueified) data:
with NonDups as (
select t1.id
from #t1 t1 left join #t1 t2
on t1.id > t2.id and t1.data = t2.data
group by t1.id
having COUNT(t1.data) > COUNT(t2.data)
), DataAddedBack as (
select ID,data
from #t1 where id in (select id from NonDups)
), Renumbered as (
select DENSE_RANK() OVER (ORDER BY id) as ID,Data from DataAddedBack
)
select * from Renumbered
Giving:
1 A
2 A
2 B
3 C
4 D
I think then, it's a matter of relational division to match up rows from this output with the rows in the original table.
Just to share my own dirty solution that I'm using for the moment:
SELECT DISTINCT t1.id, D.data
FROM #t1 t1
CROSS APPLY (
SELECT CAST(Data AS VARCHAR) + ','
FROM #t1 t2
WHERE t2.id = t1.id
ORDER BY Data ASC
FOR XML PATH('') )
D ( Data )
And then going analog to LuckyLuke's solution.