How to combine two queries where one of them results in an array and the second is the element place in the array? - sql

I have the following two queries:
Query #1
(SELECT ARRAY (SELECT (journeys.id)
FROM JOURNEYS
JOIN RESPONSES ON scenarios[1] = responses.id) AS arry);
This one returns an array.
Query #2:
SELECT (journeys_index.j_index)
FROM journeys_index
WHERE environment = 'env1'
AND for_channel = 'ch1'
AND first_name = 'name1';
This second query returns the element index in the former array.
How do I combine the two to get only the element value?

I recreated a simpler example with a table containing an array column (the result of your first query)
create table my_array_test (id int, tst_array varchar[]);
insert into my_array_test values (1,'{cat, mouse, frog}');
insert into my_array_test values (2,'{horse, crocodile, rabbit}');
And another table containing the element position for each row I want to extract.
create table my_array_pos_test (id int, pos int);
insert into my_array_pos_test values (1,1);
insert into my_array_pos_test values (2,3);
e.g. from the row in my_array_test with id=1 I want to extract the 1st item (pos=1) and from the row in my_array_test with id=2 I want to extract the 3rd item (pos=3)
defaultdb=> select * from my_array_pos_test;
id | pos
----+-----
1 | 1
2 | 3
(2 rows)
Now the resulting statement is
select *,
tst_array[my_array_pos_test.pos]
from
my_array_test join
my_array_pos_test on my_array_test.id = my_array_pos_test.id
with the expected result
id | tst_array | id | pos | tst_array
----+--------------------------+----+-----+-----------
1 | {cat,mouse,frog} | 1 | 1 | cat
2 | {horse,crocodile,rabbit} | 2 | 3 | rabbit
(2 rows)
Now, in your case I would probably do something similar to the below, assuming your 1st select statement returns one row only.
with array_sel as
(SELECT ARRAY (SELECT (journeys.id)
FROM JOURNEYS
JOIN RESPONSES ON scenarios[1] = responses.id) AS arry)
SELECT arry[journeys_index.j_index]
FROM journeys_index cross join array_sel
WHERE environment = 'env1'
AND for_channel = 'ch1'
AND first_name = 'name1';
I can't validate fully the above sql statement since we can't replicate your tables, but should give you a hint on where to start from

Related

Need to join 2 tables using substring of different lengths on one column with length stated in second table

I need to join 2 tables of which I need to substring a column in Table 1. What is unknown is the length of the substring in order to join to Table 2. The first few numbers are the joining key with differing lengths. Table 2 does state the length and will be the indicator on which entries need to be substringed with a specific length. The second table has fixed length of 9 so will also need to be substringed (which will be easy to do). Table 1 is my problem. The length column in Table 2 tells you how much of the ShortRef to use as well as how much to substring RefNr in Table1 which then becomes the join. However, I am not sure how to do this in SSMS or whether it is possible.
Since table 2 informs how much to substring, I currently don't see a solution and I don't know if like will work or how to do this using like.
Example:
TABLE 1
|RefNr |
|----------------|
|1234567890101234|
|9876543210090876|
|1234569000100223|
TABLE 2
|ShortRef | Length | Name |
|---------|--------|------|
|123456789|8 |Alice |
|123456909|8 |Cindy |
|987654999|6 |Ben |
RESULTS SHOULD BE:
|RefNr | Substr Table1&2 based on Length in Table2 | Name |
|----------------|-------------------------------------------|------|
|1234567890101234| 12345678 |Alice |
|9876543210090876| 987654 |Ben |
|1234569000100223| 12345690 |Cindy |
EXAMPLE OF TABLES
I don't know if you're wanting this exactly, but i took correct output for the example table with this.
SELECT RefNr
,tb2."Substring Table1 based on length in Table2"
,tb2.Name
FROM Table1
INNER JOIN
(SELECT SUBSTRING(ShortRef, 0, Length+1) as "Substring Table1 based on length in Table2",
Name,
Length
FROM Table2) as tb2
ON SUBSTRING(RefNr, 0, tb2.Length + 1) = tb2."Substring Table1 based on length in Table2"
It looks like you want to join the tables together with string operations:
select t1.*, left(t2.refnr, t2.length), t2.name
from table1 t1 join
table2 t2
on left(t2.refnr, t2.length) = left(t1.refnr, t2.length);

redshift regex get multiple matches and expand rows

I'm working on the URL extraction on AWS Redshift. The URL column looks like this:
url item origin
http://B123//ajdsb apple US
http://BYHG//B123 banana UK
http://B325//BF89//BY85 candy CA
The result I want to get is to get the series that starts with B and also expand rows if there are multiple series in a URL.
extracted item origin
B123 apple US
BYHG banana UK
B123 banana UK
B325 candy CA
BF89 candy CA
BY85 candy CA
My current code is:
select REGEXP_SUBSTR(url, '(B[0-9A-Z]{3})') as extracted, item, origin
from data
The regex part works well but I have problems with extracting multiple values and expand them to new rows. I tried to use REGEXP_MATCHES(url, '(B[0-9A-Z]{3})', 'g') but function regexp_matches does not exist on Redshift...
The solution I use is fairly ugly but achieves the desired results. It involves using REGEXP_COUNT to determine the maximum number of matches in a row then joining the resulting table of numbers to a query using REGEXP_SUBSTR.
-- Get a table with the count of matches
-- e.g. if one row has 5 matches this query will return 0, 1, 2, 3, 4, 5
WITH n_table AS (
SELECT
DISTINCT REGEXP_COUNT(url, '(B[0-9A-Z]{3})') AS n
FROM data
)
-- Join the previous table to the data table and use n in the REGEXP_SUBSTR call to get the nth match
SELECT
REGEXP_SUBSTR(url, '(B[0-9A-Z]{3})', 1, n) AS extracted,
item,
origin
FROM data,
n_table
-- Only keep non-null matches
WHERE n > 0
AND REGEXP_COUNT(url, '(B[0-9A-Z]{3})') >= N
IronFarm's answer inspired me, though I wanted to find a solution that didn't require a cross join. Here's what I came up with:
with
-- raw data
src as (
select
1 as id,
'abc def ghi' as stuff
union all
select
2 as id,
'qwe rty' as stuff
),
-- for each id, get a series of indexes for
-- each match in the string
match_idxs as (
select
id,
generate_series(1, regexp_count(stuff, '[a-z]{3}')) as idx
from
src
)
select
src.id,
match_idxs.idx,
regexp_substr(src.stuff, '[a-z]{3}', 1, match_idxs.idx) as stuff_match
from
src
join match_idxs using (id)
order by
id, idx
;
This yields:
id | idx | stuff_match
----+-----+-------------
1 | 1 | abc
1 | 2 | def
1 | 3 | ghi
2 | 1 | qwe
2 | 2 | rty
(5 rows)

SQL query for two values of one row based off same table column

I have two columns of one row of a report that I would like to be based off the same one column in a SQL table.
For example, in the report it should be something like:
ID | Reason | SubReason
1 | Did not like | Appearance
In the SQL table it is something like:
ID | ReturnReason
1 | Did not like
1 | XX*SR*Appearance
1 | XX - TestData
1 | XX - TestData2
The SubReason column is being newly added and the current SQL query is something like:
SELECT ID, ReturnReason AS 'Reason'
FROM table
WHERE LEFT(ReturnReason,2) NOT IN ('XX')
And now I'd like to add a column in the SELECT statement for SubReason, which should be the value if *SR* is in the value. This however won't work because it also has 'XX' in the value, which is omitted by the current WHERE clause.
SELECT t.ID, t.ReturnReason AS 'Reason',
SUBSTRING(t1.ReturnReason,7,10000) as 'SubReason '
FROM t
LEFT JOIN t as t1 on t.id=t1.id and t1.ReturnReason LIKE 'XX*SR*%'
WHERE t.ReturnReason NOT LIKE 'XX%'
SQLFiddle demo

how to select one tuple in rows based on variable field value

I'm quite new into SQL and I'd like to make a SELECT statement to retrieve only the first row of a set base on a column value. I'll try to make it clearer with a table example.
Here is my table data :
chip_id | sample_id
-------------------
1 | 45
1 | 55
1 | 5986
2 | 453
2 | 12
3 | 4567
3 | 9
I'd like to have a SELECT statement that fetch the first line with chip_id=1,2,3
Like this :
chip_id | sample_id
-------------------
1 | 45 or 55 or whatever
2 | 12 or 453 ...
3 | 9 or ...
How can I do this?
Thanks
i'd probably:
set a variable =0
order your table by chip_id
read the table in row by row
if table[row]>variable, store the table[row] in a result array,increment variable
loop till done
return your result array
though depending on your DB,query and versions you'll probably get unpredictable/unreliable returns.
You can get one value using row_number():
select chip_id, sample_id
from (select chip_id, sample_id,
row_number() over (partition by chip_id order by rand()) as seqnum
) t
where seqnum = 1
This returns a random value. In SQL, tables are inherently unordered, so there is no concept of "first". You need an auto incrementing id or creation date or some way of defining "first" to get the "first".
If you have such a column, then replace rand() with the column.
Provided I understood your output, if you are using PostGreSQL 9, you can use this:
SELECT chip_id ,
string_agg(sample_id, ' or ')
FROM your_table
GROUP BY chip_id
You need to group your data with a GROUP BY query.
When you group, generally you want the max, the min, or some other values to represent your group. You can do sums, count, all kind of group operations.
For your example, you don't seem to want a specific group operation, so the query could be as simple as this one :
SELECT chip_id, MAX(sample_id)
FROM table
GROUP BY chip_id
This way you are retrieving the maximum sample_id for each of the chip_id.

How do I use a correlated sub query for a new column in my view?

I am trying to write a view that has 3 columns: Planet, Moon, and Largest.
The view is meant to show planets, their moons, and a Yes or No column indicating whether or not it is the largest moon for the planet.
Only one Basetable is used, and the columns I am referencing are moonPlanetOrbit (only not null if bodyType is = to 'Moon'), bodyName (name of the moon), and largest ('yes' or 'no').
Here is my attempt so far:
CREATE VIEW Moons (Planet, Moon, Largest)
select moonPlanetOrbited, bodyName, ('Yes' if bodyName = (SELECT top 1 moonMeanRadius from Body where moonPlanetOrbited = bodyName order by moonMeanRadius) as Largest)
I can provide any more information if needed.
Thanks,
Cody
SQL works best with sets of data. My advice is to get the set of largest moons using a SELECT statement and the MAX() function, and then join the result set with the whole table. Then test whether the moon is equal to the largest in order to print 'yes' or 'no'.
Here's an example using MySQL. I created a table Moons containing the columns moonPlanetOrbited, bodyName, moonMeanRadius. The following SQL selects the largest moonMeanRadius for a given moonPlanetOrbited:
SELECT moonPlantedOrbited, MAX(moonMeanRadius) as maxMoonRadius
FROM Moons
GROUP BY moonPlanetOrbitede
Now that we have a list of maxMoonRadius, join the result set with the entire table and test if the moonMeanRadius is equal to the maxMoonRadius:
SELECT m1.moonPlanetOrbited, m2.bodyName,
if(m1.moonMeanRadius = m2.maxMoonRadius, 'Yes', 'No') as Largest
FROM Moons m1
JOIN (
SELECT moonPlanetOrbited, MAX(moonMeanRadius) as maxMoonRadius
FROM Moons
GROUP BY moonPlanetOrbited
) m2
ON m1.moonPlanetOrbited = m2.moonPlanetOrbited;
The IF syntax is from MySQL 5.5:
http://dev.mysql.com/doc/refman/5.5/en/control-flow-functions.html#function_if
Tested using the following SQL :
CREATE TABLE Moons(
moonPlanetOrbited VARCHAR(255),
bodyName VARCHAR(255),
moonMeanRadius FLOAT
);
INSERT INTO Moons('a', 'b', 1.01);
INSERT INTO Moons('a', 'c', 1.02);
INSERT INTO Moons('a', 'd', 1.03);
INSERT INTO Moons('a', 'e', 1.04);
+-------------------+----------+---------+
| moonPlanetOrbited | bodyName | Largest |
+-------------------+----------+---------+
| a | b | No |
| a | c | No |
| a | d | No |
| a | e | Yes |
+-------------------+----------+---------+
4 rows in set (0.00 sec)
Here is my MS-SQL Syntax stab at it:
SELECT
B.moonPlanetOrbited
, B.bodyName
, CASE
WHEN B.bodyName =
(SELECT TOP 1
iB.bodyName
FROM
Body AS iB
WHERE
iB.moonPlanetOrbited = B.bodyName
ORDER BY
iB.moonMeanRadius DESC
)
THEN 'Yes'
ELSE 'No'
END CASE AS [Largest]
FROM
Body AS B
If the table uses IDs as a primary key it may be better to compare the IDs instead of the names.
Here is an attempt (untested) that resembles your approach as closely as possible, since your idea wasn't that far off:
Select
M.moonPlanetOrbited,
M.bodyName,
CASE
WHEN M.bodyName =
(SELECT top 1 bodyName from Body
where moonPlanetOrbited = M.moonPlanetOrbited
order by moonMeanRadius DESC)
Then 'Y'
Else 'N'
AS Largest
FROM body
You just needed a table prefix to actually do the correlating to the root table, and also to make sure that you were comparing apples to apples in your CASE statement.