PostgreSQL, splitting to rows and recognizing the first part - sql

I have a table in PostgreSQL:
CREATE TABLE t1 (
id SERIAL,
values TEXT);
INSERT INTO t1 (values)
VALUES ('T815,T847'), ('F00,B4R,B4Z'), ('AS5,XX3'), ('G00');
like:
id|values
--------------
1 |T815,T847
2 |F00,B4R,B4Z
3 |AS5,XX3
4 |G00
I need to split the values in the values column on their own row and mark the first values, like this:
id|first|value
--------------
1 | yes | T815
1 | | T84T
2 | yes | F00
2 | | B4R
2 | | B4Z
3 | yes | AS5
3 | | XX3
4 | yes | G00
I can produce id and value with:
SELECT t1.id,
regexp_split_to_table(t1.values, E',')
FROM t1;
but how would I recognize and tag the first values?

There may be a better way, but you can use a brute force method:
SELECT t1.id, value,
(case when values like value || '%' then 'yes' end) as first
FROM (SELECT t1.id, t1.values,
regexp_split_to_table(t1.values, E',') as value
FROM t1
) t1;

Related

Select IDs from multiple rows where column values satisfy one condition but not another

Hello I have the following problem.
I have a table like the one in this sql fiddle
This table defines a relationship and it contains IDs from two other tables
example values
| FirstID | SecondID |
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 2 | 4 |
| 2 | 5 |
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
I want to select all the FirstIDs that satisfy the following criteria.
Their corresponding SecondIDs are in the range 1-3 AND NOT in the range 4-5
For example in this case we would want FirstIDs 1 and 3.
I have tried the following queries
SELECT FirstID from table
WHERE SecondID IN (1,2,3) AND SecondID NOT IN (4,5)
SELECT FirstID,SecondID
FROM(
SELECT FirstID, SecondID
FROM table
WHERE SecondID in (1,2,3,4,5) )
WHERE SecondID NOT IN (4,5)
but I don't get the correct results I am aiming for.
What is the correct query to get the data I want?
SELECT FirstID
FROM table
WHERE SecondId in (1,2,3) --Included values
AND FirstID NOT IN (SELECT FirstID FROM test
WHERE SecondId IN (4,5)) --Excluded values
How about min() and max():
select firstid
from t
group by firstid
having min(secondId) between 1 and 3 and
max(secondid) between 1 and 3;
Assuming 1 is the minimum, then this can be simplified to:
having max(secondid) <= 3;
For arbitrary ranges, you can use sum(case):
having sum(case when secondId between 1 and 3 then 1 else 0 end) > 0 and
sum(case when secondId between 4 and 5 then 1 else 0 end) = 0;
I think Gonzalo Lorieto proably has the best answer to this question already, but depending on the size of your data, SELECT statements in a WHERE clause can get really slow, and the below might be significantly faster (although it's not clear it's worth it for the reduced readability...)
SELECT inrange.FirstId FROM
t inrange
LEFT OUTER JOIN
(SELECT FirstID FROM t
WHERE SEcondId IN (4,5)) outrange
ON inrange.firstID = outrange.firstId
WHERE SecondID IN (1,2,3)
AND outrange.firstId IS NULL
GROUP BY inrange.FirstId
You will want to use the EXISTS clause to exclude the FirstIDs that have an invalid SecondID. here is an example:
SELECT FirstID from test Has123
WHERE SecondID IN (1,2,3)
AND NOT EXISTS (
SELECT 1 FROM test Not45
WHERE Has123.FirstID = Not45.FirstID
AND Not45.SecondID IN (4,5)
)
GROUP BY FirstID
SqlFiddle

Postgres check if any of array values is in a list

How can I check if any of an array values in a single row is in my list?
Here is my table. Let's call it ABC
id | page_id | values
------------------+-------------+-------------------------------------------------------------------------
1376092679147519 | xyz | {6004036173148,6003373173651,6003050657850}
1375487155874738 | xyz | {6003301698460,6003232518610}
1497527026945449 | xyz | {6003654559478,6003197656807}
1375388575884596 | xyz | {6003512053894,6003450241842,6003051414416}
1319144441504401 | xyz | {6004001256506,6003514818642,6003400993421}
My aim is to select those rows, where one of the values appears in the given list ('6004036173148', '6003197656807').
SELECT id, page_id, values from ABC WHERE -SOME CLAUSE- IN ('6004036173148', '6003197656807');
id | page_id | values
------------------+-------------+-------------------------------------------------------------------------
1376092679147519 | xyz | {6004036173148,6003373173651,6003050657850}
1497527026945449 | xyz | {6003654559478,6003197656807}
Here is the structure of my PosgreSQL table
Table "public.ABC"
Column | Type | Modifiers
--------------------+--------------------------+------------------------
id | character varying | not null
page_id | character varying | not null
values | character varying[] |
Indexes:
"ABC_pkey" PRIMARY KEY, btree (id)
If I correct understood, you need this:
select * from t
where "values" && '{6004036173148,6003197656807}'::character varying[]
EDIT
If you need extract certain values, then you can use unnest function
Note, that if same array contains more than 1 value from searched list, then row will be repeated. Look this example output with id=2 and you can see, about what I'm talking:
with t(id, values) as(
select 1, '{6004036173147,6003373173651,6003050657840}'::character varying[]
union all
select 2, '{6004036173148,6003373173652,6003050657850}'::character varying[]
union all
select 3, '{6004036173149,6003373173653,6003050657860}'::character varying[]
)
select tt.* from
(select t.*, unnest(values) unn from t) tt
inner join (select unnest('{6003373173651,6004036173148,6003373173652}'::character varying[]) v ) lst
on tt.unn = lst.v

SQL Select a group when attributes match at least a list of values

Given a table with a (non-distinct) identifier and a value:
| ID | Value |
|----|-------|
| 1 | A |
| 1 | B |
| 1 | C |
| 1 | D |
| 2 | A |
| 2 | B |
| 2 | C |
| 3 | A |
| 3 | B |
How can you select the grouped identifiers, which have values for a given list? (e.g. ('B', 'C'))
This list might also be the result of another query (like SELECT Value from Table1 WHERE ID = '2' to find all IDs which have a superset of values, compared to ID=2 (only ID=1 in this example))
Result
| ID |
|----|
| 1 |
| 2 |
1 and 2 are part of the result, as they have both A and B in their Value-column. 3 is not included, as it is missing C
Thanks to the answer from this question: SQL Select only rows where exact multiple relationships exist I created a query which works for a fixed list. However I need to be able to use the results of another query without changing the query. (And also requires the Access-specific IFF function):
SELECT ID FROM Table1
GROUP BY ID
HAVING SUM(Value NOT IN ('A', 'B')) = 0
AND SUM(IIF(Value='A', 1, 0)) = 1
AND SUM(IIF(Value='B', 1, 0)) = 1
In case it matters: The SQL is run on a Excel-table via VBA and ADODB.
In the where criteria filter on the list of values you would like to see, group by id and in the having clause filter on those ids which have 3 matching rows.
select id from table1
where value in ('A', 'B', 'C') --you can use a result of another query here
group by id
having count(*)=3
If you can have the same id - value pair more than once, then you need to slightly alter the having clause: having count(distinct value)=3
If you want to make it completely dynamic based on a subquery, then:
select id, min(valcount) as minvalcount from table1
cross join (select count(*) as valcount from table1 where id=2) as t1
where value in (select value from table1 where id=2) --you can use a result of another query here
group by id
having count(*)=minvalcount

Why IN operator return distinct selection when passing duplicate value (value1 , value1 ....)

Using SQL Server 2008
Why does the IN operator return distinct values when selecting duplicate values?
Table #temp
x | 1 | 2 | 3
--+------------+-------------+------------
1 | first 1 | first 2 | first 3
2 | Second 1 | second 2 | second 3
When I execute this query
SELECT * FROM #temp WHERE x IN (1,1)
it will return
x | 1 | 2 | 3
--+------------+-------------+------------
1 | first 1 | first 2 | first 3
How can I make it so it returns this instead:
x | 1 | 2 | 3
--+------------+-------------+------------
1 | first 1 | first 2 | first 3
1 | first 1 | first 2 | first 3
What is the alternative of IN in this case?
If you want to return duplicates, then you need to phrase the query as a join. The in is simply testing a condition on each row. Whether the condition is met once or twice doesn't matter -- the row either stays in or gets filtered out.
with xes as (
select 1 as x union all
select 1 as x
)
SELECT *
FROM #temp t join
xes
on t.x = xes.x;
EDIT:
If you have a subquery, then it is even simpler:
select *
from #temp t join
(<subquery>) s
on t.x = s.x
This would be a "normal" use of a join.

How do I get LIKE and COUNT to return the number of rows less than a value not in the row?

For example:
SELECT COUNT(ID) FROM My_Table
WHERE ID <
(SELECT ID FROM My_Table
WHERE ID LIKE '%4'
ORDER BY ID LIMIT 1)
My_Table:
X ID Y
------------------------
| | A1 | |
------------------------
| | B2 | |
------------------------
| | C3 | |
------------------------ -----Page 1
| | D3 | |
------------------------
| | E3 | |
------------------------
| | F5 | |
------------------------ -----Page 2
| | G5 | |
------------------------
| | F6 | |
------------------------
| | G7 | | -----Page 3
There is no data ending in 4 but there still are 5 rows that end in something less than "%4".
However, in this case were there is no match, so SQLite only returns 0
I get it is not there but how do I change this behavior to still return number of rows before it, as if it was there?
Any suggestions?
Thank You.
SELECT COUNT(ID) FROM My_Table
WHERE ID < (SELECT ID FROM My_Table
WHERE SUBSTRING(ID, 2) >= 4
ORDER BY ID LIMIT 1)
Assuming there is always one letter before the number part of the id field, you may want to try the following:
SELECT COUNT(*) FROM my_table WHERE CAST(substr(id, 2) as int) <= 4;
Test case:
CREATE TABLE my_table (id char(2));
INSERT INTO my_table VALUES ('A1');
INSERT INTO my_table VALUES ('B2');
INSERT INTO my_table VALUES ('C3');
INSERT INTO my_table VALUES ('D3');
INSERT INTO my_table VALUES ('E3');
INSERT INTO my_table VALUES ('F5');
INSERT INTO my_table VALUES ('G5');
INSERT INTO my_table VALUES ('F6');
INSERT INTO my_table VALUES ('G7');
Result:
5
UPDATE: Further to the comment below, you may want to consider using the ltrim() function:
The ltrim(X,Y) function returns a string formed by removing any and all characters that appear in Y from the left side of X.
Example:
SELECT COUNT(*)
FROM my_table
WHERE CAST(ltrim(id, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ') as int) <= 4;
Test case (adding to the above):
INSERT INTO my_table VALUES ('ABC1');
INSERT INTO my_table VALUES ('ZWY2');
New Result:
7
In MySQL that would be:
SELECT COUNT(ID)
FROM My_Table
WHERE ID <
(
SELECT id
FROM (
SELECT ID
FROM My_Table
WHERE ID LIKE '%4'
ORDER BY
ID
LIMIT 1
) q
UNION ALL
SELECT MAX(id)
FROM mytable
LIMIT 1
)