PostgreSQL differences in value 1 and 11 - sql

I try to call to my DB and where is only one table:
id | value
----------
1 | 1|2|4
2 | 11|23
3 | 1|4|3|11
4 | 2|4|11
5 | 5|6|11
6 | 12|15|16
7 | 3|1|4
8 | 5|2|1
QUERY was : SELECT * FROM table_name WHERE value LIKE '%1%'
I want to select only rows with value 1 but I get rows with 11 value to.
How to show in SQL differences?

If you have to stick with this broken design, it's probably better to use Postgres' ability to parse a string into an array.
This is more robust than using a like condition:
select *
from the_table
where string_to_array(value,'|') #> array['1']
or maybe a bit easier to read
select *
from the_table
where '1' = any (string_to_array(value,'|'))
using the overlaps operator #> you can also search for more than one value at a time:
select *
from the_table
where string_to_array(value,'|') #> array['1','2']
will return all rows where value contains 1 and 2
SQLFiddle example: http://sqlfiddle.com/#!15/8793d/2

I strongly recommend that you should normalize your schema to every column store only atomic values.
Without it, you are forced to do some nasty trick, f.ex. with arrays:
select * from t
where '1' = any (string_to_array(value, '|'))
or, with pattern matching:
select * from t
where '1' similar to value
SQLFiddle

Related

Big Query String Manipulation using SubQuery

I would appreciate a push in the right direction with how this might be achieved using GCP Big Query, please.
I have a column in my table of type string, inside this string there are a repeating sequence of characters and I need to extract and process each of them. To illustrate, lets say the column name is 'instruments'. A possible value for instruments could be:
'band=false;inst=basoon,inst=cello;inst=guitar;cases=false,permits=false'
In which case I need to extract 'basoon', 'cello' and 'guitar'.
I'm more or less a SQL newbie, sorry. So far I have:
SELECT
bandId,
REGEXP_EXTRACT(instruments, r'inst=.*?\;') AS INSTS
FROM `inventory.band.mytable`;
This extracts the instruments substring ('inst=basoon,inst=cello;inst=guitar;') and gives me an output column 'INSTS' but now I think I need to split the values in that column on the comma and do some further processing. This is where I'm stuck as I cannot see how to structure additional queries or processing blocks.
How can I reference the INSTS in order to do subsequent processing? Documentation suggests I should be buildin subqueries using WITH but I can't seem to get anything going. Could some kind soul give me a push in the right direction, please?
BigQuery has a function SPLIT() that does the same as SPLIT_PART() in other databases.
Assuming that you don't alternate between the comma and the semicolon for separating your «key»=«value» pairs, and only use the semicolon,
first you split your instruments string into as many parts that contain inst=. To do that, you use an in-line table of consecutive integers to CROSS JOIN with, so that you can SPLIT(instruments,';',i) with an increasing integer value for i. You will get strings in the format inst=%, of which you want the part after the equal sign. You get that part by applying another SPLIT(), this time with the equal sign as the delimiter, and for the second split part:
WITH indata(bandid,instruments) AS (
-- some input, don't use in real query ...
-- I assume that you don't alternate between comma and semicolon for the delimiter, and stick to semicolon
SELECT
1,'band=false;inst=basoon;inst=cello;inst=guitar;cases=false;permits=false'
UNION ALL
SELECT
2,'band=true;inst=drum;inst=cello;inst=bass;inst=flute;cases=false;permits=true'
UNION ALL
SELECT
3,'band=false;inst=12string;inst=banjo;inst=triangle;inst=tuba;cases=false;permits=true'
)
-- real query starts here, replace following comma with "WITH" ...
,
-- need a series of consecutive integers ...
i(i) AS (
SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
UNION ALL SELECT 6
)
SELECT
bandid
, i
, SPLIT(SPLIT(instruments,';',i),'=',2) AS instrument
FROM indata CROSS JOIN i
WHERE SPLIT(instruments,';',i) like 'inst=%'
ORDER BY 1
-- out bandid | i | instrument
-- out --------+---+------------
-- out 1 | 2 | basoon
-- out 1 | 3 | cello
-- out 1 | 4 | guitar
-- out 2 | 2 | drum
-- out 2 | 3 | cello
-- out 2 | 4 | bass
-- out 2 | 5 | flute
-- out 3 | 2 | 12string
-- out 3 | 3 | banjo
-- out 3 | 4 | triangle
-- out 3 | 5 | tuba
Consider below few options (just to demonstrate different technics here)
Option 1
select bandId,
( select string_agg(split(kv, '=')[offset(1)])
from unnest(split(instruments, ';')) kv
where split(kv, '=')[offset(0)] = 'inst'
) as insts
from `inventory.band.mytable`
Option 2 (for obvious reason this one would be my choice)
select bandId,
array_to_string(regexp_extract_all(instruments, r'inst=([^;$]+)'), ',') instrs
from `inventory.band.mytable`
If applied to sample data in your question - output in both cases is

BigQuery - nested json - select where nested item equals

Having the following table in BigQuery database, where the f0_
Row | f0_
1 | {"configuration":[{"param1":"value1"},{"param2":[3.0,45]}]}
2 | {"configuration":[{"param1":"value2"},{"param2":[3.0,45]}]}
3 | {"configuration":[{"param1":"value1"},{"param2":[3.0,36]}]}
4 | {"configuration":[{"param1":"value1"},{"param2":[3.0,46]}]}
5 | {"configuration":[{"param1":"value1"},{"param2":[3.0,30]}]}
6 | {"configuration":[{"param1":"value1"}]}
f0_ column is a pure string.
Is there a way to write a select query, where the "param2" value is equal to [3.0, 45] array meaning it would only return rows 1 and 2? Preferably would be great to accomplish it without directly indexing the first element in the "configuration" array as the order might not be guaranteed.
Below is for BigQuery Standrad SQL
#standardSQL
SELECT line
FROM `project.dataset.table`
WHERE REGEXP_EXTRACT(JSON_EXTRACT(line, '$.configuration'), r'{"param2":(.*?)}') = '[3.0,45]'
You can test, play with above using sample data from your question as in example below
#standardSQL
WITH `project.dataset.table` AS (
SELECT '{"configuration":[{"param1":"value1"},{"param2":[3.0,45]}]}' line UNION ALL
SELECT '{"configuration":[{"param1":"value2"},{"param2":[3.0,45]}]}' UNION ALL
SELECT '{"configuration":[{"param1":"value1"},{"param2":[3.0,36]}]}' UNION ALL
SELECT '{"configuration":[{"param1":"value1"},{"param2":[3.0,46]}]}' UNION ALL
SELECT '{"configuration":[{"param1":"value1"},{"param2":[3.0,30]}]}' UNION ALL
SELECT '{"configuration":[{"param1":"value1"}]}'
)
SELECT line
FROM `project.dataset.table`
WHERE REGEXP_EXTRACT(JSON_EXTRACT(line, '$.configuration'), r'{"param2":(.*?)}') = '[3.0,45]'
with result
Row line
1 {"configuration":[{"param1":"value1"},{"param2":[3.0,45]}]}
2 {"configuration":[{"param1":"value2"},{"param2":[3.0,45]}]}
Preferably would be great to accomplish it without directly indexing the first element in the "configuration" array as the order might not be guaranteed.
Note: this solution does not depend on position of "param2" in the configuration array
You can use some of BQ's neat JSON functions as described here.
Based on that, you can locate param2 and check if its value matches what you're looking for. If you aren't sure of the configuration order, you can iterate through the array to find param2, but it's not particularly efficient. I recommend you try to find a way where param2 is always the second field in the array. I was able to get the correct results like so:
SELECT json_text AS correct_configurations
FROM UNNEST([
'{"configuration":[{"param1":"value1"},{"param2":[3.0,45]}]}',
'{"configuration":[{"param1":"value2"},{"param2":[3.0,45]}]}',
'{"configuration":[{"param1":"value1"},{"param2":[3.0,36]}]}',
'{"configuration":[{"param1":"value1"},{"param2":[3.0,46]}]}',
'{"configuration":[{"param1":"value1"},{"param2":[3.0,30]}]}',
'{"configuration":[{"param1":"value1"}]}'
])
AS json_text
WHERE JSON_EXTRACT(json_text, '$.configuration[1].param2') LIKE "[3.0,45]";
Gives a result of:
Row | correct_configurations
1 | {"configuration":[{"param1":"value1"},{"param2":[3.0,45]}]}
2 | {"configuration":[{"param1":"value2"},{"param2":[3.0,45]}]}

SQL - Min difference between two integer fields

How I can get min difference between two integer fields(value_0 - value)?
value_0 >= value always
value_0 | value
-------------------
15 | 10
12 | 10
15 | 11
11 | 11
Try this:
SELECT MIN(value_0-value) as MinDiff
FROM TableName
WHERE value_0>=value
With the sample data you have given,
Output is 0. (11-11)
See demo in SQL Fiddle.
Read more about MIN() here.
Here is one way:
select min(value_0 - value)
from table t;
This is pretty basic SQL. If you want to see other values on the same row as the minimum, use order by and choose one row:
select (value_0 - value)
from table t
order by (value_0 - value)
limit 1;
The limit 1 works in some databases for getting one row. Others use top 1 in the select clause. Or fetch first 1 rows only. Or even something else.

how to select one tuple in rows based on variable field value

I'm quite new into SQL and I'd like to make a SELECT statement to retrieve only the first row of a set base on a column value. I'll try to make it clearer with a table example.
Here is my table data :
chip_id | sample_id
-------------------
1 | 45
1 | 55
1 | 5986
2 | 453
2 | 12
3 | 4567
3 | 9
I'd like to have a SELECT statement that fetch the first line with chip_id=1,2,3
Like this :
chip_id | sample_id
-------------------
1 | 45 or 55 or whatever
2 | 12 or 453 ...
3 | 9 or ...
How can I do this?
Thanks
i'd probably:
set a variable =0
order your table by chip_id
read the table in row by row
if table[row]>variable, store the table[row] in a result array,increment variable
loop till done
return your result array
though depending on your DB,query and versions you'll probably get unpredictable/unreliable returns.
You can get one value using row_number():
select chip_id, sample_id
from (select chip_id, sample_id,
row_number() over (partition by chip_id order by rand()) as seqnum
) t
where seqnum = 1
This returns a random value. In SQL, tables are inherently unordered, so there is no concept of "first". You need an auto incrementing id or creation date or some way of defining "first" to get the "first".
If you have such a column, then replace rand() with the column.
Provided I understood your output, if you are using PostGreSQL 9, you can use this:
SELECT chip_id ,
string_agg(sample_id, ' or ')
FROM your_table
GROUP BY chip_id
You need to group your data with a GROUP BY query.
When you group, generally you want the max, the min, or some other values to represent your group. You can do sums, count, all kind of group operations.
For your example, you don't seem to want a specific group operation, so the query could be as simple as this one :
SELECT chip_id, MAX(sample_id)
FROM table
GROUP BY chip_id
This way you are retrieving the maximum sample_id for each of the chip_id.

Postgres: regex and nested queries something like Unix pipes

Command should do: Give 1 as output if the pattern "*#he.com" is on the row excluding the headings:
user_id | username | email | passhash_md5 | logged_in | has_been_sent_a_moderator_message | was_last_checked_by_moderator_at_time | a_moderator
---------+----------+-----------+----------------------------------+-----------+-----------------------------------+---------------------------------------+-------------
9 | he | he#he.com | 6f96cfdfe5ccc627cadf24b41725caa4 | 0 | 1 | 2009-08-23 19:16:46.316272 |
In short, I want to connect many SELECT-commands with Regex, rather like Unix pipes. The output above is from a SELECT-command. A new SELECT-command with matching the pattern should give me 1.
Related
Did you mean
SELECT regexp_matches( (SELECT whatevername FROM users WHERE username='masi'), 'masi');
you obviously can not feed the record (*) to regexp_matches, but I assume this is not what your problem is, since you mention the issue of nesting SQL queries in the subject.
Maybe you meant something like
SELECT regexp_matches( wn, 'masi' ) FROM (SELECT whatevername AS wn FROM users WHERE username LIKE '%masi%') AS sq;
for the case when your subquery yields multiple results.
It looks like you could use a regular expression query to match on the email address:
select * from table where email ~ '.*#he.com';
To return 1 from this query if there is a match:
select distinct 1 from table where email ~ '.*#he.com';
This will return a single row containing a column with 1 if there is a match, otherwise no rows at all. There are many other possible ways to construct such a query.
Let's say that your original query is:
select * from users where is_active = true;
And that you really want to match in any column (which is bad idea for a lot of reasons), and you want just to check if "*#he.com" matches any row (by the way - this is not correct regexp! correct would be .*#he.com, but since there are no anchors (^ or $) you can just write #he.com.
select 1 from (
select * from users where is_active = true
) as x
where textin(record_out( x )) ~ '#he.com'
limit 1;
of course you can also select all columns:
select * from (
select * from users where is_active = true
) as x
where textin(record_out( x )) ~ '#he.com'
limit 1;