Search for the occurrence of a list of values - sql

I'm trying to find an optimized way to identify if a specific set of values exists in a list.
For example, lets assume the following list of records in a table
Id Value
1 A
2 B
3 A
4 C
5 A
6 B
7 C
8 C
9 A
I'm trying to find a way to check how much times the sequence {A, B} or {A, B, C} occurs, for example.
I know I can do this with cursors but I was checking if there's any other option that would be preferable in terms of performance.
The result I'd expect would by something like this:
{A, B}: 2 times:
{A, B, C}: 1 time.
I'm using Sql Server.

Probably the simplest way is to use the ANSI standard functions lag() and/or lead():
select count(*)
from (select t.*,
lead(value) over (order by id) as next_value,
lead(value, 2) over (order by id) as next_value2,
from t
) t
where value = 'A' and next_value = 'B' and next_value2 = 'C';

Related

For each value in col A finding number of values in column B that are greater than it

Let's say I have a table with 2 columns - A & B.
Using plain SQL (No scripts/cursors etc.), how do I (window function?) calculate for EACH value in column A the number of values in column B that are bigger/smaller than it?
Thanks you.
You would use conditional aggregation:
select a,
sum(case when b < a then 1 else 0 end)
from t
group by a;
Window functions don't seem appropriate to this question.

Postgres union of queries in loop

I have a table with two columns. Let's call them
array_column and text_column
I'm trying to write a query to find out, for K ranging from 1 to 10, in how many rows does the value in text_column appear in the first K elements of array_column
I'm expecting results like:
k | count
________________
1 | 70
2 | 85
3 | 90
...
I did manage to get these results by simply repeating the query 10 times and uniting the results, which looks like this:
SELECT 1 AS k, count(*) FROM table WHERE array_column[1:1] #> ARRAY[text_column]
UNION ALL
SELECT 2 AS k, count(*) FROM table WHERE array_column[1:2] #> ARRAY[text_column]
UNION ALL
SELECT 3 AS k, count(*) FROM table WHERE array_column[1:3] #> ARRAY[text_column]
...
But that doesn't looks like the correct way to do it. What if I wanted a very large range for K?
So my question is, is it possible to perform queries in a loop, and unite the results from each query? Or, if this is not the correct approach to the problem, how would you do it?
Thanks in advance!
You could use array_positions() which returns an array of all positions where the argument was found in the array, e.g.
select t.*,
array_positions(array_column, text_column)
from the_table t;
This returns a different result but is a lot more efficient as you don't need to increase the overall size of the result. To only consider the first ten array elements, just pass a slice to the function:
select t.*,
array_positions(array_column[1:10], text_column)
from the_table t;
To limit the result to only rows that actually contain the value you can use:
select t.*,
array_positions(array_column[1:10], text_column)
from the_table t
where text_column = any(array_column[1:10]);
To get your desired result, you could use unnest() to turn that into rows:
select k, count(*)
from the_table t, unnest(array_positions(array_column[1:10], text_column)) as k
where text_column = any(array_column[1:10])
group by k
order by k;
You can use the generate_series function to generate a table with the expected number of rows with the expected values and then join to it within the query, like so:
SELECT t.k AS k, count(*)
FROM table
--right join ensures that you will get a value of 0 if there are no records meeting the criteria
right join (select generate_series(1,10) as k) t
on array_column[1:t.k] #> ARRAY[text_column]
group by t.k
This is probably the closest thing to using a loop to go through the results without using something like PL/SQL to do an actual loop in a user-defined function.

How to Quickly Flatten a SQL Table

I'm using Presto. If I have a table like:
ID CATEGORY VALUE
1 a ...
1 b
1 c
2 a
2 b
3 b
3 d
3 e
3 f
How would you convert to the below without writing a case statement for each combination?
ID A B C D E F
1
2
3
I've never used Presto and the documentation seems pretty thin, but based on this article it looks like you could do
SELECT
id,
kv['A'] AS A,
kv['B'] AS B,
kv['C'] AS C,
kv['D'] AS D,
kv['E'] AS E,
kv['F'] AS F
FROM (
SELECT id, map_agg(category, value) kv
FROM vtable
GROUP BY id
) t
Although I'd recommend doing this in the display layer if possible since you have to specify the columns. Most reporting tools and UI grids support some sort of dynamic pivoting that will create columns based on the source data.
My 2 cents:
If you know "possible" values:
SELECT
m['web'] AS web,
m['shopping'] AS shopping,
m['news'] AS news,
m['music'] AS music,
m['images'] AS images,
m['videos'] AS videos,
m[''] AS empty
FROM (
SELECT histogram(data_tab) AS m
FROM datahub
WHERE
year = 2017
AND month = 5
AND day = 7
AND name = 'search'
) searches
No PIVOT function (yet)!

How to construct SQL query for this...?

I would like to do some math operation but for each row at the time.
For example:
A B C D
-------------------------------
100 -50 =50 20160101
100 0 =150 20160102
100 -50 =100 20160103
So basically column C would always be sum of all past A +(B) columns, but not future ones. Does anyone have idea on how to achieve this in SQL?
I can do this in code, but I would like to do this in SQL and just show result in table.
P.S. my english is not the best, so feel free to ask if I was not clear enough.
This is called a cumulative or running sum. The normal method uses ANSI standard window functions:
select a, b,
sum(a + b) over (order by d) as c,
d
from t;
If your version of SQL doesn't support window functions, then you can use a correlated subquery (performance would generally be much worse):
select a, b,
(select sum(a + b) from t t2 where t2.d <= t.d) as c,
d
from t;

Oracle - Use value in first cell to determine value of second cell

I have an interesting requirement - I need to use the value of the first cell in a row to determine the value of the fourth cell in a row. Normally this would be handled at the application level or within a function, but I'm stuck doing it in a normal select query.
Here are the details.
1) I have a simple query (select A, B, C from D) returns the following correctly
1 | 2 | 3
2) I have a function that leverages the values returned in the first query and returns a value
select function_x('1') from dual
accurately returns 'Z'
I want to concatenate all of them so I get the following:
1 | 2 | 3 | Z
I tried something like this query but it doesnt work:
select A, B, C, (select function_x(A) from dual)
from D
It works when I hard code a value into the function, but doesn't work when I try to leverage the first returned value.
Are there any solutions available without me creating a function?
select A, B, C, function_x(A) from D
I figured it out, I had to use a subquery:
select A, B, C, function_x(A) from (select A, B, C from D)