Is it possible to SELECT multiple constants into multiple resultset rows in SQL? - sql

I know I can "SELECT 5 AS foo" and get the resultset:
foo
5
(1 row)
...is there any way to "SELECT 5,6,7 AS foo" and get the resultset:
foo
5
6
7
(3 rows)
...I'm well aware that this is not typical DB usage, and any conceivable usage of this is probably better off going w/ a more ordinary technique. More of a technical question.
Note: I know I could use a big gross list of UNIONs -- I'm trying to find something else.

this is easy with a number table, here is an example
select number as foo
from master..spt_values
where type = 'p'
and number between 5 and 7
or if you want to use in
select number as foo
from master..spt_values
where type = 'p'
and number in(5,6,7)

select foo
from (select 1 as n1, 2 as n2, 3 as n3) bar
unpivot (foo for x in (n1, n2, n3)) baz;

It's possible using these and other techniques (as anyone who has interviewed for a database developer's position will tell you). But it's usually easier (and the tools are more appropriate) to do this in another abstraction layer; i.e. your DAL, or beyond, where you view the data as a list of some kind. Although rdbms products provide facilitators, it's a distortion of the relational conceptual model.

Just for fun, wouldn't dream of using it for real:
WITH numbers AS
(
SELECT ROW_NUMBER() OVER (ORDER BY name) AS 'RowNumber'
FROM sys.all_objects
)
SELECT RowNumber
FROM numbers
WHERE RowNumber BETWEEN 5 AND 7;

Related

Oracle - Coordinate extraction from vertices (first, last and all vertices)

For a given feature (line or area) and for all of its members I need to extract the coordinates of (1) all the vertices, (2) the first vertex and (3) the last vertex (3 separate queries to create 3 different sets of results)
I'm using Oracle spatial.
I’ve tested this sql code for the table ARAMAL (it's a 3d line entity; Primary key column: IPID; geometry column: GEOMETRY) and it works well.
(1) - List all the vertices
SELECT A.IPID, t.X, t.Y, t.Z, t.id FROM ARAMAL A,
TABLE(SDO_UTIL.GETVERTICES(A.GEOMETRY)) t ORDER BY A.IPID, t.id;
Result (example for IPID=1479723):
IPID X Y Z id
1479723 -99340.38408 -102364.3603 10 1
1479723 -99341.21035 -102366.2701 11 2
1479723 -99342.03375 -102368.1783 12 3
1479723 -99342.86238 -102370.0875 13 4
... ... .... ... ...
(2) - List the first vertex
SELECT A.IPID, t.X, t.Y, t.Z, t.id FROM ARAMAL A,
TABLE(SDO_UTIL.GETVERTICES(A.GEOMETRY)) t where t.id=1 ORDER BY A.IPID;
Result (example for IPID=1479723)
IPID X Y Z id
1479723 -99340.38408 -102364.3603 10 1
(3) How can I obtain the last vertex purely with sql (no additional functions)?
(Expected) Result (example for IPID=1479723)
IPID X Y Z id
1479723 -99342.86238 -102370.0875 13 4
I guess this process could run faster if I use specific functions - I would also like to be able to use them.
I’ve come across a great site (Simon Greener) with some functions that I guess could do the trick
http://spatialdbadvisor.com/oracle_spatial_tips_tricks/322/st_vertexn-extracting-a-specific-point-from-any-geometry
The functions are:
ST_StartPoint
CREATE OR REPLACE
FUNCTION ST_StartPoint(p_geom IN mdsys.sdo_geometry)
RETURN mdsys.sdo_geometry
IS
BEGIN
RETURN ST_PointN(p_geom,1);
END ST_StartPoint;
/
ST_EndPoint
CREATE OR REPLACE
FUNCTION ST_EndPoint(p_geom IN mdsys.sdo_geometry)
RETURN mdsys.sdo_geometry
IS
BEGIN
RETURN ST_PointN(p_geom,-1);
END ST_EndPoint;
/
I’m a newbie to this world and I don’t really get the syntax of these functions…
For the table ARAMAL that I’ve used before how should I use/apply them to get the results (and in the format) I need?
IPID X Y Z id
1479723 -99340.38408 -102364.3603 10 1
....
Thanks in advance,
Best regards,
Pedro
It doesn't matter if you are working with spatial data, you are interested in row which has maximum id for given ipid, so you can run it like here:
select *
from (
select a.ipid, t.x, t.y, t.z, t.id,
max(t.id) over (partition by a.ipid) mx_id
from aramal a, table(sdo_util.getvertices(a.geometry)) t)
where id = mx_id;
demo
There are several ways to get last row, you can use row_number(), subquery, like in many top-n questions on this site.

select top N for each category w/o sorting if there are less than N rows

Given the following table, the question is to find for example the top N C2 from each C1.
C1 C2
1 1
1 2
1 3
1 4
1 ...
2 1
2 2
2 3
2 4
2 ...
....
So if N = 3, the results are
C1 C2
1 1
1 2
1 3
2 1
2 2
2 3
....
The proposed solutions use the window function and partition by
Select top 10 records for each category
https://www.the-art-of-web.com/sql/partition-over/
For example,
SELECT rs.Field1,rs.Field2
FROM (
SELECT Field1,Field2, Rank()
over (Partition BY Section
ORDER BY RankCriteria DESC ) AS Rank
FROM table
) rs WHERE Rank <= 3
I guess what it does is sorting then picking the top N.
However if some categories have less N elements, we can get the top N w/o sorting because the top N must include all elements in the category.
The above query uses Rank(). My question applies to other window functions like row_num() or dense_rank().
Is there a way to ignore the sorting at the case?
Also I am not sure if the underlying engine can optimize the case: whether the inner partition/order considers the outer where constraints before sorting.
Using partition+order+where is a way to get the top-N element from each category. It works perfect if each category has more than N element, but has additional sorting cost otherwise. My question is if there is another approach that works well at both cases. Ideally it does the following
for each category {
if # of element <= N:
continue
sort and get the top N
}
For example, but is there a better SQL?
WITH table_with_count AS (
SELECT Field1, Field2, RankCriteria, count() over (PARTITION BY Section) as c
FROM table
),
rs AS (
SELECT Field1,Field2, Rank()
over (Partition BY Section
ORDER BY RankCriteria DESC ) AS Rank
FROM table_with_count
where c > 10
)
(SELECT Field1,Field2e FROM rs WHERE Rank <= 10)
union
(SELECT Field1,Field2 FROM table_with_count WHERE c <= 10)
No, an there really shouldn't be. Overall what you describe here is the XY-problem.
You seem to:
Worry about sorting, while in fact sorting (with optional secondary sort) is the most efficient way of shuffling / repartitioning data, as it doesn't lead to proliferation of file descriptors. In practice Spark strictly prefers sort over alternatives (hashing) for exactly that reason.
Worry about "unnecessary" sorting of small groups, when in fact the problem is intrinsic inefficiency of window functions, which require full shuffle of all data, therefore exhibit the same behavior pattern as infamous groupByKey.
There are more efficient patterns (MLPairRDDFunctions.topByKey being the most prominent example) but these haven't been ported to Dataset API, and would require custom Aggregator It is also possible to approximate selection (for example through quantile approximation), but this increases the number of passes over data, and in many cases won't provide any performance gains.
This is too long for a comment.
There is no such optimization. Basically, all the data is sorted when using windowing clauses. I suppose that a database engine could actually use a hash algorithm for the partition by and a sort algorithm for the order by, but I don't think that is a common approach.
In any case, the operation is over the entire set, and it should be optimized for this purpose. Trying not to order a subset would add lots of overhead -- for instance, running the sort multiple times for each subset and counting the number of rows in each subset.
Also note that the comparison to "3" occurs (logically) after the window function. I don't think window functions are typically optimized for such post-filtering (although once again, it is a possible optimization).

Possible explanation on WITH RECURSIVE Query Postgres

I have been reading around With Query in Postgres. And this is what I'm surprised with
WITH RECURSIVE t(n) AS (
VALUES (1)
UNION ALL
SELECT n+1 FROM t WHERE n < 100
)
SELECT sum(n) FROM t;
I'm not able to understand how does the evaluation of the query work.
t(n) it sound like a function with a parameter. how does the value of n is passed.
Any insight on how the break down happen of the recursive statement in SQL.
This is called a common table expression and is a way of expressing a recursive query in SQL:
t(n) defines the name of the CTE as t, with a single column named n. It's similar to an alias for a derived table:
select ...
from (
...
) as t(n);
The recursion starts with the value 1 (that's the values (1) part) and then recursively adds one to it until the 99 is reached. So it generates the numbers from 1 to 99. Then final query then sums up all those numbers.
n is a column name, not a "variable" and the "assignment" happens in the same way as any data retrieval.
WITH RECURSIVE t(n) AS (
VALUES (1) --<< this is the recursion "root"
UNION ALL
SELECT n+1 FROM t WHERE n < 100 --<< this is the "recursive part"
)
SELECT sum(n) FROM t;
If you "unroll" the recursion (which in fact is an iteration) then you'd wind up with something like this:
select x.n + 1
from (
select x.n + 1
from (
select x.n + 1
from (
select x.n + 1
from (
values (1)
) as x(n)
) as x(n)
) as x(n)
) as x(n)
More details in the manual:
https://www.postgresql.org/docs/current/static/queries-with.html
If you are looking for how it is evaluated, the recursion occurs in two phases.
The root is executed once.
The recursive part is executed until no rows are returned. The documentation is a little vague on that point.
Now, normally in databases, we think of "function" in a different way than we think of them when we do imperative programming. In database terms, the best way to think of a function is "a correspondence where for every domain value you have exactly one corresponding value." So one of the immediate challenges is to stop thinking in terms of programming functions. Even user-defined functions are best thought about in this other way since it avoids a lot of potential nastiness regarding the intersection of running the query and the query planner... So it may look like a function but that is not correct.
Instead the WITH clause uses a different, almost inverse notation. Here you have the set name t, followed (optionally in this case) by the tuple structure (n). So this is not a function with a parameter, but a relation with a structure.
So how this breaks down:
SELECT 1 as n where n < 100
UNION ALL
SELECT n + 1 FROM (SELECT 1 as n) where n < 100
UNION ALL
SELECT n + 1 FROM (SELECT n + 1 FROM (SELECT 1 as n)) where n < 100
Of course that is a simplification because internally we keep track of the cte state and keep joining against the last iteration, so in practice these get folded back to near linear complexity (while the above diagram would suggest much worse performance than that).
So in reality you get something more like:
SELECT 1 as n where 1 < 100
UNION ALL
SELECT 1 + 1 as n where 1 + 1 < 100
UNION ALL
SELECT 2 + 1 AS n WHERE 2 + 1 < 100
...
In essence the previous values carry over.

How can I get sets of 2 entries where each set starts with a different letter and no letter is repeated from a sqlite database?

Like this:
apple
aardvark
banana
bet
cow
car
...
zipper
zoo
Assuming the database has more than just two different entries that start with any of the letters. I was thinking of doing something with TOP and wildcards, but I don't really know enough about SQL to pull this off. What can I do?
You can do this with the substr function and a correlated subquery:
SELECT *
FROM YourTable a
WHERE wordField IN (SELECT wordField
FROM YourTable AS b
WHERE substr(a.wordField ,1,1) = substr(b.wordField ,1,1)
ORDER BY wordField
LIMIT 2)
Demo: SQL Fiddle
You can use the ORDER BY to adjust which 2 records are returned. Like ORDER BY RANDOM() if that's supported.

SQL and number combination search

I have table with 10 number fields (let's say F1, F2... F10).
Now I have 4 numbers (N1, N2, N3, N4).
I have to find if those 4 numbers appear anywhere in the above table. For example, if F2=N4 and F1=N2 and Fx=N3 and Fy=N1 (any order, any combination).
I was wondering is there quick way to do it via SQL or is it only way to write looooong combination of selects (I am not sure I will be able even finish that in this life time).
Here is SQLFiddel Demo
Below is the sample Query
select * from Temp
where 'N1' in (F1,F2,F3,F4,F5,F6,F7,F8,F9,F10)
and 'N2' in (F1,F2,F3,F4,F5,F6,F7,F8,F9,F10)
and 'N3' in (F1,F2,F3,F4,F5,F6,F7,F8,F9,F10)
and 'N4' in (F1,F2,F3,F4,F5,F6,F7,F8,F9,F10)
One way to do it (if your database supports it) would be to pivot the data so that each of the 10 columns has its own row.
So
ID F1 F2 F3 .. Fn
1 1 2 3 10
Becomes
ID F
1 1
1 2
1 3
..
1 10
You can now query for the existence of a given value of F against a single column, which simplifies things somewhat.
SQL Server supports this functionality Using PIVOT and UNPIVOT