Create view with multiple rows for one source row - sql

I have a table which has an offset and a qty column.
I now want to create a view from that which has an entry for each precise position.
Table:
offset | qty | more_data
-------+---------+-------------
1 | 3 | 'qwer'
2 | 2 | 'asdf'
View:
position | more_data
---------+------------
1 | 'quer'
2 | 'quer'
3 | 'quer'
2 | 'asdf'
3 | 'asdf'
Is that even possible?
I would need to do that for Oracle (8! - 11), MS SQL (2005-) and PostgreSQL (8-)

Based on you input/output:
with t(offset, qty) as (
select 1, 3 from dual
)
select offset + level - 1 position
from t
connect by rownum <= qty
POSITION
--------
1
2
3

For Postgres:
select offst, generate_series(offst, qty) as position
from the_table
order by offst, num;
SQLFiddle: http://sqlfiddle.com/#!10/e70d9/4
I don't have anything as ancient as 8.0, 8.1 or 8.2 around but it should work on those pre-historic versions as well.
Note that offset is a reserved word in Postgres. You should find a different name for that column

In Oracle, to answer the specific question (i.e. a table with just the one row):
select rn posn from (
select offset-1+rownum rn from the_table
connect by level between offset and qty
);
In reality, your table will have multiple rows, so you will need to restrict the inner query to 1 object row, otherwise I think you will get huge, incorrect output. If you can provide more details about the table/data a more complete answer could be given.

Related

Average of successive pairs of rows

I have a table like so:
id | value
---+------
1 | 10
2 | 5
3 | 11
4 | 8
5 | 9
6 | 7
The data in this table is really pairs of values, which I need to take the average of, which should result in:
pair_id | pair_avg
--------+---------
1 | 7.5
2 | 9.5
3 | 8
I have got some other information (a pair of flags) which could also help to pair them, though they still have to be in id order. I cannot really change how the data comes to me.
As I'm more used to arrays than SQL, all I can think is that I need to loop through the table and sum the pairs. But this doesn't strike me as very SQL-ish.
Update
In making this minimal example, I have apparently over simplified.
As the table I am working with is the result of several selects, the IDs will not be quite so clean, apologies for not specifying this.
The table looks a lot more like:
id | value
----------
1 | 10
4 | 5
6 | 11
7 | 8
10 | 9
15 | 7
The results will be used to create a second table, I don't care about the index on this new table, it can provide its own, therefore giving the result already indicated above.
If your data is as clean as the question makes it seem: no NULL values, no gaps, pairs have consecutive positive numbers, starting with 1, and assuming id is type integer, it can be as simple as:
SELECT (id+1)/2 AS pair_id, avg(value) AS pair_avg
FROM tbl
GROUP BY 1
ORDER BY 1;
Integer division truncates the result and thus takes care of grouping pairs automatically this way.
If your id numbers are not as regular but at least strictly monotonically increasing like your update suggests (still no NULL or missing values), you can use a surrogate ID generated with row_number() instead:
SELECT id/2 AS pair_id, avg(value) AS pair_avg
FROM (SELECT row_number() OVER (ORDER BY id) + 1 AS id, value FROM tbl) t
GROUP BY 1
ORDER BY 1;
db<>fiddle here
I think you can just use group by with arithmetic:
select row_number() over (order by min(id)), min(id), max(id), avg(id)
from t
group by floor( (id - 1) / 2 );
I'm not sure why you would want to renumber the ids after aggregation. The original ids seem more useful.
You may use ceil function by appliying division by 2 to id column as in the following select statement :
with t(id,value) as
(
select 1 , 10 union all
select 2 , 5 union all
select 3 , 11 union all
select 4 , 8 union all
select 5 , 9 union all
select 6 , 7
)
select ceil(id/2::numeric) as "ID", avg(t.value) as "pair_avg"
from t
group by "ID"
order by "ID";
id | pair_avg
-------------
1 | 7.5
2 | 9.5
3 | 8

Deterministic sort order for window functions

I've a status table and I want to fetch the latest details.
Slno | ID | Status | date
1 | 1 | Pass | 15-06-2015 11:11:00 - this is inserted first
2 | 1 | Fail | 15-06-2015 11:11:00 - this is inserted second
3 | 2 | Fail | 15-06-2015 12:11:11 - this is inserted first
4 | 2 | Pass | 15-06-2015 12:11:11 - this is inserted second
I use a window function with partition by ID order by date desc to fetch the first value.
Excepted Output :
2 | 1 | Fail | 15-06-2015 11:11:00 - this is inserted second
4 | 2 | Pass | 15-06-2015 12:11:11 - this is inserted second
Actual Output :
1 | 1 | Pass | 15-06-2015 11:11:00 - this is inserted first
3 | 2 | Fail | 15-06-2015 12:11:11 - this is inserted first
According to [http://docs.aws.amazon.com/redshift/latest/dg/r_Examples_order_by_WF.html], adding a second ORDER BY column to the window function may solve the problem. But I don't have any other column to differentiate the rows!
Is there another approach to solve the issue?
EDIT: I've added slno here for clarity. I don't have slno as such in the table!
My SQL:
with range as (
select id from status where date between 01-06-2015 and 30-06-2015
), latest as (
select status, id, row_number() OVER (PARTITION BY id ORDER BY date DESC) row_num
)
select * from latest where row_num = 1
If you don't have slno in your table, then you don't have any reliable information which row was inserted first. There is no natural order in a table, the physical order of rows can change any time (with any update, or with VACUUM, etc.)
You could use an unreliable trick: order by the internal ctid.
select *
from (
select id, status
, row_number() OVER (PARTITION BY id
ORDER BY date, ctid) AS row_num
from status -- that's your table name??
where date >= '2015-06-01' -- assuming column is actually a date
and date < '2015-07-01'
) sub
where row_num = 1;
In absence of any other information which row came first (which is a design error to begin with, fix it!), you might try to save what you can using the internal tuple ID ctid
In-order sequence generation
Rows will be in physical order when inserted initially, but that can change any time with any write operation to the table or VACUUM or other events.
This is a measure of last resort and it will break.
Your presented query was invalid on several counts: missing column name in 1st CTE, missing table name in 2nd CTE, ...
You don't need a CTE for this.
Simpler with DISTINCT ON (considerations for ctid apply the same):
SELECT DISTINCT ON (id)
id, status
FROM status
WHERE date >= '2015-06-01'
AND date < '2015-07-01'
ORDER BY id, date, ctid;
Select first row in each GROUP BY group?

Return only first row with particular value in a column

I realize that this has probably been asked a billion times, and I could swear I've done this in the past, but tonight I've got brain block or something and can't figure it out...
I have a database table ("t1") where I need to be able to retrieve only the first row where a particular value appears in a particular column.
Here's a sample of the data:
id | qID | Name
---------------------
1 | 1 | Bob
2 | 3 | Fred
3 | 1 | George
4 | 1 | Jack
What I want as a result is:
id | qID | Name
---------------------
1 | 1 | Bob
2 | 3 | Fred
The only column I actually need to get out of the query is the first one, but that's not where the duplicates need to be eliminated, and I thought it might be confusing not to show the entire row.
I've tried using this:
select id, qID, ROW_NUMBER() over(partition by qID order by qID) as zxy
from t1 where zxy = 1
But it gives me this error:
Msg 207, Level 16, State 1, Line 14
Invalid column name 'zxy'.
If I remove the where part of the query, the rest of it works fine. I've tried different variable names, using single or double quotes around 'zxy' but it seems to make no difference. And try as I might, I can't find the part of the SQL Server documentation where it discusses assigning a variable name to an expression, as in the "as zxy" part of the above query... if anybody has a link for that, that's quite useful.
Needless to say, I've tried other variable names besides "zxy" but that makes no difference.
Help!
WHERE clause is applied earlier in the process than SELECT. Therefore the calculated column zxy is not available in WHERE. In order to achieve your goal you need to put your original query in a subquery or CTE.
select id, qid
from
(
select id, qID, ROW_NUMBER() over(partition by qID order by qID) as zxy
from t1
) q
where zxy = 1
Output:
| id | qid |
|----|-----|
| 1 | 1 |
| 2 | 3 |
Here is SQLFiddle demo
Logical Processing Order of the SELECT statement
1 FROM
2 ON
3 JOIN
4 WHERE
5 GROUP BY
6 WITH CUBE or WITH ROLLUP
7 HAVING
8 SELECT
9 DISTINCT
10 ORDER BY
11 TOP
Where Clause Execute Before Select Clause so You can not find ZXY in Where cluase
with cte as
(
select id, qID, ROW_NUMBER() over(partition by qID order by qID) as zxy
from t1
)
select * from cte where zxy = 1
Here is my blog it might help you http://sqlearth.blogspot.in/2015/05/how-sql-select-statement-logically-works.html

SQL group by and count fixed column values

I'm facing a problem in a data importation script in SQL(MySQL) where I need to GROUP rows by type to COUNT how much rows there are from each type. So far, it isn't really a problem, because I know that I can do:
SELECT
data.type,
COUNT(data.type)
FROM data
GROUP BY data.type;
So, by doing it, I have the result:
-------------- ---------------------
| type | COUNT(data.type) |
|--------------|---------------------|
| 0 | 1 |
| 1 | 46 |
| 2 | 35 |
| 3 | 423 |
| 4 | 64 |
| 5 | 36 |
| 9 | 1 |
-------------- ---------------------
I know that in the type column the values will always be in the range from 0 to 9, like the above result. So, I would like to list not only the existing values in the table content but the missing type values too, with their COUNT value set to 0.
Based on the above query result, the expected result would be:
-------------- ---------------------
| type | COUNT(data.type) |
|--------------|---------------------|
| 0 | 1 |
| 1 | 46 |
| 2 | 35 |
| 3 | 423 |
| 4 | 64 |
| 5 | 36 |
| 6 | 0 |
| 7 | 0 |
| 8 | 0 |
| 9 | 1 |
-------------- ---------------------
I could trickly INSERT one row of each type before GROUP/COUNT-1 the table content, flagging some other column on INSERT to be able to DELETE these rows after. So, the steps of my importation script would change to:
TRUNCATE table; (I can't securily import new content if there were old data in the table)
INSERT "control" rows;
LOAD DATA INFILE INTO TABLE;
GROUP/COUNT-1 the table content;
DELETE "control" rows; (So I can still work with the table content)
Do any other jobs;
But, I was looking for a cleaner way to reach the expected result. If possible, a single query, without a bunch of JOINs.
I would appreciate any suggestion or advice. Thank you very much!
EDIT
I would like to thank for the answers about CREATE a table to store all types to JOIN it. It really solves the problem. My approach solves it too, but does it storing the types, as you did.
So, I have "another" question, just a clarification, based on the received answers and my desired scope... is it possible to reach the expected result with some MySQL command that will not CREATE a new table and/or INSERT these types?
I don't see any problem, actually, in solve my question storing the types... I just would like to find a simplified command... something like a 'best practice'... some kind of filter... as I could run:
GROUP BY data.type(0,1,2,3,4,5,6,7,8,9)
and it could return these filtered values.
I am really interested to learn such a command, if it really exists/is possible.
And again, thank you very much!
Let's assume that you have a types table with all the valid types:
SELECT t.type,
COUNT(data.type)
FROM data join types t on data.type = t.type
GROUP BY t.type
order by t.type
You should include the explicit order by and not depend on the group by to produce results in a particular order.
The easiest way is to create a table of all type values and then join on that table when getting the count:
select t.type,
count(d.type)
from types t
left join data d
on t.type = d.type
group by t.type
See SQL Fiddle with demo
Or you can use the following:
select t.type,
count(d.type)
from
(
select 0 type
union all
select 1
union all
select 2
union all
select 3
union all
select 4
union all
select 5
union all
select 6
union all
select 7
union all
select 8
union all
select 9
) t
left join data d
on t.type = d.type
group by t.type
See SQL Fiddle with Demo
One option would be having a static numbers table with the values 0-9. Not sure if this is the most elegant approach, and if you were using SQL Server, I could think of another approach.
Try something like this:
SELECT
numbers.number,
COUNT(data.type)
FROM numbers
left join data
on numbers.number = data.type
GROUP BY numbers.number;
And the SQL Fiddle.
Okay... I think I found it! Thank you all!!! I'm accepting my own answer.
I agree with the #GordonLinoff comment that the best practice refers to store the types values and describe them, so you can keep a concise/understandable database and queries.
But, as far as I've learned, if you have some data which might be an irrelevant information, it is preferable to treat it in some other way than storing it.
So, I developed this query:
SELECT
SUM(IF(data.type = 0, 1, 0)) AS `0`,
SUM(IF(data.type = 1, 1, 0)) AS `1`,
SUM(IF(data.type = 2, 1, 0)) AS `2`,
SUM(IF(data.type = 3, 1, 0)) AS `3`,
SUM(IF(data.type = 4, 1, 0)) AS `4`,
SUM(IF(data.type = 5, 1, 0)) AS `5`,
SUM(IF(data.type = 6, 1, 0)) AS `6`,
SUM(IF(data.type = 7, 1, 0)) AS `7`,
SUM(IF(data.type = 8, 1, 0)) AS `8`,
SUM(IF(data.type = 9, 1, 0)) AS `9`
FROM data;
Not a so faster, optimized and beauty query, but to the size of data I'll manage (less than 100.000 rows each importation) it "manually" does the GROUP/COUNT job, running in 0.13 sec in a common developer machine.
It differs from my expected result just in the way rows and columns are selected - instead of 10 rows with 2 columns I've got 1 row with 10 columns, labeled with the matching type. Also, as we have a standardization to the type value (and we'll not change it for sure) which gives it a name and description, I'm now able to use the type name as the column label, instead of joining to a table with the types info to select a third column in the result (which really, is not that important as it's an importation script based on some standards).
Thank you all so much for the help!

Insert multiple values and Insert value in parallel

I have a question about SQL in parallel queries. For example, suppose that I have this query:
INSERT INTO tblExample (num) VALUES (1), (2)
And this query:
INSERT INTO tblExample (num) VALUES (3)
The final table should looked like this:
num
---
1
2
3
But I wonder if there is an option that those two queries will run in parallel and the final table will be looked like this:
num
---
1
3
2
Someone know the answer?
Thanks in advance!
There is no order in sql. You can sort your queries by adding an ORDER BY clause
SELECT * FROM tblExample ORDER BY num
Or you could add a timestamp column to the table and order by that.
How your table "looks" depends on how you asked for it (in the SELECT statement). Without an ORDER BY clause, the order of your table is undefined:
ORDER BY is the only way to sort the rows in the result set. Without this clause, the relational database system may return the rows in any order. If an ordering is required, the ORDER BY must be provided in the SELECT statement sent by the application.
For example:
SELECT num FROM tblExample ORDER BY num ASC
1
2
3
SELECT num FROM tblExample ORDER BY num DESC
3
2
1
If you want to order your columns manually, you can add a new column and sort on it:
+-----+-------+
| num | order |
+-----+-------+
| 1 | 1 |
| 2 | 3 |
| 3 | 2 |
+-----+-------+
SELECT num FROM tblExample ORDER BY order ASC
1
3
2