If I have the following tables and I perform R1/R2 in relational algebra, would the result be a table with A values 1 and 3? I am a bit confused as I know 3 would be a result as it contains both 5 and 1, but the result 1 has additional values for B aside from the matching ones so would this also be included and why?
R1 R2
+---+---+ +---+
| A | B | | B |
|---|---| |---|
| 1 | 1 | | 5 |
| 1 | 2 | | 1 |
| 1 | 3 | +---+
| 1 | 4 |
| 2 | 3 |
| 2 | 4 |
| 3 | 5 |
| 3 | 1 |
| 1 | 5 |
| 5 | 7 |
| 5 | 8 |
+---+---+
In relational databases Divide is defined as:
R1(Y,X) DIVIDE R2(X) = R1[Y] MINUS ((R1[Y] TIMES R2) MINUS R1)[Y]
remember that R1[Y] is another form of "PROJECT R1 over Y".
so the result is {1,3}
Related
I have the table shown below, with only one column. What I want to achieve is to separate all rows that have no gap in x, for example the numbers 1-3, 5-6 and 8-9 (because the gaps are 4 and 7).
+---+
| x |
+---+
| 1 |
| 2 |
| 3 |
| 5 |
| 6 |
| 8 |
| 9 |
+---+
I would like to make it look like this: a table with two columns (a and b), indicating the ranges where there are no gaps in the previous column x. For every gap a new record is inserted. How would I go about it in PostgreSQL?
+---+---+
| a | b |
+---+---+
| 1 | 3 |
| 5 | 6 |
| 8 | 9 |
+---+---+
You can compare the sequence with gaps to a sequence without gaps:
select min(x), max(x)
from
(
select x,
x-row_number() over (order by x) as dummy
from tab
) as dt
group by dummy
x | row_number | x - row_number
| 1 | 1 | 0 -- same value for consecutive values without gaps
| 2 | 2 | 0
| 3 | 3 | 0
| 5 | 4 | 1
| 6 | 5 | 1
| 8 | 6 | 2
| 9 | 7 | 2
I want to store some number sequences in my database. So I:
+-----+---------+-----+
| idx | seq_id | x |
+-----+---------+-----+
| 1 | 1 | 1 |
| 2 | 1 | 1 |
| 3 | 1 | 2 |
| 4 | 1 | 3 |
| 5 | 1 | 5 |
| 6 | 1 | 7 |
| 1 | 2 | 1 |
| 2 | 2 | 2 |
| 3 | 2 | 4 |
| 4 | 2 | 8 |
| 5 | 2 | 16 |
| ... |
+-----+---------+-----+
but when I look at it, it feels like I'm storing more overhead with idx and seq_id than meaningful information.
In some sense I am, but I wouldn't find strange if the database engine optimized most of the repetition here. Is this the case for SQLite, MySQL, Postgre...?
And what can I make, perhaps in terms of table definition, to help the db optimize this storage pattern?
I have two tables, A and B, and a join table M. I want to, for each A.id, get the top 2 B.id's sorting on the value in table M, producing the results below. This is running on an Azure SQL database
Table A Table M Table B
+-----+ +-----+-----+-------+ +-----+
| Id | | AId | BId | Value | | Id |
+-----+ +-----+-----+-------+ +-----+
| 1 | | 1 | 3 | 4 | | 1 |
| 2 | | 1 | 2 | 3 | | 2 |
| 3 | | 3 | 2 | 3 | | 3 |
| 4 | | 3 | 5 | 6 | | 4 |
+-----+ | 3 | 3 | 4 | | 5 |
| 4 | 1 | 2 | +-----+
| 4 | 2 | 1 |
| 4 | 4 | 3 |
+-----+-----+-------+
Result
+-----+-----+-------+
| AId | BId | Value |
+-----+-----+-------+
| 1 | 3 | 4 |
| 1 | 2 | 3 |
| 3 | 5 | 6 |
| 3 | 3 | 4 |
| 4 | 1 | 2 |
| 4 | 4 | 3 |
+-----+-----+-------+
I know that I can select all the M.AId rows where they equal 1, sort it, and limit by 2, but I need to do this for every row in Table A. I've made an attempt to use group by, but I wasn't sure how to sort and limit it. I've also tried to search for resources associated with this issue but I couldn't find any resources.
(I also wasn't sure how to word the title for this issue)
You can just use ROW_NUMBER:
SELECT
AId, BId, Value
FROM (
SELECT *,
Rn = ROW_NUMBER() OVER(PARTITION BY AId ORDER BY Value DESC)
FROM M
) t
WHERE Rn <= 2
I have this table that is already sorted but I want it to only display the maximum values... so instead of this table:
+------+-------+
| id | value |
+------+-------+
| 1 | 3 |
| 5 | 3 |
| 4 | 3 |
| 9 | 2 |
| 8 | 2 |
| 3 | 2 |
| 2 | 1 |
| 6 | 1 |
| 7 | 1 |
+------+-------+
I want this:
+------+-------+
| id | value |
+------+-------+
| 1 | 3 |
| 5 | 3 |
| 4 | 3 |
+------+-------+
I'm using SQLite. thanks for any help.
You can do this using a subquery. Here is one way:
select t.*
from t
where t.value = (select max(value) from t);
I have a table with X values and Y values, both INT. What I want to do is group on the X value with the condition that it contains a distinct combination of Y values. I also want to see the total number of each combination.
I tried using SUM ( POWER (2, Y)), but that generates numbers that are too big as Y can get up to about 300 in some cases.
+--------------+--------------+
| X | Y |
+--------------+--------------+
| 1 | 1 |
| 1 | 2 |
| 1 | 4 |
| 1 | 6 |
| 2 | 1 |
| 2 | 2 |
| 2 | 4 |
| 2 | 6 |
| 3 | 2 |
| 3 | 3 |
| 3 | 5 |
| 4 | 2 |
| 4 | 3 |
| 4 | 5 |
| 5 | 2 |
| 5 | 3 |
| 5 | 6 |
+--------------+--------------+
I want the result to look something like:
+--------------+--------------+
| X | COUNT |
+--------------+--------------+
| 1 | 2 |
| 3 | 2 |
| 5 | 1 |
+--------------+--------------+
Based on your description (but not on your sample data) next query should do:
select X, count(distinct Y)
from TBL
group by X
Thanks for trying to help. I realize that it might have been hard to understand what I was trying to do.
Anyway, I ended up solving it with the checksum_agg aggregate function.