MYSQL enumeration: #rownum, odd and even records - sql

I asked a question about creating temporary/ virtual ids for query results,
mysql & php: temporary/ virtual ids for query results?
I nearly got I wanted with this link,
http://craftycodeblog.com/2010/09/13/rownum-simulation-with-mysql/
I have managed to enumerate each row,
SELECT
u.pg_id AS ID,
u.pg_url AS URL,
u.pg_title AS Title,
u.pg_content_1 AS Content,
#rownum:=#rownum+1 AS rownum
FROM (
SELECT pg_id, pg_url,pg_title,pg_content_1
FROM root_pages
WHERE root_pages.parent_id = '7'
AND root_pages.pg_id != '7'
AND root_pages.pg_cat_id = '2'
AND root_pages.pg_hide != '1'
ORDER BY pg_created DESC
) u,
(SELECT #rownum:=0) r
result,
ID URL Title Content rownum
53 a x x 1
52 b x x 2
43 c x x 3
41 d x x 4
but how can I work on it a bit further - I want to display the odd or even records only like the ones below - is it possible?
odd records,
ID URL Title Content rownum
53 a x x 1
43 c x x 3
even records,
ID URL Title Content rownum
52 b x x 2
41 d x x 4
thank you.
p.s. I don't quite understand the sql query actually even though I almost got the answer, for instance, what do the 'u' and 't' mean?

what do the 'u' and 't' mean?
They are table aliases, so you don't have to specify the entire name of the table when you need to make reference.
To get only the odd numbered records, use:
SELECT x.*
FROM (SELECT u.pg_id AS ID,
u.pg_url AS URL,
u.pg_title AS Title,
u.pg_content_1 AS Content,
#rownum := #rownum + 1 AS rownum
FROM root_pages u
JOIN (SELECT #rownum := 0) r
WHERE u.parent_id = '7'
AND u.pg_id != '7'
AND u.pg_cat_id = '2'
AND u.pg_hide != '1'
ORDER BY u.pg_created DESC) x
WHERE x.rownum % 2 != 0
To get the even numbered records, use:
SELECT x.*
FROM (SELECT u.pg_id AS ID,
u.pg_url AS URL,
u.pg_title AS Title,
u.pg_content_1 AS Content,
#rownum := #rownum + 1 AS rownum
FROM root_pages u
JOIN (SELECT #rownum := 0) r
WHERE u.parent_id = '7'
AND u.pg_id != '7'
AND u.pg_cat_id = '2'
AND u.pg_hide != '1'
ORDER BY u.pg_created DESC) x
WHERE x.rownum % 2 = 0
Explanation
The % is the modulus operator in MySQL syntax -- it returns the remainder of the division. For example 1 % 2 is 0.5, while 2 % 2 is zero. This is then used in the WHERE clause to filter the rows displayed.

Related

Fill in missing dates in date range from a table

table A
no date count
1 20160401 1
1 20160403 4
2 20160407 3
result
no date count
1 20160401 1
1 20160402 0
1 20160403 4
1 20160404 0
.
.
.
2 20160405 0
2 20160406 0
2 20160407 3
.
.
.
I'm using Oracle and I want to write a query that returns rows for every date within a range based on table A.
Is there some function in Oracle that can help me?
you can use the SEQUENCES.
First create a sequence
Create Sequence seq_name start with 20160401 max n;
where n is the max value till u want to display.
Then use the sql
select seq_name.next,case when seq_name.next = date then count else 0 end from tableA;
Note:- Its better not to use date,count as the column names.
Try this:
with
A as (
select 1 no, to_date('20160401', 'yyyymmdd') dat, 1 cnt from dual union all
select 1 no, to_date('20160403', 'yyyymmdd') dat, 4 cnt from dual union all
select 2 no, to_date('20160407', 'yyyymmdd') dat, 3 cnt from dual),
B as (select min(dat) mindat, max(dat) maxdat from A t),
C as (select level + mindat - 1 dat from B connect by level + mindat - 1 <= maxdat),
D as (select distinct no from A),
E as (select * from D,C)
select E.no, E.dat, nvl(cnt, 0) cnt
from E
full outer join A on A.no = E.no and A.dat = E.dat
order by 1, 2, 3
This isn't an oracle specific answer, you'll need to translate it to oracle yourself.
Create an intervals table, containing all integers from 0 to 999. Something like this:
CREATE TABLE intervals (days int);
INSERT INTO intervals (days) VALUES (0), (1);
DECLARE #rc int;
SELECT #rc = 2;
WHILE (SELECT Count(*) FROM intervals) < 1000 BEGIN
INSERT INTO intervals (days) SELECT days + #rc FROM intervals WHERE days + #rc < 1000;
SELECT #rc = #rc * 2
END;
Then all the dates in the range can be identified by adding intervals.days to the first date you've got, where the first date + intervals.days is <= the end date, and the resultant date is new. Do this by cross joining intervals to your own table. Something like (it would be in SQL, but again you'll need to translate):
SELECT DateAdd(a.date, d, i.days)
FROM (select min(date) from table_A) a, intervals I
WHERE DateAdd(a.date, d, i.days) < (select max(date) from table_A)
AND NOT EXISTS (select 1 from table_A aa where aa.date = DateAdd(a.date, d, i.days))
Hope this gives you a starting point

Find overlapping range in PL/SQL

Sample data below
id start end
a 1 3
a 5 6
a 8 9
b 2 4
b 6 7
b 9 10
c 2 4
c 6 7
c 9 10
I'm trying to come up with a query that will return all the overlap start-end inclusive between a, b, and c (but extendable to more). So the expected data will look like the following
start end
2 3
6 6
9 9
The only way I can picture this is with a custom aggregate function that tracks the current valid intervals then computes the new intervals during the iterate phase. However I can't see this approach being practical when working with large datasets. So if some bright mind out there have a query or some innate function that I'm not aware of I would greatly appreciate the help.
You can do this using aggregation and a join. Assuming no internal overlaps for "a" and "b":
select greatest(ta.start, tb.start) as start,
least(ta.end, tb.end) as end
from t ta join
t tb
on ta.start <= tb.end and ta.end >= tb.start and
ta.id = 'a' and tb.id = 'b';
This is a lot uglier and more complex than Gordon's solution, but I think it gives the expected answer better and should extend to work with more ids:
WITH NUMS(N) AS ( --GENERATE NUMBERS N FROM THE SMALLEST START VALUE TO THE LARGEST END VALUE
SELECT MIN("START") N FROM T
UNION ALL
SELECT N+1 FROM NUMS WHERE N < (SELECT MAX("END") FROM T)
),
SEQS(N,START_RANK,END_RANK) AS (
SELECT N,
CASE WHEN IS_START=1 THEN ROW_NUMBER() OVER (PARTITION BY IS_START ORDER BY N) ELSE 0 END START_RANK, --ASSIGN A RANK TO EACH RANGE START
CASE WHEN IS_END=1 THEN ROW_NUMBER() OVER (PARTITION BY IS_END ORDER BY N) ELSE 0 END END_RANK --ASSIGN A RANK TO EACH RANGE END
FROM (
SELECT N,
CASE WHEN NVL(LAG(N) OVER (ORDER BY N),N) + 1 <> N THEN 1 ELSE 0 END IS_START, --MARK N AS A RANGE START
CASE WHEN NVL(LEAD(N) OVER (ORDER BY N),N) -1 <> N THEN 1 ELSE 0 END IS_END /* MARK N AS A RANGE END */
FROM (
SELECT DISTINCT N FROM ( --GET THE SET OF NUMBERS N THAT ARE INCLUDED IN ALL ID RANGES
SELECT NUMS.*,T.*,COUNT(*) OVER (PARTITION BY N) N_CNT,COUNT(DISTINCT "ID") OVER () ID_CNT
FROM NUMS
JOIN T ON (NUMS.N >= T."START" AND NUMS.N <= T."END")
) WHERE N_CNT=ID_CNT
)
) WHERE IS_START + IS_END > 0
)
SELECT STARTS.N "START",ENDS.N "END" FROM SEQS STARTS
JOIN SEQS ENDS ON (STARTS.START_RANK=ENDS.END_RANK AND STARTS.N <= ENDS.N) ORDER BY "START"; --MATCH CORRESPONDING RANGE START/END VALUES
First we generate all the numbers between the smallest start value and the largest end value.
Then we find the numbers that are included in all the provided "id" ranges by joining our generated numbers to the ranges, and selecting each number "n" that appears once for each "id".
Then we determine whether each of these values "n" starts or ends a range. To determine that, for each N we say:
If the previous value of N does not exist or is not 1 less than current N, current N starts a range. If the next value of N does not exist or is not 1 greater than current N, current N ends a range.
Next, we assign a "rank" to each start and end value so we can match them up.
Finally, we self-join where the ranks match (and where the start <= the end) to get our result.
EDIT: After some searching, I came across this question which shows a better way to find the start/ends and refactored the query to:
WITH NUMS(N) AS ( --GENERATE NUMBERS N FROM THE SMALLEST START VALUE TO THE LARGEST END VALUE
SELECT MIN("START") N FROM T
UNION ALL
SELECT N+1 FROM NUMS WHERE N < (SELECT MAX("END") FROM T)
)
SELECT MIN(N) "START",MAX(N) "END" FROM (
SELECT N,ROW_NUMBER() OVER (ORDER BY N)-N GRP_ID
FROM (
SELECT DISTINCT N FROM ( --GET THE SET OF NUMBERS N THAT ARE INCLUDED IN ALL ID RANGES
SELECT NUMS.*,T.*,COUNT(*) OVER (PARTITION BY N) N_CNT,COUNT(DISTINCT "ID") OVER () ID_CNT
FROM NUMS
JOIN T ON (NUMS.N >= T."START" AND NUMS.N <= T."END")
) WHERE N_CNT=ID_CNT
)
)
GROUP BY GRP_ID ORDER BY "START";

Select Random Numbers from a list

This is my query.
SELECT TOP 2 NUM
FROM QT_PIVOT
WHERE NUM BETWEEN 1 AND 45
ORDER BY NEWID()
I'm selecting 2 random numbers from a list but I don't want that these numbers to be continuous
Sometimes the result is
NUM
----
2
3
And I don't want this
Thanks , and sorry for my English u.u
Basically the same as the 2nd approach Gordon uses except it lacks the use of the lag function and therefor will work on SQL-2008.
WITH Data AS(
SELECT *, RowNum = ROW_NUMBER() OVER (ORDER BY NEWID())
FROM sys.objects AS O
),
r AS(
SELECT TOP 1 *, SkipRow = 0
FROM Data
WHERE Data.RowNum = 1
UNION ALL
SELECT d.*, SkipRow = CASE WHEN d.object_id BETWEEN r.object_id -2 AND r.object_id + 2 THEN 1 ELSE 0 END
FROM r
JOIN Data AS D
ON r.RowNum + 1 = D.RowNum
)
SELECT TOP 2 * FROM R
WHERE R.SkipRow = 0
One approach is to select the first number, and then select an appropriate second number:
WITH r AS (
SELECT TOP 1 num
FROM QT_PIVOT
WHERE NUM BETWEEN 1 AND 45
ORDER BY NEWId()
)
select num
from r
union all
select top 1 q.num
from qt_pivot q join
r
on q.num not in (r.num, r.num - 1, r.num + 1)
where q.num between 1 and 45
order by newid();
Another approach (if you had SQL Server 2012+) would use lag() to remove any possibilities that do not meet the conditions:
WITH r AS (
SELECT num, row_number() over (order by newid()) as seqnum
FROM QT_PIVOT
WHERE NUM BETWEEN 1 AND 45
)
SELECT r.num
FROM (SELECT r.*, LAG(num) OVER (ORDER BY seqnum) as prevnum
FROM r
) r
WHERE prevnum is null or
prevnum not in (num - 1, num + 1);
EDIT:
The first approach doesn't work, because SQL Server always re-evaluates CTEs, and there is not even a hint to fix this problem. Here is an alternative approach, that will ensure that values are not consecutive:
WITH r as (
SELECT (1 + checksum(newid()) * 45) as r1,
(2 + checksum(newid()) * 43) as r2
)
SELECT q.num
FROM QT_PIVOT q
WHERE q.num = r.r1 or
q.num = 1 + (r.r1 + r.r2) % 45;
This calculates a two random numbers. The first is a random position. The second is an allowable offset (hence the "2" and "43") to guarantee that the numbers are not adjacent.

MYSQL query to get 'n' rows nearby given row

I have a MySQL table by name 'videos', where one of the column is 'cat' (INT) and 'id' is the PRIMARY KEY.
So, if 'x' is the row number,and 'n' is the category id, I need to get nearby 15 rows
Case 1: There are many rows in the category before and after 'x'.. Just get 7 each rows before and after 'x'
SELECT * FROM videos WHERE cat=n AND id<x ORDER BY id DESC LIMIT 0,7
SELECT * FROM videos WHERE cat=n AND id>x LIMIT 0,7
Case 2: If 'x' is in the beginning/end of the the table -> Print all (suppose 'y' rows) the rows before/after 'x' and later print 15-y rows after/before 'x'
Case 1 is not a problem but I am stuck with Case 2. Is there any generic method to get 'p' rows nearby a row 'x' ??
This query will always position N (exact id match) at the centre of the data, unless there are no more rows (in either direction), in which case rows will be added from the prior/next sections as required, while still preserving data from prior/next (as much as available).
set #n := 28;
SELECT * FROM
(
SELECT * FROM
(
(SELECT v.*, 0 as prox FROM videos v WHERE cat=1 AND id = #n)
union all
(SELECT v.*, #rn1:=#rn1+1 FROM (select #rn1:=0) x, videos v WHERE cat=1 AND id < #n ORDER BY id DESC LIMIT 15)
union all
(SELECT v.*, #rn2:=#rn2+1 FROM (select #rn2:=0) y, videos v WHERE cat=1 AND id > #n ORDER BY id LIMIT 15)
) z
ORDER BY prox
LIMIT 15
) w
order by id
For example, if you had 30 ids for cat=1, and you were looking at item #28, it will show items 16 through 30, #28 is the 3rd row from the bottom.
Some explanation:
SELECT v.*, 0 as prox FROM videos v WHERE cat=1 AND id = #n
v.* means to select all columns in the table/alias v. In this case, v is the alias for the table videos.
0 as prox means to create a column named prox, and it will contain just the value 0
The next query:
SELECT v.*, #rn1:=#rn1+1 FROM (select #rn1:=0) x, videos v WHERE cat=1 AND id < #n ORDER BY id DESC LIMIT 15
v.* - as above
#rn1:=#rn1+1 uses a variable to return a sequence number for each record in this subquery. It starts with 1 and for each record, following the ORDER BY id DESC, it will be numbered 2, then 3 etc.
(select #rn1:=0) x This creates a subquery aliased as x, all it does is ensures the variable #rn1 starts with the value 1 for the first row.
The end result is that the variable and 0 as prox ranks each row based on how close it is to the value #n. The clause order by prox limit 15 takes the 15 that are closest to N.

T-sql problem with running sum

I am trying to write T-sql script which will find "open" records for one table
Structure of data is following
Id (int PK) Ts (datetime) Art_id (int) Amount (float)
1 '2009-01-01' 1 1
2 '2009-01-05' 1 -1
3 '2009-01-10' 1 1
4 '2009-01-11' 1 -1
5 '2009-01-13' 1 1
6 '2009-01-14' 1 1
7 '2009-01-15' 2 1
8 '2009-01-17' 2 -1
9 '2009-01-18' 2 1
According to my needs I am trying to show only records after last sum for every one articles where 0 sorting by date of last running sum of zero value. So I am trying to abstract (show) records 5 and 6 for Art_id=1 and record 9 for art_id=2. I am using MSSQL2005 and my table has around 30K records with 6000 distinct values of ART_ID.
In this solution I simply want to find all the rows where there isn't a subsequent row for that Art_id where the running sum was 0. I am assuming we can use the ID as a better tiebreaker than TS, since two rows can come in with the same timestamp but they will get sequential identity values.
;WITH base AS
(
SELECT
ID, Art_id, TS, Amount,
RunningSum = Amount + COALESCE
(
(
SELECT SUM(Amount)
FROM dbo.foo
WHERE Art_id = f.Art_id
AND ID < f.ID
)
, 0
)
FROM dbo.[table name] AS f
)
SELECT ID, Art_id, TS, Amount
FROM base AS b1
WHERE NOT EXISTS
(
SELECT 1
FROM base AS b2
WHERE Art_id = b1.Art_id
AND ID >= b1.ID
AND RunningSum = 0
)
ORDER BY ID;
Complete working query:
SELECT
*
FROM TABLE_NAME E
JOIN
(SELECT
C.ART_ID,
MAX(TS) MAX_TS
FROM
(SELECT
ART_ID,
TS,
COALESCE((SELECT SUM(AMOUNT) FROM TABLE_NAME B WHERE (B.Art_id = A.Art_id) AND (B.Ts < A.Ts)),0) ROW_SUM
FROM TABLE_NAME A) C
WHERE C.ROW_SUM = 0
GROUP BY C.ART_ID) D
ON
(D.ART_ID = E.ART_ID) AND
(E.TS >= D.MAX_TS)
First we calculate running sums for every row:
SELECT
ART_ID,
TS,
COALESCE((SELECT SUM(AMOUNT) FROM TABLE_NAME B WHERE (B.Art_id = A.Art_id) AND (B.Ts < A.Ts)),0) ROW_SUM
FROM TABLE_NAME A
Then we look for last article with 0:
SELECT
C.ART_ID,
MAX(TS) MAX_TS
FROM
(SELECT
ART_ID,
TS,
COALESCE((SELECT SUM(AMOUNT) FROM TABLE_NAME B WHERE (B.Art_id = A.Art_id) AND (B.Ts < A.Ts)),0) ROW_SUM
FROM TABLE_NAME A) C
WHERE C.ROW_SUM = 0
GROUP BY C.ART_ID
You can find all rows where the running sum is zero with:
select cur.id, cur.art_id
from #articles cur
left join #articles prev
on prev.art_id = cur.art_id
and prev.id <= cur.id
group by cur.id, cur.art_id
having sum(prev.amount) = 0
Then you can query all rows that come after the rows with a zero running sum:
select a.*
from #articles a
left join (
select cur.id, cur.art_id, running = sum(prev.amount)
from #articles cur
left join #articles prev
on prev.art_id = cur.art_id
and prev.ts <= cur.ts
group by cur.id, cur.art_id
having sum(prev.amount) = 0
) later_zero_running on
a.art_id = later_zero_running.art_id
and a.id <= later_zero_running.id
where later_zero_running.id is null
The LEFT JOIN in combination with the WHERE says: there can not be a row after this row, where the running sum is zero.