repeating / duplicating query entries based on a table value - sql

Related to / copied from this PostgreSQL topic: so-link
Let's say I have a table with two rows
id | value |
----+-------+
1 | 2 |
2 | 3 |
I want to write a query that will duplicate (repeat) each row based on
the value. I want this result (5 rows total):
id | value |
----+-------+
1 | 2 |
1 | 2 |
2 | 3 |
2 | 3 |
2 | 3 |
How is this possible in SQL Anywhere (Sybase SQL)?

The easiest way to do this is to have a numbers table . . . one that generates integers. Perhaps you have one handy. There are other ways. For instance, using a recursive CTE:
with numbers as (
select 1 as n
union all
select n + 1
from numbers
where n < 100
)
select t.*
from yourtable t join
numbers n
on n.n <= value;
Not all versions of Sybase necessarily support recursive CTEs There are other ways to generate such a table or you might already have one handy.

Related

Split a quantity into multiple rows with limit on quantity per row

I have a table of ids and quantities that looks like this:
dbo.Quantity
id | qty
-------
1 | 3
2 | 6
I would like to split the quantity column into multiple lines and number them, but with a set limit (which can be arbitrary) on the maximum quantity allowed for each row.
So for the value of 2, expected output should be:
dbo.DesiredResult
id | qty | bucket
---------------
1 | 2 | 1
1 | 1 | 2
2 | 1 | 2
2 | 2 | 3
2 | 2 | 4
2 | 1 | 5
In other words,
Running SELECT id, SUM(qty) as qty FROM dbo.DesiredResult should return the original table (dbo.Quantity).
Running
SELECT id, SUM(qty) as qty FROM dbo.DesiredResult GROUP BY bucket
should give you this table.
id | qty | bucket
------------------
1 | 2 | 1
1 | 2 | 2
2 | 2 | 3
2 | 2 | 4
2 | 1 | 5
I feel I can do this with cursors imperitavely, looping over each row, keeping a counter that increments and resets as the "max" for each is filled. But this is very "anti-SQL" I feel there is a better way around this.
One approach is recursive CTE which emulates cursor sequentially going through rows.
Another approach that comes to mind is to represent your data as intervals and intersections of intervals.
Represent this:
id | qty
-------
1 | 3
2 | 6
as intervals [0;3), [3;9) with ids being their labels
0123456789
|--|-----|
1 2 - id
It is easy to generate this set of intervals using running total SUM() OVER().
Represent your buckets also as intervals [0;2), [2;4), [4;6), etc. with their own labels
0123456789
|-|-|-|-|-|
1 2 3 4 5 - bucket
It is easy to generate this set of intervals using a table of numbers.
Intersect these two sets of intervals preserving information about their labels.
Working with sets should be possible in a set-based SQL query, rather than a sequential cursor or recursion.
It is bit too much for me to write down the actual query right now. But, it is quite possible that ideas similar to those discussed in Packing Intervals by Itzik Ben-Gan may be useful here.
Actually, once you have your quantities represented as intervals you can generate required number of rows/buckets on the fly from the table of numbers using CROSS APPLY.
Imagine we transformed your Quantity table into Intervals:
Start | End | ID
0 | 3 | 1
3 | 9 | 2
And we also have a table of numbers - a table Numbers with column Number with values from 0 to, say, 100K.
For each Start and End of the interval we can calculate the corresponding bucket number by dividing the value by the bucket size and rounding down or up.
Something along these lines:
SELECT
Intervals.ID
,A.qty
,A.Bucket
FROM
Intervals
CROSS APPLY
(
SELECT
Numbers.Number + 1 AS Bucket
,#BucketSize AS qty
-- it is equal to #BucketSize if the bucket is completely within the Start and End boundaries
-- it should be adjusted for the first and last buckets of the interval
FROM Numbers
WHERE
Numbers.Number >= Start / #BucketSize
AND Numbers.Number < End / #BucketSize + 1
) AS A
;
You'll need to check and adjust formulas for errors +-1.
And write some CASE WHEN logic for calculating the correct qty for the buckets that happen to be on the lower and upper boundary of the interval.
Use a recursive CTE:
with cte as (
select id, 1 as n, qty
from t
union all
select id, n + 1, qty
from cte
where n + 1 < qty
)
select id, n
from cte;
Here is a db<>fiddle.

How do I Transform / Pivot in Access SQL but without aggregating?

Firstly, thank you to anyone that can help, I hope this is a simple question for those in the know.
I have Data which is of the form:
LeaseID | ChargeID
1 | 1
1 | 2
2 | 3
3 | 4
3 | 5
3 | 6
i.e. LeaseID 1 has 2 ChargeIDs
How can I query this in Access SQL so that the data will be reflected as
LeaseID | ChargeID | ChargeID | ChargeID
1 | 1 | 2
2 | 3
3 | 4 | 5 | 6
I know I am limited to 255 columns but this is not a problem as there will never be 255 but the number of columns should increase with the maximum number of ChargeIDs on a given lease.
I believe it is something to do with Transform / Pivot but have been unable to get it working. I keep getting the "too many crosstabs error"
Thanks,
Consider a two-step process involving a staging table:
Make-Table Query (using correlated subquery with slow performance on very large tables)
SELECT t.LeaseID, t.ChargeID, 'ChargeID' & (SELECT count(*) FROM LeaseCharge sub
WHERE sub.LeaseID = t.LeaseID
AND sub.ChargeID <= t.ChargeID) As Rank
INTO myStagingTable
FROM myTable t;
Cross-Tab Query
TRANSFORM MAX(s.ChargeID) As MaxChargeID
SELECT s.LeaseID
FROM myStagingTable s
GROUP BY s.LeaseID
PIVOT s.[Rank]
-- LeaseID ChargeID1 ChargeID2 ChargeID3
-- 1 1 2
-- 2 3
-- 3 4 5 6

Finding the difference between two sets of data from the same table

My data looks like:
run | line | checksum | group
-----------------------------
1 | 3 | 123 | 1
1 | 7 | 123 | 1
1 | 4 | 123 | 2
1 | 5 | 124 | 2
2 | 3 | 123 | 1
2 | 7 | 123 | 1
2 | 4 | 124 | 2
2 | 4 | 124 | 2
and I need a query that returns me the new entries in run 2
run | line | checksum | group
-----------------------------
2 | 4 | 124 | 2
2 | 4 | 124 | 2
I tried several things, but I never got to a satisfying answer.
In this case I'm using H2, but of course I'm interested in a general explanation that would help me to wrap my head around the concept.
EDIT:
OK, it's my first post here so please forgive if I didn't state the question precisely enough.
Basically given two run values (r1, r2, with r2 > r1) I want to determine which rows having row = r2 have a different line, checksum or group from any row where row = r1.
select * from yourtable
where run = 2 and checksum = (select max(checksum)
from yourtable)
Assuming your last run will have the higher run value than others, below SQL will help
select * from table1 t1
where t1.run in
(select max(t2.run) table1 t2)
Update:
Above SQL may not give you the right rows because your requirement is not so clear. But the overall idea is to fetch the rows based on the latest run parameters.
SELECT line, checksum, group
FROM TableX
WHERE run = 2
EXCEPT
SELECT line, checksum, group
FROM TableX
WHERE run = 1
or (with slightly different result):
SELECT *
FROM TableX x
WHERE run = 2
AND NOT EXISTS
( SELECT *
FROM TableX x2
WHERE run = 1
AND x2.line = x.line
AND x2.checksum = x.checksum
AND x2.group = x.group
)
A slightly different approach:
select min(run) run, line, checksum, group
from mytable
where run in (1,2)
group by line, checksum, group
having count(*)=1 and min(run)=2
Incidentally, I assume that the "group" column in your table isn't actually called group - this is a reserved word in SQL and would need to be enclosed in double quotes (or backticks or square brackets, depending on which RDBMS you are using).

Finding the row with most common attribute using SQL

I have the following table in my database:
user_id | p1 | p2 | p3
1 | x | y | z
2 | x | x | x
3 | y | y | z
I need to find the row(s) that contains the most common value between that same row.
i.e., the first row has no common value, the second contain three common values and the third one contains two common values.
Then, the output in this case should be
user_id | p1 | p2 | p3
2 | x | x | x
Any ideas?
(It would be nice if the solution did not require a vendor-specific feature, but anything will help).
For a non vendor specific solution You could do
SELECT *
FROM YourTable
ORDER BY
CASE WHEN p1=p2 THEN 1 ELSE 0 END +
CASE WHEN p1=p3 THEN 1 ELSE 0 END +
CASE WHEN p2=p3 THEN 1 ELSE 0 END DESC
And then LIMIT, TOP, ROW_NUMBER or whatever dependant upon RDBMS to just get the top row.
But if you have a specific RDBMS in mind there may be other ways that are more maintainable for larger number of columns (e.g. for SQL Server 2008)
SELECT TOP 1 *
FROM YourTable
ORDER BY
(SELECT COUNT (DISTINCT p) FROM (VALUES(p1),(p2),(p3)) T(p))
Also how do you want ties handled?

How to sort sql result using a pre defined series of rows

i have a table like this one:
--------------------------------
id | name
--------------------------------
1 | aa
2 | aa
3 | aa
4 | aa
5 | bb
6 | bb
... one million more ...
and i like to obtain an arbitrary number of rows in a pre defined sequence and the other rows ordered by their name. e.g. in another table i have a short sequence with 3 id's:
sequ_no | id | pos
-----------------------
1 | 3 | 0
1 | 1 | 1
1 | 2 | 2
2 | 65535 | 0
2 | 45 | 1
... one million more ...
sequence 1 defines the following series of id's: [ 3, 1, 2]. how to obtain the three rows of the first table in this order and the rest of the rows ordered by their name asc?
how in PostgreSQL and how in mySQL? how would a solution look like in hql (hibernate query language)?
an idea i have is to first query and sort the rows which are defined in the sequence and than concat the other rows which are not in the sequence. but this involves tow queries, can it be done with one?
Update: The final result for the sample sequence [ 3, 1, 2](as defined above) should look like this:
id | name
----------------------------------
3 | aa
1 | aa
2 | aa
4 | aa
5 | bb
6 | bb
... one million more ...
i need this query to create a pagination through a product table where part of the squence of products is a defined sequence and the rest of the products will be ordered by a clause i dont know yet.
I'm not sure I understand the exact requirement, but won't this work:
SELECT ids.id, ids.name
FROM ids_table ids LEFT OUTER JOIN sequences_table seq
WHERE ids.id = seq.id
ORDER BY seq.sequ_no, seq.pos, ids.name, ids.id
One way: assign a position (e.g. 0) to each id that doesn't have a position yet, UNION the result with the second table, join the result with the first table, and ORDER BY seq_no, pos, name.