Enrich table with data from other table - sql

Currently I have 2 tables: one with client data (CLIENT_TABLE) and one with gift card information (GIFTCARD_TABLE).
The GIFTCARD_TABLE consists of 100 rows and it has 2 columns: Card_Number and Pin_Code.
Now I need to enrich the CLIENT_TABLE (35 rows) with the 2 columns from the GIFTCARD_TABLE, so every client needs one card_number with its corresponding pin_code and it doesn't matter which one (just don't use the same card number & pin_code twice).
Since these tables don't have any keys which I can use, I don't know how I can do this.
Any suggestions how I can tackle this?
Kind regards

If you want to assign the cards truly random you need:
select *
from
( -- random row_numbers
select dt.*,
row_number() over (order by rnd) as rn
from
( -- 35 random clients
select t.*, random(1,1000000000) as rnd
from CLIENT_TABLE as t
sample randomized allocation 35
) as dt
) as client
join
( -- random row_numbers
select dt.*,
row_number() over (order by rnd) as rn
from
(
select t.*, random(1,1000000000) as rnd
from GIFTCARD_TABLE as t
) as dt
) as card
on client.rn = card.rn
RANDOM can't be used directly in ROW_NUMBER.

Schematically (I may be incorrect in Teradata syntax):
UPDATE
-- this table copy will be updated
CLIENT_TABLE c1
-- this CTE enumerates clients, join enumerates rows in 1st copy
JOIN ( SELECT id, ROW_NUMBER() OVER (ORDER BY id) rn
FROM CLIENT_TABLE ) c2 ON c1.id = c2.id
-- this CTE enumerates cards, join assigns card to client one-to-one by the `rn` number
JOIN ( SELECT *, ROW_NUMBER() OVER (ORDER BY Card_Number) rn
FROM GIFTCARD_TABLE ) g ON c2.rn = g.rn
SET c1.Card_Number = g.Card_Number,
c1.Pin_Code = g.Pin_Code;

Related

Joining two tables at random

I have two tables, on with main data, and another shorter table with additional data.
I would like to join the rows from the shorter table to some of the rows of the main table, at random. For example:
main table:
id
data
1
apple
2
banana
3
cherry
4
date
5
elderberry
6
fig
secondary table:
id
data
1
accordion
2
banjo
Desired Result:
main
secondary
… ?
accordion
… ?
banjo
I can think of one way to do it, using a lot of pre-processing with CTEs:
WITH
cte1 AS (SELECT data FROM main ORDER BY random() LIMIT 2),
cte2 AS (SELECT row_number() OVER() AS row, data FROM cte1),
cte3 AS (SELECT row_number() OVER () AS row, data FROM secondary)
SELECT *
FROM cte2 JOIN cte3 ON cte2.row=cte3.row;
It works, but is there a more straightforward way of joining two tables at random?
I have attached a fiddle: https://dbfiddle.uk/?rdbms=postgres_13&fiddle=21af08976112c7ac7c18329fa3699b8c&hide=2
A CTE is basically just a re-usable template for a subquery.
So this can be golfcoded to using 2 subqueries.
SELECT m.rn, m.data main_data, s.data secondary_data
FROM (SELECT data, ROW_NUMBER() OVER (ORDER BY random()) rn FROM main) m
JOIN (SELECT data, ROW_NUMBER() OVER (ORDER BY random()) rn FROM secondary) s USING (rn)
I could rewrite it to this:
SELECT *
FROM (SELECT row_number() OVER (ORDER BY random()) as id,
data
FROM main
ORDER BY RANDOM()) m1
JOIN secondary s on s.id = m1.id
dbfiddle
Update: LIMIT is not needed after looking at #LukStorm's version
I assumed that you know which table is shorter so there is only one column with generated id's

How do I select 100 records from one table for each unique record from another

I have one table of addresses, another table of coupons. I want to select 10 coupons per address. How would I go about doing that? I know this is very basic, but I've been out of SQL for some time now and trying to get reacquainted with it the best I can...
Table 1
Name Address
-------------------
Store 1 Address 1
Store 2 Address 2
Table 2
Coupons
--------
coupon1
coupon2
...
coupon19
coupon20
You can use window functions:
select t1.*, t2.coupons
from (
select t1.*, row_number() over(order by id) rn
from table1 t1
) t1
inner join (
select t2.*, row_number() over(order by id) rn
from table2 t2
) t2 on (t2.rn - 1) / 10 = t1.rn
The idea is to enumerate rows of each table with row_number(), then join the results with a condition on the row numbers. The above query gives you 10 coupons per address.
To get a stable result, you need a column (or a set of columns) in each table that uniquely identifies each row: I assumed id in both tables.
Do you want 10 coupons per store? 100 coupons per store? Your question response is different than the post. Or maybe you'd like to evenly distribute all available coupons across all the stores? Some of this query is building data to be able to demonstrate the output, but the main thing to focus on is the using of NTILE(10) to break up the Coupons into ten groups that can then have a ROW_NUMBER applied to it that gives you ten coupons per id value that can be joined upon...
WITH random_data AS
(
SELECT ROW_NUMBER() OVER (ORDER BY id) AS nums
FROM sysobjects
), store_info AS
(
SELECT ROW_NUMBER() OVER (ORDER BY nums) AS join_id,
'Store' + CONVERT(VARCHAR(10),nums) AS StoreName,
'Address' + CONVERT(VARCHAR(10),nums) AS StoreAddress
FROM random_data
), more_random_data AS
(
SELECT ROW_NUMBER() OVER (ORDER BY t2.nums) AS nums
FROM random_data t1
CROSS JOIN random_data t2
), coupons AS
(
SELECT NTILE(10) OVER (ORDER BY nums) AS group_id,
'Coupon' + CONVERT(VARCHAR(10),nums) AS Coupon,
nums
FROM more_random_data
), coupons_with_join_id AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY group_id ORDER BY nums) AS join_id,
Coupon
FROM coupons
)
SELECT StoreName, StoreAddress, Coupon
FROM store_info AS si
JOIN coupons_with_join_id AS cwji
ON si.join_id = cwji.join_id
ORDER BY si.join_id, Coupon
The inherent issue here is that the 2 tables have no relation to each other. So your options are either to force a pseudo relation, like the other answers show, or create a relation between the two tables, like adding a store_name column to the coupon table.
This distributes all coupons (almost) evenly across all adresses:
with addr as
( -- prepare addresses by adding a sequence
select Name, Address,
-- 1-n
row_number() over (order by name) as rn
from table1
)
,coup as
( -- prepare coupons by adding same "sequence"
select coupons,
-- 1-n, same number of coupons (+/-1) for each address
ntile((select count(*) from table1))
over (order by coupons) as num
from table2
)
select *
from addr
join coup
on addr.rn = coup.num

Remove duplicates based on a condition

For the below given data set I want to remove the row which has later timestamp.
**37C1Z2990E5E0 (TRXID) should be UNIQUE** in the below dataSet
JKLAMMSDF123 20141112 20141117 5000.0 P 1.22 RT101018 *2014-11-12 10:10:26* 37C1Z2990E5E0 101018
JKLAMMSDF123 20141110 20141114 5000.0 P 1.22 RT161002 *2014-11-12 10:11:33* 37C1Z2990E5E0 161002
-- More rows
Try this:
;WITH DATA AS
(
SELECT TRXID, MAX(YourTimestampColumn) AS TS
FROM YourTable
GROUP BY TRXID
HAVING COUNT(*) > 1
)
DELETE T
FROM YourTable AS T
INNER JOIN DATA AS D
ON T.TRXID = D.TRXID
AND T.YourTimestampColumn = D.TS;
Select the min of the timestamp column and group by all of the other columns.
SELECT MIN(TIMESTAMP), C1, C2, C3...
FROM YOUR_TABLE
GROUP BY C1, C2, C3..
I will do this by using window function plus CTE.
To check the result after removing duplicates use this.
;WITH DATA
AS (SELECT *,
Row_number()OVER(partition BY TRXID ORDER BY YourTimestampColumn) rn
FROM YourTable)
select *
FROM data
WHERE rn = 1
To delete the duplicates use this.
;WITH DATA
AS (SELECT *,
Row_number()OVER(partition BY TRXID ORDER BY YourTimestampColumn) rn
FROM YourTable)
DELETE FROM data
WHERE rn > 1
This will work even if you more than one duplicate for same TRXID

how to join two tables in sql server with out duplication

Hi I have two tables A and B
Table A:
Order Pick up
100 Toronto
100 Mississauga
100 Scarborough
Table B
Order Drop off
100 Oakvile
100 Hamilton
100 Milton
Please let me know how can I can get this output (ie I just want to join the fields from in B in right hand side of A)
Order pickup Dropoff
100 Toronto oakvile
100 Mississauga Hamilton
100 Scarborough Milton
How can I write query for the same I try to join a.rownum = b.rownum but no luck.
As OP has not mention any RDBMS
I am taking the liberty for taking SQL SERVER 2008 as his RDBMS. If OP wants the following Query can be converted to any other RDBMS easily.
select A.[Order],
ROW_NUMBER() OVER(ORDER BY A.[Pick up]) rn1,
A.[Pick up]
into A1
FROM A
;
select B.[Order],
ROW_NUMBER() OVER(ORDER BY B.[Drop off]) rn2,
B.[Drop off]
into B1
FROM B
;
Select A1.[Order],
A1.[Pick up],
B1.[Drop off]
FROM A1
INNER JOIN B1 on A1.rn1=B1.rn2
SQL FIDDLE to Test
From the use rownum, I'm presuming that you are using Oracle. You can attempt the following:
select a.Order as "order", a.Pickup, b.DropOff
from (select a.*, rownum as seqnum
from a
) a join
(select b.*, rownum as seqnum
from b
) b
on a.order = b.order and a.seqnum = b.seqnum;
(This assumes that all orders match up exactly.)
I must emphasize that although this might seem to work (and it should work on small examples), it will not work in general. And, it will not work on data that has deleted records. And, it probably won't work on parallel systems. If you have a small amount of data, I'd suggest dumping it in Excel and doing the work there -- that way, you can see if the pairs make sense.
Also, if you do have a column that specifies the ordering, then basically the same structure will work:
select coalesce(a.Order, b.Order) as "order", a.Pickup, b.DropOff
from (select a.*,
row_number() over (partition by "order" order by <ordering field>) as seqnum
from a
) a join
(select b.*,
row_number() over (partition by "order" order by <ordering field>) as seqnum
from b
) b
on a.order = b.order and a.seqnum = b.seqnum;
I'd use a CTE along with the ROW_NUMBER windowing function.
WITH keyed_A AS (
SELECT
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS id
,[Order]
,[Pick Up]
FROM A
), keyed_B AS (
SELECT
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS id
,[Order]
,[Drop Off]
FROM B
)
SELECT
a.[Pick Up]
,b.[Drop Off]
FROM keyed_A AS a
INNER JOIN keyed_B AS b
ON a.id = b.id
;
The CTE can be thought of as a virtual table with an id that crosses the two tables. The OVER clause with the Windowing function ROW_NUMBER can be used to create an id in the CTE. Since we are relying on the physical storage of the records (not a good idea, please add keys to the tables) we can ORDER BY (SELECT NULL) which means just use the order in will be read in.
SQLFiddle to test

Select random values from each group, SQL

I have a project through which I'm creating a game powered by a database.
The database has data entered like this:
(ID, Name) || (1, PhotoID),(1,PhotoID),(1,PhotoID),(2,PhotoID),(2,PhotoID) and so on. There are thousands of entries.
This is my current SQL statement:
$sql = "SELECT TOP 8 * FROM Image WHERE Hidden = '0' ORDER BY NEWID()";
But this can also produce results with matching IDs, where I need to have each result have a unique ID (that is I need one result from each group).
How can I change my query to grab one result from each group?
Thanks!
Since ORDER BY NEWID() will result in tablescan anyway, you might use row_number() to isolate first in group:
; with randomizer as (
select id,
name,
row_number() over (partition by id
order by newid()) rn
from Image
where hidden = 0
)
select top 8
id,
name
from randomizer
where rn = 1
-- Added by mellamokb's suggestion to allow groups to be randomized
order by newid()
Sql Fiddle playground thanks to mellamokb.
Looks like this may work, but I can't vouch for performance:
SELECT TOP 8 ID,
(select top 1 name from image i2
where i2.id = i1.id order by newid())
FROM Image i1
WHERE hidden = '0'
group by ID
ORDER BY NEWID();
Demo: http://www.sqlfiddle.com/#!3/657ad/6
If you have an index on the ID column and want to take advantage of the index and avoid a full table scan, do your randomization on the key values first:
WITH IDs AS
(
SELECT DISTINCT ID
FROM Image
WHERE Hidden = '0'
),
SequencedIDs AS
(
SELECT ID, ROW_NUMBER() OVER (ORDER BY NEWID()) AS Seq
FROM IDs
),
ImageGroups AS
(
SELECT i.*, ROW_NUMBER() OVER (PARTITION BY i.ID ORDER BY NEWID()) Seq
FROM SequencedIDs s
INNER JOIN Image i
ON i.ID = s.ID
WHERE s.Seq < 8
AND i.Hidden = '0'
)
SELECT *
FROM ImageGroups
WHERE Seq = 1
This should drastically reduce the cost over the table scan approach, although I don't have a schema big enough that I can test with - so try running some statistics in SSMS and make sure ID is actually indexed for this to be effective.
select * from (select * from photos order by rand()) as _SUB group by _SUB.id;
select ID, Name from (select ID, Name, row_number() over
(partition by ID, Name order by ID) as ranker from Image where Hidden = 0 ) Z where ranker = 1
order by newID()