please i have a table like
customer_no product_code
1345 001
1345 002
1345 003
i want a new table that will show me these details
customer_no product_code, product_code
1345 001 002
1345 001 003
1345 002 001
1345 002 003
1345 003 001
1345 003 002
This will give you the desired output.
create yourNewTableName as (
select t1.customer_no,
t1.product_code,
t2.product_code
from yourOldTableName t1
inner join yourOldTableName t2
on t1.customer_no = t2.customer_no
where t1.product_code != t2.product_code
);
Related
So I have a table as follows:
ID create_date
001 01/01/2021
002 02/04/2021
003 07/22/2021
004 01/29/2021
005 03/01/2021
ID is unique for the table.
I have another table (below) where these IDs appear multiple times alongside another variable, titled code_id.
ID code_id date data
001 A 01/01/2021 xxx
002 W 02/08/2021 xxx
002 B 03/06/2021 xxx
001 A 01/19/2021 xxx
002 C 05/01/2021 xxx
004 D 12/01/2021 xxx
001 K 01/02/2021 xxx
001 J 01/15/2021 xxx
005 A 03/01/2021 xxx
005 A 03/01/2021 xxx
005 B 03/05/2021 xxx
005 B 03/30/2021 xxx
005 C 03/30/2021 xxx
005 D 04/01/2021 xxx
What I want to do is create a new table (preferably via CTE, but open to join options) which show the distinct count of code_id after both 5 and 30 days from table1.create_date.
So in other words, how many different code_id's appear for each ID after x days from create_date, where x is equal to 5 and 30 respectively.
Here is the resulting table I seek:
ID distinct_code_id_5_day distinct_code_id_30_day distinct_code_id_total
001 2 3 3
002 1 2 3
003 0 0 0
004 0 0 1
005 2 3 4
In the case of ID = 001,we show all code_id's that appeared from 01/01/2021 - 01/05/2021, inclusive for distinct_code_id_5_day and 01/01/2021 - 01/30/2021, inclusive for distinct_code_id_30_day.
You should be able to solve this with a join and a couple iff() with date math:
with ids as (
select split(value, ' ') x, x[0] id, x[1]::date create_date
from table(split_to_table('001 01/01/2021
002 02/04/2021
003 07/22/2021
004 01/29/2021
005 03/01/2021', '\n'))
), data as(
select split(value, ' ') x, x[0] id, x[7] code_id, x[9]::date date, x[11] data
from table(split_to_table('001 A 01/01/2021 xxx
002 W 02/08/2021 xxx
002 B 03/06/2021 xxx
001 A 01/19/2021 xxx
002 C 05/01/2021 xxx
004 D 12/01/2021 xxx
001 K 01/02/2021 xxx
001 J 01/15/2021 xxx
005 A 03/01/2021 xxx
005 A 03/01/2021 xxx
005 B 03/05/2021 xxx
005 B 03/30/2021 xxx
005 C 03/30/2021 xxx
005 D 04/01/2021 xxx', '\n')))
select id, count(distinct code5), count(distinct code30), count(distinct code_id)
from (
select a.id, iff(a.create_date + 5 >= b.date, b.code_id, null) code5
, iff(a.create_date + 30 >= b.date, b.code_id, null) code30
, b.code_id
from ids a
left outer join data b
where a.id=b.id
)
group by 1
So I have a list as follows:
Table 1
ID TIMESTAMP GROUP
001 2021-04-01 12:51:12.063 A
001 2021-04-04 12:51:12.063 G
001 2021-04-14 10:47:03.022 B
002 2021-01-13 09:46:23.012 C
003 2021-09-10 03:32:53.043 D
004 2021-04-13 01:12:54.056 D
004 2021-04-13 11:12:26.054 A
004 2021-04-13 21:53:36.023 D
005 2021-04-01 13:53:13.023 F
005 2021-04-11 13:53:13.023 J
003 2022-04-13 20:32:11.011 G
006 2021-08-13 20:32:11.011 G
And I also have a list of events:
TABLE 2
EVENT ID TIMESTAMP
eventA 001 2021-04-02 12:51:12.063
eventB 001 2021-04-13 12:51:12.063
eventA 002 2021-04-01 12:51:12.063
eventA 002 2021-04-13 12:51:12.063
eventA 002 2021-04-14 12:51:12.063
eventA 003 2021-10-17 12:51:12.063
eventB 005 2021-04-10 12:51:12.063
eventB 005 2021-04-21 12:51:12.063
eventA 006 2021-05-01 20:32:11.011
And my goal here is for every event in TABLE 2, I want to join the most recent entry from table 1 based on ID. If there are no preceding entries in Table 1, though they exist, they should be null on the join.
So in short, for every row in Table 2, we need to find the most recent group for that ID based on timestamp.
Final Result
EVENT ID TIMESTAMP group
eventA 001 2021-04-02 12:51:12.063 A
eventB 001 2021-04-13 12:51:12.063 G
eventA 002 2021-04-01 12:51:12.063 NULL
eventA 002 2021-04-13 12:51:12.063 C
eventA 002 2021-04-14 12:51:12.063 C
eventA 003 2021-10-17 12:51:12.063 D
eventB 005 2021-04-10 12:51:12.063 F
eventB 005 2021-04-21 12:51:12.063 J
eventA 006 2021-05-01 20:32:11.011 NULL
So if you do a LEFT JOIN based on prior (equal?) timestamps and then prune the overmatches to just the most recent with a QUALIFY this can be done with:
SELECT t2.event
t2.id
t2.timestamp
t1.group
FROM table2 AS t2
LEFT JOIN table1 AS t1
ON t2.id = t1.id AND t2.timestamp >= t1.timestamp
QUALIFY ROW_NUMBER() OVER (
PARTITON BY t2.id, t2.timestamp
ORDER BY t1.timestamp DESC NULLS LAST
) = 1
ORDER BY 1,2,3;
this will work as long as Table2 has no duplicate ID, Timestamp values
Window functions with QUALIFY ROW_NUMBER() work to get the latest row as Simeon shows. I've found that for this type of join (often called an AsOf join) if the tables are very large this join, find the max timestamp and rejoin approach usually completes faster than using a window function:
select J."EVENT", J.ID, J."TIMESTAMP", "GROUP" from
(select * from T2,
lateral (select max(T1."TIMESTAMP") TS from T1 where T1.ID = T2.ID and T1.TIMESTAMP < T2."TIMESTAMP")) J
left join T1 on J.TS = T1."TIMESTAMP"
;
I have a table that has player_id, team_id
I want to find all players who played on the same 3 or more teams.
The expected output would be :
player1, player2, number_of_teams
so far i have something like
SELECT player_id as player1, player_id as player2, count(team_id) as number_of_teams
FROM player_history
WHERE ....
Sample Data:
player_id | team_id
--------------------
001 | 23
001 | 15
001 | 21
002 | 23
002 | 21
002 | 15
002 | 34
003 | 23
003 | 15
003 | 34
003 | 21
004 | 12
004 | 11
004 | 23
should return:
player1 | player2 | number_of_teams
-----------------------------------
001 | 002 | 3
001 | 003 | 3
002 | 003 | 4
What you should do is join your table with itself, on the same team but different players, once found, you should group the result table and count
Since I assume there's more than 2 players in each team and you're looking for different players in the same year as implied (not really specified) in your question, I took the liberty to add it to the join conditions
You can, of course, remove it
SELECT
p1,
p2,
COUNT(team_id) as total
FROM
(
SELECT
h1.team_id,
h1.player_id as p1,
h2.player_id as p2
FROM
player_history h1
INNER JOIN player_history h2 ON h1.team_id = h2.team_id AND h1.player_id != h2.player_id AND h1.year = h2.year
GROUP BY
h1.team_id,
h1.player_id
) sameteam
GROUP BY
p1,
p2
HAVING
total >= 3
Notice that your example result doesn't fit the example data. play 4 should not be on the list
SQLFiddle here
hope it helps
I want to insert into my table a new ID, which makes it possible to cluster lines into one group. My data contain authors who published together, author 1 (auid1) published with author 2 (auid2). I would like to find out if there are groups of authors in my data who published together and build a network. So every group_id will mark one network.
There is an additional condition: authors belong to the same group if every author published with everyone else in his group. That means, one auid can be in more than one group.
Here is an example of my data:
auid_1 auid_2
--------------------
001 002
008 002
010 007
001 008
007 005
005 010
008 003
007 012
004 005
006 005
004 006
004 009
The result should look like this:
auid_1 auid_2 group_id
---------------------------------
001 002 1
008 002 1
010 007 2
001 008 1
007 005 2
005 010 2
008 003 3
007 012 4
004 005 5
006 005 5
004 006 5
004 009 6
Additional information:
I use Qracle 11g, enterprise edition
We have pairs of IDs, examples:
ID1 ID2
--------
1 2
3 2
1 3
4 5
...
We want to allocate a group ID for all pairs which have a relationship to each other. In my example, ID 1 and 2 and 3 (every ID belongs to the other) are one cluster. The next cluster would be 4, 5, ....
We need a SQL query which do this clustering for us. I think, we need recursion? We donĀ“t know the count of IDs per cluster.
Is it understandable now?
I have a table "price_hist" in AmazonRedshift (Postgresql) which has product and price data for 10 countries on daily basis twice a day. I want only latest data for each day for each product
For Example below is the table
Country Product Price(string) Created_On
US 001 $2,300 2015/02/16 00:46:20
US 001 $2,300 2015/02/16 13:27:12
DK 006 kr1,700 2015/02/16 00:46:20
DK 006 kr1,700 2015/02/16 13:27:12
US 002 $5,300 2015/02/15 00:46:20
US 002 $5,300 2015/02/15 13:27:12
US 001 $2,200 2015/02/15 00:46:20
US 001 $2,200 2015/02/15 13:27:12
DK 007 kr28 2015/02/15 00:46:20
DK 007 kr28 2015/02/15 13:27:12
US 001 $2,100 2015/02/14 00:46:20
US 002 $5,200 2015/02/14 13:27:12
DK 007 kr9,100 2015/02/14 00:46:20
DK 007 kr9,100 2015/02/14 13:27:12
Now I want a query which should show always data for today and yesterday with price difference and with a flag for product whether it was available yesterday or not.
Required Output :
Country Product P_today p_yesterday p_change flag created_on
US 001 2300 2200 100 Both 2015/02/16 13:27:12
US 002 0 5300 -5300 Removed 2015/02/15 13:27:12
DK 006 1700 0 1700 Added 2015/02/16 13:27:12
DK 007 0 9100 -9100 Removed 2015/02/15 13:27:12
where column P_Change - Show price changes between today's and yesterday's products.
flag - Create a column to reflect new products added in Today's data and the ones which got removed.
You can do it with something like that:
select country,product,P_today,P_yesterday, (P_today - P_yesterday) as P_change ,
CASE
WHEN P_today > 0 and P_yesterday > 0 then 'both'
WHEN P_today = 0 and P_yesterday > 0 then 'removed'
WHEN P_today > 0 and P_yesterday = 0 then 'added'
END
from
(select
isnull(q1.country,q2.country) as country,isnull(q1.product, q2.product) as product ,isnull(q1.price, 0) as P_today, isnull(q2.price,0) as P_yesterday
from
(select * from product where created_on in (select max(created_on) from product where date_trunc('day', created_on) = '2015-02-16 00:00:00+00' group by product,country)) as q1
full outer join
(select * from product where created_on in (select max(created_on) from product where date_trunc('day', created_on) = '2015-02-15 00:00:00+00' group by product,country)) as q2
on q1.country = q2.country and q1.product = q2.product )
I tested it and it gave me something similar to what you are looking for, see below:
country | product | p_today | p_yesterday | p_change | case
---------+---------+---------+-------------+----------+---------
US | 001 | 2300 | 2300 | 0 | both
US | 002 | 0 | 2300 | -2300 | removed
DK | 006 | 700 | 0 | 700 | added
DK | 007 | 0 | 2300 | -2300 | removed
Hope that helps.