I have a table which looks like this:
ID
money_earned
days_since_start
1
1000
1
1
2000
2
1
3000
4
1
2000
5
2
1000
1
2
100
3
I want that rows, without a days_since_start (which means that the money_earned column was empty that day) - will include all the days PER ID, with "money" being null to indicate there was no earnings, so it to look like this:
ID
money_earned
days_since_start
1
1000
1
1
2000
2
1
NULL
3
1
3000
4
1
2000
5
2
1000
1
1
NULL
2
2
100
3
I have tried to look up for something like that, but I don't even know what function does that...
thank you!
You can first generate table for your ids and days by query
SELECT d1.id, generate_series(1, max(d1.days_since_start)) AS days_since_start
FROM days d1 JOIN days d2 ON d1.id = d2.id GROUP BY d1.id
(if you need all numbers from 1 to 100, you can replase exspression max(d1.days_since_start) with the number 100)
and then join it with your table. the final query might look like this
WITH genDays AS
(SELECT d1.id, generate_series(1, max(d1.days_since_start)) AS days_since_start
FROM days d1 JOIN days d2 ON d1.id = d2.id GROUP BY d1.id)
SELECT coalesce(genDays.id, d3.id) AS id,
d3.money_earned,
coalesce(d3.days_since_start, genDays.days_since_start) AS days_since_start
FROM days d3 FULL JOIN genDays ON genDays.id = d3.id
AND genDays.days_since_start = d3.days_since_start
the output will be as you need
if you need to fill the nulls with last nonnull value per id then you can modify the query like here
WITH genDays AS
(SELECT d1.id as id, generate_series(1, 100) AS days_since_start
FROM days d1 JOIN days d2 ON d1.id = d2.id GROUP BY d1.id)
SELECT coalesce(genDays.id, d3.id) AS id,
coalesce(d3.money_earned,
(
select d4.money_earned
from days d4
where d4.id = genDays.id
and d4.days_since_start < genDays.days_since_start
order by d4.days_since_start desc
limit 1
)) as money_earned,
coalesce(d3.days_since_start, genDays.days_since_start) AS days_since_start
FROM days d3 FULL JOIN genDays ON genDays.id = d3.id
AND genDays.days_since_start = d3.days_since_start
Related
I have a table historical_data
ID
Date
column_a
column_b
1
2011-10-01
a
a1
1
2011-11-01
w
w1
1
2011-09-01
a
a1
2
2011-01-12
q
q1
2
2011-02-01
d
d1
3
2011-11-01
s
s1
I need to retrieve the whole history of an id based on the date condition on any 1 row related to that ID.
date>='2011-11-01' should get me
ID
Date
column_a
column_b
1
2011-10-01
a
a1
1
2011-11-01
w
w1
1
2011-09-01
a
a1
3
2011-11-01
s
s1
I am aware you can get this by using a CTE or a subquery like
with selected_id as (
select id from historical_data where date>='2011-11-01'
)
select hd.* from historical_data hd
inner join selected_id si on hd.id = si.id
or
select * from historical_data
where id in (select id from historical_data where date>='2011-11-01')
In both these methods I have to query/scan the table ``historical_data``` twice.
I have indexes on both id and date so it's not a problem right now, but as the table grows this may cause issues.
The table above is a sample table, the table I have is about to touch 1TB in size with upwards of 600M rows.
Is there any way to achieve this by only querying the table once? (I am using Snowflake)
Using QUALIFY:
SELECT *
FROM historical_data
QUALIFY MAX(date) OVER(PARTITION BY id) >= '2011-11-01'::DATE;
Look at this sql request:
select distinct erp.users.id
from erp.users
inner join prod.referral_order_delivered
on erp.users.id= prod.referral_order_delivered.user_id::uuid
inner join erp.orders
on erp.orders."userId"::uuid= erp.users.id
where
"paidAt"::date >= '2016-06-07'
and "paidAt"::date <= '2017-07-07'
Let’s say I get a result like this one:
id
2
1
4
5
Now I wanna count how many times the value of these ids appear as value of the column userId in the table erp.orders
For example, if I have erp.orders.userId which is:
userId
2
2
1
4
4
5
5
5
I want the request that is gonna return this:
id number_of_id
2 2
1 1
4 2
5 3
Any ideas?
You need to use the count() function and a group by clause. It'll look something like:
select
erp.users.id
, count(1)
from
erp.users
inner join prod.referral_order_delivered
on erp.users.id = prod.referral_order_delivered.user_id::uuid
inner join erp.orders
on erp.orders."userId"::uuid = erp.users.id
where
"paidAt"::date >= '2016-06-07'
and "paidAt"::date <= '2017-07-07'
group by
erp.users.id
I've 3 tables. Let's say Root, Detail and Revision
I need to select the distinct codes from Root with the highest revision date, having count that the revision lines may not exist and/or have repeteated values in the date column.
Root: idRoot, Code
Detail: idDetail, price, idRoot
Revision: idRevision, date, idDetail
So, i've started doing the join query:
select code, price, date from Root r
inner join Detail d on d.idRoot = r.idRoot
left join Revision r on d.idDetail = r.idDetail;
Having table results like this:
CODE|PRICE|DATE idRevision
---- ----- ----- -----------
C1 100 2/1/2016 1
C1 120 2/1/2016 3
C1 150 null 2
C1 200 1/1/2016 4
C2 300 null null
C3 400 3/1/2016 6
But what I really need is the next result:
CODE|PRICE|DATE idRevision
---- ----- ----- -----------
C1 120 2/1/2016 3
C2 300 null null
C3 400 3/1/2016 6
I've seen several answers for similar cases, but never with null and repeated values:
Oracle: Taking the record with the max date
Fetch the row which has the Max value for a column
Oracle Select Max Date on Multiple records
Any kind of help would be really appreciated
You can use row_number():
select code, price, date
from (select code, price, date,
row_number() over (partition by code order by date desc nulls last, idRevision desc) as seqnum
from Root r inner join
Detail d
on d.idRoot = r.idRoot left join
Revision r
on d.idDetail = r.idDetail
) rdr
where seqnum = 1;
I'm working on a problem which is something like this :
I have a table with many columns but major are DepartmentId and EmployeeIds
Employee Ids Department Ids
------------------------------
A 1
B 1
C 1
D 1
AA 2
BB 2
CC 2
A1 3
B1 3
C1 3
D1 3
I want to write a SQL query such that I take out 2 sample EmployeeIds for each DepartmentID.
like
Employee Id Dept Ids
B 1
C 1
AA 2
CC 2
D1 3
A1 3
Currently I am writing the query,
select
EmployeeId, DeptIds, count(*)
from
table_name
group by 1,2
sample 2
but it gives me total two rows.
Any help?
If the number of departments i know and small you could do a stratified sampling:
select *
from table_name
sample
when DeptIds = 1 then 2
when DeptIds = 2 then 2
when DeptIds = 3 then 2
end
Otherwise a combination of RANDOM and ROW_NUMBER:
select *
from
(
sel EmployeeId, DeptIds, random(1,10000000) as rand
from table_name
) as dt
qualify
row_number()
over (partition by DeptIds
order by rand) <= 2
Fee Table
FeeId(PK)managerId amount Type
1 50 100 1
1 50 10000 39
1 50 50000 2
1 50 50000 3
1 50 50000 4
Manager Table
FeeId(FK)Split managerId
1 70 68
Desired Results:
FeeId managerId amount Type
1 50 30 1
1 68 70 1
1 50 3000 39
1 68 7000 39
1 50 15000 2
1 68 35000 2
1 50 15000 3
1 68 35000 3
1 50 15000 4
1 68 35000 4
This dataset is just one record, there are many more FeeId's in my data. A cross join would not take this into account. I basically want to cross join each manager based on the feeId.
The amount column is then recalculated to 70,30 for managerid 68,50 respectivly.
How do I do a cross join on each subset: WHERE f.feeId = m.feeId to get the desired results?
Example of cross join with incorrect results since the manager table will have more then 1 fee:
SELECT
f.feeId,
(cast(m.split as decimal) / 100) * f.amount as amount
FROM
dbo.fee f
CROSS JOIN dbo.manager m
As I understand this problem, you are trying to allocate the amount in fee between the two managers. The following query does this by cross joining an additional table, which is used to choose the data for each row.
select f.feeid,
(case when n.n = 1 then f.managerid
when n.n = 2 then m.managerid
end) as managerid,
(case when n.n = 1 then f.amount * (100 - m.split)/100
when n.n = 2 then f.amount * m.split/100
end) as amount, f.type
from fee f cross join
manager m cross join
(select 1 as n union all select 2) as n;
As a comment, this seems like a very unusual data structure.
It seems like this should work:
SELECT f.feeId, ,f.managerID, (cast(m.split as decimal) / 100) * f.amount as amount, f.type
FROM fee f
JOIN manager m
ON f.FeeID = m.FeeID
AND f.managerID = m.managerID