I have a table which has a large number of entries and from which I need only the first in each group.
The table is used to store daily fund prices (1000+ funds) over the last 30 years. I need to find the last price prior to, or on a specific date for each fund existing on that date (so only one row per fund).
In its simplified form, the table has columns Date, FundCode and Price.
The following input
Date FundCode Price
2016/01/05 X123 1.234
2016/01/04 X123 1.233
2016/01/03 X123 1.222
2016/01/05 A456 1.876
2016/01/04 A456 1.822
2016/01/03 A456 1.776
2016/01/03 M234 3.234
...when queried for 2016/01/04, should produce
Date FundCode Price
2016/01/04 X123 1.233
2016/01/04 A456 1.822
2016/01/03 M234 3.234
I have a solution which uses a correlated subquery in the where but no amount of messing with indexes will make it run in a reasonable amount of time.
I'm sure there's a straightforward solution to this but I just can't see it.
Somethink like
SELECT fundCode, price, date FROM your_table WHERE date<='date_you_need' GROUP BY fundCode HAVING MAX(date)
Query like this works in SQLITE, what db do you use?
A single nested query gives me the max date for each fund, then inner join to this on fundCode/Date, thus...
SELECT
Date,
FundCode,
Price
FROM
PriceHistory H
INNER JOIN
/* this nested query gives the max date for each fund*/
(SELECT
FundCode,
max(Date) AS MaxDate
FROM
PriceHistory H2
WHERE
Date<=#DateToSearchFor
GROUP BY
FundCode) AS RowSelector
ON H.FundCode=RowSelector.FundCode AND H.Date=RowSelector.MaxDate
Related
I have a sqlite3 database maintained on an AWS exchange that is regularly updated by a Python script. One of the things it tracks is when any team generates a new post for a given topic. The entries look something like this:
id
client
team
date
industry
city
895
acme industries
blueteam
2022-06-30
construction
springfield
I'm trying to create a table that shows me how many entries for construction occur each day. Right now, the entries with data populate, but they exclude dates with no entries. For example, if I search for just
SELECT date, count(id) as num_records
from mytable
WHERE industry = "construction"
group by date
order by date asc
I'll get results that looks like this:
date
num_records
2022-04-01
3
2022-04-04
1
How can I make sqlite output like this:
date
num_records
2022-04-02
3
2022-04-02
0
2022-04-03
0
2022-04-04
1
I'm trying to generate some graphs from this data and need to be able to include all dates for the target timeframe.
EDIT/UPDATE:
The table does not already include every date; it only includes dates relevant to an entry. If no team posts work on a day, the date column will jump from day 1 (e.g. 2022-04-01) to day 3 (2022-04-03).
Given that your "mytable" table contains all dates you need as an assumption, you can first select all of your dates, then apply a LEFT JOIN to your own query, and map all resulting NULL values for the "num_records" field to "0" using the COALESCE function.
WITH cte AS (
SELECT date,
COUNT(id) AS num_records
FROM mytable
WHERE industry = "construction"
GROUP BY date
ORDER BY date
)
SELECT dates.date,
COALESCE(cte.num_records, 0) AS num_records
FROM (SELECT date FROM mytable) dates
LEFT JOIN cte
ON dates.date = cte.date
Thanks in advance.
I have Customer records that look like this:
Customer_Number
Create_Date
34343
01/22/2001
54554
03/03/2020
85296
01/01/2001
...
I have about a thousand of these records (customer number is unique) and the bossman wants to see how the number of customers has grown over time.
The output I need:
Customer_Count
Monthly_Bucket
7
01/01/2021
9
02/01/2021
13
03/01/2021
20
04/01/2021
The customer count is cumulative and the Monthly Bucket will just feed the graphing package to make a nice bar chart answering the question "how many customers to we have in total in a particular month and how is it growing over time".
Try the following SELECT SQL with a sub-query:
SELECT Customer_Count=
(
SELECT COUNT(s.[Create_Date])
FROM [Customer_Sales] s
WHERE MONTH(s.[Create_Date]) <= MONTH(t.[Create_Date])
), Monthly_Bucket=MONTH([Create_Date])
FROM Customer_Sales t
WHERE YEAR(t.[Create_Date]) = ????
GROUP BY MONTH(t.[Create_Date])
Where [Customer_Sales] is the sales table and ??? = your year
I am only a beginner in SQL and I am encountering the following problem:
I have a table with a list of SKU orders where each row displays the SKU, DELIVERY DATE, AND ORDER QUANTITY. I want to somehow rearrange the table in a way that the rows contain not only the delivery date for that given quantity, but also the following delivery date that occured in the future.
The table currently looks like that:
SKU/ DELIVERY_DATE/ QUANTITY_ORDERED
1.SKUx 14/3/2020 200
2.SKUx 19/3/2020 400
3.SKUx 27/3/2020 550
What I want to achieve is this:
SKU/ DELIVERY_DATE/ **NEXT_DELIVERY_DATE**/ QUANTITY_ORDERED <br/>
1.SKUx 14/3/2020 **19/3/2020** 200
2.SKUx 19/3/2020 **27/3/2020** 400
3.SKUx 27/3/2020 **NULL** 550
Keep in mind, as shown above, that the days between two deliveries vary (5 days between 14/3-19/3 and 8 days between 27/3-19/3) and therefore cannot pick an absolute value to make the column reappear twice e.g
SELECT SKU, DELIVERY_DATE,
DELIVERY_DATE + 5 AS NEXT_DELIVERY_DATE,
QUANTITY_ORDERED
FROM TABLE1
Any help is much appreciated!
Use lead():
select t1.*,
lead(delivery_date) over (partition by sku order by delivery_date) as next_delivery_date
from table1 t1
I have tried and failed to solve this puzzle, and now im looking for some nice help if anyone is out there. Explanation:
First table consists of the original price on a item.
Table1 (Sales_Data):
+---------+----------+-------+------------+
| Item_Id | Store_Id | Price | Sales_Date |
+---------+----------+-------+------------+
Second table consists of two prices
one is if a store has another price on the item and this has a 0 value in To_Date
because this price should last forever.(Lets call this forever price)
one is if a store has another price on a item for just a period (02.03.2014-10.03.2014) lets call this discount
both prices is stored in Price, but the dates are the big difference.
Table2 (Discount_Data):
+---------+----------+-------+------------+
| Item_Id | Store_Id | Price | Sales_Date |
+---------+----------+-------+------------+
Now to the big Q:
Forever price should always overwrite original price
Discount price should always overwrite original/or forever price for the exact period
Item_Id, and Store_Id has to be the same.
How can i go forward to solve this? Can anyone help me on the way?
I hate to make an assumption, but it appears your second table should include two more columns, From_Date and To_Date. I will alias Discount_Data forever price as ddf and Discount_Data period price as ddp
select sd.Item_Id,
sd.Store_Id,
sd.Price as OriginalPrice,
coalesce(ddd.Price,ddf.Price,sd.Price) as UpdatedPrice
from Sales_Data sd
left join Discount_Data ddf
on sd.Item_Id=ddf.Item_Id
and sd.Store_Id=ddf.Store_Id
and sd.Sales_Date >= ddf.From_Date
and ddf.To_Date=0
left join Discount_Data ddp
on sd.Item_Id=ddp.Item_Id
and sd.Store_Id=ddp.Store_Id
and sd.Sales_Date between ddp.From_Date and ddp.To_Date
Hope that helps.
I have two tables with the following (simplified) structures:
table "Factors" which holds data about purchased goods' factors and has these columns:
FactorSerial, PurchaseDate, PurchasedGood
table "Prices" which holds goods prices on different dates
Serial, GoodCode, EvaluationDate, Price
A price is valid until a new row with the same Code but different date is added and thus updates its value
Now, I want to create a table which adds the price to the table 1 according to purchase date.
So if we have:
PurchaseDate PurchasedGood
-----------------------------
05/20/2011 A111
and:
GoodCode EvaluationDate Price
--------------------------------
A111 02/01/2011 100
...
A111 04/01/2011 110
...
A111 06/01/2011 120
the result would be
PurchaseDate PurchasedGood Price
-----------------------------------
05/20/2011 A111 110
Preferred method is creating the view Prices1 as
Serial GoodCode StartDate EndDate Price
and then joining Factors with this view by
PurchasedDate between StartDate AND EndDate
Can anybody show me how to create view1 (or obtaining the final result with any other method)? Thanks in advance!
P.S. sorry for my bad English!
I want to create a table which adds the price to the table 1 according to purchase date.
Here is a query that returns such data. The syntax is pretty standard SQL, I believe, but this was tested on SQL Server (looks like you may be using PostgreSQL with your "serial" naming).
select a.FactorSerial, a.PurchasedGood, a.PurchaseDate
, (select max(Price) from Prices where GoodCode = a.PurchasedGood and EvaluationDate = a.EvaluationDate) as Price
from (
select f.FactorSerial, f.PurchasedGood, f.PurchaseDate, max(p.EvaluationDate) as EvaluationDate
from Factors as f
join Prices as p on f.PurchasedGood = p.GoodCode
where f.PurchaseDate >= p.EvaluationDate
group by f.FactorSerial, f.PurchasedGood, f.PurchaseDate
) as a
This query assumes that there are no Purchases before a Price existed.
Also, considering:
Preferred method is creating the view Prices1 as
Serial GoodCode StartDate EndDate Price
and then joining Factors with this view by
PurchasedDate between StartDate AND EndDate
between is inclusive. Using this method that you've described, you would get duplicates if a PurchaseDate lies on the EndDate of one row and the StartDate of another.