MySQL: ORDER BY stat / age? - sql

I have an int field in my product table called product_stat which is incremented everytime a product is looked at. I also have a date field called product_date_added.
In order to figure out the average visits per day a product gets, you need to calculate how many days the product has existed by using the current date and the date the product was added. Then divide the product stat by the amount of days it has existed to get the average visits per day.
OK but what I want to do is select a number of products and order them by visits per day DESC
How can I do that?
Thanks!!!

Something like this should do the trick, using DATEDIFF to get the difference between two dates, and then dividing the product_stat column by that difference.
SELECT
p.*,
p.product_stat/DATEDIFF(CURDATE(),p.product_date_added) as visits_per_day
FROM products p
ORDER BY visits_per_day DESC
Although note that DATEDIFF only came around as of MySQL 4.1.1. If you're using an earlier version you should do "TO_DAYS(CURDATE()) - TO_DAYS(p.product_date_added)" instead.

You will want something like this:
SELECT product_name,
product_stat / datediff(now(), product_date_added) as 'VisitPerDay'
FROM product
ORDER by VisitPerDay DESC

Related

Finding the initial sampled time window after using SAMPLE BY again

I can't seem to find a perhaps easy solution to what I'm trying to accomplish here, using SQL and, more importantly, QuestDB. I also find it hard to put my exact question into words so bear with me.
Input
My real input is different of course but a similar dataset or case is the gas_prices table on the demo page of QuestDB. On https://demo.questdb.io, you can directly write and run queries against some sample database, so it should be easy enough to follow.
The main task I want to accomplish is to find out which month was responsible for the year's highest galon price.
Output
Using the following query, I can get the average galon price per month just fine.
SELECT timestamp, avg(galon_price) as avg_per_month FROM 'gas_prices' SAMPLE BY 1M
timestamp
avg_per_month
2000-06-05T00:00:00.000000Z
1.6724
2000-07-05T00:00:00.000000Z
1.69275
2000-08-05T00:00:00.000000Z
1.635
...
...
Then, I get all these monthly averages, group them by year and return the maximum galon price per year by wrapping the above query in a subquery, like so:
SELECT timestamp, max(avg_per_month) as max_per_year FROM (
SELECT timestamp, avg(galon_price) as avg_per_month FROM 'gas_prices' SAMPLE BY 1M
) SAMPLE BY 12M
timestamp
max_per_year
2000-01-05T00:00:00.000000Z
1.69275
2001-01-05T00:00:00.000000Z
1.767399999999
2002-01-05T00:00:00.000000Z
1.52075
...
...
Wanted output
I want to know which month was responsible for the maximum price of a year.
Looking at the output of the above query, we see that the maximum galon price for the year 2000 was 1.69275. Which month of the year 2000 had this amount as average price? I'd like to display this month in an additional column.
For the first row, July 2000 is shown in the additional column for year 2000 because it is responsible for the highest average price in 2000. For the second row, it was May 2001 as that month had the highest average price of 2001.
timestamp
max_per_year
which_month_is_responsible
2000-01-05T00:00:00.000000Z
1.69275
2000-07-05T00:00:00.000000Z
2001-01-05T00:00:00.000000Z
1.767399999999
2001-05-05T00:00:00.000000Z
...
...
What did I try?
I tried by adding a subquery to the SELECT to have a "duplicate" of some sort for the timestamp column but that's apparently never valid in QuestDB (?), so probably the solution is by adding even more subqueries in the FROM? Or a UNION?
Who can help me out with this? The data is there in the database and it can be calculated. It's just a matter of getting it out.
I think 'wanted output' can be achieved with window functions.
Please have a look at:
CREATE TABLE electricity (ts TIMESTAMP, consumption DOUBLE) TIMESTAMP(ts);
INSERT INTO electricity
SELECT (x*1000000)::timestamp, rnd_double()
FROM long_sequence(10000000);
SELECT day, ts, max_per_day
FROM
(
SELECT timestamp_floor('d', ts) as day,
ts,
avg_in_15_min as max_per_day,
row_number() OVER (PARTITION BY timestamp_floor('d', ts) ORDER BY avg_in_15_min desc) as rn_per_day
FROM
(
SELECT ts, avg(consumption) as avg_in_15_min
FROM electricity
SAMPLE BY 15m
)
) WHERE rn_per_day = 1

SQL calculating running total as you go down the rows but also taking other fields into account

I'm hoping you guys can help with this problem.
I have a set of data which I have displayed via excel.
I'm trying to work out the rolling new cap allowance but need to deduct from previous weeks bookings. I don't want to use a cursor so can anyone help.
I'm going to group by the product id so it will need to start afresh for every product.
In the image, Columns A to D are fixed and I am trying to calculate the data in column E ('New Cap'). The 'New Cap' is the expected results.
Column F gives a detailed formula of what im trying to do.
Not sure what I've done for the post to be marked down.
Thanks
Update:
The formula looks like this.
You want the sum of the cap through this row minus the sum of booked through the previous row. This is easy to do with window functions:
select t.*,
(sum(cap + booked) over (partition by productid order by weekbeg) - booked
) as new_cap
from t;
You can get the new running total using lag and sum over window functions - calculate the cap-booked first, then use sum over() for the running total:
select weekbeg, ProductId, Cap, Booked,
Sum(n) over(partition by productid order by weekbeg) New_Cap
from (
select *, cap - Lag(booked,1,0) over(partition by productid order by weekbeg)n
from t
)t

Sum dates with different timestamps and picking the min date?

Beginner here. I want to have only one row for each delivery date but it is important to keep the hours and the minutes. I have the following table in Oracle (left):
As you can see there are days that a certain SKU (e.g SKU A) was delivered twice in the same day. The table on the right is the desired result. Essentially, I want to have the quantities that arrived on the 28th summed up and in the Supplier_delivery column I want to have the earliest delivery timestamp.
I need to keep the hours and the minutes otherwise I know I could achieve this by writing sth like: SELECT SKU, TRUNC(TO_DATE(SUPPLIER_DELIVERY), 'DDD'), SUM(QTY) FROM TABLE GROUP BY SKU , TRUNC(TO_DATE(SUPPLIER_DELIVERY), 'DDD')
Any ideas?
You can use MIN():
SELECT SKU, MIN(SUPPLIER_DELIVERY), SUM(QTY)
FROM TABLE
GROUP BY SKU, TRUNC(SUPPLIER_DELIVERY);
This assumes that SUPPLIER_DELIVERY is a date and does not need to be converted to one. But it would work with TO_DATE() in the GROUP BY as well.

Querying SQLITE DB for Data from One Column Based On Another Column

I hope the title of this post makes sense.
The db in question has two columns that are related to my issue, a date column that follows the format xx/xx/xxxx and price a column. What I want to do is get a sum of the prices in the price column based on the month and year in which they occurred, but that data is in the other aforementioned column. Doing so will allow me to determine the total for a given month of a given year. The problem is I have no idea how to construct a query that would do what I need. I have done some reading on the web, but I'm not really sure how to go about this. Can anyone provide some advice/tips?
Thanks for your time!
Mike
I was able to find a solution using a LIKE clause:
SELECT sum(price) FROM purchases WHERE date LIKE '11%1234%'
The "11" could be any 2-digit month and the "1234" is any 4 digit year. The % sign acts as a wildcard. This query, for example, returns the sum of any prices that were from month 11 of year 1234 in the db.
Thanks for your input!
You cannot use the built-in date functions on these date values because you have stored them formatted for displaing instead of in one of the supported date formats.
If the month and day fields always have two digits, you can use substr:
SELECT substr(MyDate, 7, 4) AS Year,
substr(MyDate, 1, 2) AS Month,
sum(Price)
FROM Purchases
GROUP BY Year,
Month
So, the goal is to get an aggregate grouping by the month?
select strftime('%m', mydate), sum(price)
from mytable
group by strftime('%m', mydate)
Look into group by

DB2 - Ranking data by timeframe

I am trying to write a report (DB2 9.5 on Solaris) to do the following:
I have a set of data, let's say it's an order table. I want to run a report which will give me, for each month, the number of orders per customer, and their "rank" that month. The rank would be based on the number of orders. I was playing around with the RANK() OVER clauses, but I can't seem to get it to give me a rank per month (or other "group by"). If there are 100 customers and 12 months of data, i would expect 1200 rows in the report, 100 per month, each with a rank between 1 and 100. Let me know if more detail would be helpful. Thanks in advance.
the solution is to use the PARTITION BY clause.
for example, see page 5 here: http://cmsaville.ca/documents/MiscDocs/TopNQueries.pdf