How to summarize dates in a table with different timestamps - sql

When I ask this question, I get multiple hits on divcode because there are different timestamps in pickdate. Is there a way to get the question to summarize the date so I get 1 value per division?
/*
Söker Vikt och volym Snitt
*/
select TO_CHAR(ROUND(O08T1.pickdate),'YYYY-MM-DD') as date_1,
O08T1.divcode,
sum(O08T1.calcwght) as Vikt, sum(O08T1.calcvol) as Volym,
count(*) as AntalOrder,
avg(O08T1.calcwght) as avg_vikt,
avg(O08T1.calcvol) as avg_Vol
from O08T1
where O08T1.pickdate >= #('Från datum',#DATE)
group by O08T1.pickdate, O08T1.divcode
order by Pickdate DESC

If you want the values grouped by day then don't use GROUP BY pickdate as that will group by each instant and have values down to an accuracy of a second. Instead, GROUP BY TRUNC(pickdate) which will truncate values to the start of the day and will aggregate all values on the same day together (and then, also, use TRUNC in the first line and in the ORDER BY):
select TO_CHAR(TRUNC(pickdate),'YYYY-MM-DD') as date_1,
divcode,
sum(calcwght) as Vikt,
sum(calcvol) as Volym,
count(*) as AntalOrder,
avg(calcwght) as avg_vikt,
avg(calcvol) as avg_Vol
from O08T1
where pickdate >= #('Från datum',#DATE)
group by TRUNC(pickdate),
divcode
order by TRUNC(pickdate) DESC

Related

How to conditional SQL select

My table consists of user_id, revenue, publish_month columns.
Right now I use group_by user_id and sum(revenue) to get revenue for all individual users.
Is there a single SQL query I can use to query for user revenue across a time period conditionally? If for a specific user, there is a row for this month, I want to query for this month, last month and the month before. If there is not yet a row for this month, I want to query for last month and the two months before.
Any advice with which approach to take would be helpful. If I should be using cases, if-elses with exists or if this is do-able with a single SQL query?
UPDATE---since I did a bad job of describing the question, I've come to include some example data and expected results
Where current month is not present for user 33
Where current month is present
Assuming publish_month is a DATE datatype, this should get the most recent three months of data per user...
SELECT
user_id, SUM(revenue) as s_revenue
FROM
(
SELECT
user_id, revenue, publish_month,
MAX(publish_month) OVER (PARTITION BY user_id) AS user_latest_publish_month
FROM
yourtableyoudidnotname
)
summarised
WHERE
publish_month >= DATEADD(month, -2, user_latest_publish_month)
GROUP BY
user_id
If you want to limit that to the most recent 3 months out of the last 4 calendar months, just add AND publish_month >= DATEADD(month, -3, DATE_TRUNC(month, GETDATE()))
The ambiguity here is why it is important to include a Minimal Reproducible Example
With input data and require results, we could test our code against your requirements
If you're using strings for the publish_month, you shouldn't be, and should fix that with utmost urgency.
You can use a windowing function to "number" the months. In this way the most recent one will have a value of 1, the prior 2, and the one before 3. Then you can only select the items with a number of 3 or less.
Here is how:
SELECT user_id, revienue, publish_month,
ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY publish_month DESC) as RN
FROM yourtableyoudidnotname
now you just select the items with RN less than 3 and do your sum
SELECT user_id, SUM(revenue) as s_revenue
FROM (
SELECT user_id, revenue, publish_month,
ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY publish_month DESC) as RN
FROM yourtableyoudidnotname
) X
WHERE RN <= 3
GROUP BY user_id
You could also do this without a sub query if you use the windowing function for SUM and a range, but I think this is easier to understand.
From the comment -- there could be an issue if you have months from more than one year. To solve this make the biggest number in the order by always the most recent. so instead of
ORDER BY publish_month DESC
you would have
ORDER BY (100*publish_year)+publish_month DESC
This means more recent years will always have a higher number so january of 2023 will be 202301 while december of 2022 will be 202212. Since january is a bigger number it will get a row number of 1 and december will get a row number of 2.

How do I select a data every second with PostgreSQL?

I've got a SQL query that selects every data between two dates and now I would like to add the time scale factor so that instead of returning all the data it returns one data every second, minute or hour.
Do you know how I can achieve it ?
My query :
"SELECT received_on, $1 FROM $2 WHERE $3 <= received_on AND received_on <= $4", [data_selected, table_name, date_1, date_2]
The table input:
As you can see there are several data the same second, I would like to select only one per second
If you want to select data every second, you may use ROW_NUMBER() function partitioned by 'received_on' as the following:
WITH DateGroups AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY received_on ORDER BY adc_v) AS rn
FROM table_name
)
SELECT received_on, adc_v, adc_i, acc_axe_x, acc_axe_y, acc_axe_z
FROM DateGroups
WHERE rn=1
ORDER BY received_on
If you want to select data every minute or hour, you may use the extract function to get the number of seconds in 'received_on' and divide it by 60 to get the minutes or divide it by 3600 to get the hours.
epoch: For date and timestamp values, the number of seconds since 1970-01-01 00:00:00-00 (can be negative); for interval values, the total number of seconds in the interval
Group by minutes:
WITH DateGroups AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY floor(extract(epoch from (received_on)) / 60) ORDER BY adc_v) AS rn
FROM table_name
)
SELECT received_on, adc_v, adc_i, acc_axe_x, acc_axe_y, acc_axe_z
FROM DateGroups
WHERE rn=1
ORDER BY received_on
Group by hours:
WITH DateGroups AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY floor(extract(epoch from (received_on)) / (60*60)) ORDER BY adc_v) AS rn
FROM table_name
)
SELECT received_on, adc_v, adc_i, acc_axe_x, acc_axe_y, acc_axe_z
FROM DateGroups
WHERE rn=1
ORDER BY received_on
See a demo.
When there are several rows per second, and you only want one result row per second, you can decide to pick one of the rows for each second. This can be a randomly chosen row or you pick the row with the greatest or least value in a column as shown in Ahmed's answer.
It would be more typical, though, to aggregate your data per second. The columns show figures and you are interested in those figures. Your sample data shows two times the value 2509 and three times the value 2510 for the adc_v column at 2022-07-29, 15:52. Consider what you would like to see. Maybe you don't want this value go below some boundary, so you show the minimum value MIN(adc_v) to see how low it went in the second. Or you want to see the value that occured most often in the second MODE(adc_v). Or you'd like to see the average value AVG(adc_v). Make this decision for every value, so as to get the informarion most vital to you.
select
received_on,
min(adc_v),
avg(adc_i),
...
from mytable
group by received_on
order by received_on;
If you want this for another interval, say an hour instead of the month, truncate your received_on column accordingly. E.g.:
select
date_trunc('hour', received_on) as received_hour,
min(adc_v),
avg(adc_i),
...
from mytable
group by date_trunc('hour', received_on)
order by date_trunc('hour', received_on);

Is there a way to count how many strings in a specific column are seen for the 1st time?

**Is there a way to count how many strings in a specific column are seen for
Since the value in the column 2 gets repeated sometimes due to the fact that some clients make several transactions in different times (the client can make a transaction in the 1st month then later in the next year).
Is there a way for me to count how many IDs are completely new per month through a group by (never seen before)?
Please let me know if you need more context.
Thanks!
A simple way is two levels of aggregation. The inner level gets the first date for each customer. The outer summarizes by year and month:
select year(min_date), month(min_date), count(*) as num_firsts
from (select customerid, min(date) as min_date
from t
group by customerid
) c
group by year(min_date), month(min_date)
order by year(min_date), month(min_date);
Note that date/time functions depends on the database you are using, so the syntax for getting the year/month from the date may differ in your database.
You can do the following which will assign a rank to each of the transactions which are unique for that particular customer_id (rank 1 therefore will mean that it is the first order for that customer_id)
The above is included in an inline view and the inline view is then queried to give you the month and the count of the customer id for that month ONLY if their rank = 1.
I have tested on Oracle and works as expected.
SELECT DISTINCT
EXTRACT(MONTH FROM date_of_transaction) AS month,
COUNT(customer_id)
FROM
(
SELECT
date_of_transaction,
customer_id,
RANK() OVER(PARTITION BY customer_id
ORDER BY
date_of_transaction ASC
) AS rank
FROM
table_1
)
WHERE
rank = 1
GROUP BY
EXTRACT(MONTH FROM date_of_transaction)
ORDER BY
EXTRACT(MONTH FROM date_of_transaction) ASC;
Firstly you should generate associate every ID with year and month which are completely new then count, while grouping by year and month:
SELECT count(*) as new_customers, extract(year from t1.date) as year,
extract(month from t1.date) as month FROM table t1
WHERE not exists (SELECT 1 FROM table t2 WHERE t1.id==t2.id AND t2.date<t1.date)
GROUP BY year, month;
Your results will contain, new customer count, year and month

Fetch records week wise for last one year in MS SQL

I have table with users data.Columns are Username,Datetime.
I need to fetch count of users weekly wise for last one year data from current date.
I wrote query to fetch last one year's record daywise but I couldn't find a way to retrieve data in weekly wise.
Below is the query to fetch data in day wise.
select count(Username)
from tablename
where (Datetime > DATEADD(year,-1,GETDATE()) and Datetime< GETDATE()-1)
group by Datetime
order by Datetime asc
Can anyone help me to fetch the weekwise count for the same in sql server?
TRY THIS: you have to group by week so you can done in the following way:
SELECT COUNT(Username)
from tablename
where (Datetime > DATEADD(year,-1,GETDATE()) and DATETIME < GETDATE()-1)
group by DATEPART(wk,[DATETIME])
order by [Datetime] ASC
Note: Please try to avoid reserved words as column name and you can enclosed existing one in [].
Use DATEPART IN GROUP BY clause.In where condition first filter one year record and then group by based on DATEPART function to get week wise count
SELECT COUNT(Username) , DATEPART(WEEK,Datetime)
FROM tablename
WHERE DATEDIFF(YEAR,Datetime,GETDATE()) <= 0 AND
DATEDIFF(YEAR,Datetime,DATEADD(YEAR,-1,GETDATE())) >= 0
GROUP BY DATEPART(WEEK,Datetime)
You can just change the group by and order by clause as below.
group by datepart(week,Datetime)
order by datepart(week,Datetime) asc
First, you need to specify extra column, pointing which week of the year it is.
You can create it like this (I assume that you have data from more years and let's call our column Date):
CASE WHEN Date BETWEEN '2017-01-01 00:00:00.000' AND '2018-01-01 00:00:00.000'
THEN floor(cast(datediff(dd, '2017-01-01 00:00:00.000',Date) as real) / 7)
END AS [Week]
Secondly, you have to run the query:
SELECT COUNT(Username) FROM tablename
GROUP BY [Week]
SELECT *, WEEK(created_on) week
FROM Table_Name
WHERE created_on > DATEADD(year,-1,GETDATE())
ORDER BY week

Last day of the month with a twist in SQLPLUS

I would appreciate a little expert help please.
in an SQL SELECT statement I am trying to get the last day with data per month for the last year.
Example, I am easily able to get the last day of each month and join that to my data table, but the problem is, if the last day of the month does not have data, then there is no returned data. What I need is for the SELECT to return the last day with data for the month.
This is probably easy to do, but to be honest, my brain fart is starting to hurt.
I've attached the select below that works for returning the data for only the last day of the month for the last 12 months.
Thanks in advance for your help!
SELECT fd.cust_id,fd.server_name,fd.instance_name,
TRUNC(fd.coll_date) AS coll_date,fd.column_name
FROM super_table fd,
(SELECT TRUNC(daterange,'MM')-1 first_of_month
FROM (
select TRUNC(sysdate-365,'MM') + level as DateRange
from dual
connect by level<=365)
GROUP BY TRUNC(daterange,'MM')) fom
WHERE fd.cust_id = :CUST_ID
AND fd.coll_date > SYSDATE-400
AND TRUNC(fd.coll_date) = fom.first_of_month
GROUP BY fd.cust_id,fd.server_name,fd.instance_name,
TRUNC(fd.coll_date),fd.column_name
ORDER BY fd.server_name,fd.instance_name,TRUNC(fd.coll_date)
You probably need to group your data so that each month's data is in the group, and then within the group select the maximum date present. The sub-query might be:
SELECT MAX(coll_date) AS last_day_of_month
FROM Super_Table AS fd
GROUP BY YEAR(coll_date) * 100 + MONTH(coll_date);
This presumes that the functions YEAR() and MONTH() exist to extract the year and month from a date as an integer value. Clearly, this doesn't constrain the range of dates - you can do that, too. If you don't have the functions in Oracle, then you do some sort of manipulation to get the equivalent result.
Using information from Rhose (thanks):
SELECT MAX(coll_date) AS last_day_of_month
FROM Super_Table AS fd
GROUP BY TO_CHAR(coll_date, 'YYYYMM');
This achieves the same net result, putting all dates from the same calendar month into a group and then determining the maximum value present within that group.
Here's another approach, if ANSI row_number() is supported:
with RevDayRanked(itemDate,rn) as (
select
cast(coll_date as date),
row_number() over (
partition by datediff(month,coll_date,'2000-01-01') -- rewrite datediff as needed for your platform
order by coll_date desc
)
from super_table
)
select itemDate
from RevDayRanked
where rn = 1;
Rows numbered 1 will be nondeterministically chosen among rows on the last active date of the month, so you don't need distinct. If you want information out of the table for all rows on these dates, use rank() over days instead of row_number() over coll_date values, so a value of 1 appears for any row on the last active date of the month, and select the additional columns you need:
with RevDayRanked(cust_id, server_name, coll_date, rk) as (
select
cust_id, server_name, coll_date,
rank() over (
partition by datediff(month,coll_date,'2000-01-01')
order by cast(coll_date as date) desc
)
from super_table
)
select cust_id, server_name, coll_date
from RevDayRanked
where rk = 1;
If row_number() and rank() aren't supported, another approach is this (for the second query above). Select all rows from your table for which there's no row in the table from a later day in the same month.
select
cust_id, server_name, coll_date
from super_table as ST1
where not exists (
select *
from super_table as ST2
where datediff(month,ST1.coll_date,ST2.coll_date) = 0
and cast(ST2.coll_date as date) > cast(ST1.coll_date as date)
)
If you have to do this kind of thing a lot, see if you can create an index over computed columns that hold cast(coll_date as date) and a month indicator like datediff(month,'2001-01-01',coll_date). That'll make more of the predicates SARGs.
Putting the above pieces together, would something like this work for you?
SELECT fd.cust_id,
fd.server_name,
fd.instance_name,
TRUNC(fd.coll_date) AS coll_date,
fd.column_name
FROM super_table fd,
WHERE fd.cust_id = :CUST_ID
AND TRUNC(fd.coll_date) IN (
SELECT MAX(TRUNC(coll_date))
FROM super_table
WHERE coll_date > SYSDATE - 400
AND cust_id = :CUST_ID
GROUP BY TO_CHAR(coll_date,'YYYYMM')
)
GROUP BY fd.cust_id,fd.server_name,fd.instance_name,TRUNC(fd.coll_date),fd.column_name
ORDER BY fd.server_name,fd.instance_name,TRUNC(fd.coll_date)