SQL: Select only users who are new in 2021 - sql

If we have a table as follows:
User_ID
Order_date
Order_ID
1
2020-02-02
23
2
2021-03-03
45
1
2021-02-02
13
3
2019-05-23
34
3
2021-01-31
56
How to select only the user whose first order is in the year 2021 (in this case, only User 2)?

You can use aggregation:
select user_id
from t
group by user_id
having min(order_date) >= '2021-01-01';
This checks that the earliest order date is after the first of the year.

Related

How to calculate total worktime per week [SQL]

I have a table of EMPLOYEES that contains information about the DATE and WORKTIME per that day. Fx:
ID | DATE | WORKTIME |
----------------------------------------
1 | 1-Sep-2014 | 4 |
2 | 2-Sep-2014 | 6 |
1 | 3-Sep-2014 | 5.5 |
1 | 4-Sep-2014 | 7 |
2 | 4-Sep-2014 | 4 |
1 | 9-Sep-2014 | 8 |
and so on.
Question: How can I create a query that would allow me to calculate amount of time worked per week (HOURS_PERWEEK). I understand that I need a summation of WORKTIME together with grouping considering both, ID and week, but so far my trials as well as googling didnt yield any results. Any ideas on this? Thank you in advance!
edit:
Got a solution of
select id, sum (worktime), trunc(date, 'IW') week
from employees
group by id, TRUNC(date, 'IW');
But will need somehow to connect that particular output with DATE table by updating a newly created column such as WEEKLY_TIME. Any hints on that?
You can find the start of the ISO week, which will always be a Monday, using TRUNC("DATE", 'IW').
So if, in the query, you GROUP BY the id and the start of the week TRUNC("DATE", 'IW') then you can SELECT the id and aggregate to find the SUM the WORKTIME column for each id.
Since this appears to be a homework question and you haven't attempted a query, I'll leave it at this to point you in the correct direction and you can complete the query.
Update
Now I need to create another column (lets call it WEEKLY_TIME) and populate it with values from the current output, so that Sep 1,3,4 (for ID=1) would all contain value 16.5, specifying that on that day (that is within the certain week) that person worked 16.5 in total. And for ID=2 it would then be a value of 10 for both Sep 2 and 4.
For this, if I understand correctly, you appear to not want to use aggregation functions and want to use the analytic version of the function:
select id,
"DATE",
trunc("DATE", 'IW') week,
worktime,
sum (worktime) OVER (PARTITION BY id, trunc("DATE", 'IW'))
AS weekly_time
from employees;
Which, for the sample data:
CREATE TABLE employees (ID, "DATE", WORKTIME) AS
SELECT 1, DATE '2014-09-01', 4 FROM DUAL UNION ALL
SELECT 2, DATE '2014-09-02', 6 FROM DUAL UNION ALL
SELECT 1, DATE '2014-09-03', 5.5 FROM DUAL UNION ALL
SELECT 1, DATE '2014-09-04', 7 FROM DUAL UNION ALL
SELECT 2, DATE '2014-09-04', 4 FROM DUAL UNION ALL
SELECT 1, DATE '2014-09-09', 8 FROM DUAL;
Outputs:
ID
DATE
WEEK
WORKTIME
WEEKLY_TIME
1
2014-09-01 00:00:00
2014-09-01 00:00:00
4
16.5
1
2014-09-03 00:00:00
2014-09-01 00:00:00
5.5
16.5
1
2014-09-04 00:00:00
2014-09-01 00:00:00
7
16.5
1
2014-09-09 00:00:00
2014-09-08 00:00:00
8
8
2
2014-09-04 00:00:00
2014-09-01 00:00:00
4
10
2
2014-09-02 00:00:00
2014-09-01 00:00:00
6
10
db<>fiddle here
edit: answer submitted without noticing "Oracle" tag. Otherwise, question answered here: Oracle SQL - Sum and group data by week
Select employee_Id,
DATEPART(week, workday) as [Week],
sum (worktime) as [Weekly Hours]
from WORK
group by employee_id, DATEPART(week, workday)
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=238b229156a383fa3c466b6c3c2dee1e

Find most visited Hotel by month in PostgreSQL

I have a table with couple of customers resided in a hotel for a month or months. I need to find 3 most visited hotels by month. In case one customer lived in a hotel for three months, then it refers for three month. To be more precise below table hotel I have:
id
usr_id
srch_ci
srch_co
hotel_id
1
13
2021-10-01
2021-11-22
200
2
12
2021-10-11
2021-10-22
300
3
11
2021-10-28
2021-11-05
200
4
10
2021-10-28
2021-12-03
100
Result should look like below:
mnth
hotel_id
rnk
visits
2021-10
200
1
2
2021-10
100
2
1
2021-10
300
2
1
2021-11
200
1
2
2021-11
100
2
1
2021-12
100
1
1
As we can see above, user_id = 10 stayed in a hotel = 100 for 3 different months. That means it is counted for 3 different month for a hotel as 1 count. And for 2021-12 month only user = 10 stayed, for this reason in 2021-12 month hotel = 100 is ranked as 1st.
I solved problem using generate_series function in Postgres. That is what I was looking for. This link helped me. Splitting single row into multiple rows based on date
SELECT hotel_id,mnth,visits,
ROW_NUMBER() OVER (PARTITION BY mnth ORDER BY visits DESC) AS rnk FROM (
SELECT hotel_id,to_char(live_mnth,'YYYY-MM') AS mnth,count(*) AS visits FROM (
SELECT id,usr_id,hotel_id,date_in,date_out,
generate_series(date_in, date_out, '1 MONTH')::DATE AS live_mnth
FROM (
SELECT *,TO_CHAR(srch_ci, 'yyyy-mm-01')::date AS date_in,
TO_CHAR(srch_co, 'yyyy-mm-01')::date AS date_out
FROM hotels
) s
) s GROUP BY hotel_id,to_char(live_mnth,'YYYY-MM')
) t

Count distinct id between months previous year and same months current year Bigquery

I have a dataset in bigquery which contains order_date: DATE and customer_id.
order_date | CustomerID
2020-01-01 | 111
2020-02-01 | 112
2020-03-01 | 111
2021-01-01 | 113
2021-02-01 | 115
2021-03-01 | 119
How can I count distinct customer_id between the months of the previous year and the same months of the current year?
For example, from 2020-01-01 to 2021-01-01, then from 2020-02-01 to 2021-01-01, and so on until the current date and should be grouped by the latest date. The output looks like
order_date| count distinct CustomerID
2021-01-01| 5191
2021-02-01| 4859
2021-03-01| 3567
..........| ....
and the next periods shouldn't include the previous.
Thanks in advance.
If you want just a count for each month you can expand the data and aggregate:
select mon, count(distinct customerid)
from t cross join
unnest(generate_date_array(t.order_date, date_add(t.order_date, interval 11 month), interval 1 month)) mon
group by mon
order by mon;

IF Else or Case Function for SQL select problem

Hi I would like to make a select expression using case or if/else which seems to be a simple solution from logic perspective but I can't seem to get it to work. Basically I am joining against two table here, the first table is customer record with date filter called min_del_date and then the second table for the model scoring table with BIN and update_date parameters.
There are two logics I want to display
Picking the model score that was the month before min_del_date
If model score month before delivery is greater than 50 (Bin > 50) then pick the model score for same month as min_del_date
My 1st logic code is below
with cust as (
select
distinct cust_no, max(del_date) as del_date, min(del_date) as min_del_date, (EXTRACT(YEAR FROM min(del_date)) -1900)*12 + EXTRACT(MONTH FROM min(del_date)) AS upd_seq
from customer.cust_history
group by 1
)
,model as (
select party_id, model_id, update_date, upd_seq, bin, var_data8, var_data2
from
(
select
party_id, update_date, bin, var_data8, var_data2,
(EXTRACT(YEAR FROM UPDATE_DATE) -1900)*12 + EXTRACT(MONTH FROM UPDATE_DATE) AS upd_seq,
dense_Rank() over (partition by (EXTRACT(YEAR FROM UPDATE_DATE) -1900)*12 + EXTRACT(MONTH FROM UPDATE_DATE) order by update_date desc) as rank1
from
(
select party_id,update_date, bin, var_data8, var_data2
from model.rpm_model
group by party_id,update_date, bin, var_data8, var_data2
) model
)model_final
where rank1 = 1
)
-- Add model scores
-- 1st logic Picking the model score that was the month before delivery date
select *
from
(
select cust.cust_no, cust.del_date, cust.min_del_date, model.upd_seq, model.bin
from cust
left join cust
on cust.cust_no = model.party_id
and cust.upd_seq = model.upd_seq + 1
)a
Now I am struggling in creating the 2nd logic in the same query?.. any assistance would be appreciated
cust table
cust_no
min_del_date
upd_seq
123
2021-01-11
1453
234
2020-06-29
1446
456
2020-07-20
1447
model table
party_id
update_date
upd_seq
BIN
123
2020-11-30
1451
22
123
2020-12-25
1452
54
123
2020-01-11
1453
14
234
2020-05-23
1445
76
234
2020-06-18
1446
48
234
2020-07-23
1447
12
456
2020-06-18
1446
23
456
2020-07-23
1447
39
456
2020-08-21
1448
21
desired results
cust_no
min_del_date
model.upd_seq
update_date
BIN
123
2021-01-11
1453
2020-01-11
14
234
2020-06-29
1446
2020-06-18
48
456
2020-07-20
1446
2020-06-18
23
Update
I managed to find the solution by myself, thanks for everyone who has attending this question. The solution is per below
select a.cust_no, a.del_date, a.min_del_date, b.update_date, b.upd_seq, b.bin
from
(
select cust.cust_no, cust.del_date, cust.min_del_date,
CASE WHEN model.BIN <=50 THEN model.upd_seq WHEN BIN > 50 THEN model.upd_seq +1 ELSE NULL END as upd_seq
from cust
inner join model
on cust.cust_no = model.party_id
and cust.upd_seq = model.upd_seq + 1
)a
inner join model b
on a.cust_no = b.party_id
and a.upd_seq = b.upd_seq

SQL query to get records inserted over last 7 days grouped by day

I have a table two very similar tables that store purchases and downloads they both look similar to this with an id and date
id date
1 2020-06-15 18:25:27.415548+01
2 2020-06-15 11:03:30.157502+01
3 2020-06-15 17:09:15.592209+01
4 2020-06-14 18:29:18.332623+01
5 2020-06-13 18:09:31.990473+01
... many more rows ...
I would like to be able execute a Postgres query that returns the count of all the purchases and downloads inserted over the last 7 days grouped by day. An ideal response would look like this
date purchase_count download_count
2020-06-13 37 64
2020-06-14 44 56
2020-06-15 34 63
2020-06-16 41 72
2020-06-17 30 40
2020-06-18 42 55
2020-06-19 9 22
One method uses aggregation with full join:
select dte, coalesce(d.downloads, 00) as downloads, coalesce(p.purchases, 0) as purchases
from (select date_trunc('day', date) as dte, count(*) as downloads
from downloads
group by dte
) d full join
(select date_trunc('day', date) as dte, count(*) as purchases
from purchases
group by dte
) p
using (dte)
order by dte;