hive/sql:count each user_id gets how many uid

hive/sql:count each user_id gets how many uid - sql

There is a table like:
+-----------+---------+------------+
| uid | user_id | month |
+-----------+---------+------------+
| d23fsdfsa | 101 | 2017-01-02 |
| 43gdasc | 102 | 2017-05-06 |
| b65hrfd | 101 | 2017-08-11 |
| 1wseda | 103 | 2017-09-13 |
| vdfhryd | 101 | 2017-08-06 |
| b6thd3d | 105 | 2017-05-03 |
| ve32h65 | 102 | 2017-01-02 |
| 43gdasc | 102 | 2017-09-06 |
+-----------+---------+------------+
How can one count each user_id where if the user_id appears in the same month, then only count one?
The final table should look like below: (because '101' has two uid in the same month so it only counts one for it)
+---------+-----------+
| user_id | count_num |
+---------+-----------+
| 101 | 2 |
| 102 | 3 |
| 103 | 1 |
| 105 | 1 |
+---------+-----------+

If I understand correctly, you want the number of distinct months for each user. If so:
select user_id, count(distinct trunc(month, 'MONTH')) as count_num
from t
group by user_id;

Related

PostgresSql:Comparing two tables and obtaining its result and compare it with third table

TABLE 2 : trip_delivery_sales_lines
+-------+---------------------+------------+----------+------------+-------------+--------+--+
| Sl no | Order_date | Partner_id | Route_id | Product_id | Product qty | amount | |
+-------+---------------------+------------+----------+------------+-------------+--------+--+
| 1 | 2020-08-01 04:25:35 | 34567 | 152 | 432 | 2 | 100 | |
| 2 | 2021-09-11 02:25:35 | 34572 | 130 | 312 | 4 | 150 | |
| 3 | 2020-05-10 04:25:35 | 34567 | 152 | 432 | 3 | 123 | |
| 4 | 2021-02-16 01:10:35 | 34572 | 130 | 432 | 5 | 123 | |
| 5 | 2020-02-19 01:10:35 | 34567 | 152 | 432 | 2 | 600 | |
| 6 | 2021-03-20 01:10:35 | 34569 | 152 | 123 | 1 | 123 | |
| 7 | 2021-04-23 01:10:35 | 34570 | 152 | 432 | 4 | 200 | |
| 8 | 2021-07-08 01:10:35 | 34567 | 152 | 432 | 3 | 32 | |
| 9 | 2019-06-28 01:10:35 | 34570 | 152 | 432 | 2 | 100 | |
| 10 | 2018-11-14 01:10:35 | 34570 | 152 | 432 | 5 | 20 | |
| | | | | | | | |
+-------+---------------------+------------+----------+------------+-------------+--------+--+
From Table 2 : we had to find partners in route=152 and find the sum of product_qty of the last 2 sale [can be selected by desc order_date]
. We can find its result in table 3.
34567 – Serial number [ 1,8]
34570 – Serial number [ 7,9]
34569 – Serial number [6]
TABLE 3 : RESULT OBTAINED FROM TABLE 1,2
+------------+-------+
| Partner_id | count |
+------------+-------+
| 34567 | 5 |
| 34569 | 1 |
| 34570 | 6 |
| | |
+------------+-------+
From table 4 we want to find the above partner_ids leaf count
TABLE 4 :coupon_leaf
+------------+-------+
| Partner_id | Leaf |
+------------+-------+
| 34567 | XYZ1 |
| 34569 | XYZ2 |
| 34569 | DDHC |
| 34567 | DVDV |
| 34570 | DVFDV |
| 34576 | FVFV |
| 34567 | FVV |
| | |
+------------+-------+
From that we can find result as:
34567 – 3
34569-2
34570 -1
TABLE 5: result obtained from TABLE 4
+------------+-------+
| Partner_id | count |
+------------+-------+
| 34567 | 3 |
| 34569 | 2 |
| 34570 | 1 |
| | |
+------------+-------+
Now we want compare table 3 and 5
If partner_id count [table 3] > partner_id count [table 4]
Print partner_id
I want a single query to do all these operation
distinct partner_id can be found by: fROM TABLE 1
SELECT DISTINCT partner_id
FROM trip_delivery_sales ts
WHERE ts.route_id='152'
GROUP BY ts.partner_id

This answers the original version of the problem.
You seem to want to compare totals after aggregating tables 2 and 3. I don't know what table1 is for. It doesn't seem to do anything.
So:
select *
from (select partner_id, sum(quantity) as sum_quantity
from (select tdsl.*,
row_number() over (partition by t2.partner_id order by order_date) as seqnum
from trip_delivery_sales_lines tdsl
) tdsl
where seqnum <= 2
group by tdsl.partner_id
) tdsl left join
(select cl.partner_id, count(*) as leaf_cnt
from coupon_leaf cl
group by cl.partner_id
) cl
on cl.partner_id = tdsl.partner_id
where leaf_cnt is null or sum_quantity > leaf_cnt

Query to reorganize dates

I need to do a transformation of a Postgres database table and I don't know where to start.
This is the table:
| Customer Code | Activity | Start Date |
|:---------------:|:--------:|:----------:|
| 100 | A | 01/05/2017 |
| 100 | A | 19/07/2017 |
| 100 | B | 18/09/2017 |
| 100 | C | 07/12/2017 |
| 101 | A | 11/02/2018 |
| 101 | B | 02/04/2018 |
| 101 | B | 14/06/2018 |
| 100 | A | 13/07/2018 |
| 100 | B | 14/08/2018 |
Customers can perform activities A, B and C, always in that order.
To carry out activity B he/she has to carry out activity A. To carry out C, he/she has to carry out activity A, then to B.
An activity or cycle can be performed more than once by the same customer.
I need to reorganize the table in this way, placing the beginning and end of each step:
| Customer Code | Activity | Start Date | End Date |
|:---------------:|:--------:|:----------:|:----------:|
| 100 | A | 01/05/2017 | 18/09/2017 |
| 100 | B | 18/09/2017 | 07/12/2017 |
| 100 | C | 07/12/2017 | 13/07/2018 |
| 101 | A | 11/02/2018 | 02/04/2018 |
| 101 | B | 02/04/2018 | |
| 100 | A | 13/07/2018 | 14/08/2018 |
| 100 | B | 14/08/2018 | |

Here is approach at this gaps-and-islands problem:
select
customer_code,
activity,
start_date,
case when (activity, lead(activity) over(partition by customer_code order by start_date))
in (('A', 'B'), ('B', 'C'), ('C', 'A'))
then lead(start_date) over(partition by customer_code order by start_date)
end end_date
from (
select
t.*,
lead(activity) over(partition by customer_code order by start_date) lead_activity
from mytable t
) t
where activity is distinct from lead_activity
The query starts by removing consecutive rows that have the same customer_code and activity. Then, we use conditional logic to bring in the start_date of the next row when the activty is in sequence.
Demo on DB Fiddle:
customer_code | activity | start_date | end_date
------------: | :------- | :--------- | :---------
100 | A | 2017-07-19 | 2017-09-18
100 | B | 2017-09-18 | 2017-12-07
100 | C | 2017-12-07 | 2018-07-13
100 | A | 2018-07-13 | 2018-08-14
100 | B | 2018-08-14 | null
101 | A | 2018-02-11 | 2018-06-14
101 | B | 2018-06-14 | null

Finding MAX date aggregated by order - Oracle SQL

I have a data orders that looks like this:
| Order | Step | Step Complete Date |
|:-----:|:----:|:------------------:|
| A | 1 | 11/1/2019 |
| | 2 | 11/1/2019 |
| | 3 | 11/1/2019 |
| | 4 | 11/3/2019 |
| | 5 | 11/3/2019 |
| | 6 | 11/5/2019 |
| | 7 | 11/5/2019 |
| B | 1 | 12/1/2019 |
| | 2 | 12/2/2019 |
| | 3 | |
| C | 1 | 10/21/2019 |
| | 2 | 10/23/2019 |
| | 3 | 10/25/2019 |
| | 4 | 10/25/2019 |
| | 5 | 10/25/2019 |
| | 6 | |
| | 7 | 10/27/2019 |
| | 8 | 10/28/2019 |
| | 9 | 10/29/2019 |
| | 10 | 10/30/2019 |
| D | 1 | 10/30/2019 |
| | 2 | 11/1/2019 |
| | 3 | 11/1/2019 |
| | 4 | 11/2/2019 |
| | 5 | 11/2/2019 |
What I need to accomplish is the following:
For each order, assign the 'Order_Completion_Date' field as the most recent 'Step_Complete_Date'. If ANY 'Step_Complete_Date' is NULL, then the value for 'Order_Completion_Date' should be NULL.
I set up a SQL FIDDLE with this data and my attempt, below:
SELECT
OrderNum,
MAX(Step_Complete_Date)
FROM
OrderNums
WHERE
Step_Complete_Date IS NOT NULL
GROUP BY
OrderNum
This is yielding:
ORDERNUM MAX(STEP_COMPLETE_DATE)
D 11/2/2019
A 11/5/2019
B 12/2/2019
C 10/30/2019
How can I achieve:
| OrderNum | Order_Completed_Date |
|:--------:|:--------------------:|
| A | 11/5/2019 |
| B | NULL |
| C | NULL |
| D | 11/2/2019 |

Aggregate function with KEEP can handle this
select ordernum,
max(step_complete_date)
keep (DENSE_RANK FIRST ORDER BY step_complete_date desc nulls first) res
FROM
OrderNums
GROUP BY
OrderNum

You can use a CASE expression to first count if there are any NULL values and if not then find the maximum value:
Query 1:
SELECT OrderNum,
CASE
WHEN COUNT( CASE WHEN Step_Complete_Date IS NULL THEN 1 END ) > 0
THEN NULL
ELSE MAX(Step_Complete_Date)
END AS Order_Completion_Date
FROM OrderNums
GROUP BY OrderNum
Results:
| ORDERNUM | ORDER_COMPLETION_DATE |
|----------|-----------------------|
| D | 11/2/2019 |
| A | 11/5/2019 |
| B | (null) |
| C | (null) |

First, you are representing dates as varchars in mm/dd/yyyy format (at least in fiddle). With max function it can produce incorrect result, try for example order with dates '11/10/2019' and '11/2/2019'.
Second, the most simple solution is IMHO to use fallback date for nulls and get null back when fallback date wins:
SELECT
OrderNum,
NULLIF(MAX(NVL(Step_Complete_Date,'~')),'~')
FROM
OrderNums
GROUP BY
OrderNum
(Example is still for varchars since tilde is greater than any digit. For dates, you could use 9999-12-31, for instance.)

What SQL query should I perform to get the result set as I expected?

I'm having problem to get the result I need :/ these are my tablets. On Postgresql
table: logins table: users
+------------------------+ +------------------+
| iduser | date | | iduser | name |
|------------------------| |------------------|
| 1 |'2017-06-06'| | 1 | Joe |
|------------------------| |------------------|
| 1 |'2017-06-06'| | 2 | Jane |
|------------------------| |------------------|
| 2 |'2017-06-07'| | 3 | Mary |
|------------------------| +------------------+
| 3 |'2017-06-07'|
|------------------------|
| 3 |'2017-06-07'|
|------------------------|
| 3 |'2017-06-07'|
+------------------------+
Im Using this query:
SELECT name, date, count(*) FROM logins l
LEFT JOIN users u
ON u.iduser= l.iduser
GROUP BY
u.name,l.date
ORDER BY
l.date
This it what I got:
+-----------------------------------+
| name | date | count |
|-----------------------------------|
| Joe | '2017-06-06' | 2 |
|-----------------------------------|
| Jane | '2017-06-07' | 1 |
|-----------------------------------|
| Mary | '2017-06-07' | 3 |
+-----------------------------------+
but what I really need to get from the result its this:
+-----------------------------------+
| name | date | count |
|-----------------------------------|
| Joe | '2017-06-06' | 2 |
|-----------------------------------|
| Jane | '2017-06-06' | 0 |
|-----------------------------------|
| Mary | '2017-06-06' | 0 |
|-----------------------------------|
| Joe | '2017-06-07' | 0 |
|-----------------------------------|
| Jane | '2017-06-07' | 1 |
|-----------------------------------|
| Mary | '2017-06-07' | 3 |
+-----------------------------------+
What should I do? please help!!! thanks a lot! ^^

In SQL Server & Postgres:
Getting all combinations of date and users, then left join to login:
select
d.date
, u.name
, count(l.iduser) as login_count
from (select distinct date from logins) d
cross join users u
left join logins l
on l.iduser=u.iduser
and l.date=d.date
group by d.date, u.name
rextester demo (sql server): http://rextester.com/THJE85313
rextester demo (postgres): http://rextester.com/BNHE97192
returns:
+---------------------+------+-------------+
| date | name | login_count |
+---------------------+------+-------------+
| 2017-06-06 00:00:00 | Jane | 0 |
| 2017-06-07 00:00:00 | Jane | 1 |
| 2017-06-06 00:00:00 | Joe | 2 |
| 2017-06-07 00:00:00 | Joe | 0 |
| 2017-06-06 00:00:00 | Mary | 0 |
| 2017-06-07 00:00:00 | Mary | 3 |
+---------------------+------+-------------+

How to check dates condition from one table to another in SQL

Which way we can use to check and compare the dates from one table to another.
Table : inc
+--------+---------+-----------+-----------+-------------+
| inc_id | cust_id | item_id | serv_time | inc_date |
+--------+---------+-----------+-----------+-------------+
| 1 | john | HP | 40 | 17-Apr-2015 |
| 2 | John | HP | 60 | 10-Jan-2016 |
| 3 | Nick | Cisco | 120 | 11-Jan-2016 |
| 4 | samanta | EMC | 180 | 12-Jan-2016 |
| 5 | Kerlee | Oracle | 40 | 13-Jan-2016 |
| 6 | Amir | Microsoft | 300 | 14-Jan-2016 |
| 7 | John | HP | 120 | 15-Jan-2016 |
| 8 | samanta | EMC | 20 | 16-Jan-2016 |
| 9 | Kerlee | Oracle | 10 | 2-Feb-2017 |
+--------+---------+-----------+-----------+-------------+
Table: Contract:
+-----------+---------+----------+------------+
| item_id | con_id | Start | End |
+-----------+---------+----------+------------+
| Dell | DE2015 | 1/1/2015 | 12/31/2015 |
| HP | HP2015 | 1/1/2015 | 12/31/2015 |
| Cisco | CIS2016 | 1/1/2016 | 12/31/2016 |
| EMC | EMC2016 | 1/1/2016 | 12/31/2016 |
| HP | HP2016 | 1/1/2016 | 12/31/2016 |
| Oracle | OR2016 | 1/1/2016 | 12/31/2016 |
| Microsoft | MS2016 | 1/1/2016 | 12/31/2016 |
| Microsoft | MS2017 | 1/1/2017 | 12/31/2017 |
+-----------+---------+----------+------------+
Result:
+-------+---------+---------+--------------+
| Calls | Cust_id | Con_id | Tot_Ser_Time |
+-------+---------+---------+--------------+
| 2 | John | HP2016 | 180 |
| 2 | samanta | EMC2016 | 200 |
| 1 | Nick | CIS2016 | 120 |
| 1 | Amir | MS2016 | 300 |
| 1 | Oracle | OR2016 | 40 |
+-------+---------+---------+--------------+
MY Query:
select count(inc_id) as Calls, inc.cust_id, contract.con_id,
sum(inc.serv_time) as tot_serv_time
from inc inner join contract on inc.item_id = contract.item_id
where inc.inc_date between '2016-01-01' and '2016-12-31'
group by inc.cust_id, contract.con_id
The result from inc table with filter between 1-jan-2016 to 31-Dec-2016 with
count of inc_id based on the items and its contract start and end dates .

If I understand correctly your problem, this query will return the desidered result:
select
count(*) as Calls,
inc.cust_id,
contract.con_id,
sum(inc.serv_time) as tot_serv_time
from
inc inner join contract
on inc.item_id = contract.item_id
and inc.inc_date between contract.start and contract.end
where
inc.inc_date between '2016-01-01' and '2016-12-31'
group by
inc.cust_id,
contract.con_id
the question is a little vague so you might need some adjustments to this query.

select
Calls = count(*)
, Cust = i.Cust_id
, Contract = c.con_id
, Serv_Time = sum(Serv_Time)
from inc as i
inner join contract as c
on i.item_id = c.item_id
and i.inc_date >= c.[start]
and i.inc_date <= c.[end]
where c.[start]>='20160101'
group by i.Cust_id, c.con_id
order by i.Cust_Id, c.con_id
returns:
+-------+---------+----------+-----------+
| Calls | Cust | Contract | Serv_Time |
+-------+---------+----------+-----------+
| 1 | Amir | MS2016 | 300 |
| 2 | John | HP2016 | 180 |
| 1 | Kerlee | OR2016 | 40 |
| 1 | Nick | CIS2016 | 120 |
| 2 | samanta | EMC2016 | 200 |
+-------+---------+----------+-----------+
test setup: http://rextester.com/WSYDL43321
create table inc(
inc_id int
, cust_id varchar(16)
, item_id varchar(16)
, serv_time int
, inc_date date
);
insert into inc values
(1,'john','HP', 40 ,'17-Apr-2015')
,(2,'John','HP', 60 ,'10-Jan-2016')
,(3,'Nick','Cisco', 120 ,'11-Jan-2016')
,(4,'samanta','EMC', 180 ,'12-Jan-2016')
,(5,'Kerlee','Oracle', 40 ,'13-Jan-2016')
,(6,'Amir','Microsoft', 300 ,'14-Jan-2016')
,(7,'John','HP', 120 ,'15-Jan-2016')
,(8,'samanta','EMC', 20 ,'16-Jan-2016')
,(9,'Kerlee','Oracle', 10 ,'02-Feb-2017');
create table contract (
item_id varchar(16)
, con_id varchar(16)
, [Start] date
, [End] date
);
insert into contract values
('Dell','DE2015','20150101','20151231')
,('HP','HP2015','20150101','20151231')
,('Cisco','CIS2016','20160101','20161231')
,('EMC','EMC2016','20160101','20161231')
,('HP','HP2016','20160101','20161231')
,('Oracle','OR2016','20160101','20161231')
,('Microsoft','MS2016','20160101','20161231')
,('Microsoft','MS2017','20170101','20171231');

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

hive/sql:count each user_id gets how many uid - sql

If I understand correctly, you want the number of distinct months for each user. If so: select user_id, count(distinct trunc(month, 'MONTH')) as count_num from t group by user_id;

Related

PostgresSql:Comparing two tables and obtaining its result and compare it with third table

Query to reorganize dates

Finding MAX date aggregated by order - Oracle SQL

What SQL query should I perform to get the result set as I expected?

How to check dates condition from one table to another in SQL

Categories

Resources