Distincted ids for grouped values - sql

I want to count the distinct ids in each numb and store them in a column :
Tried this:
WITH T AS(
SELECT
MAX(CASE WHEN LOGS like'CAR%' then REPLACE(LOGS,'CAR-','')end)as CAR,
MAX(CASE WHEN LOGS like 'MOT%' then REPLACE(LOGS,'MOT-','')end)as MOTO,
MAX(CASE WHEN LOGS like 'BICYCLE%' then REPLACE(LOGS,'BICYCLE-','')end)as BICYCLE,
MAX(CASE WHEN LOGS like 'SHIP%' then REPLACE(LOGS,'SHIP-','')end)as SHIP,
ID,
ORIG,
DATE_ID ,
NUMB,
STEPS
from dbo.test
group by ORIG,DATE_ID,ID ,NUMB,STEPS
)
SELECT ID,ORIG,NUMB,STEPS,DATE_ID,CAR,MOTO,BICYCLE,SHIP,
(SELECT COUNT(DISTINCT ID) FROM dbo.test tp WHERE ORIG= '4567') as COUNTER
from t
where ORIG= '4567'
and NUMB in('1515','1921','2121')
GROUP BY ID,ORIGIN_URI,NUMB,STEPS,DATE_ID,CAR,MOTO,BICYCLE,SHIP
Receive this output:
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
| ID | ORIG | NUMB | STEPS | DATE_ID | CAR | MOTO | BICYCLE | SHIP | COUNTER |
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
| 1 | 4567 | 1515 | 1 | 20201010 | HONDA | NULL | NULL | NULL | 3 |
| 1 | 4567 | 1515 | 2 | 20201010 | HONDA | NULL | NULL | NULL | 3 |
| 1 | 4567 | 1515 | 3 | 20201010 | HONDA | NULL | NULL | NULL | 3 |
| 2 | 4567 | 1921 | 1 | 20201111 | NULL | KTM | NULL | NULL | 3 |
| 3 | 4567 | 2121 | 1 | 20201231 | NULL | NULL | NULL | BOAT | 3 |
| 3 | 4567 | 2121 | 2 | 20201231 | NULL | NULL | NULL | BOAT | 3 |
| 3 | 4567 | 2121 | 3 | 20201231 | NULL | NULL | NULL | BOAT | 3 |
| 3 | 4567 | 2121 | 4 | 20201231 | NULL | NULL | NULL | BOAT | 3 |
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
As you can see COUNTER columns has the count of distincted ids but for all NUMB
I want to output this:
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
| ID | ORIG | NUMB | STEPS | DATE_ID | CAR | MOTO | BICYCLE | SHIP | COUNTER |
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
| 1 | 4567 | 1515 | 1 | 20201010 | HONDA | NULL | NULL | NULL | 2 |
| 1 | 4567 | 1515 | 2 | 20201010 | HONDA | NULL | NULL | NULL | 2 |
| 2 | 4567 | 1515 | 1 | 20201010 | HONDA | NULL | NULL | NULL | 2 |
| 2 | 4567 | 1921 | 1 | 20201111 | NULL | KTM | NULL | NULL | 1 |
| 3 | 4567 | 2121 | 1 | 20201231 | NULL | NULL | NULL | BOAT | 2 |
| 3 | 4567 | 2121 | 2 | 20201231 | NULL | NULL | NULL | BOAT | 2 |
| 3 | 4567 | 2121 | 3 | 20201231 | NULL | NULL | NULL | BOAT | 2 |
| 1 | 4567 | 2121 | 1 | 20201231 | NULL | NULL | NULL | BOAT | 2 |
+----+--------------+--------+-------+----------+-------+------+---------+------+---------+
1515 has 2 ids
1921 has 1 id
2121 has 2 ids
I tried also to place a GROUP BY NUMB inside (SELECT COUNT(DISTINCT ID) FROM dbo.test tp WHERE ORIG= '4567') but didn't work.

What you seem to want is:
count(distinct steps) over (partition by orig, numb)
Alas, SQL Server doesn't support count(distinct) with window functions.
Happily, there is an easy workaround (which begs the question as to why the above syntax is not supported):
(dense_rank() over (partition by orig, numb order by steps asc) +
dense_rank() over (partition by orig, numb order by steps desc) - 1
) as counter

Related

How I can I add a count to rank null values in SQL Hive?

This is what I have right now:
| time | car_id | order | in_order |
|-------|--------|-------|----------|
| 12:31 | 32 | null | 0 |
| 12:33 | 32 | null | 0 |
| 12:35 | 32 | null | 0 |
| 12:37 | 32 | 123 | 1 |
| 12:38 | 32 | 123 | 1 |
| 12:39 | 32 | 123 | 1 |
| 12:41 | 32 | 123 | 1 |
| 12:43 | 32 | 123 | 1 |
| 12:45 | 32 | null | 0 |
| 12:47 | 32 | null | 0 |
| 12:49 | 32 | 321 | 1 |
| 12:51 | 32 | 321 | 1 |
I'm trying to rank orders, including those who have null values, in this case by car_id.
This is the result I'm looking for:
| time | car_id | order | in_order | row |
|-------|--------|-------|----------|-----|
| 12:31 | 32 | null | 0 | 1 |
| 12:33 | 32 | null | 0 | 1 |
| 12:35 | 32 | null | 0 | 1 |
| 12:37 | 32 | 123 | 1 | 2 |
| 12:38 | 32 | 123 | 1 | 2 |
| 12:39 | 32 | 123 | 1 | 2 |
| 12:41 | 32 | 123 | 1 | 2 |
| 12:43 | 32 | 123 | 1 | 2 |
| 12:45 | 32 | null | 0 | 3 |
| 12:47 | 32 | null | 0 | 3 |
| 12:49 | 32 | 321 | 1 | 4 |
| 12:51 | 32 | 321 | 1 | 4 |
I just don't know how to manage a count for the null values.
Thanks!
You can count the number of non-NULL values before each row and then use dense_rank():
select t.*,
dense_rank() over (partition by car_id order by grp) as row
from (select t.*,
count(order) over (partition by car_id order by time) as grp
from t
) t;

How to remove duplicate values from oracle join?

I want to create a view that present only the results and not present the duplicates, I have 3 tables in oracle database:
The first table contain general information about a person
+-----------+-------+-------------+
| ID | Name | Birtday_date|
+-----------+-------+-------------+
| 1 | Byron | 12/10/1998 |
| 2 | Peter | 01/11/1973 |
| 4 | Jose | 05/02/2008 |
+-----------+-------+-------------+
The second table contain information about a telephone of the people in the first table.
+-------+----------+----------+----------+
| ID |ID_Person |CELL_TYPE | NUMBER |
+-------+- --------+----------+----------+
| 1221 | 1 | 3 | 099141021|
| 2221 | 1 | 2 | 099091925|
| 3222 | 1 | 1 | 098041013|
| 4321 | 2 | 1 | 088043153|
| 4561 | 2 | 2 | 090044313|
| 5678 | 4 | 1 | 092049013|
| 8990 | 4 | 2 | 098090233|
+----- -+----------+----------+----------+
The Third table contain information about a email of the people in the first table.
+------+----------+----------+---------------+
| ID |ID_Person |MAIL_TYPE | Email |
+------+- --------+----------+---------------+
| 221 | 1 | 1 |jdoe#aol.com |
| 222 | 1 | 2 |jdoe1#aol.com |
| 421 | 2 | 1 |xx12#yahoo.com |
| 451 | 2 | 2 |dsdsa#gmail.com|
| 578 | 4 | 1 |sasaw1#sdas.com|
| 899 | 4 | 2 |cvcvsd#wew.es |
+------+----------+----------+---------------+
if i do a inner join with this tables the result will do something like that
+-----+-------+-------------+----------+----------+----------+----------------+
| ID | Name | Birtday_date| CELL_TYPE| NUMBER |MAIL_TYPE|Email |
+-----+-------+-------------+----------+----------+----------+----------------+
| 1 | Byron | 12/10/1998 | 3 | 099141021|1 |jdoe#aol.com |
| 1 | Byron | 12/10/1998 | 3 | 099141021|2 |jdoe1#aol.com |
| 1 | Byron | 12/10/1998 | 2 | 099091925|1 |jdoe#aol.com |
| 1 | Byron | 12/10/1998 | 2 | 099091925|2 |jdoe1#aol.com |
| 1 | Byron | 12/10/1998 | 1 | 098041013|1 |jdoe#aol.com |
| 1 | Byron | 12/10/1998 | 1 | 098041013|2 |jdoe1#aol.com |
| 2 | Peter | 01/11/1973 | 1 | 088043153|1 |xx12#yahoo.com |
| 2 | Peter | 01/11/1973 | 1 | 088043153|2 |dsdsa#gmail.com |
| 2 | Peter | 01/11/1973 | 2 | 090044313|1 |xx12#yahoo.com |
| 2 | Peter | 01/11/1973 | 2 | 090044313|2 |dsdsa#gmail.com |
| 4 | Jose | 05/02/2008 | 1 | 088043153|1 |sasaw1#sdas.com |
| 4 | Jose | 05/02/2008 | 1 | 088043153|2 |cvcvsd#wew.es |
| 4 | Jose | 05/02/2008 | 2 | 088043153|1 |sasaw1#sdas.com |
| 4 | Jose | 05/02/2008 | 2 | 088043153|2 |cvcvsd#wew.es |
+-----+-------+-------------+----------+----------+----------+----------------+
So the result that i will to present in a view is the next
+-----+-------+-------------+----------+----------+----------+----------------+
| ID | Name | Birtday_date| CELL_TYPE| NUMBER |MAIL_TYPE|Email |
+-----+-------+-------------+----------+----------+----------+----------------+
| 1 | Byron | 12/10/1998 | 3 | 099141021|1 |jdoe#aol.com |
| 1 | Byron | 12/10/1998 | | |2 |jdoe1#aol.com |
| 1 | Byron | 12/10/1998 | 2 | 099091925| | |
| 1 | Byron | 12/10/1998 | 1 | 098041013| | |
| 2 | Peter | 01/11/1973 | 1 | 088043153|1 |xx12#yahoo.com |
| 2 | Peter | 01/11/1973 | | |2 |dsdsa#gmail.com |
| 2 | Peter | 01/11/1973 | 2 | 090044313| | |
| 4 | Jose | 05/02/2008 | 1 | 092049013|1 |sasaw1#sdas.com |
| 4 | Jose | 05/02/2008 | | |2 |cvcvsd#wew.es |
| 4 | Jose | 05/02/2008 | 2 | 098090233| | |
+-----+-------+-------------+----------+----------+----------+----------------+
I tried to achieve a similar output using
case
when row_number() over (partition by table1.id order by table2.type) = 1
then table1.value
end
as "VALUE"
But the result is nothing that I expect and some rows they repeats
What you need to do is enumerate the rows and then join on those enumerations. This is tricky, because you don't know how many are in each list. Well, there is another method using conditional aggregation:
select p.id, p.name, p.birthday,
max(cell_type) as cell_type, max(number) as number,
max(mail_type) as mail_type, max(email) as email
from person p left join
((select id_person, cell_type, number,
null as mail_type, null as email,
row_number() over (partition by id_person order by number) as seqnum
from phones
) union all
(select id_person, null as cell_type, null as number,
mail_type, email,
row_number() over (partition by id_person order by email) as seqnum
from emails
)
) pe
on pe.id_person = p.id_person
group by p.id, p.name, p.birthday, pe.seqnum
Hope this helps.
Create table person(ID int ,Name varchar(20), Birtday_date date)
Insert into person values
(1,'Byron' ,'12/10/1998'),
(2,'Peter' ,'01/11/1973'),
(4,'Jose ' ,'05/02/2008')
Create table phones (ID int,ID_Person int,CELL_TYPE int,NUMBER float)
Insert into phones values
(1221, 1 , 3,099141021),
(2221, 1 , 2,099091925),
(3222, 1 , 1,098041013),
(4321, 2 , 1,088043153),
(4561, 2 , 2,090044313),
(5678, 4 , 1,092049013),
(8990, 4 , 2,098090233)
Create table emails(ID int,ID_Person int, MAIL_TYPE int, Email varchar(100))
Insert into emails values
(221, 1 , 1, 'jdoe#aol.com '),
(222, 1 , 2, 'jdoe1#aol.com '),
(421, 2 , 1, 'xx12#yahoo.com '),
(451, 2 , 2, 'dsdsa#gmail.com'),
(578, 4 , 1, 'sasaw1#sdas.com'),
(899, 4 , 2, 'cvcvsd#wew.es ')
select p.id, p.name, p.Birtday_date,
case when Lag(number) over(partition by p.id order by p.id,pe.id) = number then null else cell_type end as cell_type,
case when Lag(number) over(partition by p.id order by p.id,pe.id) = number then null else number end as number,
mail_type as mail_type, email as email
from person p left join
(select pp.ID_Person, cell_type, number, mail_type, email,pp.id from
(select ID_Person, cell_type, number,id,
row_number() over (partition by ID_Person order by id ) as seqnum
from phones
) pp left join
(select ID_Person,
mail_type, email, 1 as seqnum
from emails
)e on pp.ID_Person = e.ID_Person and pp.seqnum = e.seqnum
) pe
on pe.ID_Person = p.Id
order by p.id, pe.id

SQL How count subset with condition

I have the following table:
+--------+------------+----------------+
| saleId | saleDate | contractId |
+--------+------------+----------------+
| 1 | 01.07.2016 | 1001 |
| 2 | 02.07.2016 | 1001 |
| 3 | 03.07.2016 | 1002 |
| 4 | 04.07.2016 | 1002 |
| 5 | 05.07.2016 | 1001 |
| 6 | 06.07.2016 | 1001 |
+--------+------------+----------------+
I want to count number of previuos sales by contract for each sale (each row)
+--------+------------+------------+------------------------+
| saleId | saleDate | contractId | SalesCountPerContract |
+--------+------------+------------+------------------------+
| 1 | 01.07.2016 | 1001 | 0 |
| 2 | 02.07.2016 | 1001 | 1 |
| 3 | 03.07.2016 | 1002 | 0 |
| 4 | 04.07.2016 | 1002 | 1 |
| 5 | 05.07.2016 | 1001 | 2 |
| 6 | 06.07.2016 | 1001 | 3 |
+--------+------------+------------+------------------------+
select t.*
,row_number() over
(partition by contractId order by saleDate) - 1 as SalesCountPerContract
from mytable t

Considering values from one table as column header in another

I have a base table where I need to calculate the difference between two dates based on the type of the entry.
tblA
+----------+------------+---------------+--------------+
| TypeCode | Log_Date | Complete_Date | Pending_Date |
+----------+------------+---------------+--------------+
| 1 | 18/04/2016 | 19/04/2016 | |
| 2 | 10/04/2016 | 18/04/2016 | 15/04/2016 |
| 3 | 12/04/2016 | 19/04/2016 | |
| 4 | 15/04/2016 | 17/04/2016 | 16/04/2016 |
| 5 | 16/04/2016 | 21/04/2016 | |
| 1 | 19/04/2016 | 20/04/2016 | |
| 2 | 20/03/2016 | 31/03/2015 | |
| 3 | 25/03/2016 | 28/03/2016 | |
| 4 | 26/03/2016 | 27/03/2016 | |
| 5 | 27/03/2016 | 30/03/2016 | |
+----------+------------+---------------+--------------+
I have another look up table which has the column names to be considered based on the TypeCode.
tblB
+----------+----------+---------------+
| TypeCode | DateCol1 | DateCol2 |
+----------+----------+---------------+
| 1 | Log_Date | Complete_Date |
| 2 | Log_Date | Pending_Date |
| 3 | Log_Date | Complete_Date |
| 4 | Log_Date | Pending_Date |
| 5 | Log_Date | Complete_Date |
+----------+----------+---------------+
I am doing a simple DATEDIFF between two dates for my calculation. However I want to lookup which columns to consider for this calculation from tblB and apply it on tblA based on the TypeCode.
Resulting table:
For example: When the TypeCode is 2 or 4 then the calculation should be DATEDIFF(d, Log_Date, Pending_Date), otherwise DATEDIFF(d, Log_Date, Complete_Date)
+----------+------------+---------------+--------------+----------+
| TypeCode | Log_Date | Complete_Date | Pending_Date | Cal_Days |
+----------+------------+---------------+--------------+----------+
| 1 | 18/04/2016 | 19/04/2016 | | 1 |
| 2 | 10/04/2016 | 18/04/2016 | 15/04/2016 | 5 |
| 3 | 12/04/2016 | 19/04/2016 | | 7 |
| 4 | 15/04/2016 | 17/04/2016 | 16/04/2016 | 1 |
| 5 | 16/04/2016 | 21/04/2016 | | 5 |
| 1 | 19/04/2016 | 20/04/2016 | | 1 |
| 2 | 20/03/2016 | 31/03/2015 | | |
| 3 | 25/03/2016 | 28/03/2016 | | 3 |
| 4 | 26/03/2016 | 27/03/2016 | | |
| 5 | 27/03/2016 | 30/03/2016 | | 3 |
+----------+------------+---------------+--------------+----------+
Any help would be appreciated. Thanks.
Use JOIN with CASE expression:
SELECT
a.*,
Cal_Days =
DATEDIFF(
DAY,
CASE
WHEN b.DateCol1 = 'Log_Date' THEN a.Log_Date
WHEN b.DateCol1 = 'Complete_Date' THEN a.Complete_Date
ELSE a.Pending_Date
END,
CASE
WHEN b.DateCol2 = 'Log_Date' THEN a.Log_Date
WHEN b.DateCol2 = 'Complete_Date' THEN a.Complete_Date
ELSE a.Pending_Date
END
)
FROM TblA a
INNER JOIN TblB b
ON b.TypeCode = a.TypeCode

how to write a query to get multilevel data

I have four tables as below:
tblAccount
Id i sprimary key
+----+-----------------+
| Id | AccName |
+----+-----------------+
| 1 | AccountA |
| 2 | AccountB |
+----+-----------------+
tblLocation
Id is primary key.
+----+---------------+
| Id | LocName |
+----+---------------+
| 1 | LocationA |
| 2 | LocationB |
| 3 | LocationC |
+----+---------------+
tblAccountwiseLocation
Id i sprimary key.LocId and AccId are foreign key.
+----+---------------+---------------+
| Id | LocId | AccId |
+----+---------------+---------------+
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 3 | 1 |
| 4 | 1 | 2 |
| 5 | 2 | 2 |
| 6 | 3 | 2 |
+----+---------------+---------------+
tblRSCMaster
Id i sprimary key.LocId and AccId are foreign key.
+----+---------------+---------------+----------------+------------------+
| Id | LocId | AccId | RSCNo | DateOfAddition |
+----+---------------+---------------+----------------+------------------+
| 1 | 1 | 1 | Acc1_Loc1_1_14 | 15/01/2014 |
| 2 | 2 | 1 | Acc1_Loc2_1_14 | 15/01/2014 |
| 3 | 3 | 1 | Acc1_Loc2_1_14 | 15/01/2014 |
| 4 | 1 | 2 | Acc2_Loc1_1_14 | 15/01/2014 |
| 5 | 2 | 2 | Acc2_Loc2_1_14 | 15/01/2014 |
| 6 | 3 | 2 | Acc2_Loc3_1_14 | 15/01/2014 |
| 7 | 1 | 1 | Acc1_Loc1_2_14 | 15/02/2014 |
| 8 | 2 | 1 | Acc1_Loc2_2_14 | 15/02/2014 |
| 9 | 3 | 1 | Acc1_Loc3_2_14 | 15/02/2014 |
| 10 | 1 | 2 | Acc2_Loc1_2_14 | 15/02/2014 |
| 11 | 2 | 2 | Acc2_Loc2_2_14 | 15/02/2014 |
| 12 | 3 | 2 | Acc2_Loc3_2_14 | 15/02/2014 |
| 13 | 1 | 1 | Acc1_Loc1_3_14 | 15/03/2014 |
| 14 | 2 | 1 | Acc1_Loc2_3_14 | 15/03/2014 |
| 15 | 3 | 1 | Acc1_Loc3_3_14 | 15/03/2014 |
| 16 | 1 | 2 | Acc2_Loc1_3_14 | 15/03/2014 |
| 17 | 2 | 2 | Acc2_Loc2_3_14 | 15/03/2014 |
| 18 | 3 | 2 | Acc2_Loc3_3_14 | 15/03/2014 |
| 19 | 1 | 1 | Acc1_Loc1_4_14 | 15/04/2014 |
| 20 | 2 | 1 | Acc1_Loc2_4_14 | 15/04/2014 |
| 21 | 3 | 1 | Acc1_Loc3_4_14 | 15/04/2014 |
| 22 | 1 | 2 | Acc2_Loc1_4_14 | 15/04/2014 |
| 23 | 2 | 2 | Acc2_Loc2_4_14 | 15/04/2014 |
| 24 | 3 | 2 | Acc2_Loc3_4_14 | 15/04/2014 |
| 25 | 1 | 1 | Acc1_Loc1_5_14 | 15/05/2014 |
| 26 | 2 | 1 | Acc1_Loc2_5_14 | 15/05/2014 |
| 27 | 3 | 1 | Acc1_Loc3_5_14 | 15/05/2014 |
| 28 | 1 | 2 | Acc2_Loc1_5_14 | 15/05/2014 |
| 29 | 2 | 2 | Acc2_Loc2_5_14 | 15/05/2014 |
| 30 | 3 | 2 | Acc2_Loc3_5_14 | 15/05/2014 |
+----+---------------+---------------+----------------+------------------+
Acc1_Loc1_1_14 resembles RSC for LocationA of AccountA for Jan 2014.
I need to get a output as below from tblRSCMaster.
+---------------+---------------+----------------+------------------+
| LocId | AccId | RSCNo | DateOfAddition |
+---------------+---------------+----------------+------------------+
| 1 | 1 | Acc1_Loc1_3_14 | 15/03/2014 |
| 1 | 1 | Acc1_Loc1_4_14 | 15/04/2014 |
| 1 | 1 | Acc1_Loc1_5_14 | 15/05/2014 |
| 2 | 1 | Acc1_Loc2_3_14 | 15/03/2014 |
| 2 | 1 | Acc1_Loc2_4_14 | 15/04/2014 |
| 2 | 1 | Acc1_Loc2_5_14 | 15/05/2014 |
| 3 | 1 | Acc1_Loc3_3_14 | 15/03/2014 |
| 3 | 1 | Acc1_Loc3_4_14 | 15/04/2014 |
| 3 | 1 | Acc1_Loc3_5_14 | 15/05/2014 |
+---------------+---------------+----------------+------------------+
Each account has multiple locations and each location has multiple RSCs.
I need to get last three RSCs for each location for AccountA.
I have tried the below query:
SELECT tblAccountwiseLocation.LocId,tblAccountwiseLocation.AccId,tblRSCMaster.RSCNo,tblRSCMaster.DateOfAddition FROM tblAccountwiseLocation
INNER JOIN tblRSCMaster ON tblAccountwiseLocation.LocId= tblRSCMaster.LocId
where tblRSCMaster.AccId=1
But not getting the proper output.
Please help me out.
Thank you all in advance.
You can wrap the existing query inside a common table expression, and use ROW_NUMBER() to get only the last 3 (by tblRSCMaster.DateOfAddition) entries per tblAccountwiseLocation.LocId.
WITH cte AS (
SELECT tblAccountwiseLocation.LocId,
tblAccountwiseLocation.AccId,
tblRSCMaster.RSCNo,
tblRSCMaster.DateOfAddition,
ROW_NUMBER() OVER (PARTITION BY tblAccountwiseLocation.LocId
ORDER BY tblRSCMaster.DateOfAddition DESC) rn
FROM tblAccountwiseLocation
INNER JOIN tblRSCMaster
ON tblAccountwiseLocation.LocId = tblRSCMaster.LocId
AND tblAccountwiseLocation.AccId = tblRSCMaster.AccId
WHERE tblRSCMaster.AccId=1
)
SELECT LocId, AccId, RSCNo, DateOfAddition
FROM cte
WHERE rn <= 3
ORDER BY LocId, AccId, DateOfAddition
An SQLfiddle to test with.
Is this what you need?
select m.*
from (select m.*, row_number() over (partition by accID
order by DateOfAddition desc) as seqnum
from tblRSCMaster
where m.locid = 1
) m
where seqnum <= 3
order by AccId, DateOfAddition;
I think you need to filter on the locid rather than on the AccId to get what you want.