Query to group all the rows with same column value - sql

userid
tenantid
null
a001
null
a002
null
a002
null
a002
null
a001
null
a003
null
a002
null
a003
null
a001
null
a002
I want to set the userid as "distinct_user_#" for the rows with same tenant ids. I can't set the userid manually as tenantids are generated randomly
So output would be something like
userid
tenantid
d_u_1
a001
d_u_2
a002
d_u_2
a002
d_u_3
a003
d_u_1
a001
d_u_3
a003
d_u_2
a002
d_u_3
a003
d_u_1
a001
d_u_2
a002
Any help with this?

We can use DENSE_RANK() here:
SELECT 'd_u_' + CAST(DENSE_RANK() OVER (ORDER BY tenantid) AS varchar(12)) AS userid,
tenantid
FROM yourTable;

You can use the dense_rank function to generate this id:
SELECT 'd_u_' + DENSE_RANK() OVER (PARTITION BY tenantid ORDER BY tenantid) AS userid,
tenantid
FROM mytable

Related

Month Difference between rows by group in PostgreSQL

I have data that looks like this using PostgreSQL
customer name order_id order_date
John A001 1-Jan-2017
John A002 1-Feb-2017
John A003 1-Apr-2017
Smith A004 1-Dec-2016
Smith A005 1-Feb-2017
Jane A006 1-Mar-2017
Dave A007 1-Feb-2017
Dave A008 1-Feb-2017
Dave A009 1-Feb-2017
I'm trying to get the difference between month of repurchases in another column. So something like this.
customer name order_id order_date month_diff
John A001 1-Jan-2017 null
John A002 1-Feb-2017 1
John A003 1-Apr-2017 2
Smith A004 1-Dec-2016 null
Smith A005 1-Feb-2017 3
Jane A006 1-Mar-2017 null
Dave A007 1-Feb-2017 null
Dave A008 1-Feb-2017 0
Dave A009 1-Feb-2017 0
Any suggestion would be greatly appreciation. I'm new to postgreSQL. Thank you in advance
This can easily be done using window functions (assuming that order_date is properly defined as DATE)
select customer_name,
order_id,
order_date,
order_date - lag(order_date) over (partition by customer_name order by order_date) as diff
from order_table
order by customer_name, order_date;
Note that the result of the diff is in days if order_date is a date.
try this:
select
t.customer,
t.order_id,
t.order_date,
(select extract(MONTH from t.order_date) - extract(MONTH from tt.order_date)
from your_table tt
where tt.order_id < t.order_id
order by tt.order_id desc limit 1
)
from your_table t
order by t.order_id
try with first_value():
select
customer_name, order_id, order_date
, order_date - first_value(order_date) over (partition by customer_name order by order_id) as month_diff
from tname;

SQL Query to Count total rows grouping different columns

I have a database that has RMA return data. I want to write a query to return the total number of times a unit has been returned (each return has a unique RMA Number). I also need to return the number of times a unit has returned multiple times, and the number of times it returned for the same symptom. A record is created each time the unit goes to a station (RMA, symptom, and date returned is propagated for each station record).
The data looks like this:
ID SN RMA SYMPTOM Station Date_Returned
21567 A001 84704 POWER FAULT DockRecv 01/01/2015
21568 A001 84704 POWER FAULT Repair 01/01/2015
21569 A001 84704 POWER FAULT Ship 01/01/2015
10235 A002 83494 NO DISPLAY DockRecv 02/20/2015
10236 A002 83494 NO DISPLAY Repair 02/20/2015
10237 A002 83494 NO DISPLAY Ship 02/20/2015
36548 A002 84283 ABNORMAL NOISE DockRecv 10/05/2015
36549 A002 84283 ABNORMAL NOISE Repair 10/05/2015
36550 A002 84283 ABNORMAL NOISE Ship 10/05/2015
38790 A003 83432 HDD FAULT DockRecv 09/15/2015
38791 A003 83432 HDD FAULT Repair 09/15/2015
38792 A003 83432 HDD FAULT Ship 09/15/2015
69613 A003 84276 HDD FAULT DockRecv 01/30/2016
69614 A003 84276 HDD FAULT Repair 01/30/2016
69615 A003 84276 HDD FAULT Ship 01/30/2016
56732 A004 82011 NFF DockRecv 12/01/2015
56733 A004 82011 NFF Repair 12/01/2015
56734 A004 82011 NFF Ship 12/01/2015
My Output needs to look like this:
Total_Returns Repeat_Return Same_Symptom_Return
6 2 1
A001(RMA 84704) is a single return.
A002 is a multiple return-(RMA 83494) is the first return (after repaired, the unit is shipped out) after some time in the field, the unit is returned again A002(RMA 84283).... When a unit is returned, it goes through 3 stations (we create a record for each station (propagating the RMA, symptom, and date returned for each station record).
I can get Total_Returns with the code:
Select count(*) as totalcount
From
(
SELECT
[SN]
,[RMA]
FROM [dbo].[test]
Group by [SN],[RMA]
)as a
There are 3 quite different methods needed to arrive at the counts, so I have used 3 separate sub-queries. see this working at sqlfiddle (but not on MS SQL Server) here: http://sqlfiddle.com/#!5/9df16/1
Result:
| Total_Count | Repeat_Return | Same_Symptom_Return |
|-------------|---------------|---------------------|
| 6 | 2 | 1 |
Query:
select
(select count(distinct SN + RMA + SYMPTOM) from table1) as Total_Count
, (select count(*) from(
SELECT SN
FROM table1
Group by SN
having count(distinct Date_Returned) > 1)
) as Repeat_Return
, (select count(*) from(
SELECT SYMPTOM
FROM table1
Group by SYMPTOM
having count(*)/3 > 1)
) as Same_Symptom_Return
note: you should include "sql server" as a tag on your question (well I presum it is that because of the [dbo].[test]
I got it to work... I'm sure there is a more streamline way to write it...
SELECT
-- Get Total_Returned Count
(Select distinct
count(*) as 'Total_Returned'
From
( SELECT
[SN]
,[RMA]
FROM [dbo].[test]
Group by [SN],[RMA]
)a) AS 'Total_Returned'
-- Get Repeat_Return Count
,(Select distinct
[Repeat_Return] - COUNT(*) OVER() AS [Repeat_Return]
From
( SELECT
COUNT(*) OVER() AS [Repeat_Return]
,[SN]
,[RMA]
FROM [dbo].[test]
Group by [SN],[RMA]
)a Group by [SN],[Repeat_Return]) AS 'Repeat_Return'
-- Get Same_Symptom_Return Count
,(Select distinct
[Same_Symptom_Return] - COUNT(*) OVER() AS [Same_Symptom_Return]
From
( SELECT
COUNT(*) OVER() AS [Same_Symptom_Return]
,[SN]
,[RMA]
,SYMPTOM
FROM [dbo].[test]
Group by SN, SYMPTOM, RMA
)a Group by [SN], SYMPTOM, [Same_Symptom_Return]) AS 'Same_Symptom_Return'
Result:
|Total_Returned | Repeat_Return | Same_Symptom_Return |
|---------------|---------------|---------------------|
| 6 | 2 | 1 |

SQL Server: aggregate to single result

I have this query
SELECT Client.ClientNo,
Client.ContactName,
Deal.Currency,
MAX(Deal.DealDate)
FROM Deal
JOIN Client ON Deal.ClientNo = Client.ClientNo
GROUP BY Client.ClientNo, Client.ContactName, Deal.Currency;
which gives me a result
1 John Smith EUR 2014-10-07
1 John Smith GBP 2014-11-12
2 Jane Doe GBP 2014-09-17
2 Jane Doe USD 2014-12-23
1 John Smith USD 2013-11-13
2 Jane Doe EUR 2012-09-06
Problem is, I need an aggregated result with the latest date per client, like this:
1 John Smith GBP 2014-11-12
2 Jane Doe USD 2014-12-23
How can I change my query to achieve this?
UPDATE Thanks to jarlh for the answer, however I have missed something - if there is a duplicate row - it will remain in the result, looking like this:
1 John Smith GBP 2014-11-12
1 John Smith GBP 2014-11-12
2 Jane Doe USD 2014-12-23
Any way to make that work?
You could do something like this:
Test data:
DECLARE #Deal TABLE(ClientNo INT,Currency VARCHAR(10),DealDate DATETIME)
DECLARE #Client TABLE(ClientNo INT,ContactName VARCHAR(100))
INSERT INTO #Deal
VALUES (1,'EUR','2014-10-07'),(1,'GBP','2014-11-12'),(2,'GBP','2014-09-17'),
(2,'USD','2014-12-23'),(1,'USD','2013-11-13'),(2,'EUR','2012-09-06')
INSERT INTO #Client
VALUES (1,'John Smith'),(2,'Jane Doe')
Query:
;WITH latestDeals
AS
(
SELECT
ROW_NUMBER() OVER(PARTITION BY ClientNo ORDER BY DealDate DESC) AS RowNbr,
Deal.*
FROM
#Deal AS Deal
)
SELECT
client.ClientNo,
client.ContactName,
latestDeals.Currency,
latestDeals.DealDate
FROM
#Client AS client
JOIN latestDeals
ON client.ClientNo=latestDeals.ClientNo
AND latestDeals.RowNbr=1
Update:
If you want to use conventional query. You could do something like this:
SELECT
client.ClientNo,
client.ContactName,
Latestdeal.maxDealDate as DealDate,
deal.Currency
FROM
#Client AS client
JOIN
(
SELECT
MAX(Deal.DealDate) AS maxDealDate,
Deal.ClientNo
FROM
#Deal AS Deal
GROUP BY
Deal.ClientNo
) AS Latestdeal
ON client.ClientNo=Latestdeal.ClientNo
JOIN #Deal as deal
ON client.ClientNo=deal.ClientNo
AND deal.DealDate=Latestdeal.maxDealDate
This will result in the same output
Result:
1 John Smith GBP 2014-11-12 00:00:00.000
2 Jane Doe USD 2014-12-23 00:00:00.000
Untested, but should work. Will return several rows for a clieant if the client has two (or more) deals the same, latest day.
SELECT Client.ClientNo,
Client.ContactName,
Deal.Currency,
Deal.DealDate
FROM Deal
JOIN Client ON Deal.ClientNo = Client.ClientNo
WHERE Deal.DealDate = (select max(DealDate) from Deal
where ClientNo = Client.ClientNo)
Try this,
Test Data:
CREATE TABLE #YourTable
(
CLIENT_NO INT,
CONTACT_NAME VARCHAR(20),
CURRENCY VARCHAR(10),
[DEAL_DATE] DATE
)
INSERT INTO #YourTable VALUES
(1,'John Smith','EUR','2014-10-07'),
(1,'John Smith','GBP','2014-11-12'),
(2,'Jane Doe','GBP','2014-09-17'),
(2,'Jane Doe','USD','2014-12-23'),
(1,'John Smith','USD','2013-11-13'),
(2,'Jane Doe','EUR','2012-09-06')
Query:
SELECT CLIENT_NO,CONTACT_NAME,CURRENCY,[DEAL_DATE]
FROM (SELECT *,
Row_Number()
OVER (
PARTITION BY CLIENT_NO
ORDER BY [DEAL_DATE] DESC) AS RN
FROM #YourTable)A
WHERE RN = 1

PowerPivot DAX Set the Maximum Value Per Group

I am working on a report and need to report hours per employee.
However, some people worked longer than Max Hours and a simple sum will not work in the following case...
I tried to use the Min Function but it only works as column level...
I saw =calculation function should work but I am not sure how to write it... below is the example:
Staff ID Date Work Hours
A001 5-Jan-2015 8
A001 6-Jan-2015 8
A001 7-Jan-2015 8
A001 8-Jan-2015 8
A001 9-Jan-2015 8
A002 5-Jan-2015 7
A002 6-Jan-2015 7
A002 7-Jan-2015 6
A002 8-Jan-2015 7
A002 9-Jan-2015 6
Staff ID Staff Name Max Hours Per Week
A001 Person A 35
A002 Person B 35
Output:
Staff ID Hours
A001 35 (instead of 40)
A002 33 (7+7+6+7+6)
Thanks a lot for your help!
Start with a Measure called hours that simply sums the column:
=SUM(Table1[Work Hours])
This measure then uses that sum and does an IF() to check whether the person is over the limit and returns the appropriate number. The SUMX() iterates over each person in order to give you a correct total.
=
SUMX (
VALUES ( Table1[Staff ID] ),
IF (
[Hours] > VALUES ( Table2[Max Hours Per Week] ),
VALUES ( Table2[Max Hours Per Week] ),
[Hours]
)
)
Assumes you have 2 tables called table1 and 2 that are related on Staff ID.

How to extract the latest rows

I have a table like this:
Table A
Date Time ID Ref
110217 91703 A001 A1100056
110217 91703 A001 A1100057
110217 91703 A001 A1100058
110217 91703 A001 A1100059
110217 132440 A001 A1100057
110217 132440 A001 A1100058
110217 132440 A001 A1100060
110217 91703 B001 B1100048
110217 91703 B001 B1100049
110217 132440 B001 B1100049
110217 132440 B001 B1100050
I wish to have the latest data only & the final result should look like this using SQL:
Date Time ID Ref
110217 132440 A001 A1100057
110217 132440 A001 A1100058
110217 132440 A001 A1100060
110217 132440 B001 B1100049
110217 132440 B001 B1100050
(3 records all with the same "latest" time)
The database will self-update by itself at certain time. The problem is: I do not know the exact time, hence I do not know which record is the latest.
This works in SQL Server:
SELECT TOP 1 WITH TIES *
FROM TableA
ORDER BY Date DESC, Time DESC
And this solution is probably server-independent:
SELECT a.*
FROM TableA a
JOIN (
SELECT d.MaxDate, MAX(t.Time) AS MaxTime
FROM TableA t
JOIN (
SELECT MAX(Date) AS MaxDate
FROM TableA
) d
ON t.Date = d.MaxDate
GROUP BY d.MaxDate
) m
ON a.Date = m.MaxDate AND a.Time = m.MaxTime
SELECT * FROM table ORDER BY date DESC, time DESC LIMIT 1;
Will give you the latest row in MySql.
Which database are you using? You can actually concat the two columns after converting them to a date and time format, and then order by date. All this can be achieved in query.