I have a problem which I can not resolve in dax. I have a table in report view -
Real table has other values and contains 10000+ records, but below is example what I would like to achieve:
(in real table the difference between dates in tales does not always equals 1)
user
username
salary
date
1
x
123
14-10-2022
2
y
455
11-10-2022
3
z
333
13-10-2022
4
t
222
12-10-2022
5
h
111
10-10-2022
desired output:
user
username
salary
date
salary (date-1 day)
salary (date-3 days)
1
x
123
14-10-2022
333
455
2
y
455
11-10-2022
111
3
z
333
13-10-2022
222
111
4
t
222
12-10-2022
455
5
h
111
10-10-2022
I know that the way could be self join like
on table1.user = table2.user and table1.date = table2.date - 1 and table1.date = table2.date - 3
but is there any other idea how to achieve desired tables without doing many joins?
Thanks you very much in advance
Related
I would like to convert this SQL query into ANSI SQL. I am having trouble wrapping my head around the logic of this query.
I use Snowflake Data Warehouse, but it does not understand this query because of the 'delete' statement right before join, so I am trying to break it down. From my understanding the row number column is giving me the order from 1 to N based on timestamp and placing it in C. Then C is joined against itself on the rows other than the first row (based on id) and placed in C1. Then C1 is deleted from the overall data, which leaves only the first row.
I may be understanding the logic incorrectly, but I am not used to seeing the 'delete' statement right before a join. Let me know if I got the logic right, or point me in the right direction.
This query was copy/pasted from THIS stackoverflow question which has the exact situation I am trying to solve, but on a much larger scale.
with C as
(
select ID,
row_number() over(order by DT) as rn
from YourTable
)
delete C1
from C as C1
inner join C as C2
on C1.rn = C2.rn-1 and
C1.ID = C2.ID
The specific problem I am trying to solve is this. Let's assume I have this table. I need to partition the rows by primary key combinations (primKey 1 & 2) while maintaining timestamp order.
ID primKey1 primKey2 checkVar1 checkVar2 theTimestamp
100 1 2 302 423 2001-07-13
101 3 6 506 236 2005-10-25
100 1 2 302 423 2002-08-15
101 3 6 506 236 2008-12-05
101 3 6 300 100 2010-06-10
100 1 2 407 309 2005-09-05
100 1 2 302 423 2012-05-09
100 1 2 302 423 2003-07-24
Once the rows are partitioned and the timestamp is ordered within each partition, I need to delete the duplicate checkVar combination (checkVar 1 & 2) rows until the next change. Thus leaving me with the earliest unique row. The rows with asterisks are the ones which need to be removed since they are duplicates.
ID primKey1 primKey2 checkVar1 checkVar2 theTimestamp
100 1 2 302 423 2001-07-13
*100 1 2 302 423 2002-08-15
*100 1 2 302 423 2003-07-24
100 1 2 407 309 2005-09-05
100 1 2 302 423 2012-05-09
101 3 6 506 236 2005-10-25
*101 3 6 506 236 2008-12-05
101 3 6 300 100 2010-06-10
This is the final result. As you can see for ID=100, even though the 1st and 3rd record are the same, the checkVar combination changed in between, which is fine. I am only removing the duplicates until the values change.
ID primKey1 primKey2 checkVar1 checkVar2 theTimestamp
100 1 2 302 423 2001-07-13
100 1 2 407 309 2005-09-05
100 1 2 302 423 2012-05-09
101 3 6 506 236 2005-10-25
101 3 6 300 100 2010-06-10
If you want to keep the earliest row for each id, then you can use:
delete from yourtable yt
where yt.dt > (select min(yt2.dt)
from yourtable yt
where yt2.id = yd.id
);
Your query would not do this, if that is your intent.
I have a list of codes by area and type. I need to get the unique codes for each type, which I can do with a simple SELECT query with a GROUP BY. I now need to know which area does not have one of the codes. So how do I run a query to group by unique values and tell me how records do not have one of the values?
ID Area Type Code
1 10 A 123
2 10 A 456
3 10 B 789
4 10 B 987
5 10 C 654
6 10 C 321
7 20 A 123
8 20 B 789
9 20 B 987
10 20 C 654
11 20 C 321
12 30 A 137
13 30 A 456
14 30 B 579
15 30 B 789
16 30 B 987
17 30 C 654
18 30 C 321
I can run this query to group them by type and get get the unique codes:
SELECT tblExample.Type, tblExample.Code
FROM tblExample
GROUP BY tblExample.Type, tblExample.Code
This gives me this:
Type Code
A 123
A 137
A 456
B 579
B 789
B 987
C 321
C 654
Now I need to know which areas do not have a given code. For example, Code 123 does not appear for Area 10 and code 137 does not appear for codes 10 and 20. How do I write a query to give me that areas are missing a code? The format of the output doesn't matter, I just need to get the results. I'm thinking the results could be in one column or spread out in multiple columns:
Type Code Missing Areas or Missing1 Missing2
A 123 30 30
A 137 10, 20 10 20
A 456 20 20
B 579 10, 20 10 20
B 789
B 987
C 321
C 654
You can get a list of the missing code/areas by first generating all combinations and then filtering out the ones that exist:
select t.type, c.code
from (select distinct type from tblExample) t cross join
(select distinct code from tblExample) c left join
tblExample e
on t.type = e.type and c.code = e.code
where e.type is null;
I have a table with an ID and multiple informative columns. Sometimes however, I can have multiple data for an ID, so I added a column called "Sequence". Here is a shortened example:
ID Sequence Name Tel Date Amount
124 1 Bob 873-4356 2001-02-03 10
124 2 Bob 873-4356 2002-03-12 7
124 3 Bob 873-4351 2006-07-08 24
125 1 John 983-4568 2007-02-01 3
125 2 John 983-4568 2008-02-08 13
126 1 Eric 345-9845 2010-01-01 18
So, I would like to obtain only these lines:
124 3 Bob 873-4351 2006-07-08 24
125 2 John 983-4568 2008-02-08 13
126 1 Eric 345-9845 2010-01-01 18
Anyone could give me a hand on how I could build a SQL query to do this ?
Thanks !
You can calculate the maximum sequence using group by. Then you can use join to get only the maximum in the original data.
Assuming your table is called t:
select t.*
from t join
(select id, MAX(sequence) as maxs
from t
group by id
) tmax
on t.id = tmax.id and
t.sequence = tmax.maxs
I have 3 tables that I want to merge, each with a different column of interest. I also have an id variable that I want to do separate merges "within" id. The idea is that I want to merge X, Y, and Z by date (within ID), and have missing values if that date does not exist for a particular variable.
Table X:
ID Date X
1 2012-01-01 101
1 2012-01-02 102
1 2012-01-03 103
1 2012-01-04 104
1 2012-01-05 105
2 2012-01-01 150
Table Y:
ID Date Y
1 2012-01-01 301
1 2012-01-02 302
1 2012-01-03 303
1 2012-01-11 311
2 2012-01-01 350
Table Z:
ID Date Z
1 2012-01-01 401
1 2012-01-03 403
1 2012-01-04 404
1 2012-01-11 411
1 2012-01-21 421
2 2012-01-01 450
Desired Result Table:
ID Date X Y Z
1 2012-01-01 101 301 401
1 2012-01-02 102 302 .
1 2012-01-03 103 303 403
1 2012-01-04 104 . 404
1 2012-01-05 105 . .
1 2012-01-11 . 311 411
1 2012-01-21 . . 421
2 2012-01-01 150 350 450
Any ideas how to write this SQL statement? I've tried messing around with "full joins" and where statements for cross products, but I keep getting duplicate values for some of my ID-date combinations, or sometimes no ID.
Any help would be appreciated.
Joins can be tricky things. My usual approach is to form the set of Keys first, and then use those keys to get what I want.
SELECT source.ID, source.Date, x.X, y.Y, z.Z
FROM
(
SELECT ID, Date
FROM TableX
UNION
SELECT ID, Date
FROM TableY
UNION
SELECT ID, Date
FROM TableZ
) as source
LEFT JOIN TableX x ON source.ID = x.ID AND source.Date = x.Date
LEFT JOIN TableY y ON source.ID = y.ID AND source.Date = y.Date
LEFT JOIN TableZ z ON source.ID = z.ID AND source.Date = z.Date
ORDER BY source.ID, source.Date
I have a table similar to the following:
employee_id | totalWorkHours | projectID
1 20 123
1 20 321
2 15 222
2 25 333
3 10 434
3 12 343
Is it possible to combine rows based on employee_id, but add totalWorkHours into an actual total for an employee and present in a result set without modifying the table?
So the results would be something like:
employee_id | actualTotalWorkHours
1 40
2 40
3 22
Or is this something better done with the raw result set?
Any help is much appreciated.
Select employee_id, Sum(totalWorkHours) As actualWorkHours
From YourTableName
Group By employee_id
Order By employee_id