I have a scenario in one of the implementations to be done on Hive for reporting. I have a table structure that currently looks as below -
+------+------+----------+----------+-------+-------+--------+--------+
| Col1 | Col2 | M1_Today | M2_Today | M1_LW | M2_LW | M1_L2W | M2_L2W |
+------+------+----------+----------+-------+-------+--------+--------+
| A | A1 | 10 | 200 | 9 | 190 | 11 | 210 |
| A | A2 | 12 | 210 | 11 | 200 | 13 | 220 |
| B | B1 | 15 | 300 | 14 | 290 | 16 | 310 |
| B | B2 | 18 | 310 | 17 | 300 | 19 | 320 |
+------+------+----------+----------+-------+-------+--------+--------+
The columns in the table need to be transformed to appear as below -
+------+------+-------+----+-----+
| Col1 | Col2 | Col3 | M1 | M2 |
+------+------+-------+----+-----+
| A | A1 | Today | 10 | 200 |
| A | A1 | LW | 9 | 190 |
| A | A1 | L2W | 11 | 210 |
| A | A2 | Today | 12 | 210 |
| A | A2 | LW | 11 | 200 |
| A | A2 | L2W | 13 | 220 |
| B | B1 | Today | 15 | 300 |
| B | B1 | LW | 16 | 310 |
| B | B1 | L2W | 14 | 290 |
| B | B2 | Today | 18 | 310 |
| B | B2 | LW | 17 | 300 |
| B | B2 | L2W | 19 | 320 |
+------+------+-------+----+-----+
How can this be achieved via SQL. I am using HIVE as my datastore. Any help is much appreciated
You could use this:
SELECT Col1, Col2, 'Today' AS Col3 , M1_Today AS M1, M2_Today AS M2
FROM table_name
UNION ALL
SELECT Col1, Col2, 'LW' AS Col3 , M1_LW AS M1, M2_LW AS M2
FROM table_name
UNION ALL
SELECT Col1, Col2, 'L2W' AS Col3 , M1_L2W AS M1, M2_L2W AS M2
FROM table_name
ORDER BY Col1, Col2, Col3 DESC;
Related
I have an Oracle DB with 2 tables, Table A and Table B.
Table A has better data quality but only for a limited set of entries. Table A has also multiple entries (because of history) and I only need the last one per number.
So I need to get all entries from Table A and then the rest of the entries which are not in Table A from Table B.
I also need to get some data from Table B into the result of Table A because the info does not exist in Table A (val1, val2, val3).
So probably some sort of JOIN + GROUP BY?
Table A:
number | valid_from | valid_to | pos1 | pos2 | pos3 | factor | loc
100 | 2020-03-01 | 2020-03-10 | 7 | 80 | 18 | 19 | 1
100 | 2020-03-10 | 2020-03-13 | 7 | 80 | 18 | 19 | 1
100 | 2020-03-13 | 2020-03-16 | 8 | 80 | 18 | 20 | 1
200 | 2020-03-02 | 2020-03-03 | 6 | 90 | 19 | 30 | 1
200 | 2020-03-03 | 2020-03-04 | 6 | 90 | 19 | 29 | 1
200 | 2020-03-04 | 2020-03-10 | 9 | 90 | 19 | 30 | 1
300 | 2020-03-10 | 2020-03-12 | 13 | 100 | 10 | 41 | 2
300 | 2020-03-12 | 2020-03-14 | 13 | 100 | 10 | 40 | 2
300 | 2020-03-14 | 2020-03-20 | 10 | 100 | 10 | 40 | 2
Table B:
number | pos1 | pos2 | pos3 | val1 | val2 | val3 | top
100 | 7 | 70 | 18 | a | aa | aaa | 3
200 | 6 | 60 | 19 | b | bb | bbb | 4
300 | 5 | 50 | 10 | c | cc | ccc | 5
400 | 2 | 20 | 2 | d | dd | ddd | 16
500 | 3 | 30 | 3 | e | ee | eee | 28
End result should be:
number | pos1 | pos2 | pos3 | factor | loc | val1 | val2 | val3 | top
100 | 8 | 80 | 18 | 20 | 1 | a | aa | aaa | 3
200 | 9 | 90 | 19 | 30 | 1 | b | bb | bbb | 4
300 | 10 | 100 | 10 | 40 | 2 | c | cc | ccc | 5
400 | 2 | 20 | 2 | NULL | NULL | d | dd | ddd | 16
500 | 3 | 30 | 3 | NULL | NULL | e | ee | eee | 28
How can I achieve this? Do I need a FULL LEFT JOIN and GROUP BY by number? Not sure what to take or how to get the latest entries from Table A.
select b.number, b.pos1, b.pos2, b.pos3,
a.factor, a.loc, b.val1, b.val2, b.val3, b.top
from tableb b
left outer join tablea a
on b.number = a.number
you can also use NVL(b.pos1, a.pos1) which means if b.pos1 is null take a.pos1
I would write this as a left join with filtering:
select b.number, b.pos1, b.pos2, b.pos3,
a.factor, a.loc,
b.val1, b.val2, b.val3, b.top
from b
left join (
select a.*, row_number() over(partition by number order by valid_from desc) rn
from a
) a on a.number = b.number and a.rn = 1
Given the following SQL tables: https://imgur.com/a/NI8VrC7. For each specific ID_t I need to return the MAX() and MIN() value of Cena_c(total price) column of a given ID_t.
| ID_t | Nazwa |
| ---- | ----- |
| 1 | T1 |
| 2 | T2 |
| 3 | T3 |
| 4 | T4 |
| 5 | T5 |
| 6 | T6 |
| 7 | T7 |
| ID | ID_t | Ilosc | Cena_j | Cena_c | ID_p |
| ---- | ---- | ----- | ------ | ------ | ---- |
| 100 | 1 | 1 | 10 | 10 | 1 |
| 101 | 2 | 3 | 20 | 60 | 2 |
| 102 | 4 | 5 | 10 | 50 | 7 |
| 103 | 2 | 2 | 20 | 40 | 5 |
| 104 | 5 | 1 | 30 | 30 | 5 |
| 105 | 7 | 6 | 80 | 480 | 1 |
| 106 | 6 | 7 | 15 | 105 | 2 |
| 107 | 6 | 5 | 15 | 75 | 1 |
| 108 | 3 | 3 | 25 | 75 | 7 |
| 109 | 7 | 1 | 80 | 80 | 5 |
| 110 | 4 | 1 | 10 | 10 | 2 |
| 111 | 2 | 9 | 20 | 180 | 2 |
Based on provided tables the correct result should look like this:
| ID_t | Cena_c_max | Cena_c_min |
| ----- | ---------- | ---------- |
| T1 | 10 | 10 |
| T2 | 180 | 60 |
| T3 | 75 | 75 |
| T4 | 50 | 10 |
| T5 | 30 | 30 |
| T6 | 105 | 75 |
| T7 | 480 | 80 |
Is this even possible?
I haven't found anything yet that I could use to implement my solution.
SELECT concat('T',ID_t), max(Cena_c) as Cena_c_max, min(Cena_c) as Cena_c_min
FROM table
GROUP BY ID_t
Better is to solve it with joins of tables, because it will be avoided in the future if the prefix T is changed to another letter.
Hardcoding should be avoided.
select b.nazva as "Nazva", max(a.cena.c) as "Cena_c_max", min(a.cena.c) as "Cena_c_min"
from table1 as a
left join table2 as b on (
a.id_t = b.id_t
)
group by id_t
Consider the following simplified example:
Table JobTitles
| PersonID | JobTitle | StartDate | EndDate |
|----------|----------|-----------|---------|
| A | A1 | 1 | 5 |
| A | A2 | 6 | 10 |
| A | A3 | 11 | 15 |
| B | B1 | 2 | 4 |
| B | B2 | 5 | 7 |
| B | B3 | 8 | 11 |
| C | C1 | 5 | 12 |
| C | C2 | 13 | 14 |
| C | C3 | 15 | 18 |
Table Transactions:
| PersonID | TransDate | Amt |
|----------|-----------|-----|
| A | 2 | 5 |
| A | 3 | 10 |
| A | 12 | 5 |
| A | 12 | 10 |
| B | 3 | 5 |
| B | 3 | 10 |
| B | 10 | 5 |
| C | 16 | 10 |
| C | 17 | 5 |
| C | 17 | 10 |
| C | 17 | 5 |
Desired Output:
| PersonID | JobTitle | StartDate | EndDate | Amt |
|----------|----------|-----------|---------|-----|
| A | A1 | 1 | 5 | 15 |
| A | A2 | 6 | 10 | 0 |
| A | A3 | 11 | 15 | 15 |
| B | B1 | 2 | 4 | 15 |
| B | B2 | 5 | 7 | 0 |
| B | B3 | 8 | 11 | 5 |
| C | C1 | 5 | 12 | 0 |
| C | C2 | 13 | 14 | 0 |
| C | C3 | 15 | 18 | 30 |
To me this is JobTitles LEFT OUTER JOIN Transactions with some type of moving criteria for the TransDate -- that is, I want to SUM Transaction.Amt if Transactions.TransDate is between JobTitles.StartDate and JobTitles.EndDate per each PersonID.
Feels like some type of partition or window function, but my SQL skills are not strong enough to create an elegant solution. In Excel, this equates to:
SUMIFS(Transaction[Amt], JobTitles[PersonID], Results[#[PersonID]], Transactions[TransDate], ">" & Results[#[StartDate]], Transactions[TransDate], "<=" & Results[#[EndDate]])
Moreover, I want to be able to perform this same logic over several flavors of Transaction tables.
The basic query is:
select jt.PersonID, jt.JobTitle, jt.StartDate, jt.EndDate, coalesce(sum(amt), 0) as amt
from JobTitles jt left join
Transactions t
on jt.PersonId = t.PersonId and
t.TransDate between jt.StartDate and jt.EndDate
group by jt.PersonID, jt.JobTitle, jt.StartDate, jt.EndDate;
I have the following table named foo:
ID | D1 | D2 | D3 |
---------------------
1 | 47 | 3 | 71 |
2 | 47 | 98 | 82 |
3 | 0 | 99 | 3 |
4 | 3 | 100 | 6 |
5 | 48 | 10 | 3 |
6 | 49 | 12 | 4 |
I want to run a select query and have the results show like this
ID | D1 | D2 | D3 | Result |
------------------------------
1 | 47 | 3 | 71 | D3 |
2 | 47 | 98 | 82 | D2 |
3 | 0 | 99 | 3 | D2 |
4 | 3 | 100 | 6 | D2 |
5 | 48 | 10 | 3 | D1 |
6 | 49 | 12 | 4 | D1 |
So, basically I want to get Maximum value between D1, D2, D3 column divided by id.
As You may seen , ID 1 have D3 in the Result column since Maximum value between
D1 : D2 : D3
That Means 4 : 3 : 71 , Max value is 71. Thats Why The Result show 'D3'
Is there a way to do this in a sql query ?
Thanks!
For Oracle please try this one
select foo.*, case when greatest(d1, d2, d3) = d1 then 'D1'
when greatest(d1, d2, d3) = d2 then 'D2'
when greatest(d1, d2, d3) = d3 then 'D3'
end result
from foo
Consider the following - a normalized approach...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id INT NOT NULL
,d INT NOT NULL
,val INT NOT NULL
,PRIMARY KEY(id,d)
);
INSERT INTO my_table VALUES
(1,1,47),
(2,1,47),
(3,1,0),
(4,1,3),
(5,1,48),
(6,1,49),
(1,2,3),
(2,2,98),
(3,2,99),
(4,2,100),
(5,2,10),
(6,2,12),
(1,3,71),
(2,3,82),
(3,3,3),
(4,3,6),
(5,3,3),
(6,3,4);
SELECT * FROM my_table;
+----+---+-----+
| id | d | val |
+----+---+-----+
| 1 | 1 | 47 |
| 1 | 2 | 3 |
| 1 | 3 | 71 |
| 2 | 1 | 47 |
| 2 | 2 | 98 |
| 2 | 3 | 82 |
| 3 | 1 | 0 |
| 3 | 2 | 99 |
| 3 | 3 | 3 |
| 4 | 1 | 3 |
| 4 | 2 | 100 |
| 4 | 3 | 6 |
| 5 | 1 | 48 |
| 5 | 2 | 10 |
| 5 | 3 | 3 |
| 6 | 1 | 49 |
| 6 | 2 | 12 |
| 6 | 3 | 4 |
+----+---+-----+
SELECT x.*
FROM my_table x
JOIN
( SELECT id,MAX(val) max_val FROM my_table GROUP BY id) y
ON y.id = x.id
AND y.max_val = x.val;
+----+---+-----+
| id | d | val |
+----+---+-----+
| 1 | 3 | 71 |
| 2 | 2 | 98 |
| 3 | 2 | 99 |
| 4 | 2 | 100 |
| 5 | 1 | 48 |
| 6 | 1 | 49 |
+----+---+-----+
(This is intended as a MySQL solution - I'm not familiar with ORACLE syntax, so apologies if this doesn't port)
Does this answer your comment?
SELECT x.* , y.max_val
FROM my_table x
JOIN
( SELECT id,MAX(val) max_val FROM my_table GROUP BY id) y
ON y.id = x.id ;
+----+---+-----+---------+
| id | d | val | max_val |
+----+---+-----+---------+
| 1 | 1 | 47 | 71 |
| 1 | 2 | 3 | 71 |
| 1 | 3 | 71 | 71 |
| 2 | 1 | 47 | 98 |
| 2 | 2 | 98 | 98 |
| 2 | 3 | 82 | 98 |
| 3 | 1 | 0 | 99 |
| 3 | 2 | 99 | 99 |
| 3 | 3 | 3 | 99 |
| 4 | 1 | 3 | 100 |
| 4 | 2 | 100 | 100 |
| 4 | 3 | 6 | 100 |
| 5 | 1 | 48 | 48 |
| 5 | 2 | 10 | 48 |
| 5 | 3 | 3 | 48 |
| 6 | 1 | 49 | 49 |
| 6 | 2 | 12 | 49 |
| 6 | 3 | 4 | 49 |
+----+---+-----+---------+
I need data grid in my UI of the following form
| Col1 | Col2 | Col3 | Col4 |
--------------------------------------
| | | | |
Row1 | 10 | 20 | 30 | 40 |
| | | | |
Row2 | 50 | 60 | 70 | 80 |
| | | | |
Row3 | 90 | 100 | 110 | 120 |
| | | | |
Row4 | 130 | 140 | 150 | 160 |
The schema of the above DataGrid is stored in the DB as
Table T1
ID | Description | DimensionType
-------------------------------------------
101 | Row1 | 1
102 | Row2 | 1
103 | Row3 | 1
104 | Row4 | 1
105 | Col1 | 2
106 | Col2 | 2
107 | Col3 | 2
108 | Col4 | 2
In the above table DimensionType denotes whether the description is row or column. DimensionType = 1 means row and DimensionType = 2 means column.
The values stored in DB are as follows
Table T2
ID | T1R | T1C | Value
----------------------------------
1001 | 101 | 105 | 10
1002 | 101 | 106 | 20
1003 | 101 | 107 | 30
1004 | 101 | 108 | 40
1005 | 102 | 105 | 50
1006 | 102 | 106 | 60
1007 | 102 | 107 | 70
1008 | 102 | 108 | 80
.
.
.
an so on.
I wish to retrieve the data in the following form.
Row | C1 | Value | C2 | Value | C3 | Value | C4 | Value |
--------------------------------------------------------------------------------------------
| | | | | | | | |
Row1 | Col1 | 10 | Col2 | 20 | Col3 | 30 | Col4 | 40 |
Row2 | Col1 | 50 | Col2 | 60 | Col3 | 70 | Col4 | 80 |
Row3 | Col1 | 90 | Col2 | 100 | Col3 | 110 | Col4 | 120 |
Row4 | Col1 | 130 | Col2 | 140 | Col3 | 150 | Col4 | 160 |
| | | | | | | | |
Need to write a query that can print the data in above format (in MSSQL). If the retrieval can be made further optimized it would be even more helpful, i.e, of the form
Row | Col1 | Col2 | Col3 | Col4 |
--------------------------------------------------
| | | | |
Row1 | 10 | 20 | 30 | 40 |
Row2 | 50 | 60 | 70 | 80 |
Row3 | 90 | 100 | 110 | 120 |
Row4 | 130 | 140 | 150 | 160 |
| | | | |
Thanks in advance!!
Untested, but try this out (I have no MS SQL in front of me and had not worked in it for at least a year, so this code was written blindly):
select t2id, t2t1r, t2t1c, rdesc,
STUFF((SELECT ', ' + Value from t ttemp where ttemp.t2t1r = t.t2t1r FOR XML PATH('')), 1, 2, '') as ConcatenatedValues,
STUFF((SELECT ', ' + cdesc from t ttemp where ttemp.t2t1r = t.t2t1r FOR XML PATH('')), 1, 2, '') as cdescs,
from
(select T2.ID as t2id, T2.T1R as t2t1r, T2.T1C as t2t1c, Value, row.Description as rdesc, col.Description as cdesc
from T1 join T2 row
on T1.ID = row.T1R and row.DimensionType = 1
join T2 col
on T1.ID = col.T1C and col.DimensionType = 2) t