SQL HIVE | Duplicate lines in Table

SQL HIVE | Duplicate lines in Table - sql

I have a table like this where the keys are [c_1, c_2, c_3], I want to non duplicates in my table.
Input :
C1 C2 C3 C4 C5
A1 D1 V1 X1 F3
A2 D1 V1 X2 F2
A1 D1 V1 X1 F3
A2 D1 V1 X2 F2
A4 D1 V2 X1 F3
A2 D1 V1 X1 F3
Output :
C1 C2 C3 C4 C5
A1 D1 V1 X1 F3
A2 D1 V1 X2 F2
A4 D1 V2 X1 F3
Regards,

try below:
insert overwrite table yourtable select distinct * from yourtable;

you can select the non duplicated data by
SELECT DISTINCT * FROM Table
then you can truncate the table and insert the above result to the table.

You can use ROW_NUMBER() window function:
select t.c1, t.c2, t.c3, t.c4, t.c5
from (
select *, row_number() over (partition by c1, c2, c3 order by c4, c5) rn
from tablename
) t
where t.rn = 1
You can remove order by c4, c5 if you are not interested in the 1st row of that order.

Does aggregation do what you want?
select c1, c2, c3, max(c4), max(c5)
from t
group by c1, c2, c3;
This does not guarantee that c4 and c5 come from the same row, but it does guarantee that the triple c1/c2/c3 appears only once.

Related

toad formatter starts next line under trailing comments

I find that toad formatter starts the next line of code under trailing comments.
Is there a way to change this.
This happens regardless of wrapping/stacking options.
e.g.
Original SQL:
SELECT C1, C2, C3, C4, C5 FROM DUAL
Now I add a trailing comment after C3:
SELECT C1, C2, C3 -- this is a test
, C4, C5 FROM DUAL
Result of formatting - C4 starts below the trailing comment:
SELECT C1, C2, C3 -- this is a test
, C4, C5 FROM DUAL
Preferred solution - Stack the columns:
SELECT C1
, C2
, C3 -- this is a test
, C4
, C5
FROM DUAL
Is there a setting I can use to change this?
Kind regards
fe
Decided to add another example where the comment is in the WHERE clause:
Original SQL:
SELECT C1
, C2
, C3
FROM DUAL
WHERE C1 = 1 AND C2 = 2 AND C3 = 3
Adding comment and formatting:
SELECT C1
, C2
, C3
FROM DUAL
WHERE C1 = 1 AND C2 = 2 -- this is a test
AND C3 = 3
Preferred solution :
SELECT C1
, C2
, C3
FROM DUAL
WHERE C1 = 1
AND C2 = 2 -- this is a test
AND C3 = 3
fe

Table data show one by one record different two tables?

Table 1
ABC
DEF
GS
PM
BS
PK
Table 2
ABC
DEF
YZ
TT
UG
KK
Need output
ABC
DEF
GS
PM
YZ
TT
BS
PK
UG
KK
So please help me sql query

table1:
Azbuka
Def
A1
D1
A2
D2
A3
D3
A4
D4
table2:
Azbuka
Def
F1
H1
F2
H2
F3
H3
F4
H4
DECLARE #max INT
select #max = count(*) from table1
;WITH CTE AS (
SELECT 1 num
UNION ALL
SELECT num+1
FROM CTE
WHERE num<#max
)
SELECT t1.* FROM CTE CC
inner join
(
Select
ROW_NUMBER() OVER(ORDER BY id ASC) AS RNum,
Azbuka,
Def
from table1
union all
Select
ROW_NUMBER() OVER(ORDER BY id ASC) AS RNum,
Azbuka,
Def
from table2
) t1 on t1.RNum = CC.num
Result:
Azbuka
Def
A1
D1
F1
H1
A2
D2
F2
H2
A3
D3
F3
F3
A4
D4
F4
H4

A more simple example and without recursive:
Select main.Azbuka, main.Def from (
Select
(ROW_NUMBER() OVER(ORDER BY id ASC))*2 AS RNum,
Azbuka,
Def
from TEST_DB.dbo.table1
union all
Select
((ROW_NUMBER() OVER(ORDER BY id ASC))*2 + 1) AS RNum,
Azbuka,
Def
from TEST_DB.dbo.table2
) main
order by main.RNum

Adding column in table with the values in another if matched

I have two tables:
table1
A
B
C
A1
B1
C1
A2
B2
C2
A3
B3
C3
A4
B4
C4
A5
B5
C5
A6
B6
C6
table2
A
D
A1
D1
A3
D3
A5
D5
A6
D6
I would like to have table 1 updated with a column D which shows the value in column D joining by A. However, Is altering table 1 adding a column D and then merging both tables and update when matched the way to go or is there any better approach?

You can just join the value in when you need it:
select t1.*, t2.d
from table1 t1 left join
table2 t2
on t1.a = t2.a;
If that is not sufficient, you can add the column:
alter table1 add d <type>;
Then you can update it:
update table1 t1
set d = (select t2.d from table2 t2 where t2.a = t1.a)
where exists (select t2.d from table2 t2 where t2.a = t1.a);

SQL combine basic order by and custom order by

I am using Oracle database and I am trying to combine a basic Order By and a custom one in one of my query.
Here's my table :
table1
-----------------
C1 | C2 | C3 | C4
I am trying to order it like that :
SELECT C1,C2,C3,C4 FROM table1
ORDER BY C1, C2, C3, (
CASE C4
WHEN C4 = 'value1' THEN 1
WHEN C4 = 'value2' THEN 2
WHEN C4 = 'value3' THEN 3
END
)
But I'm getting "Missing keyword" and I can't find which one, any ideas?

You can try
SELECT C1,C2,C3,C4 FROM table1
ORDER BY C1, C2, C3, (
CASE
WHEN C4 = 'value1' THEN 1
WHEN C4 = 'value2' THEN 2
WHEN C4 = 'value3' THEN 3
END
)
OR
SELECT C1,C2,C3,C4 FROM table1
ORDER BY C1, C2, C3, (
CASE C4
WHEN 'value1' THEN 1
WHEN 'value2' THEN 2
WHEN 'value3' THEN 3
END
)

Intersecting N width_buckets

I'm trying to take subsets from bucketed columns, and then take the intersection.
This will select other columns from the original table.
I'm also open to filtering in series.
The code below reports col1 doesn't exist - not sure It's the correct approach anyway.
WITH ranges AS (
SELECT
min(col1) AS c1min,
max(col1) AS c1max,
min(col2) AS c2min,
max(col2) AS c2max
FROM csv_test
),
f1 AS (
SELECT width_bucket(col1,c1min,c1max,12) AS b1
FROM csv_test, ranges
ORDER BY b1 ASC
),
f2 AS (
SELECT width_bucket(col2,c2min,c2max,12) AS b2
FROM csv_test, ranges
ORDER BY b2 ASC
)
SELECT b1, b2, c3, c4, c18
FROM csv_test
WHERE
b1 BETWEEN 0 AND 5
AND
b2 BETWEEN 3 AND 7;

You could use LATERAL join:
SELECT t.*, s2.*
FROM csv_test t
,LATERAL (SELECT
min(col1) AS c1min,
max(col1) AS c1max,
min(col2) AS c2min,
max(col2) AS c2max
FROM csv_test) AS s
,LATERAL (SELECT width_bucket(col1,c1min,c1max,12) AS b1,
width_bucket(col2,c2min,c2max,12) AS b2) AS s2
WHERE b1 BETWEEN 0 AND 5
AND b2 BETWEEN 3 AND 7;
DBFiddle Demo

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL HIVE | Duplicate lines in Table - sql

I have a table like this where the keys are [c_1, c_2, c_3], I want to non duplicates in my table. Input : C1 C2 C3 C4 C5 A1 D1 V1 X1 F3 A2 D1 V1 X2 F2 A1 D1 V1 X1 F3 A2 D1 V1 X2 F2 A4 D1 V2 X1 F3 A2 D1 V1 X1 F3 Output : C1 C2 C3 C4 C5 A1 D1 V1 X1 F3 A2 D1 V1 X2 F2 A4 D1 V2 X1 F3 Regards,

try below: insert overwrite table yourtable select distinct * from yourtable;

you can select the non duplicated data by SELECT DISTINCT * FROM Table then you can truncate the table and insert the above result to the table.

You can use ROW_NUMBER() window function: select t.c1, t.c2, t.c3, t.c4, t.c5 from ( select *, row_number() over (partition by c1, c2, c3 order by c4, c5) rn from tablename ) t where t.rn = 1 You can remove order by c4, c5 if you are not interested in the 1st row of that order.

Does aggregation do what you want? select c1, c2, c3, max(c4), max(c5) from t group by c1, c2, c3; This does not guarantee that c4 and c5 come from the same row, but it does guarantee that the triple c1/c2/c3 appears only once.

Related

toad formatter starts next line under trailing comments

Table data show one by one record different two tables?

Adding column in table with the values in another if matched

SQL combine basic order by and custom order by

Intersecting N width_buckets

Categories

Resources