How to update a Column from a Bunch of other columns - sql

I have a Table A where i column 1 Column 2 Column 3 Column 4 and Column 5.
Column 1,2,3,4 already have data and we need to update Column 5 based on that data and on priority .
Column 1 has Priority 5 , Col 2 has Priority 4 ,Col 3 has priority 3 and Col 4 has priority 2.
So if a particular row has all the column , then it should pick up Col 1 since it has highest priority and update Col 5 ,
If a record has data only in Col 3 and 4 then it should be Col3 and update in Col 5 since 3 has higher priority than Col4 .
If there is no data from Col 1-4 , col 5 should be null.
I have 24k records in my Table and i need to run this for all rows.
Any pointers for this query would he highly appreciated .

I think you want coalesce() -- assuming that the columns with no values have NULL:
update t
set col5 = coalesce(col1, col2, col3, col4);
You can also put the coalesce() in a select, if you don't want to actually change the data.

Related

Constraint on Column Based on Another Column Value

I'm writing a sql constraint about values on columns based on some conditions in Oracle database. My table is like below.(assume id is auto increment, also 'alpha' and 'beta' columns are numbers)
id alpha beta
--------------------------
1 1 0
2 1 1
3 0 0
4 0 0
5 2 3
6 4 1
If alpha value in two rows are same, only one row can be inserted with beta value of 1. In other words, i shouldn't insert a row with (1,1) values because there is already a row with beta value of 1.(look at the row with id=2). Any value besides 1 can be inserted freely. I need a useful control about that situation.
You can use a function-based index:
create unique index xxx on t(a, case when b = 1 then -1 else id end)
Here is a db<>fiddle.
You can use unique index with condition-based column as follows:
create unique index u123
on your_table (alpha, case when beta = 1 then beta else id end)

How to count all rows in raw data file using Hive?

I am reading some raw input which looks something like this:
20 abc def
21 ghi jkl
mno pqr
23 stu
Note the first two rows are "good" rows and the last two rows are "bad" rows since they are missing some data.
Here is the snippet of my hive query which is reading this raw data into a readonly external table:
DROP TABLE IF EXISTS readonly_s3;
CREATE EXTERNAL TABLE readonly_s3 (id string, name string, data string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
I need to get the count of ALL the rows, both "good" and "bad." The problem is some of the data is missing, and if I do SELECT count(id) as total_rows for example, that doesn't work since not all the rows have an id.
Any suggestions on how I can count ALL the rows in this raw data file?
Hmmm . . . You can use:
select sum(case when col1 is not null and col2 is not null and col3 is not null then 1 else 0 end) as num_good,
sum(case when col1 is null or col2 is null or col3 is null then 1 else 0 end) as num_bad
from readonly_s3;

kdb+ conditional insert: only insert when column value doesn't exist

What is the best way to insert a row into a table, only if a column of that row doesn't exist in the table.
E.g.:
q)table:([] col1:(); col2:(); col3:());
q)`table insert (1;2;3);
q)conditionalInsert:{if[first where table.col1=x(0)~0N;`table insert x]};
Now when doing the following:
q)conditionalInsert[(1;2;3)];
q)conditionalInsert[(7;8;9)];
The result yields:
q)table
col1 col2 col3
--------------
1 2 3
7 8 9
This can probably be accomplished more easily. My question: what is the easiest/best way?
To be clear: the column may be a non-keyed one.
Or in other words: Table is either keyed or non keyed and target column is not a key (or part of the compound key columns)
Use a keyed table?
q)table
col1| col2 col3
----| ---------
1 | 2 3
q)`table insert (1;2;4)
'insert
q)`table insert (2;2;4)
,1
q)table
col1| col2 col3
----| ---------
1 | 2 3
2 | 2 4
you can always use protected evaluation to silent the error.
q).[insert;(`table;(1;2;4));{`already_there}]
`already_there
q).[insert;(`table;(3;2;4));{`already_there}]
,2
q)table
col1| col2 col3
----| ---------
1 | 2 3
2 | 2 4
3 | 2 4
First thing is to have proper attributes (sort,group) on the target column which will make function faster.
Now there are 2 scenarios I can think of:
a) Table is keyed and target column is keyed column : In this case normal insert will work in way like your conditional insert.
b) Table is either keyed or non keyed and target column is not a key (or part of the compound key columns) :
q)conditionalInsert: {if[not x[0] in table.col1;`table insert x]}
Its better to use 'exec' in place of 'table.col1' as dot notation doesn't work for keyed table:
q)conditionalInsert: {if[not x[0] in exec col1 from table;`table insert x]}

SQL query to return matrix

I have a set of rows with one column of actual data. The goal is display this data in Matrix format. The numbers of Column will remain same, the number of rows may vary.
For example:
I have 20 records. If I have 5 columns - then the number of rows would be 4
I have 24 records. I have 5 columns the number of rows would be 5, with the 5th col in 5th row would be empty.
I have 18 records. I have 5 columns the number of rows would be 4, with the 4th & 5th col in 4th row would be empty.
I was thinking of generating a column value against each row. This column value would b,e repeated after 5 rows. But I cannot the issue is "A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operations"
Not sure how it can be achieved.
Any advice will be helpful.
Further Addition - I have managed to generate the name value association with column name and value. Example -
Name1 Col01
Name2 Col02
Name3 Col03
Name4 Col01
Name5 Col02
You can use ROW_NUMBER to assign a sequential integer from 0 up. Then group by the result of integer division whilst pivoting on the remainder.
WITH T AS
(
SELECT number,
ROW_NUMBER() OVER (ORDER BY number) -1 AS RN
FROM master..spt_values
)
SELECT MAX(CASE WHEN RN%5 = 0 THEN number END) AS Col1,
MAX(CASE WHEN RN%5 = 1 THEN number END) AS Col2,
MAX(CASE WHEN RN%5 = 2 THEN number END) AS Col3,
MAX(CASE WHEN RN%5 = 3 THEN number END) AS Col4,
MAX(CASE WHEN RN%5 = 4 THEN number END) AS Col5
FROM T
GROUP BY RN/5
ORDER BY RN/5
In general:
SQL is for retrieving data, that is all your X records in one column
Making a nice display of your data is usually the job of the software that queries SQL, e.g. your web/desktop application.
However if you really want to build the display output in SQL you could use a WHILE loop in connection with LIMIT and PIVOT. You would just select the first 5 records, than the next ones until finished.
Here is an example of how to use WHILE: http://msdn.microsoft.com/de-de/library/ms178642.aspx

Will multiple columns concatenate in the same order if using STUFF and For Xml Path

Please see http://www.sqlfiddle.com/#!3/fb107/3 for an example schema and query I want to run.
I want to use the STUFF and FOR XML PATH('') solution to concatenate columns having grouped by another column.
If I use this method to concatenate multiple columnns into a csv list, am I guaranteed that the order will be the same in each concatenated string? So if the table was:
ID Col1 Col2 Col3
1 1 1 1
1 2 2 2
1 3 3 3
2 4 4 4
2 5 5 5
2 5 5 5
Am I certain that if Col1 is concatenated such that the result is:
ID Col1Concatenated
1 1,2,3
2 4,5,6
That Col2Concatenated will also be in the same order ("1,2,3", "4,5,6") as opposed to ("2,3,1", "5,6,4") for example?
This solution will only work for me if the index of each row's value is the same in each of the concatenated values. i.e. first row is first in each csv list, second row is second in each csv list etc.
You can add an ORDER BY clause in the query within your STUFF function