SQL - Select null as FIELDNAME? - sql

I have come across some code that says
select null as UnitCost,
null as MarkUp
What exactly is this doing? Is it getting the field names unitcost and markup?
Why would you use "select null as..."?
New to this sorry.
Thanks.

That code is aliasing the value null and calling it UnitCost/MarkUp. It is not selecting that column from any available table.
You would usually see this when a statement requires matching column sets, e.g. union all.
select id, col1, col2, null as UnitCost, null as MarkUp
from table_1
union all
select id, null as col1, null as col2, UnitCode, MarkUp
from table_2

Just populate NULL to those selected columns
It is useful when you do Insert...Select, the Table you are trying to insert may have more columns than the selected table has, for convenience, you could use select null as col_name to let the numbers of columns you are trying to insert matches the columns that the target table has.

Related

Save value in local variable HANA SQL Script

I'm trying to take value from a non-empty row and overwrite it in the subsequent rows until another non-empty row appears and then write that in the subsequent rows. Coming from ABAP Background, I'm not sure how to accomplish this in HANA SQL Script. Here's a picture to show what the data looks like.
Basically 'Doe, John' should be overwritten into all the empty rows until 'Doe, Jane' appears and then 'Doe, Jane' should be overwritten into empty rows until another name appears.
My idea is to store the non-empty row in a local variable, but I haven't had much success so far. Here's my code:
tempTab1 = SELECT
CASE WHEN EMPLOYEE <> ''
THEN lv_emp = EMPLOYEE
ELSE EMPLOYEE
END AS EMPLOYEE,
FROM :tempTab;
In general, rows in dataset are unordered until you explicitly specify ORDER BY part of SQL. If you observe some order it may be a side-effect and can vary. So first of all you have to explicitly create a row number column (assume it's name is RECORD).
Then you should go this way:
Select only rows with non-empty data in column.
Use LEAD(RECORD) over(order by RECORD) to identify the next non-empty record number.
Join your source dataset to dataset defined on step 3 on between condition for RECORD field.
with a as (
select 1 as record, 'Val1' as field1 from dummy union
select 2 as record, '' as field1 from dummy union
select 3 as record, '' as field1 from dummy union
select 4 as record, 'Val2' as field1 from dummy union
select 5 as record, '' as field1 from dummy union
select 6 as record, '' from dummy union
select 7 as record, '' from dummy union
select 8 as record, 'Val3' as field1 from dummy
)
, fill_base as (
select field1, record, lead(record, 1, record) over(order by record asc) as next_record
from a
where field1 <> '' and field1 is not null
)
select
a.record
, case
when a.field1 = '' or a.field1 is null
then f.field1
else a.field1
end as field1
, a.field1 as field1_original
from a
left join fill_base as f
on a.record > f.record
and a.record < f.next_record
The performance in HANA may be bad in some cases since it process window functions very bad.
Here is another more elegant solution with two nested window functions than does not force you to write multiple selects for each column: How to make LAG() ignore NULLS in SQL Server?
You can use window aggregate function LAST_VALUE to achieve the imputation of missing values.
Sample Data
CREATE TABLE sample (id integer, sort integer, value varchar(10));
INSERT INTO sample VALUES (4711, 1, 'Hello');
INSERT INTO sample VALUES (4712, 2, null);
INSERT INTO sample VALUES (4713, 3, null);
INSERT INTO sample VALUES (4714, 4, 'World');
INSERT INTO sample VALUES (4715, 5, null);
INSERT INTO sample VALUES (4716, 6, '!');
Generate a new column with imputed values
SELECT base.*, LAST_VALUE(fill.value ORDER BY fill.sort) AS value_imputed
FROM sample base
LEFT JOIN sample fill ON fill.sort <= base.sort AND fill.value IS NOT NULL
GROUP BY base.id, base.sort, base.value
ORDER BY base.id, base.sort
Result
Note that sort could be anything determining the order (e.g. a timestamp).

how to sql a record from a file that has non null values in their fields

Each record has 100 fields, only few of them have values and many are NULLs. If I want to display the record with fields that has non NULL values , how do i do that ?
Example:
Table1 has 100 fields, one record in the table has 5 fields have non NULL values, 95 NULL values. I want to display that record.
Another record that has 100 fields NULLs, I dont want to display that record.
You want records where not all columns are null. You will need to enumerate the column names. The most simple solution is lengthy where clause, like:
select *
from mytable
where col1 is not null or col2 is not null or ... or colN is not null;
An alternative is a lateral join:
select t.*
from mytable
cross apply (
select count(col) cnt
from (values (col1), (col2), ... (colN)) as x(col)
) x
where x.cnt > 0
Note that the second solution requires all columns to have the same datatype - otherwise, additional casting might is needed.

How can I merge 2 partially overlapping strings using Apache Hive?

I have a field which holds a short list of ids of a fixed length.
e.g. aab:aac:ada:afg
The field is intended to hold at most 5 ids, growing gradually. I update it by adding from a similarly constructed field that may partially overlap with my existing set, e.g. ada:afg:fda:kfc.
The field expans when joined to an "update" table, as in the following example.
Here, id_list is the aforementioned list I want to "merge", and table_update is a table with new values I want to "merge" into table1.
insert overwrite table table1
select
id,
field1,
field2,
case
when (some condition) then a.id_list
else merge(a.id_list, b.id_list)
end as id_list
from table1 a
left join
table_update b
on a.id = b.id;
I'd like to produce a combined field with the following value:
aab:aac:ada:afg:fda.
The challenge is that I don't know whether or how much overlap the strings have until execution, and I cannot run any external code, or create UDFs.
Any suggestions how I could approach this?
Split to get arrays, explode them, select existing union all new, aggregate using collect_set, it will produce unique array, concatenate array into string using concat_ws(). Not tested:
select concat_ws(':',collect_set(id))
from
(
select explode(split('aab:aac:ada:afg',':')) as id --existing
union all
select explode(split('ada:afg:fda:kfc',':')) as id --new
);
You can use UNION instead UNION ALL to get distinct values before aggregating into array. Or you can join new and existing and concatenate strings into one, then do the same:
select concat_ws(':',collect_set(id))
from
(
select explode(split(concat('aab:aac:ada:afg',':','ada:afg:fda:kfc'),':')) as id --existing+new
);
Most probably you will need to use lateral view with explode in the real query. See this answer about lateral view usage
Update:
insert overwrite table table1
select concat_ws(':',collect_set(a.idl)) as id_list,
id,
field1,
field2
from
(
select
id,
field1,
field2,
split(
case
when (some condition) then a.id_list
when b.id_list is null then a.id_list
else concat(a.id_list,':',b.id_list)
end,':') as id_list_array
from table1 a
left join table_update b on a.id = b.id
)s
LATERAL VIEW OUTER explode(id_list_array ) a AS idl
group by
id,
field1,
field2
;

"ORA-00984: column not allowed here" when inserting with select statement

I would like to insert some data into a table. One field I would like to get from another table, so I'm using select statement inside. This is the code:
INSERT INTO t.table1 (
id,
id_t2,
date_of_change
)
VALUES (
t.table1_seq.nextval,
SELECT s.id_id_t2 from t.table2 s where s.something='something',
TO_DATE('02/05/2017 13:43:34','DD/MM/YYYY HH24:MI:SS')
)
Although select statement is always returning only 1 field (1 row), I presume this is why I'm getting the error.
How can I write INSERT statement with SELECT statement for just 1 field? Can it be done? If not, is there any other solution for this problem? Thank you.
You can translate your whole insert statement into the form of
insert into table1 (fields)
select fields from table2
This will allow you to specify in your select some values from the source table and some constant values. Your resulting query would be
INSERT INTO t.table1 (
id,
id_t2,
date_of_change
)
SELECT t.table1_seq.nextval,
s.id_id_t2,
TO_DATE('02/05/2017 13:43:34','DD/MM/YYYY HH24:MI:SS')
FROM t.table2 s
WHERE s.something='something'

I am trying to return a certain values in each row which depend on whether different values in that row are already in a different table

I'm still a n00b at SQL and am running into a snag. What I have is an initial selection of certain IDs into a temp table based upon certain conditions:
SELECT DISTINCT ID
INTO #TEMPTABLE
FROM ICC
WHERE ICC_Code = 1 AND ICC_State = 'CA'
Later in the query I SELECT a different and much longer listing of IDs along with other data from other tables. That SELECT is about 20 columns wide and is my result set. What I would like to be able to do is add an extra column to that result set with each value of that column either TRUE or FALSE. If the ID in the row is in #TEMPTABLE the value of the additional column should read TRUE. If not, FALSE. This way the added column will ready TRUE or FALSE on each row, depending on if the ID in each row is in #TEMPTABLE.
The second SELECT would be something like:
SELECT ID,
ColumnA,
ColumnB,
...
NEWCOLUMN
FROM ...
NEWCOLUMN's value for each row would depend on whether the ID in that row returned is in #TEMPTABLE.
Does anyone have any advice here?
Thank you,
Matt
If you left join to the #TEMPTABLE you'll get a NULL where the ID's don't exist
SELECT ID,
ColumnA,
ColumnB,
...
T.ID IS NOT NULL AS NEWCOLUMN -- Gives 1 or 0 or True/false as a bit
FROM ... X
LEFT JOIN #TEMPTABLE T
ON T.ID = X.ID -- DEFINE how the two rows can be related unquiley
You need to LEFT JOIN your results query to #TEMPTABLE ON ID, this will give you the ID if there is one and NULL if there isn't, if you want 1 or 0 this would do it (For SQL Server) ISNULL(#TEMPTABLE.ID,0)<>0.
A few notes on coding for performance:
By definition an ID column is unique so the DISTINCT is redundant and causes unnecisary processing (unless it is an ID from another table)
Why would you store this to a temporary table rather than just using it in the query directly?
You could use a union and a subquery.
Select . . . . , 'TRUE'
From . . .
Where ID in
(Select id FROM #temptable)
UNION
SELECT . . . , 'FALSE'
FROM . . .
WHERE ID NOT in
(Select id FROM #temptable)
So the top part, SELECT ... FROM ... WHERE ID IN (Subquery), does a SELECT if the ID is in your temptable.
The bottom part does a SELECT if the ID is not in the temptable.
The UNION operator joins the two results nicely, since both SELECT statements will return the same number of columns.
To expand on what someone else was saying with Union, just do something like so
SELECT id, TRUE AS myColumn FROM `table1`
UNION
SELECT id, FALSE AS myColumn FROM `table2`