Spark Sql Parser append additional parameter in UDF call - sql

I use SQL statements as input from users something like "CASE WHEN CALL_UDF("12G", 2) < 0 THEN 4 ELSE 5 END".
I would like to add in this string an additional parameter.
Expected result: "CASE WHEN CALL_UDF("12G", 2, additional_parameter) < 0 THEN 4 ELSE 5 END".
To achieve this goal I am trying to use SparkSqlParser but faced problems during the implementation of this replacement. Probably someone has implemented a similar solution. Thanks.
What I have already tried:
val expression = parser.parseExpression("CASE WHEN CALL_UDF("12G", 2) < 0 THEN 3 ELSE 4 END")
.transformDown {
case expression if expression.isInstanceOf[UnresolvedFunction] && expression.asInstanceOf[UnresolvedFunction].name.funcName == "CALL_UDF" =>
UnresolvedFunction(FunctionIdentifier("CALL_UDF"), expression.children.toList ++ Seq(parser.parseExpression("4")), false)
}

Related

Laravel query builder. Column if not null = 1 else = 0 in select method

I using Laravel and its \Illuminate\Database\Eloquent\Builder. I would like to select all columns from "table_1" a have custom column "is_table_2_present" which value will be 1 or 0 depending if(table_1_id != null).
So I would like to do something like that.
$queryBuilder->leftJoin("table_2"....)
$queryBuilder->select([
"table_1.*",
"is_table_2_present" = (table_2_id != null) ? 1 : 0,
]);
I was trying to search for an answer but without much of a success. So I would like to ask if something like that is possible?
The reason why I cannot use Eloquent relationship is because I would need relationship with parameter. And that not possible in laravel 5.2 right?
public function table_2($userId)
{
return $this->hasOne(Table_2::class....)->where(user_id, "=", userId);
}
Conceptually each table is associated to a model. Try eloquent relationship between the models of the two table you are trying to query.
You can use selectRaw(), I think it will be:
$queryBuilder->selectRaw(
<<<EOT
table_1.*,
if(table_2_id != ?, 1, 0) as is_table_2_present,
EOT,
[null]
);
In the above code, I used binding to avoid the SQL injection.
Since this question kinda died, no respons for a while I solved it with selectRaw() for now. But still in search for more neat solution.
$queryBuilder->selectRaw("
table_one.*,
CASE WHEN table_two.id IS NOT NULL THEN 1 ELSE 0 END AS tableTwoPresent
");

Decode with AND clause oracle

I want to use parameter in my query but I can't handle with it
I have 3 big selects to raport and I just want to use parameter for some part of code which depends from choice
I have 3 different Where conditions
1st
..WHERE A.CANCELLED = 'FALSE' AND a.open_amount!=0 AND A.IDENTITY = '&client_id'..
2nd
...WHERE A.CANCELLED = 'FALSE' AND A.IDENTITY = '&client_id' ...
3rd
WHERE AND A.CANCELLED = 'FALSE' AND a.invoice_amount != a.open_amount AND A.IDENTITY = '&client_id'
I tried with decode but I guess it could be ok if there would be value in 2nd case but there isn't and I cant decode like this
WHERE decode(xxx,x1,'AND a.open_amount!= 0',x2,'',x3, 'AND a.invoice_amount != a.open_amount')
How I should solve that problem any tips?
Do you mean, if the first "where condition" OR the second OR the third is/are TRUE, you want the overall to be TRUE (select the row), and you are looking for a simplified way to write it? That is, without simply combining them with OR?
To achieve that, you don't need CASE and nested CASE statements or DECODE. You could do it like this:
WHERE A.CANCELLED = 'FALSE'
AND A.IDENTITY = '&client_id'
AND ( (xxx = x1 AND a.open_amount != 0) OR (xxx = x2) OR
(xxx = x3 AND a.invoice_amount != a.open_amount) )
This is more readable, the intent is clear, it will be easier to modify if needed, ...
You can try something like -
WHERE A.CANCELLED = 'FALSE'
AND A.IDENTITY = '&client_id'
AND a.open_amount <>
(CASE
WHEN x1 THEN 0
WHEN x2 THEN a.open_amount + 1 -- This needs to be something that is always TRUE, to nullify the condition
WHEN x3 THEN a.invoice_amount
END);
Edit: This is based on the assumption that a.open_amount is a NUMBER and uses a quick hack where we create an always TRUE condition like x <> x + 1. You should probably change this to whatever suits you better based on your data.

What's the best way to write if/else if/else if/else in HIVE?

Hive uses IF(condition, expression, expression), so when I want to do if / else if / else if / else, I have to do:
IF(a, 1, IF(b, 2, IF(c, 3, 4)))
Is there a better way to do this that's more readable?
Looking for something similar to the standard
if (a) {
1
} else if (b) {
2
} else if (c) {
3
} else {
4
}
You can use Hive Conditional CASE WHEN function for if-else scenario. The CASE Statement will provide you better readability with the same functionality.
CASE
WHEN (condition1) THEN result1
WHEN (condition2) THEN result2
WHEN (condition3) THEN result3
WHEN (condition4) THEN result4
ELSE result_default
END AS attribute_name
You can easily achieve this using CASE WHEN statements.
CASE
WHEN a THEN 1
WHEN b THEN 2
WHEN c THEN 3
ELSE 4
END AS attribute_name
For more information, refer official doc at https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-ConditionalFunctions
The best way to handle if else will be to write customize UDF for particular column.

Efficient way to update multiple records with independent values?

I have the following (overly db expensive) method:
def reorder_area_routes_by_demographics!
self.area_routes.joins(:route).order(self.demo_criteria, :proximity_rank).readonly(false).each_with_index do |area_route, i|
area_route.update_attributes(match_rank: i)
end
end
But this results in an UPDATE query for each area_route. Is there a way to do this in one query?
--Edit--
Final solution, per coreyward suggestion:
def reorder_area_routes_by_demographics!
sorted_ids = area_routes.joins(:route).order(self.demo_criteria, :proximity_rank).pluck(:'area_routes.id')
AreaRoute.update_all [efficient_sort_sql(sorted_ids), *sorted_ids], {id: sorted_ids}
end
def efficient_sort_sql(sorted_ids, offset=0)
offset.upto(offset + sorted_ids.count - 1).inject('match_rank = CASE id ') do |sql, i|
sql << "WHEN ? THEN #{id} "
end << 'END'
end
I use the following to do a similar task: updating the sort positions of a bevy of records according to their order in params. You might need to refactor or incorporate this differently to accomodate the scopes you're applying, but I think this will send you in the right direction.
def efficient_sort_sql(sortable_ids, offset = 1)
offset.upto(offset + sortable_ids.count - 1).reduce('position = CASE id ') do |sql, i|
sql << "WHEN ? THEN #{i} "
end << 'END'
end
Model.update_all [efficient_sort_sql(sortable_ids, offset), *sortable_ids], { id: sortable_ids }
sortable_ids is an array of integers representing the ids of each object. The resulting SQL looks something like this:
UPDATE pancakes SET position = CASE id WHEN 5 THEN 1 WHEN 3 THEN 2 WHEN 4 THEN 3 WHEN 1 THEN 4 WHEN 2 THEN 5 WHERE id IN (5,3,4,1,2);
This is, ugliness aside, a pretty performant query and (at least in Postgresql) will either fully succeed or fully fail.

SQL check for NULLs in WHERE clause (ternary operator?)

What would the SQL equivalent to this C# statement be?
bool isInPast = (x != null ? x < DateTime.Now() : true)
I need to construct a WHERE clause that checks that x < NOW() only if x IS NOT NULL. x is a datetime that needs to be null sometimes, and not null other times, and I only want the WHERE clause to consider non-null values, and consider null values to be true.
Right now the clause is:
dbo.assignments.[end] < { fn NOW() }
Which works for the non-NULL cases, but NULL values always seem to make the expression evaluate to false. I tried:
dbo.assignments.[end] IS NOT NULL AND dbo.assignments.[end] < { fn NOW() }
And that seems to have no effect.
For use in a WHERE clause, you have to test separately
where dbo.assignments.[end] is null or dbo.assignments.[end] < GetDate()
or you can turn the nulls into a date (that will always be true)
where isnull(dbo.assignments.[end],0) < GetDate()
or you can do the negative test against the bit flag derived from the below
where case when dbo.assignments.[end] < GetDate() then 0 else 1 end = 1
The below is explanation and how you would derive isInPast for a SELECT clause.
bool isInPast = (x != null ? x < DateTime.Now() : true)
A bool can only have one of two results, true or false.
Looking closely at your criteria, the ONLY condition for false is when
x != null && x < now
Given that fact, it becomes an easy translation, given that in SQL, x < now can only be evaluated when x!=null, so only one condition is needed
isInPast = case when dbo.assignments.[end] < { fn NOW() } then 0 else 1 end
(1 being true and 0 being false)
Not sure what { fn NOW() } represents, but if want SQL Server to provide the current time, use either GETDATE() or if you are working with UTC data, use GETUTCDATE()
isInPast = case when dbo.assignments.[end] < GetDate() then 0 else 1 end
The one you are looking for is probably the CASE statement
You need something like
WHERE X IS NULL
OR X < NOW()
Have two separate queries, one when x is null one when is not. Trying to mix the two distinct conditions is the sure shot guaranteed way to get a bad plan. Remember that the generated plan must work for all values of x, so any optimization based on it (a range scan on an index) is no longer possible.