Is there a way to query a ranged Expression in DB? - sql

Our application is a Mainframe which is a IBM iSeries – DB2 database set up. Some of our table values have a range.
Ex: 100;105;108;110:160;180
-- UPDATE --
The above data is from a single row (Single column to be precise). In the same format there would be multiple values (on various rows)
It this case, individual values are delimited by a “;” but 110:160 is a range. It includes all the values from 110 to 160. Now, for the individual values we were using like statements obviously. Ex; if I have to query for 105.
The challenge here is, if I had to query 125 which is technically not present in the database. However, logically I need to retrieve that record.
The system (application) somehow was able to accomplish this, I am not sure how. I am not a mainframe developer, I just had to query the database to retrieve a specific record for some of the automation that we work on.
As a workaround, I could think of two things:
Expand the ranges and store it in a temp database programmatically.
Ex: 110:160 would be expanded to 110;111;112..160 (Yes, it’s tedious)
Reduce the number of records, by filtering through certain unique colums (the one’w which are without ranges) then programmatically apply a logic to identify the right record
As both are workarounds, I was so curious to how the system does it. (I reached out to dev’s of the app. So far, no luck). So is there a direct approach to achieve this ? Could it be a stored procedure ?

If i got your question right your example values are not in a single row but in multiple - otherwise some preprocessing has to be done.
I would destruct the combined value into its components with SQL - like:
with temp(id, text, value1, value2) as (
select id, text
,case when posstr(id,':') > 0
then substr(id, 1, posstr(id,':') - 1)
else id
end as value1
,case when posstr(id,':') > 0
then substr(id, posstr(id,':')+1 , length(id))
else id
end as value2
from testrange
)
select * from temp
where 125 between value1 and value2

Related

Add column with substring of other column in SQL (Snowflake)

I feel like this should be simple but I'm relatively unskilled in SQL and I can't seem to figure it out. I'm used to wrangling data in python (pandas) or Spark (usually pyspark) and this would be a one-liner in either of those. Specifically, I'm using Snowflake SQL, but I think this is probably relevant to a lot of flavors of SQL.
Essentially I just want to trim the first character off of a specific column. More generally, what I'm trying to do is replace a column with a substring of the same column. I would even settle for creating a new column that's a substring of an existing column. I can't figure out how to do any of these things.
On obvious solution would be to create a temporary table with something like
CREATE TEMPORARY TABLE tmp_sub AS
SELECT id_col, substr(id_col, 2, 10) AS id_col_sub FROM table1
and then join it back and write a new table
CREATE TABLE table2 AS
SELECT
b.id_col_sub as id_col,
a.some_col1, a.some_col2, ...
FROM table1 a
JOIN tmp_sub b
ON a.id_col = b.id_col
My tables have roughly a billion rows though and this feels extremely inefficient. Maybe I'm wrong? Maybe this is just the right way to do it? I guess I could replace the CREATE TABLE table2 AS... to INSERT OVERWRITE INTO table1 ... and at least that wouldn't store an extra copy of the whole thing.
Any thoughts and ideas are most welcome. I come at this humbly from the perspective of someone who is baffled by a language that so many people seem to have mastery over.
I'm not sure the exact syntax/functions in Snowflake but generally speaking there's a few different ways of achieving this.
I guess the general approach that would work universally is using the SUBSTRING function that's available in any database.
Assuming you have a table called Table1 with the following data:
+-------+-----------------------------------------+
Code | Desc
+-------+-----------------------------------------+
0001 | 1First Character Will be Removed
0002 | xCharacter to be Removed
+-------+-----------------------------------------+
The SQL code to remove the first character would be:
select SUBSTRING(Desc,2,len(desc)) from Table1
Please note that the "SUBSTRING" function may vary according to different databases. In Oracle for example the function is "SUBSTR". You just need to find the Snowflake correspondent.
Another approach that would work at least in SQLServer and MySQL would be using the "RIGHT" function
select RIGHT(Desc,len(Desc) - 1) from Table1
Based on your question I assume you actually want to update the actual data within the table. In that case you can use the same function above in an update statement.
update Table1 set Desc = SUBSTRING(Desc,2,len(desc))
You didn't try this?
UPDATE tableX
SET columnY = substr(columnY, 2, 10 ) ;
-Paul-
There is no need to specify the length, as is evidenced from the following simple test harness:
SELECT $1
,SUBSTR($1, 2)
,RIGHT($1, -2)
FROM VALUES
('abcde')
,('bcd')
,('cdef')
,('defghi')
,('e')
,('fg')
,('')
;
Both expressions here - SUBSTR(<col>, 2) and RIGHT(<col>, -2) - effectively remove the first character of the <col> column value.
As for the strategy of using UPDATE versus INSERT OVERWRITE, I do not believe that there will be any difference in performance or outcome, so I might opt for the UPDATE since it is simpler. So, in conclusion, I would use:
UPDATE tableX
SET columnY = SUBSTR(columnY, 2)
;

SQL server query : how to check value within range

In application I am working on.
I have to input from user through excel and first put it in temporary sql table & then from temporary table to final target table.
My query is failing while putting data from temporary table to target table.
Because some values present in temporary table are out of range of columns in target table.
How can I check if values present in temporary table are within range of column of target table?
I have to check like this
20 < len(temporary_table.column1) < 50
or is there any better way
If you are using SQL server you can use below query for data checking.
temporary_table.column1 between 20 and 50
If you are looking based on the column max length. For example, your columns have datatype varchar(100) then you can use the condition like this
where len(temporary_table.column1)<=100
Extending on the above answer you can just use col_length instead of hard coding the value on the target column. This makes it more automated and less prone to mistakes (entering a value mistakenly)
where len(temporary_table.column1) <= COL_LENGTH ( 'target_table' , 'column1' )

Create column name based on value without execute

I need to create a column name based on the value of other columns. I need to return a value from a column, but the specific name depends on the value insert on other table.
From intance:
Table A
Column1 | Column2
1 2
Base on that values I need to go to the table B to the column "VE12".
I need this dynamiclly, so the execute(#query) is my last option and I would like to avoid CASE WHEN statments because I have more than 50 options.
My query will be something like:
select case when fn.tab=8 and fo.pais=3 then cp.ve83 end
FROM fn
INNER JOIN fo ON fo.stamp = fn.stamp
INNER JOIN cp
If the value in the column tab is 8 and the value in column pais is 3 I should return the value in column ve83.
Thanks for all the help!
The only sensible option is to go back to the business meaning of the data and redesign the database according to that, instead of according to "technique-oriented abstractions" such as these that SQL was never intended to support.
The main reason for this is that SQL was founded on FIRST order logic, and this precludes supporting stuff like varying domains. Which you are doing (or at least seeking to do) because ve12 could be a DATETIME and ve83 could be a VARCHAR and ve56 coulb be a BLOB etc. etc. So there is just no way for you [or anyone else] to determine the data type of the results in your query, and it is even more impossible to attach meaning to what comes out of your desired query precisely because of this varying-domain and varying-source characteristic.

Bigquery with thousand plus of case statement

I have a dataset with around 500 million records and I have requirement to derive two columns based on sequential processing of case statemens something like
Select Field1,
Field2,
Case when (expression1a and expression2c and expression 3d)
Then ‘abc’
Case when (expression1b and (expression 2f or expression 3))
Then ‘def’
Case when (expression1x and expression 2f and expression 3)
Then ‘ghi’
Case when (expression1 and expression 2n and expression 3)
Then ‘nop’
....
.....
......
.....
Else ‘unp’ end as field3
From table
With such a large query length I am facing issue of 250k character limit as well. Is there any better way to handle this scenario on google cloud?
The only way I know how to solve your problem would be to create a table and populate a column where you could list all these variables. Something like:
SELECT field1 as tmp
FROM humongoustable
WHERE tmp IN (SELECT words from smaller_table)
You would do this for every variable you needed and hopefully would be able to complete the query under the limit.
Also something else you may want to look into, is creating a new column in the table based on values that you are looking for and populate them as True/False and perform filters and joins based on these new columns. These columns could be in other tables or in the same table.

Transferring several similar named tables in SSIS

I want to create an interface between 2 databases on SQL Server 2008+ to copy several similar named tables into one.
I have n tables that all have the same naming convention, for example:
SalesInvoicePlanning2014ver1
SalesInvoicePlanning2015ver1
SalesInvoicePlanning2015ver2
etc.
The numbers can vary and do not have a set start (or end), but are always of the "int"-Datatype.
I also have a table "tabledir" that contains all table names as list. (one field) There are a total of 30-40 entries in that list with (for me) undesired entries. In the above example I would need 3 of the 30 tables.
The plan is to use a loop container to
select Top 1([name]) from [tabledir] where name like 'SalesinvoicePlanning%'
and then use the result as variable in the following SSIS Data transfer task:
Select * from [variable]
However, I'm stuck with the SQL statement to give me the desired tablename on each iteration.
Performance is not really an issue. Any advice? Am I wrong trying to use a loop-container?
You can follow below steps -
Step 1 - You can first create SQL task to get all table names into one variable lets say, TableNames of type Object(recordset) using you query.
e.g. select ([name]) as TableName from [tabledir] where name like 'SalesinvoicePlanning%'
Step 2 - Add foreach loop container to iterate over this variable TableNames to take single table name into new variable current_table and add data flow into the container to import data to destination table. Your source query will be expression like -
Select column_names from current_table