SQL search with like - sql

I,m trying to make a query for this table which it have the following columns.
from , to, Range with values like
1, 100, A:
101,200, B:
201,300, C:
The columns are integer.
a user is going to give a number and I have to get on which rate is. Let say, the user send 105, I know that with a query I can get that it is on range B. But the problem is that sometimes users do not know the complete number that is going to be sent. Let say they only know the first two digits of the number, something like 10. I have to return all the possibilities that could involve l0. Let say, 10-101-1001-10001. The problem is that If I use LIKE I will not receive all the values because I do not have them in a column.
Any ideas how i can do this?

Just for your case (give first two digits), you can use substr to pick the first two digits of FROM and corresponding head of TO (length(to||'')-length(from||'')+2). Then compare them with user input to find the range.
With this query, you can get all ranges which contain number like what user send.
Use user input of 12 as an example, result will be ranges contain number like '12%':
select from, to, range from
(select Integer(substr(from,1,2)) substr_from,
Integer(substr(to,1,length(to||'')-length(from||'')+2) substr_to,
from, to, range
from your_table)
where (substr_from<=12 and substr_to>=12)
or (from<=12 and to>=12)
Below table will show the values of from, to, substr_from, substr_to
1 100 1 100
101 200 10 20
201 300 20 30
...................
901 1000 90 100
1001 1100 10 11
1101 1200 11 12
...................
9901 10000 99 100
10001 10100 10 10
10101 10200 10 10
...................
Since input is 12, these ranges will be returned: 1-100,101-200,1101-1200,1201-1300...
Of course this won't work if given digits is in the middle.

Related

How to regex match multiple items

I have a reviews table as follows:
r_id
comment
1
Weight cannot exceed 40 kg
2
You must not make the weight go over 31 k.g
3
Don't excel above 94kg
4
Optimal weight is 45 kg
5
Don't excel above 62 kg
6
Weight cannot exceed 7000g
What I want to select is the weight a r_id's cannot exceed. So my desired output is
r_id
max weight
1
40
2
31
3
94
5
62
As you can see, r_id 4 wasn't included since it wasn't taking about the maximum weight and 6 wasn't included because it is in grams. I am struggling with two things.
There are multiple phrases, how can I do a OR operator check in my regex column.
Sometimes the kg number is written like 40kg, 40 KG, 40 k.g or 40kilos. While all things are kilograms, the kg is written in different ways. How can I only extract the number (but ensuring the kg is written in one of the above ways, so I don't accidentally extract something like 4000g.
SELECT
r_id,
REGEX_SUBSTR(REGEX_SUBSTR('cannot exceed [0-9]+ kg'), '[0-9]+ kg')) as "max weight"
FROM reviews;
My statement only checks for one particular type of sentence and doesn't check if the number is in kilograms.
You could just extract the number from the string. There only appears to be one and then check if the string looks like certain patterns:
select regexp_substr(comm, '[0-9]+')
from reviews
where regexp_like(comm, '(exceed|go over|above).*[0-9]+ ?(kg|k.g)');
Here is a db<>fiddle.
You can use a more robust regex expression to extract the number.
I don't have an oracle DB, but try something like:
SELECT
r_id,
REGEX_SUBSTR(comment, '([0-9]+) ?(k\.?g\.?|kilos)', 1, 1, 'i') as "max weight"
FROM reviews;
You can see this regex matching the given string in action at https://regex101.com/r/07Rstk/1. This also explains what the regex means.
We also turn on the case insensitive flag in order to properly handle any capitalization. https://docs.oracle.com/cd/E18283_01/olap.112/e17122/dml_functions_2069.htm
Edit: To do checks for exceed, go over, etc. Note that we have changed the position parameter from 1 to 2 since we now care about the second capture group.
SELECT
r_id,
REGEX_SUBSTR(comment, '(exceed|go over|above)\h*([0-9]+) ?(k\.?g\.?|kilos)', 1, 2, 'i') as "max weight"
FROM reviews;

Calculated query with parameters in HANA/CrystalReports

I'm having trouble trying to explain my necessity, so I'll describe the scenario.
Scenario:
Product A has a maximum production of 125KG at a time.
The operator received a production order of 1027,5KG of product A.
The operator have to calculate how many rounds he'll have to
manufacture and adjust the components quantity for each round.
We want to create a report where this calculations are already done and what we believe would be the first step, based on the values of this scenario, is to return something like this:
ROUND QUANTITY(KG)
1 125
2 125
3 125
4 125
5 125
6 125
7 125
8 125
9 27,5
After that, the recalculation of the components could be done with simple operations.
The problem is that we couldn't think of a way to get the desired return and we also couldn't think of a different way of achieving the said report.
All we could do is get the integer part of the division
SELECT FLOOR(1027.5/125) AS "TEST" FROM DUMMY
and the remainder
SELECT MOD(1027.5,125) AS "TEST" FROM DUMMY
We are using:
SAP HANA SQL
Crystal Reports
SAP B1
Any help would be appreciated
Thanks in advance!
There are several ways to achieve want you described.
One way is to translate the requirement into a function that takes the two input parameter values and returns the table of production rounds.
This can look like this:
create or replace function production_rounds(
IN max_production_volume_per_round decimal (10, 2)
, IN production_order_volume decimal (10, 2)
)
returns table (
production_round integer
, production_volume decimal (10, 2))
as
begin
declare rounds_to_produce integer;
declare remainder_production_volume decimal (10, 2);
rounds_to_produce := floor( :production_order_volume / :max_production_volume_per_round);
remainder_production_volume := mod(:production_order_volume, :max_production_volume_per_round);
return
select /* generate rows for all "max" rounds */
s.element_number as production_round
, :max_production_volume_per_round as production_volume
from
series_generate_integer
(1, 1, :rounds_to_produce + 1) s
UNION ALL
select /* generate a row for the final row with the remainder */
:rounds_to_produce + 1 as production_round
, :remainder_production_volume as production_volume
from
dummy
where
:remainder_production_volume > 0.0;
end;
You can use this function just like any table - but with parameters:
select * from production_rounds (125 , 1027.5) ;
PRODUCTION_ROUND PRODUCTION_VOLUME
1 125
2 125
3 125
4 125
5 125
6 125
7 125
8 125
9 27.5
The bit that probably needs explanation is SERIES_GENERATE_INTEGER. This is a HANA-specific built-in function that returns a number of records from a "series". Series here is a sequence of periods within a min and max limit and with a certain step-size between two adjacend periods.
More on how this works can be found in the HANA reference documentation, but for now just say, this is the fastest way to generate a result set with X rows.
This series-generator is used to create all "full" production rounds.
For the second part of the UNION ALL then creates just a single row by selecting from the built-in table DUMMY (DUAL in Oracle) which is guaranteed to only have a single record.
Finally, this second part needs to be "disabled" if there actually is no remainder, which is done by the WHERE clause.

How to pull duplicates in transactional data based on date and other fields

I am looking at transactional data such as my credit card statement. I want to ensure that I am not getting my card swiped twice. The fields that I have are card number (I have multiple), amount of transaction, transaction date, merchant code, merchant name, and transaction code.
To know if it is a true duplicate transaction, I want to know if the merchant code, merchant name, and transaction amount appear more the once. I also want to make sure that the transaction was within 5 days of each other if all else matches.
I am doing the work in SAS code, but I can also do in PROC SQL. So far in SAS I’ve sorted the data and then pulled a table that only holds duplicates, but since I’ve sorted the data, It will only call it a duplicate if the dates are the exact same date instead of the 5 days rule mentioned.
I did a simple PROC SORT.
PROC SORT DATA=WORK.TRANSACTIONS
OUT=WORK.TRANSACTIONS1
DUPOUT=WORK.SORTSORTEDDUPS
NODUPKEY;
BY CARD NUMBER TRANSACTION_AMOUNT TRANSACTION_DATE MERCHANT_CODE MERCHANT_NAME TRANSACTION_CODE
What do I need to incorporate to add my rule of transaction within 5 days?
You can do it with an additional pass, retaining (and comparing to) the last transaction date as per the below. Note the change in the sort BY statement (you'll need to update the proc sort also).
data duplicates;
set work.transactions1;
by BY CARD NUMBER TRANSACTION_AMOUNT MERCHANT_CODE MERCHANT_NAME TRANSACTION_CODE TRANSACTION_DATE;
retain datecheck 0;
if first.TRANSACTION_CODE then datecheck=0;
else if TRANSACTION_DATE-datecheck le 5 then output;
datecheck=TRANSACTION_DATE;
run;
Let's create our practice data source:
DATA MY_CREDIT_CARDS;
INPUT
C_NUMBER
TRANC_AMOUNT
TRANSC_DATE :DATE10.
TRANSC_CODE
MERCH_CODE
MERCH_NAME $10.;
FORMAT TRANSC_DATE DDMMYY10.;
CARDS;
1 100 17JAN1990 1 1 AMAZON
2 200 01JAN1990 2 8 WALLMART
4 100 04JAN1990 3 5 CRUSTYKRAB
2 200 07JAN1990 4 7 NETFLIX
1 300 01JAN1990 5 2 GOOGLEPLAY
3 200 17JAN1990 6 8 WALLMART
5 100 18JAN1990 7 2 GOOG.PLAY
5 300 19JAN1990 8 2 GOOGLEPLAY
2 200 22JAN1990 9 8 WALLMART
4 200 20JAN1990 10 2 GOOGLEPLAY
1 100 03JAN1990 11 2 GOOG.PLAY
1 100 17JAN1990 12 1 AMZN
;
RUN;
Result:
Now, first of all, I recommend not to use descriptive fields such as a names (merchant name in this case) as keys, because descriptive fields can be very variable, i.e. someone can register AMAZON as AMZN or AMAZN, or any combination you could imagine as the merchant name. Use ID fields instead. So, assuming merchant code is an unique ID, I think that is enough to identify the merchant.
Considering the above, using PROC SQL you could do something like this to find duplicates based on the rule you provide (and without the need of using any other extra-step):
PROC SQL;
/*The following assuming each record are unique
(identified by 'transaction code' in this case),
otherwise you must handle duplicate records properly.*/
SELECT
DISTINCT A.*,
CASE WHEN
B.TRANSC_CODE IS NOT NULL
THEN 1 ELSE 0 END AS DUPLICATED
FROM MY_CREDIT_CARDS AS A
LEFT JOIN MY_CREDIT_CARDS AS B
ON
A.MERCH_CODE = B.MERCH_CODE AND
A.TRANC_AMOUNT = B.TRANC_AMOUNT AND
A.TRANSC_CODE ^= B.TRANSC_CODE AND
A.TRANSC_DATE >= INTNX('day',B.TRANSC_DATE,-5) AND
A.TRANSC_DATE <= INTNX('day',B.TRANSC_DATE,5)
;
/*You could use an ORDER BY clause to sort the
results as you want.*/
RUN;
The result would be:
Now you have a new column named "DUPLICATED" showing 1 if found the value as duplicated and 0 if not.
Hope it helps.

Excel Powerpivot measure conundrum- Average (of average?)

I have a powerpivot table that shows work_tickets and timestamps for each step taken towards resolution:
`Ticket | Step | Time | **TicketDuration**
--------------------------------------
1 1 5:30 15
1 2 5:33 15
1 3 5:45 15
2 1 6:00 10
2 2 6:05 10
2 3 6:10 10
[ticketDuration] is a calculated column I added on my own. Now I'm trying to create a measure for the [AverageTicketDuration] so that it returns 12.5 minutes for the table above{ (15+10)/2 }. I haven't got a clue how to use DAX to produce the results. Please help!
What you are looking for is the AVERAGEX function, which has the following definition AVERAGEX(<table>,<expression>)
The idea being that it will iterate though each row of a defined table applying your calculation, then average the results.
In the below example, I use Table1 as the table name.
To start with to iterate along tickets we would use the following VALUES( Table1[ticket]) which will return the unique values in the ticket column.
Then assuming that your ticket duration is always the same within a ticket ID, the aggregation method used in the expression would be Average(Table1[Ticket]). Since for example of ticket 1, (15 + 15 + 15)/3 = 15
Put together the measure would look like below:\
measure:=AVERAGEX( VALUES( Table1[ticket]), AVERAGE(Table1[Ticket Duration]))
The result when dropped into a pivot using your sample data.

In crystal reports, I would like to count the number of occurances for a set of values

We have a particular drug that comes in different strengths. In my crystal report I'd like to display the number of times a given strength occurs in a dataset (planned treatments for patients). For example:
Strength Occurances
500 2
600 5
700 0
800 7
How could I easily do this?
You would required a formula to count.
create a formula like below
if {mytable.field} = 'xxx' then
{mytable.field};
then
count({formula});