extract serial from string SQL Netezza - sql

please I need a query that could extract the Serial # from the Logmsg, the hard thing is that it does not follow a specific pattern, there is not fixed delimiter and the serial length is not the same always
|LogMsg |
|------------------------------------------------------------------------------------------|
|Customer Receive CPE Indoor. serial 21530369847SKA011094, user:ahmed.o.haraz |
|Customer Receive CPE Indoor as change. serial :21530369847SK9078291, user:Abdullah.M160275|
|Customer Receive CPE Indoor as change. serial :T5D7S18802909825, user:ahmed.o.haraz |
|Customer Receive CPE Indoor as change. serial :T5D7S18802909830, user:ahmed.o.haraz |
|Customer Receive CPE Indoor. serial ZTERRTHJ9303771, user:Mohamed.E176246 |
|Customer Returned CPE. serial :21530369847SKA011094, user:ahmed.o.haraz |
the result will be like below:
|Serial |
|--------------------|
|21530369847SKA011094|
|21530369847SK9078291|
|T5D7S18802909825 |
|T5D7S18802909830 |
|ZTERRTHJ9303771 |
|21530369847SKA011094|

One method is regexp_replace(), but you have different formats for what follows serial. So:
select replace(replace(regexp_extract(logmsg, 'serial [^,]+'), 'serial ', ''), ':', '')

This is answered in https://stackoverflow.com/a/64254966/14311638
Use a combination of regexp_extract_all_sp and get_value_varchar along with the right regex pattern

Related

Handling multiple childs for the same element generated in XML from SQL using "for XML clause"

I want to generate a XML file using a specific query. The main issue is that when I generate the XML, the output would look like this:
<nsSAFT:Account xmlns:nsSAFT="uri">
<nsSAFT:Produs>
<nsSAFT:CodProdus>0200943</nsSAFT:CodProdus>
<nsSAFT:Denumire>SPRAY SPECIAL EFECT 151 SILVER METAL</nsSAFT:Denumire>
<nsSAFT:Miscari>
<nsSAFT:Cantitate> 1.00</nsSAFT:Cantitate>
</nsSAFT:Miscari>
</nsSAFT:Produs>
</nsSAFT:Account>
<nsSAFT:Account xmlns:nsSAFT="uri">
<nsSAFT:Produs>
<nsSAFT:CodProdus>0200943</nsSAFT:CodProdus>
<nsSAFT:Denumire>SPRAY SPECIAL EFECT 151 SILVER METAL</nsSAFT:Denumire>
<nsSAFT:Miscari>
<nsSAFT:Cantitate> 2.00</nsSAFT:Cantitate>
</nsSAFT:Miscari>
</nsSAFT:Produs>
</nsSAFT:Account>
The main problem is that I want to have multiple children on the same product. My expected output would look like this:
<nsSAFT:Account xmlns:nsSAFT="uri">
<nsSAFT:Produs>
<nsSAFT:CodProdus>0200943</nsSAFT:CodProdus>
<nsSAFT:Denumire>SPRAY SPECIAL EFECT 151 SILVER METAL</nsSAFT:Denumire>
<nsSAFT:Miscari>
<nsSAFT:Cantitate> 1.00</nsSAFT:Cantitate>
</nsSAFT:Miscari>
<nsSAFT:Miscari>
<nsSAFT:Cantitate> 2.00</nsSAFT:Cantitate>
</nsSAFT:Miscari>
</nsSAFT:Produs>
</nsSAFT:Account>
The SQL query I used for generating the first output mentioned by me looks like this:
WITH XMLNAMESPACES ('uri' as nsSAFT)
SELECT
RTRIM(P.codProdus) AS 'nsSAFT:Produs/nsSAFT:CodProdus',
RTRIM(P.Denumire) AS 'nsSAFT:Produs/nsSAFT:Denumire',
STR(M.Cantitate, 18, 2) AS 'nsSAFT:Produs/nsSAFT:Miscari/nsSAFT:Cantitate'
FROM
Miscari M
INNER JOIN
ProdusGestiune PG ON M.idProdusGestiune = PG.idProdusGestiune
INNER JOIN
Produs P ON PG.idProdus = P.idProdus
FOR XML PATH ('nsSAFT:Account'), ELEMENTS ;
The data sample would look like this:
CodProdus
Denumire
Cantitate
0200943
SPRAY SPECIAL EFECT 151 SILVER METAL
1.00
0200943
SPRAY SPECIAL EFECT 151 SILVER METAL
2.00
0200943
SPRAY SPECIAL EFECT 151 SILVER METAL
5.00
0200947
SPRAY SPECIAL USE 230 PENETRATING OIL
6.00
I use the following tables:
"Produs":
| CodProdus | Denumire |
|:---- |:------:|
| 0200943 | SPRAY SPECIAL EFECT 151 SILVER METAL |
| 0200954 | SPRAY ACRILIC MAT 9005 400ML |
| 0200955 | SPRAY ACRILIC MAT 9016 400ML |
| 0200960 | SPRAY ACRILIC RAL 3000 400ML |
"Miscari":
| Cantitate|
|:---- |:------:|
| 14.000000 |
| 12.000000 |
| 5.000000 |
I tried to use "select distinct", but the SSMS returns me an error. I also tried multiple queries using "union all" and I met some errors too.
You're probably wanting a subquery to generate correlated Cantitate subelements, such as with the following:
WITH XMLNAMESPACES ('uri' as nsSAFT)
SELECT
RTRIM(P.codProdus) AS [nsSAFT:CodProdus],
RTRIM(P.Denumire) AS [nsSAFT:Denumire],
(
SELECT
STR(M.Cantitate, 18, 2) AS [nsSAFT:Cantitate]
FROM
ProdusGestiune PG
INNER JOIN
Miscari M ON M.idProdusGestiune = PG.idProdusGestiune
WHERE
PG.idProdus = P.idProdus
FOR XML PATH('nsSAFT:Miscari'), TYPE
)
FROM
Produs P
--WHERE codProdus='0200943'
FOR XML PATH('nsSAFT:Produs'), ROOT('nsSAFT:Account'), ELEMENTS;

How to join between table DurationDetails and Table cost per program

How to design database for tourism company to calculate cost of flight and hotel per every program tour based on date ?
what i do is
Table - program
+-----------+-------------+
| ProgramID | ProgramName |
+-----------+-------------+
| 1 | Alexia |
| 2 | Amon |
| 3 | Sfinx |
+-----------+-------------+
every program have more duration may be 8 days or 15 days only
it have two periods only 8 days or 15 days .
so that i do duration program table have one to many with program .
Table - ProgramDuration
+------------+-----------+---------------+
| DurationNo | programID | Duration |
+------------+-----------+---------------+
| 1 | 1 | 8 for Alexia |
| 2 | 1 | 15 for Alexia |
+------------+-----------+---------------+
And same thing to program amon program and sfinx program 8 and 15 .
every program 8 or 15 have fixed details for every day as following :
Table Duration Details
+------+--------+--------------------+-------------------+
| Days | Hotel | Flight | transfers |
+------+--------+--------------------+-------------------+
| Day1 | Hilton | amsterdam to luxor | airport to hotel |
| Day2 | Hilton | | AbuSimple musuem |
| Day3 | Hilton | | |
| Day4 | Hilton | | |
| Day5 | Hilton | Luxor to amsterdam | |
+------+--------+--------------------+-------------------+
every program determine starting by flight date so that
if flight date is 25/06/2017 for program alexia 8 days it will be as following
+------------+-------+--------+----------+
| Date | Hotel | Flight | Transfer |
+------------+-------+--------+----------+
| 25/06/2017 | 25 | 500 | 20 |
| 26/06/2017 | 25 | | 55 |
| 27/06/2017 | 25 | | |
| 28/06/2017 | 25 | | |
| 29/06/2017 | 25 | 500 | |
+------------+-------+--------+----------+
And this is actually what i need how to make relations ship to join costs with program .
for flight and hotel costs as above ?
for 5 days cost will be 1200
25 is cost per day for hotel Hilton
500 is cost for flight
20 and 55 is cost per transfers
image display what i need
relation between duration and cost
Truthfully, I don't fully understand exactly what you're trying to accomplish. Your description is not clear, your tables seem to be missing information / contain information that should not be in your tables, and the way that I'm understanding your description doesn't really make sense based on the UI screenshot that you shared.
It looks like you're working on an application for a travel agency which will allow agents to create an itinerary for a trip. They can give this trip a name (so if a particular package is a hit with customers, they can just offer the "Alexa" package), and the utility will calculate the total estimated cost of the trip. If I understand correctly, the trips will be either 8, or 15 days long.
Personally, I would delete the "ProgramDuration" table altogether. If there are two versions of the Alexa trip at index 1, then you're going to run into all manners of issues. I can get into the details of why this is a bad idea, but unless you're really hung up on having this ProgramDuration table, it's not worth the time. You should add a "duration" field to your "program" table, and assign a new ProgramID for each different duration version of the "Alexa" program.
Your table "Duration details" also misses the mark. Your fields in this table will make it harder to add new features to your application down the line. You should have a field "ProgramID," which we will use to join this table against the program table later. You should have a field "Day" which obviously indicates the day in the itinerary. You should have only one more field "ItemID." We're going to use the "ItemID" field to join your itinerary against a new items table we're going to create.
Your items table is where you define all of the items that can possibly appear in an itinerary. Your current itinerary table has three possible "types" of expenses, flights, hotels, and transfers. What if your travel agents want to start adding meal expenditures into their itineraries / budgets? What about activities that cost money? What about currency exchange fees? What about items that your clientele will need before their trip (wall adapters, luggage, etc.)? In your items table, you will have fields for an ItemID, ItemName, ItemUnitPrice, and ItemType. A possible item is as follows:
ItemID: 1, ItemName: Night At The Hilton, ItemUnitPrice: 300, ItemType: Lodging
Using the "SELECT [Column] AS [Alias]" syntax with some CTEs or subqueries and the JOIN operator, we can easily reconstitute a table that looks like your "Program Duration Details" table, but we will be afforded considerably more flexibility to add or remove things later down the line.
In the interests of security and programmability, I would also add a table called "ItemTypeTable" with a single field "TypeName." You can use this table to prevent unauthorized users from defining new item types, and you can use this table to create drop down menus, navigation, and all manners of other useful features. There might be cleaner implementations, but this shouldn't represent a serious performance or size hit.
All in all, at the risk of being somewhat rude, it seems like you're trying to take on a rather large, sophisticated task with a very rudimentary understanding of basic relational database design and implementation. If you are doing this in a professional context, I would strongly encourage you to consider consulting with another professional that may be more experienced in this area.

Subtract two aggregated values in Bar Chart

My data is like -
+-----------+------------------+-----------------+-------------+
| Issue Num | Created On | Closed at | Issue Owner |
+-----------+------------------+-----------------+-------------+
| 1 | 12/21/2016 15:26 | 1/13/2017 9:48 | Name 1 |
| 2 | 1/10/2017 7:38 | 1/13/2017 9:08 | Name 2 |
| 3 | 1/13/2017 8:57 | 1/13/2017 8:58 | Name 2 |
| 4 | 12/20/2016 20:30 | 1/13/2017 5:46 | Name 2 |
| 5 | 12/21/2016 19:30 | 1/13/2017 1:14 | Name 1 |
| 6 | 12/20/2016 20:30 | 1/12/2017 9:11 | Name 1 |
| 7 | 1/9/2017 17:44 | 1/12/2017 1:52 | Name 1 |
| 8 | 12/21/2016 19:36 | 1/11/2017 16:59 | Name 1 |
| 9 | 12/20/2016 19:54 | 1/11/2017 15:45 | Name 1 |
+-----------+------------------+-----------------+-------------+
What I am trying to achieve is
Number of issues created per week
Number of issues closed per week
Net number of issues remaining per week
I am able to resolve the top two points but unable to approach the last.
My attempt -
This gives me number of issues created every week.
Similarly I have done for Closed per week.
For Net number of issues (Created-Closed) -
I tried adding Closed At column along with Created On but I can't see second bar in the chart along with Created On either.
Something like this
I tried doing the same in excel -
I want something of this sort but with another column as the difference of
number of issues created that week - number of issues closed that week.
In this case, 8-6=2.
You could use a calculated field(Analysis->Create Calculated Field). Something like this:
{FIXED [Create Date]:Count(if DATEPART('year',[Create Date]) = 2016 then [Number of Records] end)} - {FIXED [Closed Date]:Count(if DATEPART('year',[Closed Date]) = 2016 then [Number of Records] end)}
This function is using LOD expressions to pull back both sets of values. It will filter on all 2016 results for both date sets and then minus them from each other.
For more on LOD's see here:
https://www.tableau.com/about/blog/LOD-expressions
Use this as your measure and pull in one of your date fields as the dimension.
The normal way to solve this problem is to reshape the data so you have one row per status change instead of one row per issue, with a column named [Date] and a column named [Action]. The action can be submit and close (or in a more complex world include approve, reject, whatever - tracking the history.
You can do the reshaping without modifying your source data by using a UNION to get two copies of each row with appropriate calculated fields to make the visible columns make sense (e.g., create calculated a field called Date that returns the submission date or closing date depending on whether the row is from the first or second union, with a similar one called Action whose value depends on that as well. Filter out Close actions that have a null date)
Or you can preprocess the data to reshape it.
Or you can use data blending to make two sources that point to the same data source but customizing the linking fields to line up the submit and close dates (e.g., duplicate the data connection and rename both date fields to have the same name). But in this case, you probably want to create scaffolding source that has every date, but no other data, to use as the primary data source to avoid filtering out data from the secondary for dates that don't appear in the primary. The blending approach can be brittle.
Assuming you used the UNION approach instead of Data Blending, then you can count the number of submissions and closures within a certain date range, or compute a running total of the difference to see the backlog size over time.

Access Query: get difference of dates with a twist

I'm going to do my best to explain this so I apologize in advance if my explanation is a little awkward. If I am foggy somewhere, please tell me what would help you out.
I have a table filled with circuits and dates. Each circuit gets trimmed on a time cycle of about 36 months or 48 months. I have a column that gives me this info. I have one record for every time the a circuit's trim cycle has been completed. I am attempting to link a known circuit outage list, to a table with their outage data, to a table with the circuit's trim history. The twist is the following:
I only want to get back circuits that have exceeded their trim cycles by 6 months. So I would need to take all records for a circuit, look at each individual record, find the most recent previous record relative to the record currently being examined (I will need every record examined invididually), calculate the difference between the two records in months, then return only the records that exceeded 6 months of difference between any two entries for a given feeder.
Here is an example of the data:
+----+--------+----------+-------+
| ID | feeder | comp | cycle |
| 1 | 123456 | 1/1/2001 | 36 |
| 2 | 123456 | 1/1/2004 | 36 |
| 3 | 123456 | 7/1/2007 | 36 |
| 4 | 123456 | 3/1/2011 | 36 |
| 5 | 123456 | 1/1/2014 | 36 |
+----+--------+----------+-------+
Here is an example of the result set I would want (please note: cycle can vary by circuit, so the value in the cycle column needs to be in the calculation to determine if I exceeded the cycle by 6 months between trimmings):
+----+--------+----------+-------+
| ID | feeder | comp | cycle |
| 3 | 123456 | 7/1/2007 | 36 |
| 4 | 123456 | 3/1/2011 | 36 |
+----+--------+----------+-------+
This is the query I started but I'm failing really hard at determining how to make the date calculations correctly:
SELECT temp_feederList.Feeder, Temp_outagesInfo.causeType, Temp_outagesInfo.StormNameThunder, Temp_outagesInfo.deviceGroup, Temp_outagesInfo.beginTime, tbl_Trim_History.COMP, tbl_Trim_History.CYCLE
FROM (temp_feederList
LEFT JOIN Temp_outagesInfo ON temp_feederList.Feeder = Temp_outagesInfo.Feeder)
LEFT JOIN tbl_Trim_History ON Temp_outagesInfo.Feeder = tbl_Trim_History.CIRCUIT_ID;
I wasn't really able to figure out where I need to go from here to get that most recent entry and perform the mathematical comparison. I've never been asked to do SQL this complex before, so I want to thank all of you for your patience and any assistance you're willing to lend.
I'm making some assumptions, but this uses a subquery to give you rows in the feeder list where the previous completed date was greater than the number of months ago indicated by the cycle:
SELECT tbl_Trim_History.ID, tbl_Trim_History.feeder,
tbl_Trim_History.comp, tbl_Trim_History.cycle
FROM tbl_Trim_History
WHERE tbl_Trim_History.comp>
(SELECT Max(DateAdd("m", tbl_Trim_History.cycle, comp))
FROM tbl_Trim_History T2
WHERE T2.feeder = tbl_Trim_History.feeder AND
T2.comp < tbl_Trim_History.comp)
If you needed to check for longer than 36 months you could add an arbitrary value to the months calculated by the DateAdd function.
Also I don't know if the value of cycle specified the number of month from the prior cycle or the number of months to the next one. If the latter I would change tbl_Trim_History.cycle in the DateAdd function to just cycle.
SELECT tbl_trim_history.ID, tbl_trim_history.Feeder,
tbl_trim_history.Comp, tbl_trim_history.Cycle,
(select max(comp) from tbl_trim_history T
where T.feeder=tbl_trim_history.feeder and
t.comp<tbl_trim_history.comp) AS PriorComp,
IIf(DateDiff("m",[priorcomp],[comp])>36,"x") AS [Select]
FROM tbl_trim_history;
This query identifies (with an X in the last column) the records from tbl_trim_history that exceed the cycle time - but as noted in the comments I'm not entirely sure if this is what you need or not, or how to incorporate the other 2 tables. Once you see what it is doing you can modify it to only keep the records you need.

Match similar zip codes

Background
To replace invalid zip codes.
Sample Data
Consider the following data set:
Typo | City | ST | Zip5
-------+------------+----+------
33967 | Fort Myers | FL | 33902
33967 | Fort Myers | FL | 33965
33967 | Fort Myers | FL | 33911
33967 | Fort Myers | FL | 33901
33967 | Fort Myers | FL | 33907
33967 | Fort Myers | FL | 33994
34115 |Marco Island| FL | 34145
34115 |Marco Island| FL | 34146
86405 | Kingman | FL | 86404
86405 | Kingman | FL | 86406
33967 closely matches 33965, although 33907 could also be correct. (In this case, 33967 is a valid zip code, but not in our zip code database.)
34115 closely matches is 34145 (off by one digit, with a difference of 3 for that digit).
86405 closely matches both.
Sometimes digits are simply reversed (e.g,. 89 instead of 98).
Question
How would you write a SQL statement that finds the "minimum distance" between multiple numbers that have the same number of digits, returning at most one result no matter what?
Ideas
Subtract the digits.
Use LIMIT 1.
Conditions
PostgreSQL 8.3
This sounds like a case for Levenshtein distance.
The Levenshtein distance between two
strings is defined as the minimum
number of edits needed to transform
one string into the other, with the
allowable edit operations being
insertion, deletion, or substitution
of a single character.
It looks like PostgreSQL has it built-in:
test=# SELECT levenshtein('GUMBO', 'GAMBOL');
levenshtein
-------------
2
(1 row)
http://www.postgresql.org/docs/8.3/static/fuzzystrmatch.html
Redfilter answered the question that was asked, but I just wanted to clarify that the requested solution will not resolve what appears to be the real problem.
The real problem here seems to be that you have a database which was hand keyed and some numbers were transcribed giving garbage data.
The ONLY way to solve this problem is to validate the full address against a database like the USPS, MapQuest, or another provider. I know the first two have API's available for doing this.
The example I gave in a comment above was to consider a zip of 75084 and a city value of Richardson. Richardson has zip codes in the range of 75080, 81, 82, 83, and 85. The minimum number of edits will be 1. However, which one?
Another equal problem is what if the entered zip code was 75083 for Richardson. Which is a valid zipcode for that city; however, what if the address resided in 75082?
The only way to get that is to have the full address validated.