converting input fields from text to columns in FoxPro or SQL - sql

I have a set of input data in FoxPro. One of the fields, the grp field, is a concatenated string, the
individual portions of which are delimited by the pipe symbol, "|". Here are some examples of the values it can take:
grp:
ddd|1999|O|%
bce|%
aaa|2009|GON|Fixed|big|MAE|1
bbb|PAL|N|Fixed|MAE|1
aaa|SMK|O|Fixed|MAE|1|1
ddd|ERT|O|%
eef|%|N|%
afd|2000|O|%
afd|200907|O|%
swq|%|O|%
%
I would like to write a query that will separate the data above into separate fields and output them to another sql table, where the deciding factor for the separation is the pipe symbol. Taking the first two rows as an example, the output should read
Record 1:
Field1 = ddd Field2 = 1999 Field3 = O Field4 = %
Record 2:
Field1 = bce Field2 = % Field3 holds no value Field4 holds no value
It will not be known in advance what the greatest number of pipe symbols in the data will be. In the example above, it is 6, in records 3 and 5.
Is it actually possible to do this?

You can create a cursor and append the data into it using 'append from' (another way would be to use 2 alines, one for rows other for columns data). For example using your data as a text variable:
Local lcSample, lcTemp, lnFields, ix
TEXT to m.lcSample noshow
ddd|1999|O|%
bce|%
aaa|2009|GON|Fixed|big|MAE|1
bbb|PAL|N|Fixed|MAE|1
aaa|SMK|O|Fixed|MAE|1|1
ddd|ERT|O|%
eef|%|N|%
afd|2000|O|%
afd|200907|O|%
swq|%|O|%
%
ENDTEXT
lnFields = 0
Local Array laText[1]
For ix=1 To Alines(laText, m.lcSample)
m.lnFields = Max(m.lnFields, Occurs('|', m.laText[m.ix]))
Endfor
#Define MAXCHARS 20 && max field width expected
Local Array laField[m.lnFields,4]
For ix = 1 To m.lnFields
m.laField[m.ix,1] = 'F' + Ltrim(Str(m.ix))
m.laField[m.ix,2] = 'C'
m.laField[m.ix,3] = MAXCHARS
m.laField[m.ix,4] = 0
Endfor
lcTemp = Forcepath(Sys(2015)+'.txt', Sys(2023))
Strtofile(m.lcSample, m.lcTemp)
Create Cursor myData From Array laField
Append From (m.lcTemp) Delimited With "" With Character "|"
Erase (m.lcTemp)
Browse
However, in real world, this doesn't sound to be very realistic. You should know something about the data ahead.
And also, you could use FoxyClasses' import facilities to get the data. It lets you to choose the delimiters, map the columns etc. but requires some developer intervening for the final processing of the data.

The ALINES() function makes parsing easy. You could apply it to each line in turn. #Cetin has already showed you how to find out how many fields you need.

I had to do something very similar with some client data. They provided a field that was a space separated list of numbers that needed to be pulled out into a single column of numbers to match to an offer. Initially I dumped it to a text file, and imported it back into a new table. Something like the following:
create table groups ;
(group1 c(5), group2 c(5), group3 c(5), group4 c(5), group5 c(5))
select grp from infile to file grps.tmp noconsole plain
select groups
append from grps.tmp delimited with "" with character "|"

Related

How to retrieve the required string in SQL having a variable length parameter

Here is my problem statement:
I have single column table having the data like as :
ROW-1>> 7302-2210177000-XXXX-XXXXXX-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-U-XXXXXXXXX-XXXXXX
ROW-2>> 0311-1130101-XXXX-000000-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-X-XXXXXXXXX-WIPXXX
Here i want to separate these values from '-' and load into a new table. There are 11 segments in this string separated by '-', therefore, 11 columns. The problem is:
A. The length of these values are changing, however, i have to keep it as the length of these values in the standard format or the length which it has
e.g 7302- (should have four values, if the value less then that then keep that value eg. 73 then it should populate 73.
Therefore, i have to separate as well as mentation the integrity. The code which i am writing is :
select
SUBSTR(PROFILE_ID,1,(case when length(instr(PROFILE_ID,'-')<>4) THEN (instr(PROFILE_ID,'-') else SUBSTR(PROFILE_ID,1,4) end)
)AS [RQUIRED_COLUMN_NAME]
from [TABLE_NAME];
getting right parenthesis error
Please help.
I used the regex_substr SQL function to solve the above issue. Here below is an example:
select regex_substr('7302-2210177000-XXXX-XXXXXX-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-U-XXXXXXXXX-XXXXXX ROW-2>> 0311-1130101-XXXX-000000-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-X-XXXXXXXXX-WIPXXX',[^-]+,1,1);
Output is: 7302 --which is the 1st segment of the string
Similarly, the send string segment which is separated by "-" in the string can be obtained by just replacing the 1 with 2 in the above query at the end.
Example : select regex_substr('7302-2210177000-XXXX-XXXXXX-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-U-XXXXXXXXX-XXXXXX ROW-2>> 0311-1130101-XXXX-000000-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-X-XXXXXXXXX-WIPXXX',[^-]+,1,2);
output: 2210177000 which is the 2nd segment of the string

Select specified text from JSON column

I have a table, for example writing in psql. This table has a column json (text type). It contains text like this:
writing:[{"variableName":variableValue ...}]
variableValues are different types, including text ,bigint and date.
I want to get all rows from writing where variableName has the value 2.
I'm using this select:
select * from writing where json::json->>'variableName' = '2' limit 5
This select returns me 0 rows, but there are a lot of data in this table, which should pass this condition. Any idea what is wrong, or maybe you have better statement.
Im using limit 5 because need just 5 rows.
You'll have to prepend a { and append a } to make it a JSON like you intend. As it is, it will become a single JSON string.
Then you'll have to access the attribute as
('{' || json || '}')::json->'writing'->1->>'variableName'

Check subset using either string or array in Impala

I have a table like this
col
-----
A,B
The col could be string with comma or array. I have flexibility on the storage.
How to check of col is a subset of either another string or array variable? For example:
B,A --> TRUE (order doesn't matter)
A,D,B --> TRUE (other item in between)
A,D,C --> FALSE (missing B)
I have flexibility on the type. The variable is something I cannot store in a table.
Please let me know if you have any suggestion for Impala only (no Hive).
Thanks
A not pretty method, but perhaps a starting point...
Assuming a table with a unique identifier column id and an array<string> column col, and a string variable with ',' as a separator (and no occurrences of escaped '\,')...
SELECT
yourTable.id
FROM
yourTable,
yourTable.col
GROUP BY
yourTable.id
HAVING
COUNT(DISTINCT CASE WHEN find_in_set(col.item, ${VAR:yourString}) > 0 THEN col.item END)
=
LENGTH(regexp_replace(${VAR:yourString},'[^,]',''))+1
Basically...
Expand the arrays in your table, to one row per array item.
Check if each item exists in your string.
Aggregate back up to count how many of the items were found in the string.
Check that the number of items found is the same as the number of items in the string
The COUNT(DISTINCT <CASE>) copes with arrays like {'a', 'a', 'b', 'b'}.
Without expanding the string to an array or table (which I don't know how to do) you're dependent on the items in the string being unique. (Because I'm just counting commas in the string to find out how many items there are...)

Finding number of occurrences for specific value

I'm trying to create a field (calculation result) in FileMaker Pro 13 that will return the number of times a specific value is selected in a specific field.
For Example:
Say you have Table 1. Table 1 only has 1 field named Field 1. Field 1 is a drop down list field with the options "A","B", & "C". The following data is from the records of Table 1 using the field, Field 1:
Record 1: Table 1::Field 1 = "A"
Record 2: Table 1::Field 1 = "A"
Record 3: Table 1::Field 1 = "B"
Record 4: Table 1::Field 1 = "C"
What I want is a counter that searches across the records for table one and finds how many times a certain option is selected. For example, I want to know how many times "A" was selected in Field 1 and it would return "2".
What I have tried to do so far is the following but it hasn't worked out so hot (returns "?"):
ExecuteSQL(
"SELECT Field 1
FROM Table 1
WHERE Field 1 = 'A'"
;"";"")
Any suggestions for a correct SQL script?
The correct version of your Execute
ExecuteSQL(
"SELECT Count(\"Field 1\")
FROM \"Table 1\"
WHERE \"Field 1\" = ?"
;"";"";"A")
When you use ExecuteSQL, you're passing a string into FileMaker's function and then behind the scenes FileMaker uses that string and the various other pieces you give it to perform the action.
If you have a space in your field or table name, e.g. Field 1, FileMaker thinks you mean "Select a field name Field and a field named 1. You need to quote the field name if it contains spaces or special characters, but you can't use just regular double quotes because that would end the string.
The way to fix it is what I did above; escape the double quotes around the field or table name.
Also, the ? and the "A" at the bottom allows you to pass data into the query, i.e. parameterizing the query. This means you could do a loop where each iteration of the loop you pass in a different value where I have "A". E.g. You could do this:
ExecuteSQL(
"SELECT Count(\"Field 1\")
FROM \"Table 1\"
WHERE \"Field 1\" = ?"
;"";""; Table 1::Search Field)
or
ExecuteSQL(
"SELECT Count(\"Field 1\")
FROM \"Table 1\"
WHERE \"Field 1\" = ?"
;"";"";$searchValue)
Be careful though, ExecuteSQL doesn't cache records that it pulls if you're in a server/client environment so this calculation could get pretty sluggish if you have a lot of records in the table, you're going over the wan, or both. I would suggest trying to get the count a different way.
Select count(*) from Table1 where Field1='A'

How to Replace Multiple Characters in Access SQL?

I'm a novice at SQL, so hopefully someone can spell this out for me. I tried following the "Replace Multiple Strings in SQL Query" posting, but I got stuck.
I'm trying to do the same thing as the originator of the above posting but with a different table and different fields. Let's say that the following field "ShiptoPlant" in table "BTST" has three records (my table actually has thousands of records)...
Table Name: BTST
---------------
| ShiptoPlant |
| ----------- |
| Plant #1 |
| Plant - 2 |
| Plant/3 |
---------------
Here's what I'm trying to type in the SQL screen:
SELECT CASE WHEN ShipToPlant IN ("#", "-", "/") Then ""
ELSE ShipToPlant END FROM BTST;
I keep getting the message (Error 3075)...
"Syntax error (missing operator) in query expression
'CASE WHEN ShiptoPlant IN (";","/"," ") Then "" ELSE ShipToPlant END'."
I want to do this operation for every character on the keyboard, with exception of "*" since it is a wildcard.
Any help you could provide would be greatly appreciated!
EDIT: Background Information added from the comments
I have collected line-item invoice-level data from each our 14 suppliers for the 2008 calendar year. I am trying to normalize the plant names that are given to us by our suppliers.
Each supplier can call a plant by a different name e.g.
Signode Service on our master list could be called by suppliers
Signode Service
Signode - Service.
SignodeSvc
SignodeService
I'm trying to strip non-alphanumeric chars so that I can try to identify the plant using our master listing by creating a series of links that look at the first 10 char, if no match, 8 char, 6, 4...
My basic hang-up is that I don't know how to strip the alphanumeric characters from the table. I'll be doing this operation on several columns, but I planned on creating separate queries to edit the other columns.
Perhaps I need to do a mass update query that strips all the alphanumerics. I'm still unclear on how to write it. Here's what I started out with to take out all the spaces. It worked great, but failed when I tried to nest the replace
UPDATE BTST SET ShipToPlant = replace(ShipToPlant," ","");
EDIT 2: Further Information taken from Comments
Every month, up to 100 new permutations of our plant names appear in our line item invoice data- this could represent thousands of invoice records. I'm trying to construct a quick and dirty way to assign a master_id of the definitive name to each plant name permutation. The best way I can see to do so is to look at the plant, address, city and state fields, but the problem with this is that these fields have various permutations as well, for example,
128 Brookview Drive
128 Brookview Lane
By taking out alphanumerics and doing
LEFT(PlantName,#chars) & _
LEFT(Address,#chars) & _
LEFT(City,#chars) & _
LEFT(State,#chars)
and by changing the number of characters until a match is found between the invoice data and the Master Plant Listing (both tables contain the Plant, Address, City and State fields), you can eventually find a match. Of course, when you start dwindling down the number of characters you are LEFTing, the accuracy becomes compromised. I've done this in excel and had decent yield. Can anyone recommend a better solution?
You may wish to consider a User Defined Function (UDF)
SELECT ShiptoPlant, CleanString([ShiptoPlant]) AS Clean
FROM Table
Function CleanString(strText)
Dim objRegEx As Object
Set objRegEx = CreateObject("VBScript.RegExp")
objRegEx.IgnoreCase = True
objRegEx.Global = True
objRegEx.Pattern = "[^a-z0-9]"
CleanString = objRegEx.Replace(strText, "")
End Function
You could use the built in Replace function within Access
SELECT
Replace(Replace(Replace(ShipToPlant, "#", ""), "-", ""), "/", "") AS ShipToPlant
FROM
BTST
As others have said, within Access you can write your own functions in VBA and use them in your queries.
EDIT:
Here's a way to handle the nested Replace limit by wrappering the Replace function within our own function. It feels dirty but it works- put this in a module within Access
Public Function SuperReplace(ByRef field As String, ByVal ReplaceString As String) As String
' Size this as big as you need... it is zero-based by default'
Dim ReplaceArray(3) As String
'Fill each element with the character you need to replace'
ReplaceArray(0) = "#"
ReplaceArray(1) = "-"
ReplaceArray(2) = "/"
ReplaceArray(3) = " "
Dim i As Integer
For i = LBound(ReplaceArray) To UBound(ReplaceArray)
field = Replace(field, ReplaceArray(i), ReplaceString)
Next i
SuperReplace = field
End Function
Then test it with this query
SELECT
SuperReplace(ShipToPlant,"") AS ShipToPlant
FROM
BTST
You might want to take this an expand it so that you can pass in an array of strings instead of hard-coding them into the function.
EDIT 2:
In response to the additional information in the comments on the question, here's a suggestion for how you might want to handle the situation differently. The advantage to this apprach is that once you have mapped in a plant name permutation, you won't need to perform a string replace on future data in future years, only add new plant names and permutations to the map.
Start with creating another table, let's call it plant_map
CREATE TABLE plant_map (id AUTOINCREMENT PRIMARY KEY, name TEXT, master_id LONG)
into plant_map, add all of the permutations for plant names and insert the id for the name you wish to use to refer to a particular plant name permutation group with, into the master_id field. From your comments, I'll use Signode Service
INSERT INTO plant_map(name, master_id) VALUES ("Signode Service", 1);
INSERT INTO plant_map(name, master_id) VALUES ("Signode Svc", 1);
INSERT INTO plant_map(name, master_id) VALUES ("Signode - Service", 1);
INSERT INTO plant_map(name, master_id) VALUES ("Signode svc", 1);
INSERT INTO plant_map(name, master_id) VALUES ("SignodeService", 1);
Now when you query BTST table, you can get data for Signode Service using
SELECT
field1,
field2
FROM
BTST source
INNER JOIN
(
plant_map map1
INNER JOIN
plant_map map2
ON map1.master_id = map2.id
)
ON source.ShipToPlant = map1.name
WHERE
map2.name = "Signode Service"
Data within table BTST can remain unchanged.
Essentially, this is joining on the plant name in BTST to the name in plant_map then, using master_id, self joining on id within plant_map so that you need only pass in one "common" name. I would advise putting an index on each of the columns name and master_id in plant_map as both fields will be used in joins.
Don't think Access supports the CASE statement. Consider using iif:
iif ( condition, value_if_true, value_if_false )
For this case you can use the REPLACE function:
SELECT
REPLACE(REPLACE(REPLACE(yourfield, '#', ''), '-', ''), '/', '')
as FieldName
FROM
....
Create a public function in a Code module.
Public Function StripChars(ByVal pStringtoStrip As Variant, ByVal pCharsToKeep As String) As String
Dim sChar As String
Dim sTemp As String
Dim iCtr As Integer
sTemp = ""
For iCtr = 1 To Len(pStringtoStrip)
sChar = Mid(pStringtoStrip, iCtr, 1)
If InStr(pCharsToKeep, sChar) > 0 Then
sTemp = sTemp & sChar
End If
Next
StripChars = sTemp
End Function
Then in your query
SELECT
StripChars(ShipToPlant, "abcdefghijklmnopqrstuvwxyz0123456789") AS ShipToPlantDisplay
FROM
BTST
Notes - this will be slow for lots of records - if you what this to be permanent then create an update query using the same function.
EDIT: to do an Update:
UPDATE BTST
SET ShipToPlant = StripChars(ShipToPlant, "abcdefghijklmnopqrstuvwxyz0123456789")
OK, your question has changed, so the solution will too. Here are two ways to do it. The quick and dirty way will only partially solve your issue because it won't be able to account for the more odd permutations like missing spaces or misspelled words. The quick and dirty way:
Create a new table - let's call it
tChar.
Put a text field in it - the
char(s) you want to replace - we'll
call it char for this example
Put all the char or char combinatios that you want removed in this table.
Create and run the query below.
Note that it will only remove one
item at a time, but you can also put
different versions of the same
replacement in it too like ' -' or
'-'
For this example I created a table called tPlant with a field called ShipToPlant.
SELECT tPlant.ShipToPlant, Replace([ShipToPlant],
(SELECT top 1 char
FROM tChar
WHERE instr(ShipToPlant,char)<>0 ORDER BY len(char) Desc),""
) AS New
FROM tPlant;
The better (but much more complex) way. This explanation is going to be general because it would be next to impossible to put the whole thing in here. If you want to contact me directly use my user name at gmail.:
Create a table of Qualifiers -
mistakes that people enter like svc
instead of service. Here you would
enter every wierd permutation you
get.
Create a table with QualifierID and
Plant ID. Here you would say which
qualifier goes to which plant.
Create a query that joins the two
and your table with mistaken plant
names in it. Use instr so say what
is in the fields.
Create a second query that
aggragates the first. Count the
instr field and use it as a score.
The entry with the highest score is
the plant.
You will have to hand enter the ones
it can't find, but pretty soon that
will be next to none as you have
more and more entries in the table.
ughh
You have a couple different choices. In Access there is no CASE in sql, you need to use IIF. It's not quite as elegant as the solutions in the more robust db engines and needs to be nested for this instance, but it will get the job done for you.
SELECT
iif(instr(ShipToPlant,"#")<>0,"",
iif(instr(ShipToPlant,"-")<>0,"",
iif(instr(ShipToPlant,"/")<>0,"",ShipToPlant ))) AS FieldName
FROM BTST;
You could also do it using the sql to limit your data.
SELECT YourID, nz(aBTST.ShipToPlant,"") AS ShipToPlant
FROM BTST LEFT JOIN (
SELECT YourID, ShipToPlant
FROM BTST
WHERE ShipToPlant NOT IN("#", "-", "/")
) as aBTST ON BTST.YourID=aBTST.YourID
If you know VB you can also create your own functions and put them in the queries...but that is another post. :)
HTH
SELECT
IIF
(
Instr(1,ShipToPlant , "#") > 0
OR Instr(1,ShipToPlant , "/") > 0
OR Instr(1,ShipToPlant , "-") > 0, ""
, ShipToPlant
)
FROM BTST
All - I wound up nesting the REPLACE() function in two separate queries. Since there's upwards of 35 non-alphanumeric characters that I needed to replace and Access limits the complexity of the query to somewhere around 20 nested functions, I merely split it into two processes. Somewhat clunky, but it worked. Should have followed the KISS principle in this case. Thanks for your help!
I know this is a really old question, but I stumbled over it whilst looking for a solution to this problem, but ended up using a different approach.
The field that I wish to update is called 'Customers'. There are 20-odd accented characters in the 'CustName' field for which I wish to remove the diacritics - so (for example) ã > a.
For each of these characters I created a new table 'recodes' with 2 fields 'char' and 'recode'. 'char' contains the character I wish to remove, and 'recode' houses the replacement.
Then for the replace I did a full outer join inside the update statement
UPDATE Customers, Recodes SET Customers.CustName = Replace([CustName],[char],[recode]);
This has the same effect as nesting all of the replace statements, and is a lot easier to manage.
This query grabs the 3 first characters and replace them with Blanks
Example: BO-1234
Output: 1234
BO: IIf(IsNumeric(Left([sMessageDetails],3)),[sMessageDetails],Replace([sMessageDetails],Left([sMessageDetails],3),""))