SSIS - Script Component, Split single row to multiple rows (Parent Child Variation) - sql

Thanks in advance for your help. I'm in need of help on writing SSIS script component to delimit single row to multiple rows. There were many helpful blog and post I looked at below:
http://beyondrelational.com/ask/public/questions/1324/ssis-script-component-split-single-row-to-multiple-rows-parent-child-variation.aspx
http://bi-polar23.blogspot.com/2008/06/splitting-delimited-column-in-ssis.html
However, I need a little extra help on coding to complete the project. Basically here's what I want to do.
Input data
ID Item Name
1 Apple01,02,Banana01,02,03
2 Spoon1,2,Fork1,2,3,4
Output data
ParentID ChildID Item Name
1 1 Apple01
1 2 Apple02
1 3 Banana01
1 4 Banana02
1 5 Banana03
2 1 Spoon1
2 2 Spoon2
2 3 Fork1
2 4 Fork2
2 5 Fork3
2 6 Fork4
Below is my attempt to code, but feel free to revise whole if it's illogic. SSIS Asynchronous output is set.
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Dim posID As Integer, childID As Integer
Dim delimiter As String = ","
Dim txtHolder As String, suffixHolder As String
Dim itemName As String = Row.ItemName
Dim keyField As Integer = Row.ID
If Not (String.IsNullOrEmpty(itemList)) Then
Dim inputListArray() As String = _
itemList.Split(New String() {delimiter}, _
StringSplitOptions.RemoveEmptyEntries)
For Each item As String In inputListArray
Output0Buffer.AddRow()
Output0Buffer.ParentID = keyField
If item.Length >= 3 Then
txtHolder = Trim(item)
Output0Buffer.ItemName = txtHolder
'when item length is less than 3, it's suffix
Else
suffixHolder = Trim(item)
txtHolder = Left(txtHolder.ToString(), Len(txtHolder) _
- Len(suffixHolder)) & suffixHolder.ToString()
Output0Buffer.ItemName = txtHolder
End If
Next
End If
End Sub
The current code produces the following output
ID Item Name
1 Apple01
1 02
1 Banana01
1 02
1 03
2 Spoon1
2 2
2 Fork1
2 2
2 3
2 4

If I come across as pedantic in this response, it is not my intention. Based on the comment "I'm new at coding and having a problem troubleshooting" I wanted to walk through my observations and how I came to them.
Problem analysis
The desire is to split a single row into multiple output rows based on a delimited field associated to the row.
The code as it stands now is generating the appropriate number of rows so you do have the asynchronous part (split) of the script working so that's a plus. What needs to happen is we need to 1) Populate the Child ID column 2) Apply the item prefix to all subsequent row when generating the child items.
I treat most every problem like that. What am I trying to accomplish? What is working? What isn't working? What needs to be done to make it work. Decomposing problems into smaller and smaller problems will eventually result in something you can do.
Code observations
Pasting in the supplied code resulted in an error that itemList was not declared. Based on usage, it seems that it was intended to be itemName.
After fixing that, you should notice the IDE indicating you have 2 unused variables (posID, childID) and that the variable txHolder is used before it's been assigned a value. A null reference exception could result at runtime. My coworker often remarks warnings are errors that haven't grown up yet so my advice to you as a fledgling developer is to pay attention to warnings unless you explicitly expect the compiler to warn you about said scenario.
Getting started
With a choice between solving the Child ID situation versus the name prefix/suffix stuff, I'd start with an easy one, the child id
Generating a surrogate key
That's the fancy title phrase that if you searched on you'd have plenty of hits to ssistalk or sqlis or any of a number of fabulously smart bloggers. Devil of course is knowing what to search on. No where do you ever compute or assign the child id value to the stream which of course is why it isn't showing up there.
We simply need to generate a monotonically increasing number which resets each time the source id changes. I am making an assumption that the inbound ID is unique in the incoming data like a sales invoice number would be unique and we are splitting out the items purchased. However if those IDs were repeated in the dataset, perhaps instead of representing invoice numbers they are salesperson id. Sales Person 1 could have another row in the batch selling vegetables. That's a more complex scenario and we can revisit if that better describes your source data.
There are two parts to generating our surrogate key (again, break problems down into smaller pieces). The first thing to do is make a thing that counts up from 1 to N. You have defined a childId variable to serve this. Initialize this variable (1) and then increment it inside your foreach loop.
Now that we counting, we need to push that value onto the output stream. Putting those two steps together would look like
childID = 1
For Each item As String In inputListArray
Output0Buffer.AddRow()
Output0Buffer.ParentId = keyField
Output0Buffer.ChildId = childID
' There might be VB shorthand for ++
childID = childID + 1
Run the package and success! Scratch the generate surrogate key off the list.
String mashing
I don't know of a fancy term for what needs to be done in the other half of the problem but I needed some title for this section. Given the source data, this one might be harder to get right. You've supplied value of Apple01, Banana01, Spoon1, Fork1. It looks like there's a pattern there (name concatenated with a code) but what it is it? Your code indicates that if it's less than 3, it's a suffix but how do you know what the base is? The first row uses a leading 0 and is two digits long while the second row does not use a leading zero. This is where you need to understand your data. What is the rule for identifying the "code" part of the first row? Some possible algorithms
Force your upstream data providers to provide consistent length codes (I think this has worked once in my 13 years but it never hurts to push back against the source)
Assuming code is always digits, evaluate each character in reverse order testing whether it can be cast to an integer (Handles variable length codes)
Assume the second element in the split array will provide the length of the code. This is the approach you are taking with your code and it actually works.
I made no changes to make the generated item name work beyond fixing the local variables ItemName/itemList. Final code eliminates the warnings by removing PosID and initializing txtHolder to an empty string.
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Dim childID As Integer
Dim delimiter As String = ","
Dim txtHolder As String = String.Empty, suffixHolder As String
Dim itemName As String = Row.ItemName
Dim keyField As Integer = Row.ID
If Not (String.IsNullOrEmpty(itemName)) Then
Dim inputListArray() As String = _
itemName.Split(New String() {delimiter}, _
StringSplitOptions.RemoveEmptyEntries)
' The inputListArray (our split out field)
' needs to generate values from 1 to N
childID = 1
For Each item As String In inputListArray
Output0Buffer.AddRow()
Output0Buffer.ParentId = keyField
Output0Buffer.ChildId = childID
' There might be VB shorthand for ++
childID = childID + 1
If item.Length >= 3 Then
txtHolder = Trim(item)
Output0Buffer.ItemName = txtHolder
Else
'when item length is less than 3, it's suffix
suffixHolder = Trim(item)
txtHolder = Left(txtHolder.ToString(), Len(txtHolder) _
- Len(suffixHolder)) & suffixHolder.ToString()
Output0Buffer.ItemName = txtHolder
End If
Next
End If
End Sub

Related

Refined list sorting by substring integer after alphabetical sorting

I have some information in a list (called listLines). Each line below is in a List(Of String).
1|This is just a header
3|This is just a footer
2|3456789|0000000|12312312313|BLUE|1|35.00
2|7891230|0000000|45645645655|BLUE|1|22.00
2|7891230|0000000|45645645658|RED|2|13.00
2|3456789|0000000|12312312316|RED|2|45.00
2|3456789|0000000|12312312317|YELLOW|5|-9.00
2|3456789|0000000|12312312315|ORANGE|3|15.00
2|7891230|0000000|45645645659|YELLOW|5|32.00
2|3456789|0000000|12312312314|GREEN|4|-20.00
2|7891230|0000000|45645645656|GREEN|4|39.00
2|7891230|0000000|45645645657|ORANGE|3|-18.50
I'm doing a listLines.sort() on the list to sort it alphabetically. Below is what I get after the .sort().
1|This is just a header
2|3456789|0000000|12312312313|BLUE|1|35.00
2|3456789|0000000|12312312314|GREEN|4|-20.00
2|3456789|0000000|12312312315|ORANGE|3|15.00
2|3456789|0000000|12312312316|RED|2|45.00
2|3456789|0000000|12312312317|YELLOW|5|-9.00
2|7891230|0000000|45645645655|BLUE|1|22.00
2|7891230|0000000|45645645656|GREEN|4|39.00
2|7891230|0000000|45645645657|ORANGE|3|-18.50
2|7891230|0000000|45645645658|RED|2|13.00
2|7891230|0000000|45645645659|YELLOW|5|32.00
3|This is just a footer
With that said, I need to output this information to a file. I'm able to do this ok. I still have a problem though. There is a sequence number in the above data at position 5 just after the listed colors (RED, BLUE, ETC..) that you can see. It's just before the last value which is a decimal type.
I need to further sort this list, keeping it in alphabetical order since position 2 is an account number and I want to keep the account numbers grouped together. I just want them to be resorted in sequential order based on the sequence number.
I was looking at another thread trying to figure out how I can do this. I found a piece of code like listLines.OrderBy(Function(q) q.Substring(35)).ToArray. I think this would probably help me if this was a fixed length file, it isn't however. I was thinking I can do some kind of .split() to get the 5th piece of information and sort it but then it's going to unalphabetize and mix the lines back up because I don't know how to specify to still keep it alphabetical.
Right now I'm outputting my alphabetical list like below so I can format it with commas and double quotes.
For Each listLine As String In listLines
strPosition = Split(listLine, "|")
Dim i As Integer = 1
Dim iBound As Integer = UBound(strPosition)
Do While (i <= iBound)
strOutputText = strOutputText & Chr(34) & strPosition(i) & Chr(34) & ","
i += 1
Loop
My main question is how do I re-sort after .sort() to then get each account (position1) in sequential order (position 5)? OR EVEN BETTER, how can I do both at the same time?
The List(Of T) class has an overload of the Sort method that takes a Comparison(Of T) delegate. I would suggest that you use that. It allows you to write a method or lambda expression that will take two items and compare them any way you want. In this case, you could do that like this:
Dim items = New List(Of String) From {"1|This Is just a header",
"3|This Is just a footer",
"2|3456789|0000000|12312312313|BLUE|1|35.00",
"2|7891230|0000000|45645645655|BLUE|1|22.00",
"2|7891230|0000000|45645645658|RED|2|13.00",
"2|3456789|0000000|12312312316|RED|2|45.00",
"2|3456789|0000000|12312312317|YELLOW|5|-9.00",
"2|3456789|0000000|12312312315|ORANGE|3|15.00",
"2|7891230|0000000|45645645659|YELLOW|5|32.00",
"2|3456789|0000000|12312312314|GREEN|4|-20.00",
"2|7891230|0000000|45645645656|GREEN|4|39.00",
"2|7891230|0000000|45645645657|ORANGE|3|-18.50"}
items.Sort(Function(x, y)
Dim xParts = x.Split("|"c)
Dim yParts = y.Split("|"c)
'Compare by the first column first.
Dim result = xParts(0).CompareTo(yParts(0))
If result = 0 Then
'Compare by the second column next.
result = xParts(1).CompareTo(yParts(1))
End If
If result = 0 Then
'Compare by the sixth column last.
result = xParts(5).CompareTo(yParts(5))
End If
Return result
End Function)
For Each item In items
Console.WriteLine(item)
Next
If you prefer a named method then do this:
Private Function CompareItems(x As String, y As String) As Integer
Dim xParts = x.Split("|"c)
Dim yParts = y.Split("|"c)
'Compare by the first column first.
Dim result = xParts(0).CompareTo(yParts(0))
If result = 0 Then
'Compare by the second column next.
result = xParts(1).CompareTo(yParts(1))
End If
If result = 0 Then
'Compare by the sixth column last.
result = xParts(5).CompareTo(yParts(5))
End If
Return result
End Function
and this:
items.Sort(AddressOf CompareItems)
Just note that this is rather inefficient because it splits both items on each comparison. That's not a big deal for a small list but, if there were a lot of items, it would be better to split each item once and then sort based on those results.

Generating Custom Patient IDs in Access Database using VBA UDF

First, let me say this. I am new to using access and VBA functions.
My overall goal is to add functionality to my database as described below:
This database consists of patients enrolled in a Clinical Trial, these patients have a unique identifier in the format GKID-XXXXX where the XXXXX is an alphanumeric base 35 counting system.
Eg. the numbering goes like this GKID-00000, GKID-00001, GKID-00002, GKID-00003,... , GKID-0000Z. Base 35 because it exclude the letter O.
Previously, we would generate these IDs and type them in manually. However, in the future, we would like these to be automatically created when a new patient is added to the database. However,
we want to retain the ability to add IDs in manually without changing any existing IDs, delete records without changing the assigned IDs, and the IDs created cannot be already used.
I have tried many things and the naive strategy I have made progress with is as follows.
Take the existing "Working Table" that contains all of the existing IDs in a field. This field would be left blank for newly added patients who we want to automatically generate an ID for.
Using this working table, create a new table with a query. This would be the table with the IDs. It would exactly match the existing table except the ID column from the first table would be replaced with one that generates IDs with a custom VBA function. The function takes the Working Table ID field in as a variable and returns the generated ID. If the field is occupied, it simply returns the ID, if not, it generates a new one. Below is the progress I have made in accomplishing this.
Option Compare Database
Function GavFun2(EIID As String) As String
strAlphabet = "0123456789ABCDEFGHIJKLMNPQRSTUVWXYZ"
These are the characters in the base 35 counting system.
N = Mid(EIID, 6, 5)
This simply extracts the 5 alphanumeric digits of the Working Table ID
Dec = (InStr(strAlphabet, Mid(N, 5, 1)) - 1) * 35 ^ 0 + (InStr(strAlphabet, Mid(N, 4, 1)) - 1) * 35 ^ 1 + (InStr(strAlphabet, Mid(N, 3, 1)) - 1) * 35 ^ 2 + (InStr(strAlphabet, Mid(N, 2, 1)) - 1) * 35 ^ 3 + (InStr(strAlphabet, Mid(N, 1, 1)) - 1) * 35 ^ 4
This Decodes this back into a base 10 system
GavFun2 = GavFun(CInt(Dec))
This converts the number back into the base 35 system and returns the ID in its full string form (function included below).
If EIID = Empty Then
End If
End Function
This if statement is where I am running into a wall. I want to fund the maximum value of Cint(Dec), then simply return GavFun2 = GavFun(Max(Cint(Dec))+1). I feel like this would be a good start, but there would be a number of problems if I was able to even get this to work.
A. If there ware multiple blank records, they would all have the ID (maybe replace with a for loop that runs through each blank consecutively and start the counter at Max(Cint(Dec))+1, but I don’t know how to do this.)
B. If I were to add a new patient with a custom ID (or delete one), this could potentially change all of the generated IDs.
Any thoughts on my general approach or advice on how to proceed would be very much appreciated. Thank you so much for your help.
Option Compare Database
Function GavFun(IDD As Integer) As String
strAlphabet = "0123456789ABCDEFGHIJKLMNPQRSTUVWXYZ"
If IDD = 0 Then
GavFun = "0"
Exit Function
End I
GavFun = vbNullString
Do While IDD <> 0
GavFun = Mid(strAlphabet, IDD Mod 35 + 1, 1) & GavFun
IDD = IDD \ 35
Loop
ZZ = Array("0", "00", "000", "0000", "00000")
L = Len(CStr(GavFun))
MM = ZZ(4 - L)
GavFun = "GKID-" & MM & GavFun
End Function
Ok, the way to approach this?
Well we often need all kinds of speical numbering systems. For invoice, PO numbers, maybe a badge number, employee numbers etc. Now in these cases? Such numbers have VERY little to do with relational databases. I mean becuase you might not yet have some silly invoice number, your whole software package comes crashing down?
So, such external numbers? They don't have anything to do with how you build your relationships between tables. For that you use the PK (ID - autonumber), and then the FK (forighen key) in the child table. things like City, invoice number, name etc? That's just data you store - ZERO to do with relationships.
Ok, so now that we got above out of the way.
First up:
You don't want to hit the whole database to find some max number or some such. While this might work for single user, it not really a good idea.
So, for badge numbers, incrementing product numbers, invoice numbers, and hey, maybe even a clincial trial number?
You create a table with a SINGLE row in that table. That way you can then even say go and change/set what the starting number/point is to be. So yes, you ARE on the right tack - you want to build your own custom numbering system, and then use that for a column in that table (as I stated, this is just a number, or value - not different then city, first name, or say some invoice number - NOTHING to do with the PK, and NOTHING to do with relatonal database stuff.
Ok, so, now we need some things here.
First, we need that table that keeps track of this number for us.
Next, we need a routine to get this "value" and then increment the value for the next time.
And then we need to setup a nice way to get and then put/shove/have that number auto matic be setup and entered into that form for us, and of course make sure if that record already has a number, we don't over write it.
Thus we can break this problem down into separate parts.
First, create our increment number handy dandy maintains table like this:
So that's our table. Nice part is we can edit the starting number, and even the prefix - so if you do another run/study, we can easy start a new number set. we DON'T care about the existing data - build a number system - it just runs and works like a good water pump.
Ok, next we need our routine to go please get me the next number.
Our code can work like this:
Public Function GetNextPID() As String
Const Alpha = "0123456789ABCDEFGHIJKLMNPQRSTUVWXYZ"
Const MaxDigts = 4
Const Base = 35
Dim OnePart As Long
Dim OneDigit As Long
Dim rstNext As DAO.Recordset
Dim OurNum As Long
Dim i As Integer
Dim strResult As String
Dim strPrefix As String
Set rstNext = CurrentDb.OpenRecordset("NextPatientNum")
OurNum = rstNext!NextNumber
strPrefix = rstNext!Prefix
rstNext.Edit
rstNext!NextNumber = OurNum + 1
rstNext.Update
rstNext.Close
' convert our number to base
Dim s As String
For i = MaxDigts - 1 To 0 Step -1
OnePart = Int(OurNum / (Base ^ i))
s = Mid(Alpha, OnePart + 1, 1)
strResult = strResult & s
OurNum = OurNum - Int(OnePart * (Base ^ i))
Next i
GetNextPID = strPrefix & "-" & strResult
End Function
So, place the above code function in a standard code module (not in a form).
Now, hit ctrl-g, and we can test it like this from the debug (immediate window)
? GetNextPID
Output: GKID-0007
So, now you can just edit that "one row" maintains table to start at whatever number you like or feel like. once you set that number, then each time you call that routine, it will go get you the next number based on our new number system.
Now, all we have to do is in that form? Well, we could just write code to check if the text box/field is not yet set, but there is ALSO a event on the form that fires ONLY when you add/insert that record in the form. It not 100% clear when/how your form works, but we can do something like this:
Private Sub Form_BeforeInsert(Cancel As Integer)
Me.PatientNum = GetNextPID
End Sub
(of course you don't type in the event stub part).
Note close: if the record is blank - not dirty, then nothing happens. You can even close the form. However, the instant you start typing anything into that form with a new record, then like magic that code will run, and shove in the new number into that text box on the form, the new patient number will appear.
As noted, we can even go edit that maintains table, give it a new prefix and new number for a whole new study.
Edit: feel free to use your existing logic - either way, don't try and hit/use the database table with existing data to get/make/generate the new ID - use your function idea, but it will simple automatic return + generate the next new number you need upon calling that function.

How do I add a string input to an array?

basically i need to add a name to an array of candidates for an election. the user enters the candidates names, and i want to store them in an array. so far i have this:
Dim CandidateNames(candidates) As String
Dim x As Integer
'entering the names of each candidate so students can vote for them.
For x = 1 To candidates
Console.WriteLine("Enter a candidates name:")
CandidateNames(candidates) = Console.ReadLine()
Next
For x = 0 To candidates - 1
Console.WriteLine(CandidateNames(candidates) & " is candidate " & x)
Next
i want to then output all the names, which is what the second for loop does, but it only outputs the last entered name.
im in my second GCSE year of computer science, having never done any programming before so go easy on me please.
Here is the working code:
Dim candidates as Integer
candidates=5
Dim CandidateNames(candidates) As String
Dim x As Integer
For x = 0 To candidates
Console.WriteLine("Enter a candidates name:")
CandidateNames(x) = Console.ReadLine()
Next
For x = 0 To candidates
Console.WriteLine(CandidateNames(x) & " is candidate ")
Next
In your code, you are not storing and accessing the strings correctly.
There are two issues with your code. The first is with your loop bounds: the first loop is this:
For x = 1 to candidates
And the second loop is this:
For x = 0 to candidates - 1
In .NET, the lower bound for one-dimensional arrays is always 0 (insert rant about .NET array indexing design choices here), so you should be starting from 0 as in the second loop. I can never remember if VB arrays specify the upper bound or the count when you declare them, but conceptually the second is correct for the final index: if you want an array of n items, then it will be indexed from 0 to n - 1.
The second issue is that inside each loop, you are referring to CandidateNames(candidates) instead of CandidateNames(x). Instead of moving through each item in the loop in turn, you are only operating on the last item in the array.
Unless this is for an assignment requiring to use arrays, I'd suggest you consider using List(Of String) instead. Arrays make sense for a more limited set of uses cases, and I don't think this is one of them. Usually, the number of candidates for an election will be variable; with a list, you can have the user enter candidates until they're done, and the list will automatically expand as you go. Then, you can use a For Each loop to write out the contents of the list (though note you could use a For Each loop with an array as well). A list can still be accessed by index like an array.

Determine if an integer is a member of a list of integers

I need to determine if a particular integer does not exists in a datagridview column. I assume I should create an array of the integers from the dgv column, and then compare if the integer exists in the array. However, there is perhaps an easier or simpler way.
I have looked at many articles but none of them resolve my task. Some of the Stack Overflow articles show similar solutions but I can't quite determine what to do.
For a = 0 To Dgv1.RowCount - 1
If Not Dgv1(1, a).Value = Dgv0(1, m).Value Then
Dgv0(1, Dgv0.RowCount - 1).Value = Dgv0(1, m).Value
End If
Next
I hope to compare an integer with a column of integers in a datagridview and if it is present do nothing but if is not present add it to the datagrid view
Are you using wpf? If yes, create a model.
provide a checking mechanism at the setter, use observablecollection or list then bind it to the datagirdview
Get the row and column of the datagridview
then compare (means condtional statement) to the variable you wanna check
and of course it should be inside of loop, loop count is equal to the count of rows you have in the datagridview.
Here's an example code:
Dim column As String = "YourColumnNameHere"
' Assuming 2 is the number you wanna compare
Dim value As Integer = 2
For row As integer = 0 to dataGridView.RowCount - 1
If dataGridView.Rows(row).Cells(column).Value = value Then
' Do something here
Else
' Do something here
End If
Next

MS Access Extract Multiple Matching Text Strings from Long Text Field compared to Table List

Issue: Query is not able to pull all of the restricted words found in a Long Text Field. It is getting the restricted words from a Table Column of ~100 values.
Sample Data
Table: RecipeTable with Long Text Field: RecipeText
Example Contents of RecipeText Field: Add the rutabaga, leeks, carrots and cabbage to the Instant Pot®. Seal and cook on high pressure for 4 minutes. Quick release the steam. Thinly slice the brisket across the grain and transfer to a serving platter. Arrange the vegetables around the meat, sprinkle with the parsley and serve with the sour cream, horseradish and mustard on the side.
Desired Result:
Want to Compare RecipeText Field against every value in this Short Text Field RestrictedItem in Table: RestrictedTable.
RestrictedTable.RestrictedItem contains 100 values. Let's say it contains 6 for this exercise: milk, bake, spoon, carrots, mustard and steam.
Query would find these matched words in no particular order for a single record: carrots mustard steam
I've tried this: How to find words in a memo field with microsoft access
Result: Finds only 1 of many matches within the Long Text field.
Desired Result: Find ALL matched words extracted within the Long Text string. Duplicates & wildcards are fine. Case sensitive is bad.
Example Tried:
SELECT a.Adjectives, b.Content
FROM A, B
WHERE b.Content Like "*" & a.[adjectives] & "*"
LIKE and after is where I believe the issue is. I've tried using %, parentheses, spaces, etc to no avail.
Mine became this:
SELECT RecipeTable.RecipeText, RestrictedTable.RestrictedItem
FROM RecipeTable, RestrictedTable
WHERE RecipeTable.RecipeText LIKE "*" & RestrictedTable.RestrictedItem & "*";
Notes:
I can find lots of advice to find single words, but not comparing whole table columns to one field.
And, lots of advice to find the first substring or nth position, but I want all of the substrings that match. Not the position & I'm afraid that applying trimming, etc, will slow things down on searching 100 words & trimming for each one.
I am fine making this a calculated field on my form that holds the RecipeText field.
Also fine with making a button that would launch a query to compare the RecipeText field with the RestrictedTable.RestrictedItem List & fill in an empty field RestrictedFound on the same form.
The code below are two approaches to find all restricted words that are in a memo field. While this could all be done programmatically without staging/work tables I would recommend using a temporary or permanent table to extract the words from the memo field via the split function in VBA (after accounting for punctuation and other data scrubbing).
After splitting the words from the memo field into an array they could then be inserted into a separate table with a foreign key reference to RecipeTable. This could be a temporary table or permanent if needed and could be part of the workflow process. A field like PendingReview could be added to RecipeTable for processing new records then marked as false afterwards so they won't be processed again.
After the words were added to the other table it could be joined to RecipeTable
by foreign key and you should have all matches of restricted words.
Once you have the information you could store the stats and discard the work record from your temporary table or delete the work records until the process is run again.
You could do it all in VBA with a dictionary lookup of the restricted words, i.e., query restricted words table, add to a dictionary then loop through matching each word in the memo field with lower case or case insensitive comparison, but it may take a while.
First Code Snippet Below
(If you want compile time checks then you must Reference the Microsoft Scripting Runtime my path is C:\Windows\SysWOW64\scrrun.dll)
Dim dic as Dictionary
Dim memoField as string
Dim words() as String
Dim matchCnt as Integer
'Other variables I didnt declare
'Code to populate dictionary
'Do Until rstRestricted.EOF
' dic.add LCase$(rst("restrictedWord")), 0
' rstRestricted.MoveNext
'Loop
'rstRestricted.Close
'Set rstRestricted = Nothing
Set rst = New adodb.Recordset
rst.Open "SELECT [MemoField] FROM RecipeTable;"
lngRowCnt = CLng(rst.RecordCount) - 1
For x = 0 to lngRowCnt
memoField = LCase$(Nz(rst("MemoField")))
'Replace punctuation like commas, periods
'memoField = Replace(memoField, ",","")
'Now split after data scrubbed
words = Split(memoField, " ")
intWordCnt = UBound(words)
For z = 0 to intWordCnt
If LenB(words(z)) <> 0 Then
If dic.Exists(words(z) = True Then
matchCnt = dic(words(z))
dic(words(z)) = matchCnt + 1
End If
End If
Next z
Next x
Dim WordKeys() as Variant
Dim y as Integer
Dim restrictedWord as string
Dim wordCnt as Integer
WordKeys = dic.Keys
For y = 0 to UBound(WordKeys) '-1
restrictedWord = CStr(WordKeys(y))
wordCnt = CInt(WordKeys(restrictedWord))
'code to save or display stats
Next y
rst.Close
Set rst = Nothing
Set conn = Nothing
I would just do the split of all words into a working table with the word field indexed then do an aggregate with counts of restricted words.
Second Code Snippet
'Option Explicit
Dim sql as String
Dim memoDelimitedData() as String
'Other variables declared
'Code to open Recordset table for recipe and also code to open
'Work table with adOpenDynamic (SELECT * from WorkTable)
'loop through records to be processed
'Split Field (May need variant instead of array. My Access VBA is rusty)
words = Split(memoField, " ")
intWordCnt = UBound(words)
For x = 0 to intWordCnt
With rstWorkTable
.AddNew
!Word = words(x)
!ForeignKeyIdToRecipeTable = intForeignKeyId
.Update
End With
Next x
Then when you have the work table records added you can join to the RecipeTable and the RestrictedTable.
So build a WorkTable of delimited Words from the memo field. Have the foreign key reference to the recipe table then join the RestrictedTable to the WorkTable by the RestrictedItem.
If needed this could be a query for a make table or a staging table permanent table. etc.
So something like this would then give you matches, of any words in your restricted table:
SELECT RecipeTable.RecipeText, RestrictedTable.RestrictedItem
FROM RecipeTable
INNER JOIN WorkTable ON
RecipeTable.Id = WorkTable.RecipeTableId
INNER JOIN RestrictedTable ON
WorkTable.ForeignKeyIdToRecipeTable = RestrictedTable.RestrictedItem
MS Access Split Function
At that point you could do counts, sums, and other data.
I'm sorry I thought I had example code, but I couldn't find it. I had to do something like this in college many moons ago using VBA and Access (Word Count/Ranking assignment), but I can't find it. Nowadays I'd do this kind of stuff with SQL Server with numbers tables, XML/JSON functionality or the Full Text Searching capability.
Hopefully this may help point you in the right direction if you need to limit your work inside MS Access.
If you're not comfortable with working with ADODB or DAO recordsets you could build a CSV delimited file with the foreign key and the word then import that file into a work table.