Generating Custom Patient IDs in Access Database using VBA UDF - sql

First, let me say this. I am new to using access and VBA functions.
My overall goal is to add functionality to my database as described below:
This database consists of patients enrolled in a Clinical Trial, these patients have a unique identifier in the format GKID-XXXXX where the XXXXX is an alphanumeric base 35 counting system.
Eg. the numbering goes like this GKID-00000, GKID-00001, GKID-00002, GKID-00003,... , GKID-0000Z. Base 35 because it exclude the letter O.
Previously, we would generate these IDs and type them in manually. However, in the future, we would like these to be automatically created when a new patient is added to the database. However,
we want to retain the ability to add IDs in manually without changing any existing IDs, delete records without changing the assigned IDs, and the IDs created cannot be already used.
I have tried many things and the naive strategy I have made progress with is as follows.
Take the existing "Working Table" that contains all of the existing IDs in a field. This field would be left blank for newly added patients who we want to automatically generate an ID for.
Using this working table, create a new table with a query. This would be the table with the IDs. It would exactly match the existing table except the ID column from the first table would be replaced with one that generates IDs with a custom VBA function. The function takes the Working Table ID field in as a variable and returns the generated ID. If the field is occupied, it simply returns the ID, if not, it generates a new one. Below is the progress I have made in accomplishing this.
Option Compare Database
Function GavFun2(EIID As String) As String
strAlphabet = "0123456789ABCDEFGHIJKLMNPQRSTUVWXYZ"
These are the characters in the base 35 counting system.
N = Mid(EIID, 6, 5)
This simply extracts the 5 alphanumeric digits of the Working Table ID
Dec = (InStr(strAlphabet, Mid(N, 5, 1)) - 1) * 35 ^ 0 + (InStr(strAlphabet, Mid(N, 4, 1)) - 1) * 35 ^ 1 + (InStr(strAlphabet, Mid(N, 3, 1)) - 1) * 35 ^ 2 + (InStr(strAlphabet, Mid(N, 2, 1)) - 1) * 35 ^ 3 + (InStr(strAlphabet, Mid(N, 1, 1)) - 1) * 35 ^ 4
This Decodes this back into a base 10 system
GavFun2 = GavFun(CInt(Dec))
This converts the number back into the base 35 system and returns the ID in its full string form (function included below).
If EIID = Empty Then
End If
End Function
This if statement is where I am running into a wall. I want to fund the maximum value of Cint(Dec), then simply return GavFun2 = GavFun(Max(Cint(Dec))+1). I feel like this would be a good start, but there would be a number of problems if I was able to even get this to work.
A. If there ware multiple blank records, they would all have the ID (maybe replace with a for loop that runs through each blank consecutively and start the counter at Max(Cint(Dec))+1, but I don’t know how to do this.)
B. If I were to add a new patient with a custom ID (or delete one), this could potentially change all of the generated IDs.
Any thoughts on my general approach or advice on how to proceed would be very much appreciated. Thank you so much for your help.
Option Compare Database
Function GavFun(IDD As Integer) As String
strAlphabet = "0123456789ABCDEFGHIJKLMNPQRSTUVWXYZ"
If IDD = 0 Then
GavFun = "0"
Exit Function
End I
GavFun = vbNullString
Do While IDD <> 0
GavFun = Mid(strAlphabet, IDD Mod 35 + 1, 1) & GavFun
IDD = IDD \ 35
Loop
ZZ = Array("0", "00", "000", "0000", "00000")
L = Len(CStr(GavFun))
MM = ZZ(4 - L)
GavFun = "GKID-" & MM & GavFun
End Function

Ok, the way to approach this?
Well we often need all kinds of speical numbering systems. For invoice, PO numbers, maybe a badge number, employee numbers etc. Now in these cases? Such numbers have VERY little to do with relational databases. I mean becuase you might not yet have some silly invoice number, your whole software package comes crashing down?
So, such external numbers? They don't have anything to do with how you build your relationships between tables. For that you use the PK (ID - autonumber), and then the FK (forighen key) in the child table. things like City, invoice number, name etc? That's just data you store - ZERO to do with relationships.
Ok, so now that we got above out of the way.
First up:
You don't want to hit the whole database to find some max number or some such. While this might work for single user, it not really a good idea.
So, for badge numbers, incrementing product numbers, invoice numbers, and hey, maybe even a clincial trial number?
You create a table with a SINGLE row in that table. That way you can then even say go and change/set what the starting number/point is to be. So yes, you ARE on the right tack - you want to build your own custom numbering system, and then use that for a column in that table (as I stated, this is just a number, or value - not different then city, first name, or say some invoice number - NOTHING to do with the PK, and NOTHING to do with relatonal database stuff.
Ok, so, now we need some things here.
First, we need that table that keeps track of this number for us.
Next, we need a routine to get this "value" and then increment the value for the next time.
And then we need to setup a nice way to get and then put/shove/have that number auto matic be setup and entered into that form for us, and of course make sure if that record already has a number, we don't over write it.
Thus we can break this problem down into separate parts.
First, create our increment number handy dandy maintains table like this:
So that's our table. Nice part is we can edit the starting number, and even the prefix - so if you do another run/study, we can easy start a new number set. we DON'T care about the existing data - build a number system - it just runs and works like a good water pump.
Ok, next we need our routine to go please get me the next number.
Our code can work like this:
Public Function GetNextPID() As String
Const Alpha = "0123456789ABCDEFGHIJKLMNPQRSTUVWXYZ"
Const MaxDigts = 4
Const Base = 35
Dim OnePart As Long
Dim OneDigit As Long
Dim rstNext As DAO.Recordset
Dim OurNum As Long
Dim i As Integer
Dim strResult As String
Dim strPrefix As String
Set rstNext = CurrentDb.OpenRecordset("NextPatientNum")
OurNum = rstNext!NextNumber
strPrefix = rstNext!Prefix
rstNext.Edit
rstNext!NextNumber = OurNum + 1
rstNext.Update
rstNext.Close
' convert our number to base
Dim s As String
For i = MaxDigts - 1 To 0 Step -1
OnePart = Int(OurNum / (Base ^ i))
s = Mid(Alpha, OnePart + 1, 1)
strResult = strResult & s
OurNum = OurNum - Int(OnePart * (Base ^ i))
Next i
GetNextPID = strPrefix & "-" & strResult
End Function
So, place the above code function in a standard code module (not in a form).
Now, hit ctrl-g, and we can test it like this from the debug (immediate window)
? GetNextPID
Output: GKID-0007
So, now you can just edit that "one row" maintains table to start at whatever number you like or feel like. once you set that number, then each time you call that routine, it will go get you the next number based on our new number system.
Now, all we have to do is in that form? Well, we could just write code to check if the text box/field is not yet set, but there is ALSO a event on the form that fires ONLY when you add/insert that record in the form. It not 100% clear when/how your form works, but we can do something like this:
Private Sub Form_BeforeInsert(Cancel As Integer)
Me.PatientNum = GetNextPID
End Sub
(of course you don't type in the event stub part).
Note close: if the record is blank - not dirty, then nothing happens. You can even close the form. However, the instant you start typing anything into that form with a new record, then like magic that code will run, and shove in the new number into that text box on the form, the new patient number will appear.
As noted, we can even go edit that maintains table, give it a new prefix and new number for a whole new study.
Edit: feel free to use your existing logic - either way, don't try and hit/use the database table with existing data to get/make/generate the new ID - use your function idea, but it will simple automatic return + generate the next new number you need upon calling that function.

Related

MS Access Extract Multiple Matching Text Strings from Long Text Field compared to Table List

Issue: Query is not able to pull all of the restricted words found in a Long Text Field. It is getting the restricted words from a Table Column of ~100 values.
Sample Data
Table: RecipeTable with Long Text Field: RecipeText
Example Contents of RecipeText Field: Add the rutabaga, leeks, carrots and cabbage to the Instant Pot®. Seal and cook on high pressure for 4 minutes. Quick release the steam. Thinly slice the brisket across the grain and transfer to a serving platter. Arrange the vegetables around the meat, sprinkle with the parsley and serve with the sour cream, horseradish and mustard on the side.
Desired Result:
Want to Compare RecipeText Field against every value in this Short Text Field RestrictedItem in Table: RestrictedTable.
RestrictedTable.RestrictedItem contains 100 values. Let's say it contains 6 for this exercise: milk, bake, spoon, carrots, mustard and steam.
Query would find these matched words in no particular order for a single record: carrots mustard steam
I've tried this: How to find words in a memo field with microsoft access
Result: Finds only 1 of many matches within the Long Text field.
Desired Result: Find ALL matched words extracted within the Long Text string. Duplicates & wildcards are fine. Case sensitive is bad.
Example Tried:
SELECT a.Adjectives, b.Content
FROM A, B
WHERE b.Content Like "*" & a.[adjectives] & "*"
LIKE and after is where I believe the issue is. I've tried using %, parentheses, spaces, etc to no avail.
Mine became this:
SELECT RecipeTable.RecipeText, RestrictedTable.RestrictedItem
FROM RecipeTable, RestrictedTable
WHERE RecipeTable.RecipeText LIKE "*" & RestrictedTable.RestrictedItem & "*";
Notes:
I can find lots of advice to find single words, but not comparing whole table columns to one field.
And, lots of advice to find the first substring or nth position, but I want all of the substrings that match. Not the position & I'm afraid that applying trimming, etc, will slow things down on searching 100 words & trimming for each one.
I am fine making this a calculated field on my form that holds the RecipeText field.
Also fine with making a button that would launch a query to compare the RecipeText field with the RestrictedTable.RestrictedItem List & fill in an empty field RestrictedFound on the same form.
The code below are two approaches to find all restricted words that are in a memo field. While this could all be done programmatically without staging/work tables I would recommend using a temporary or permanent table to extract the words from the memo field via the split function in VBA (after accounting for punctuation and other data scrubbing).
After splitting the words from the memo field into an array they could then be inserted into a separate table with a foreign key reference to RecipeTable. This could be a temporary table or permanent if needed and could be part of the workflow process. A field like PendingReview could be added to RecipeTable for processing new records then marked as false afterwards so they won't be processed again.
After the words were added to the other table it could be joined to RecipeTable
by foreign key and you should have all matches of restricted words.
Once you have the information you could store the stats and discard the work record from your temporary table or delete the work records until the process is run again.
You could do it all in VBA with a dictionary lookup of the restricted words, i.e., query restricted words table, add to a dictionary then loop through matching each word in the memo field with lower case or case insensitive comparison, but it may take a while.
First Code Snippet Below
(If you want compile time checks then you must Reference the Microsoft Scripting Runtime my path is C:\Windows\SysWOW64\scrrun.dll)
Dim dic as Dictionary
Dim memoField as string
Dim words() as String
Dim matchCnt as Integer
'Other variables I didnt declare
'Code to populate dictionary
'Do Until rstRestricted.EOF
' dic.add LCase$(rst("restrictedWord")), 0
' rstRestricted.MoveNext
'Loop
'rstRestricted.Close
'Set rstRestricted = Nothing
Set rst = New adodb.Recordset
rst.Open "SELECT [MemoField] FROM RecipeTable;"
lngRowCnt = CLng(rst.RecordCount) - 1
For x = 0 to lngRowCnt
memoField = LCase$(Nz(rst("MemoField")))
'Replace punctuation like commas, periods
'memoField = Replace(memoField, ",","")
'Now split after data scrubbed
words = Split(memoField, " ")
intWordCnt = UBound(words)
For z = 0 to intWordCnt
If LenB(words(z)) <> 0 Then
If dic.Exists(words(z) = True Then
matchCnt = dic(words(z))
dic(words(z)) = matchCnt + 1
End If
End If
Next z
Next x
Dim WordKeys() as Variant
Dim y as Integer
Dim restrictedWord as string
Dim wordCnt as Integer
WordKeys = dic.Keys
For y = 0 to UBound(WordKeys) '-1
restrictedWord = CStr(WordKeys(y))
wordCnt = CInt(WordKeys(restrictedWord))
'code to save or display stats
Next y
rst.Close
Set rst = Nothing
Set conn = Nothing
I would just do the split of all words into a working table with the word field indexed then do an aggregate with counts of restricted words.
Second Code Snippet
'Option Explicit
Dim sql as String
Dim memoDelimitedData() as String
'Other variables declared
'Code to open Recordset table for recipe and also code to open
'Work table with adOpenDynamic (SELECT * from WorkTable)
'loop through records to be processed
'Split Field (May need variant instead of array. My Access VBA is rusty)
words = Split(memoField, " ")
intWordCnt = UBound(words)
For x = 0 to intWordCnt
With rstWorkTable
.AddNew
!Word = words(x)
!ForeignKeyIdToRecipeTable = intForeignKeyId
.Update
End With
Next x
Then when you have the work table records added you can join to the RecipeTable and the RestrictedTable.
So build a WorkTable of delimited Words from the memo field. Have the foreign key reference to the recipe table then join the RestrictedTable to the WorkTable by the RestrictedItem.
If needed this could be a query for a make table or a staging table permanent table. etc.
So something like this would then give you matches, of any words in your restricted table:
SELECT RecipeTable.RecipeText, RestrictedTable.RestrictedItem
FROM RecipeTable
INNER JOIN WorkTable ON
RecipeTable.Id = WorkTable.RecipeTableId
INNER JOIN RestrictedTable ON
WorkTable.ForeignKeyIdToRecipeTable = RestrictedTable.RestrictedItem
MS Access Split Function
At that point you could do counts, sums, and other data.
I'm sorry I thought I had example code, but I couldn't find it. I had to do something like this in college many moons ago using VBA and Access (Word Count/Ranking assignment), but I can't find it. Nowadays I'd do this kind of stuff with SQL Server with numbers tables, XML/JSON functionality or the Full Text Searching capability.
Hopefully this may help point you in the right direction if you need to limit your work inside MS Access.
If you're not comfortable with working with ADODB or DAO recordsets you could build a CSV delimited file with the foreign key and the word then import that file into a work table.

The do while loop structure

In the do while loop structure usually there's a part where you declare a variable equal to a number (in this case i) and then in a second part you make a increment (i+1). I've made this example in vba, but the structure could be repeated in several different programming languages like the for in php when you're getting data from a database. Now, what I would like to understand better is the relation between the previous mentioned declarations, that is i = some number and i = i + 1 . Wouldn't this generate a problem of interpretation since you're declaring a variable to something and then assigning a different value right after it? Is the second declaration of the variable value, i = i + 1, a new variable calling the previous one or both i's are the same? This is the general orientation I intend with this question. I think explaining the scoop of both variables would help understanding. Thanks!
Sub DoWhile()
Dim x, i, sum
x = 10
i = 1
sum = 0
Do While i < x
sum = sum + i
i = i + 1
Loop
MsgBox “Sum = ” & sum
End Sub
A variable is really just a location in memory. That location can have any value. By setting i=i+1, you're really saying "take the value at position i, add 1 to it, and store it at position i". No new variable is created. There's no problem with the computer interpreting this- what it cares about is the location of i, which isn't changing. It still knows where to find i, regardless of how many times you change the value there.
Since you have created the variable i as a global variable, any reference or modification to i in the sub will be on the same variable. That being said:
Dim i as int
i = 1
Do while i < 11
MsgBox("The value of i is: " & i)
i = i + 1
Loop
would display 10 messageboxes showing the value of i being between 1 and 10.
When the program encounters i = i + 1, the computer 'sees' this as take the value of i, add one to it, and store the result in the variable i.
Hope that helps.

Comparing dates for overlap - not avoiding

I'm working on a timetabling piece of code. I am using a system of university modules and events associated to those modules, ie
Module CSC3039
Event1 - Lecture
Event2 - Lecture
Event3 - Practial etc
I need to check the times of each event in the module against each other and compare for clashes. The clashes do not need to be rectified, just highlighted. The table I will use is Events containing Event_ID (PK), Module_code (FK), Start_Date_Time, End_Date_Time plus other fields that don't matter here. I have figured out that I need to implement a For Each statement, ultimately resulting in an if statement such as:
if (startTime1 <= endTime2 or endTime1 >= startTime2) CLASH
My problem is trying to figure out the actual for loop here. I don't know what to write to declare my start times and end times. I presume it is a case of taking event1 and getting its start and end and then checking if event 2, 3 or 4 fit the above if statement. I'm trying to get this but could really use some guidance.
EDIT... Based on suggestions below I have implemented the following code:
'return all relevant tables from the Modules database, based on the module code entered by the user.
Dim eventTime = (From mods In db.Modules
Join evnt In db.Events On mods.Module_code Equals evnt.Module_code
Join rm In db.Rooms On rm.Room_ID Equals evnt.Room_ID
Join build In db.Buildings On build.Building_code Equals rm.Building_code
Where ((mods.Module_code = initialModCode) And (evnt.Room_ID = rm.Room_ID))
Select evnt.Event_ID, evnt.Module_code, evnt.Event_type, evnt.Start_Date_Time, evnt.End_Date_Time, build.Building_code, rm.Room_Number)
'use the gridview to display the result returned by the above query
gdvEventsTable.DataSource = eventTime
gdvEventsTable.DataBind()
Dim listClashes As New List(Of Array)
For i As Integer = 0 To eventTime.Count - 1
For j As Integer = i + 1 To eventTime.Count - 1
If (eventTime.ToList(i).Start_Date_Time < eventTime.ToList(j).End_Date_Time) And (eventTime.ToList(i).End_Date_Time > eventTime.ToList(j).Start_Date_Time) Then
MsgBox("Clash", MsgBoxStyle.MsgBoxSetForeground, "")
listClashes.Add(eventTime)
Else
MsgBox("No Clash", MsgBoxStyle.MsgBoxSetForeground, "")
End If
Next
Next
When trying to add an event to my array list I have noticed, in debug, that no events are sent to the list.
If you want to compare all the pairs of events that are in an array or some kind of a collection, you can use a loop like:
Dim ModuleEventArray() As ModuleEvent
'...
For i As Integer = 0 To ModuleEventArray.Length - 1
For j As Integer = i + 1 To ModuleEventArray.Length - 1
'test if ModuleEventArray(i) overlaps with ModuleEventArray(j)
Next
Next
ModuleEvent here would be another class or structure that has fields startTime and endTime. The test
if (startTime1 <= endTime2 or endTime1 >= startTime2)
is not enough to test for overlap, but maybe you can figure out the correct test yourself :)
EDIT:
Since I see you use some sort of collection, not array, the code you need should be something like:
For i As Integer = 0 To eventTime.Count - 1
For j As Integer = i + 1 To eventTime.Count - 1
If (eventTime.Item(i).Start_Date_Time < eventTime.Item(j).End_Date_Time) And (eventTime.Item(i).End_Date_Time > eventTime.Item(j).Start_Date_Time) Then
MsgBox("Clash")
Else
MsgBox("No Clash")
End If
Next
Next
Before you write your code, you need to first decide what your algorithm is going to be. For example, if you use the naive method your presume, the code is indeed straightforward (basically 2 nested loops) but the complexity if O(n²).
Depending on how much data you have, whether it is in a database, how likely you expect clashes to be, whether you always have the full list of events at the start or you need to find clashes incrementally, etc... different solutions might be preferred. One consideration is whether you need to partition the list into non-clashing sets of events or just produce a yes/no answer (one one for each event) stating whether there is a clash.
You might consider doing something different instead, like sorting the list by start time before you start comparing. That will allow you to walk the list only once.
My comparisons are coming from the database. Prior to the code below I have a query which returns all the records from my Events table, based on the user input of a Module_Code. This code will show the clashes, through a msgbox. I will be changing it to populate a list. It's not the prettiest and will probably lead to a lot of duplication but it achieves my main objective.
For Each evnt In eventTime
Dim startTime1 = evnt.Start_Date_Time
Dim endTime1 = evnt.End_Date_Time
For Each evat In eventTime
Dim startTime2 = evat.Start_Date_Time
Dim endTime2 = evat.End_Date_Time
If (startTime1 < endTime2) And (endTime1 > startTime2) Then
MsgBox("Clash")
Else
MsgBox("No Clash")
End If
Next
Next

VBA Macro: Trying to code "if two cells are the same, then nothing, else shift rows down"

My Goal: To get all data about the same subject from multiple reports (already in the same spreadsheet) in the same row.
Rambling Backstory: Every month I get a new datadump Excel spreadsheet with several reports of variable lengths side-by-side (across columns). Most of these reports have overlapping subjects, but not entirely. Fortunately, when they are talking about the same subject, it is noted by a number. This number tag is always the first column at the beginning of each report. However, because of the variable lengths of reports, the same subjects are not in the same rows. The columns with the numbers never shift (report1's numbers are always column A, report2's are always column G, etc) and numbers are always in ascending order.
My Goal Solution: Since the columns with the ascending numbers do not change, I've been trying to write VBA code for a Macro that compares (for example) the number of the active datarow with from column A with Column G. If the number is the same, do nothing, else move all the data in that row (and under it) from columns G:J down a line. Then move on to the next datarow.
I've tried: I've written several "For Each"s and a few loops with DataRow + 1 to and calling what I thought would make the comparisons, but they've all failed miserably. I can't tell if I'm just getting the syntax wrong or its a faulty concept. Also, none of my searches have turned up this problem or even parts of it I can maraud and cobble together. Although that may be more of a reflection of my googling skill :)
Any and all help would be appreciated!
Note: In case it's important, the columns have headers. I've just been using DataRow = Found.Row + 1 to circumvent. Additionally, I'm very new at this and self-taught, so please feel free to explain in great detail
I think I understand your objective and this should work. It doesn't use any of the methodology you were using as reading your explanation I had a good idea how to proceed. If it isn't what you are looking for my apologies.
It starts at a predefined column (see FIRST_ROW constant) and goes row by row comparing the two cells (MAIN_COLUMN & CHILD_COLUMN). If MAIN_COLUMN < CHILD_COLUMN it pushes everything between SHIFT_START & SHIFT_END down one row. It continues until it hits an empty row.
Sub AlignData()
Const FIRST_ROW As Long = 2 ' So you can skip a header row, or multiple rows
Const MAIN_COLUMN As Long = 1 ' this is your primary ID field
Const CHILD_COLUMN As Long = 7 ' this is your alternate ID field (the one we want to push down)
Const SHIFT_START As String = "G" ' the first column to push
Const SHIFT_END As String = "O" ' the last column to push
Dim row As Long
row = FIRST_ROW
Dim xs As Worksheet
Set xs = ActiveSheet
Dim im_done As Boolean
im_done = False
Do Until im_done
If WorksheetFunction.CountA(xs.Rows(row)) = 0 Then
im_done = True
Else
If xs.Cells(row, MAIN_COLUMN).Value < xs.Cells(row, CHILD_COLUMN).Value Then
xs.Range(Cells(row, SHIFT_START), Cells(row, SHIFT_END)).Insert Shift:=xlDown
Debug.Print "Pushed row: " & row & " down!"
End If
row = row + 1
End If
Loop
End Sub
I modified the code to work as a macro. You should be able to create it right from the macro dialog and run it from there also. Just paste the code right in and make sure the Sub and End Sub lines don't get duplicated. It no longer accepts a worksheet name but instead runs against the currently active worksheet.

SSIS - Script Component, Split single row to multiple rows (Parent Child Variation)

Thanks in advance for your help. I'm in need of help on writing SSIS script component to delimit single row to multiple rows. There were many helpful blog and post I looked at below:
http://beyondrelational.com/ask/public/questions/1324/ssis-script-component-split-single-row-to-multiple-rows-parent-child-variation.aspx
http://bi-polar23.blogspot.com/2008/06/splitting-delimited-column-in-ssis.html
However, I need a little extra help on coding to complete the project. Basically here's what I want to do.
Input data
ID Item Name
1 Apple01,02,Banana01,02,03
2 Spoon1,2,Fork1,2,3,4
Output data
ParentID ChildID Item Name
1 1 Apple01
1 2 Apple02
1 3 Banana01
1 4 Banana02
1 5 Banana03
2 1 Spoon1
2 2 Spoon2
2 3 Fork1
2 4 Fork2
2 5 Fork3
2 6 Fork4
Below is my attempt to code, but feel free to revise whole if it's illogic. SSIS Asynchronous output is set.
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Dim posID As Integer, childID As Integer
Dim delimiter As String = ","
Dim txtHolder As String, suffixHolder As String
Dim itemName As String = Row.ItemName
Dim keyField As Integer = Row.ID
If Not (String.IsNullOrEmpty(itemList)) Then
Dim inputListArray() As String = _
itemList.Split(New String() {delimiter}, _
StringSplitOptions.RemoveEmptyEntries)
For Each item As String In inputListArray
Output0Buffer.AddRow()
Output0Buffer.ParentID = keyField
If item.Length >= 3 Then
txtHolder = Trim(item)
Output0Buffer.ItemName = txtHolder
'when item length is less than 3, it's suffix
Else
suffixHolder = Trim(item)
txtHolder = Left(txtHolder.ToString(), Len(txtHolder) _
- Len(suffixHolder)) & suffixHolder.ToString()
Output0Buffer.ItemName = txtHolder
End If
Next
End If
End Sub
The current code produces the following output
ID Item Name
1 Apple01
1 02
1 Banana01
1 02
1 03
2 Spoon1
2 2
2 Fork1
2 2
2 3
2 4
If I come across as pedantic in this response, it is not my intention. Based on the comment "I'm new at coding and having a problem troubleshooting" I wanted to walk through my observations and how I came to them.
Problem analysis
The desire is to split a single row into multiple output rows based on a delimited field associated to the row.
The code as it stands now is generating the appropriate number of rows so you do have the asynchronous part (split) of the script working so that's a plus. What needs to happen is we need to 1) Populate the Child ID column 2) Apply the item prefix to all subsequent row when generating the child items.
I treat most every problem like that. What am I trying to accomplish? What is working? What isn't working? What needs to be done to make it work. Decomposing problems into smaller and smaller problems will eventually result in something you can do.
Code observations
Pasting in the supplied code resulted in an error that itemList was not declared. Based on usage, it seems that it was intended to be itemName.
After fixing that, you should notice the IDE indicating you have 2 unused variables (posID, childID) and that the variable txHolder is used before it's been assigned a value. A null reference exception could result at runtime. My coworker often remarks warnings are errors that haven't grown up yet so my advice to you as a fledgling developer is to pay attention to warnings unless you explicitly expect the compiler to warn you about said scenario.
Getting started
With a choice between solving the Child ID situation versus the name prefix/suffix stuff, I'd start with an easy one, the child id
Generating a surrogate key
That's the fancy title phrase that if you searched on you'd have plenty of hits to ssistalk or sqlis or any of a number of fabulously smart bloggers. Devil of course is knowing what to search on. No where do you ever compute or assign the child id value to the stream which of course is why it isn't showing up there.
We simply need to generate a monotonically increasing number which resets each time the source id changes. I am making an assumption that the inbound ID is unique in the incoming data like a sales invoice number would be unique and we are splitting out the items purchased. However if those IDs were repeated in the dataset, perhaps instead of representing invoice numbers they are salesperson id. Sales Person 1 could have another row in the batch selling vegetables. That's a more complex scenario and we can revisit if that better describes your source data.
There are two parts to generating our surrogate key (again, break problems down into smaller pieces). The first thing to do is make a thing that counts up from 1 to N. You have defined a childId variable to serve this. Initialize this variable (1) and then increment it inside your foreach loop.
Now that we counting, we need to push that value onto the output stream. Putting those two steps together would look like
childID = 1
For Each item As String In inputListArray
Output0Buffer.AddRow()
Output0Buffer.ParentId = keyField
Output0Buffer.ChildId = childID
' There might be VB shorthand for ++
childID = childID + 1
Run the package and success! Scratch the generate surrogate key off the list.
String mashing
I don't know of a fancy term for what needs to be done in the other half of the problem but I needed some title for this section. Given the source data, this one might be harder to get right. You've supplied value of Apple01, Banana01, Spoon1, Fork1. It looks like there's a pattern there (name concatenated with a code) but what it is it? Your code indicates that if it's less than 3, it's a suffix but how do you know what the base is? The first row uses a leading 0 and is two digits long while the second row does not use a leading zero. This is where you need to understand your data. What is the rule for identifying the "code" part of the first row? Some possible algorithms
Force your upstream data providers to provide consistent length codes (I think this has worked once in my 13 years but it never hurts to push back against the source)
Assuming code is always digits, evaluate each character in reverse order testing whether it can be cast to an integer (Handles variable length codes)
Assume the second element in the split array will provide the length of the code. This is the approach you are taking with your code and it actually works.
I made no changes to make the generated item name work beyond fixing the local variables ItemName/itemList. Final code eliminates the warnings by removing PosID and initializing txtHolder to an empty string.
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Dim childID As Integer
Dim delimiter As String = ","
Dim txtHolder As String = String.Empty, suffixHolder As String
Dim itemName As String = Row.ItemName
Dim keyField As Integer = Row.ID
If Not (String.IsNullOrEmpty(itemName)) Then
Dim inputListArray() As String = _
itemName.Split(New String() {delimiter}, _
StringSplitOptions.RemoveEmptyEntries)
' The inputListArray (our split out field)
' needs to generate values from 1 to N
childID = 1
For Each item As String In inputListArray
Output0Buffer.AddRow()
Output0Buffer.ParentId = keyField
Output0Buffer.ChildId = childID
' There might be VB shorthand for ++
childID = childID + 1
If item.Length >= 3 Then
txtHolder = Trim(item)
Output0Buffer.ItemName = txtHolder
Else
'when item length is less than 3, it's suffix
suffixHolder = Trim(item)
txtHolder = Left(txtHolder.ToString(), Len(txtHolder) _
- Len(suffixHolder)) & suffixHolder.ToString()
Output0Buffer.ItemName = txtHolder
End If
Next
End If
End Sub