Parse a comma delimited field into seperate fields (MS ACCESS VBA 2003) - vba

I inherited a database where user input fields are stored as a comma delimited string. I know. Lame. I want a way to parse these fields in a SELECT query where there are three segments of varying number of characters. Counter to all the recommendations that I insert fields into a new table or create a stored procedure to do this, this is what I came up with. I'm wondering if anyone sees any flaw in doing this as a select query (where I can easily convert from string to parsed and back again as need be).
Field_A
5,25,89
So to get the left segment, which is the most straightforward:
Field_1: Left$([Field_A],InStr([Field_A],",")-1)
To get the right-most segment:
Field_3: Right$([Field_A],Len([Field_A])-InStrRev([Field_A],","))
Middle segment was the trickiest:
Field_2: Mid([Field_A],InStr([Field_A],",")+1,InStrRev([Field_A],",")-InStr([Field_A],",")-1)
So the result is:
Field_1 Field_2 Field_3
5 25 89
Any consenting opinions?

Well, if you insist on going down this road......
This might be easier and more adaptable. Create a function in a module:
Public Function GetValueFromDelimString(sPackedValue As String, nPos As Long,
Optional sDelim As String = ",")
Dim sElements() As String
sElements() = Split(sPackedValue, sDelim)
If UBound(sElements) < nPos Then
GetValueFromDelimString = ""
Else
GetValueFromDelimString = sElements(nPos)
End If
End Function
Now in your query you can get any field in the string like this:
GetValueFromDelimString([MultiValueField],0) AS FirstElement, GetValueFromDelimString([MultiValueField],1) AS SecondElement, etc.
I feel like I am buying beer for a minor, encouraging this type of behavior :)

It sounds like you're not asking for information on how to parse a comma-delimited field into different fields, but rather looking for people to support you in your decision to do so, yes?
The fact, as you've already discovered, is that you can indeed do this with skillful application of functions in your SQL field definitions. But that doesn't mean that you should.
In the short run, it's an easy way to achieve your goals as data manager, I'll grant you that. But as a long-term solution it's just adding another layer of complexity to what seems like a poorly-designed database (I know that the latter is not your fault -- I too have inherited my share of "lame" databases).
So I applaud you on "getting the job done" for now, but would advise you to listen to "all the recommendations that you insert fields into a new table" -- they're good recommendations. It'll take more planning and effort, but in the long run you'll have a better database. And that will make everything you do with it easier, faster, and more reliable.

This is an old thread, but someone might search it. You can also do the same strategy as an update query. That way you can keep the original CSV and have 3 new destination fields that can be calculate and recalculated depending on your application purposes.

Related

Finding multiple exact substrings in an SQL query (Crystal Report)

I'm writing a Command for a Crystal Report that queries an SQL Database. The Command will use parameters/inputs that are generated from a different program. I've put parameters directly in Commands before, but this one has to be handled differently.
Said input will be a string that is numbers with an & in between such as this: "6&12&15", order is irrelevant in this case. For understanding purposes, we'll say that the numbers are product ID's and are unique. When a user wants to search for multiple products in this database, the string above will be how it looks.
I have used the following code in the past for non-number based strings and it works well because of how other fields are set up:
CASE WHEN '{?WearhouseState}' = '' THEN 1
WHEN CHARINDEX(Products.WearhouseState,'{?WearhouseState}',0)>0 THEN 1
ELSE 0
END = 1
That code will search for the field's value as a substring essentially anywhere in the given input parameter, which works for things like a state because "Texas" is never going to be a substring of any other state. However, this doesn't work so well with numbers. For example, if a product has an ID of 3, then the search will return that record if the parameter is '31', which I do not clearly want (it would also return product 1 as well).
For the mean time, I have been splitting the string up with a delimiter in Crystal Reports which works fine, but slows down the overall time to create the document. Most of the parameters I use I tend to put right in the query and it drastically improves the speed. The Crystal code is as follows:
{?ProductID}="" or {Command.ProductID} in split({?ProductID},"&")
This works exactly as intended but again, time is of the essence. Any additional information can be provided. It is technically InterSystems SQL so keep that in mind because I know the commands/clauses can vary between SQL.
I'd do the split string operation in SQL Server instead of CR. See e.g. T-SQL split string for a working code sample. Note that this logic does not need to run as a function, but you could also include it directly in your CR command.

Retrieve more than 255 characters from a query field

I have a query that displays more than 255 characters in a field, and I want to put that data into a variable that I can process. Unfortunately, MS Access truncates the field value return to 255 characters.
(At least when using this method:)
MyVar = Nz(rst.Fields("myfield").Value)
Most of the workarounds I've found online suggest to create a table, modify the desired field setting to Long Text, and then migrate the data from the query to the table, but I'm getting the same results. The Long Text field is still truncated during the DoCmd execution.
(At least when I do it this way:)
CurrentDB.Execute "Insert Into target_table Select myquery.* From myquery"
Other suggestions mention to change the field to group by "First", but the field reverts to "Expression" when run because the field definition includes a function that runs during the query to manipulate the results.
The query also isn't mine and is rather complicated, using other field expressions that call other functions across tables. I would like to avoid reverse engineering the entire thing just to update a table if at all possible. The data is already on my screen, looking at me - I just want to be able to use it.
It's a very Microsoft-ish solution for an MS product to display some data and then tell you it can't find the stuff it just gave you (I'm looking at you, file explorer), but I'm hoping someone here might have a viable suggestion. Perhaps some other query-to-table methods that don't truncate? Some other query setting, or field retrieval method?
Try this:
Dim MyVar As String
Dim Value As Variant
Value = rst.Fields("myfield").Value
MyVar = Nz(Value)
In any case, this works:
MyVar = Nz(DLookup("myfield", "myquery", "Id = " & someId & ""))
However, most likely your myquery is the limiting factor. It must be a straight select query to not truncate memo-fields.

In vb.net, I need to split a string multiple ways to create a list of commands

Lets say I have a string that contains stored procedure names, their parameters, and values, like this:
Dim sprocs As String = "Procedure: NameOfProcedure, Parameters: #Param1(int), #Param2(varchar) Values: Something, SomethingElse ; Procedure: NameOfProcedure, Parameters: #Param1(int), #Param2(varchar) Values: Something, Something Else"
My point of the above is I need to be able to take a user defined string that can store multiple procedures separated by... something. In this case I used a ; semicolon.
What I need to be able to do is a FOR EACH on these in vb.net.
I know how to used stored procedures and set parameters... I'm just not sure of the best way to go about this as far as how to split up that string to match parameters and values, etc.
So a bad example:
Dim ConnectionString As String = "String...removed"
Dim conn As New System.Data.SqlClient.SqlConnection(ConnectionString)
' So I need a for each here
Dim cmd As New System.Data.SqlClient.SqlCommand(ProcedureName)
With cmd
.Connection = conn
.CommandType = CommandType.StoredProcedure
' and a for each here on the parameters, datatypes, and their values
.Parameters.Add(ParamName, DataType).Value = TheValue
.Connection.Open()
.BeginExecuteReader()
.Connection.Close()
End With
So the splitting and parsing out the string is my biggest roadblock on this. I'm not sure how to separate it out at this level, match parameters, datatypes and their values all together. If there's a better way to have the end-user organize the string to make parsing and splitting easier, please share.
Any suggestions or examples are greatly appreciated.
Thank you very much.
There's a temptation here to use String.Split() or a regular expression. Neither solution is actually very good for this. To understand what I mean, take a quick moment to search Google for the many thousands of posts of programmers working with CSV running into edge cases where their regex didn't quite cut it. For this purpose, your semi-colon delimiter is not significantly different than a fancy comma.
Depending on the regex engine, it may actually be possible to build an expression that does this, especially if you can put certain constraints on the data. However, such expressions tend to be very tough to build and maintain, and they tend to require backtracking, which hurts performance.
The next alternative to examine is a dedicated parser. For CSV data, this is the panacea. There a number of good ones available for most any platform that you can just plug in, and they're typically pretty easy to work with. However, what you have, while similar, goes a bit beyond the level of what these parsers are written to handle.
One related solution is write your own csv-like parser. This may be your best option, and if you choose this route all I can do is recommend that you build a state machine to go character by character through the input string.
The next rung up the ladder is to use a generic parser/interpretter that relies on something like ANTLR. I'm haven't personally dug very deep into that well, but I know the tools are out there.
One related option, that I think is probably your best option here overall, is a Domain Specific Language (DSL). This is something that may be a little easier to get started with than a tool like ANTLR, especially because it's already built into Visual Studio. However, this option requires that you be able to control your input format.
If all this sounds like more work than it should be, you're right. The good news is that is possible to skip all this. The trick is that you must first impose artificial constraints on your data. For example, a constraint that parameter values are not allowed to contain ; characters means you can get back to a String.Split() solution to divide out each separate procedure call. Now your function is no longer generally applicable, but then maybe is doesn't need to be. When you create these constraints, make sure they are documented, so that when something violates them and code explodes you'll have clear definitions for why.

Script to compare two tables in database, from user input

I am very new to VBA and SQL and am trying to learn. I have a MS Access project that requires a VBA script that prompts the user to input two table names and numerous field names and create a SQL query utilizing those the names.
The specific SQL query I'm trying to use is below.
SELECT
A.user_index, A.input1, B.input1, A.input2, B.input2, A.input3, B.input3, B.input4,
A.input4, A.input5, B.input5
FROM
table1 AS A
LEFT JOIN
table2 AS B ON A.user_index = B.user_index
WHERE
(((A.input1) <> [B].[input1)) OR
(((A.input2) <> [B].[input2])) or
(((A.input4) <> [B].[input4]));
The overall purpose of this is to have a script that will be able to list fields for comparison that is applicable with any database. I know this is probably a relatively easy solution. However, I have no idea where to start.
My first instinct is to say "What have you tried so far?", but as you said, you don't know where to start.
It sounds like you need to first prompt the user for several field and table names, then build a query based on those values. I recommend first outlining exactly what you want your script to do. Maybe something like:
Declare variables to hold the values.
Prompt the user for each of the values and store them in the variables.
2a. After the user enters a value, make sure it is valid. If not, do something accordingly.
Declare a variable to hold your SQL query.
Construct the query.
Run the query.
This is obviously just an example. Break down each step into "baby steps" as much as possible.
It's a good idea to ask yourself how unique these baby steps are to your particular situation (hint: they almost certainly are not unique). If they aren't, then they have probably been solved tens of thousands of times already, so you have a very good chance of googling your questions.
If you still can't find an answer to how to do a particular step, feel free to ask here. Just remember to include your code even if it is broken :)

Is there some way to inject SQL even if the ' character is deleted?

If I remove all the ' characters from a SQL query, is there some other way to do a SQL injection attack on the database?
How can it be done? Can anyone give me examples?
Yes, there is. An excerpt from Wikipedia
"SELECT * FROM data WHERE id = " + a_variable + ";"
It is clear from this statement that the author intended a_variable to be a number correlating to the "id" field. However, if it is in fact a string then the end user may manipulate the statement as they choose, thereby bypassing the need for escape characters. For example, setting a_variable to
1;DROP TABLE users
will drop (delete) the "users" table from the database, since the SQL would be rendered as follows:
SELECT * FROM DATA WHERE id=1;DROP TABLE users;
SQL injection is not a simple attack to fight. I would do very careful research if I were you.
Yes, depending on the statement you are using. You are better off protecting yourself either by using Stored Procedures, or at least parameterised queries.
See Wikipedia for prevention samples.
I suggest you pass the variables as parameters, and not build your own SQL. Otherwise there will allways be a way to do a SQL injection, in manners that we currently are unaware off.
The code you create is then something like:
' Not Tested
var sql = "SELECT * FROM data WHERE id = #id";
var cmd = new SqlCommand(sql, myConnection);
cmd.Parameters.AddWithValue("#id", request.getParameter("id"));
If you have a name like mine with an ' in it. It is very annoying that all '-characters are removed or marked as invalid.
You also might want to look at this Stackoverflow question about SQL Injections.
Yes, it is definitely possible.
If you have a form where you expect an integer to make your next SELECT statement, then you can enter anything similar:
SELECT * FROM thingy WHERE attributeID=
5 (good answer, no problem)
5; DROP table users; (bad, bad, bad...)
The following website details further classical SQL injection technics: SQL Injection cheat sheet.
Using parametrized queries or stored procedures is not any better. These are just pre-made queries using the passed parameters, which can be source of injection just as well. It is also described on this page: Attacking Stored Procedures in SQL.
Now, if you supress the simple quote, you prevent only a given set of attack. But not all of them.
As always, do not trust data coming from the outside. Filter them at these 3 levels:
Interface level for obvious stuff (a drop down select list is better than a free text field)
Logical level for checks related to data nature (int, string, length), permissions (can this type of data be used by this user at this page)...
Database access level (escape simple quote...).
Have fun and don't forget to check Wikipedia for answers.
Parameterized inline SQL or parameterized stored procedures is the best way to protect yourself. As others have pointed out, simply stripping/escaping the single quote character is not enough.
You will notice that I specifically talk about "parameterized" stored procedures. Simply using a stored procedure is not enough either if you revert to concatenating the procedure's passed parameters together. In other words, wrapping the exact same vulnerable SQL statement in a stored procedure does not make it any safer. You need to use parameters in your stored procedure just like you would with inline SQL.
Also- even if you do just look for the apostrophe, you don't want to remove it. You want to escape it. You do that by replacing every apostrophe with two apostrophes.
But parameterized queries/stored procedures are so much better.
Since this a relatively older question, I wont bother writing up a complete and comprehensive answer, since most aspects of that answer have been mentioned here by one poster or another.
I do find it necessary, however, to bring up another issue that was not touched on by anyone here - SQL Smuggling. In certain situations, it is possible to "smuggle" the quote character ' into your query even if you tried to remove it. In fact, this may be possible even if you used proper commands, parameters, Stored Procedures, etc.
Check out the full research paper at http://www.comsecglobal.com/FrameWork/Upload/SQL_Smuggling.pdf (disclosure, I was the primary researcher on this) or just google "SQL Smuggling".
. . . uh about 50000000 other ways
maybe somthing like 5; drop table employees; --
resulting sql may be something like:
select * from somewhere where number = 5; drop table employees; -- and sadfsf
(-- starts a comment)
Yes, absolutely: depending on your SQL dialect and such, there are many ways to achieve injection that do not use the apostrophe.
The only reliable defense against SQL injection attacks is using the parameterized SQL statement support offered by your database interface.
Rather that trying to figure out which characters to filter out, I'd stick to parametrized queries instead, and remove the problem entirely.
It depends on how you put together the query, but in essence yes.
For example, in Java if you were to do this (deliberately egregious example):
String query = "SELECT name_ from Customer WHERE ID = " + request.getParameter("id");
then there's a good chance you are opening yourself up to an injection attack.
Java has some useful tools to protect against these, such as PreparedStatements (where you pass in a string like "SELECT name_ from Customer WHERE ID = ?" and the JDBC layer handles escapes while replacing the ? tokens for you), but some other languages are not so helpful for this.
Thing is apostrophe's maybe genuine input and you have to escape them by doubling them up when you are using inline SQL in your code. What you are looking for is a regex pattern like:
\;.*--\
A semi colon used to prematurely end the genuine statement, some injected SQL followed by a double hyphen to comment out the trailing SQL from the original genuine statement. The hyphens may be omitted in the attack.
Therefore the answer is: No, simply removing apostrophes does not gaurantee you safety from SQL Injection.
I can only repeat what others have said. Parametrized SQL is the way to go. Sure, it is a bit of a pain in the butt coding it - but once you have done it once, then it isn't difficult to cut and paste that code, and making the modifications you need. We have a lot of .Net applications that allow web site visitors specify a whole range of search criteria, and the code builds the SQL Select statement on the fly - but everything that could have been entered by a user goes into a parameter.
When you are expecting a numeric parameter, you should always be validating the input to make sure it's numeric. Beyond helping to protect against injection, the validation step will make the app more user friendly.
If you ever receive id = "hello" when you expected id = 1044, it's always better to return a useful error to the user instead of letting the database return an error.