Excel macro / script to cut/paste text between columns - vba
I am working in as a business developer, however the problems in our support department sometimes require me to work on clients' CSV file i.e. Excel spreadsheet containing product catalog.
Right now I have an Export CSV from an online store, where the first column is formatted in the following way:
Name_of_the_product - {store category)
e.g. Sony NEX-6 - {digital camera}
I need a macro/script to take cut the words between {} and paste it into the column immediately to the right, without the {}. This way I will have clean product name i.e. Sony NEX-6, as well as the store category for each product e.g. digital camera (which I need to push the product feed via a third party solution).
Any help would be greatly appreciated!
I have managed to get it mostly working using the following function:
=MID(A2;FIND("[";A2)+1;FIND("]";A2) - FIND("[";A2) - 1)
The only thing now left to do is to delete what was copied from the A column (including the []) - does anyone have the idea for the macro? I hope I can get it working using a Search/Replace functionality from a feed management solution, but I am checking with our support for that.
Thank you all!
I assume your data in column A. And have 100 rows.
Dim WrdArray() As String
Dim text_string As String
Dim LeftSide As String
Dim RightSide As String
For i = 1 To 100
text_string = Range("A" & i).Value
WrdArray() = Split(text_string, "{")
LeftSide = WrdArray(LBound(WrdArray))
RightSide = WrdArray(UBound(WrdArray))
Range("A" & i).Value = LeftSide
Range("B" & i).Value = Replace(Result, "}", "")
Next i
And if you want to understand the split function, you can refer this blog post.
http://www.excelvbasolutions.com/2015/06/split-function.html
Related
Microsoft Access 2016 OLE Object & Binary export
I'm trying to write some VBA code to generate a text file containing SQL INSERT statements for all records in a table in an Access database (accdb). The table has an OLE Object field and a Binary field. I can't seem to get them written to the text file properly; I mostly get question marks(?). I've search for solutions and found some possible ideas but none worked. If any one has suggestions, I will be very appreciative of any help that you can provide. Miguel
I actually found an solution following some more searching: Function ByteArrayToHex(B() As Byte) As String Dim n As Long, I As Long ByteArrayToHex = Space$(2 * (UBound(B) - LBound(B)) + 2) n = 1 For I = LBound(B) To UBound(B) Mid$(ByteArrayToHex, n, 2) = right$("00" & Hex$(B(I)), 2) n = n + 2 Next ByteArrayToHex = "0x" & ByteArrayToHex End Function Michael
To export OLE objects in a Microsoft Access file, open this file with the Access application and create a corresponding form with all relevant fields. Use the VBA code provided in the link and you should be able to export some of the most common file types in an automated fashion. Good luck. https://medium.com/#haggenso/export-ole-fields-in-microsoft-access-c67d535c958d
Applying a formula to Get From Web URL with VBA Macro
I'm trying to find a way to apply a formula to a get data from web url. So the url will look like: http://www.thisurl.com/formula/restOfTheUrl I'm new to this, so I tried recording the macro using the Advanced Get From Web function and breaking out the elements of the URL from there and substituting the formula in. So the section of the code looked like this Source = Xml.Tables(Web.Contents(""http://www.thisurl.com/"" & text(0.123456,2) & ""/restOfTheUrl"")) Had it worked, the number 2 would have been substituted in place of text(0.123456,2). Is there another way to do this?
Try, Source = Xml.Tables(Web.Contents("http://www.thisurl.com/" & mid("0.123456", 4, 1) & "/restOfTheUrl")) 'possibly (your sample contradicts your code) Source = Xml.Tables(Web.Contents("http://www.thisurl.com/" & mid("0.123456", 4, 1) & "restOfTheUrl"))
Dynamic Label Break line
I have created a form using vb net which extracts data from a database, some of which is displayed tot the user via labels which are dynamically created. However, i need to be able to make the label 'break a line' after - say n letters. I havent been able to find a way to do this, any help will be appreciated. Thanks.
If you want to just add a line break after n characters, you can use something like: Dim myStr As String = "Hello123" Label1.Text = myStr.Substring(0, 5) & vbNewLine & myStr.Substring(5) Output: Hello 123 Hope this helps
Validate a csv file
This is my sample file #%cty_id1,#%ccy_id2,#%cty_src,#%cty_cd3,#%cty_nm4,#%cty_reg5,#%cty_natnl6,#%cty_bus7,#%cty_data8 690,ALL2,,AL,ALBALODMNIA,,,, 90,ALL2,,,AQ,AKNTARLDKCTICA,,, 161,IDR2,,AZ,AZLKFMERBALFKIJAN,,,, 252,LTL2,,BJ,BENLFMIN,,,, 206,CVE2,,BL,SAILFKNT BAFSDRTHLEMY,,,, 360,,,BW2,BOPSLFTSWLSOANA,,,, The problem is for #%cty_cd3 is a standard column(NOT NULL) with length 2 letters only, but in sql server the record shifts to the other column,(due to a extra comma in btw)how do i validate a csv file,to make sure that when there's a 2 character word need to be only in 4 column? there are around 10000 records ? Set of rules Defined ! Should have a standard set of delimiters for eachrow if not Check for NOT NULL values having Null values If found Null remove delimiter at the pointer The 3 ,,, are not replaced with 2 ,, #UPDATED : Can i know if this can be done using a script ? Updated i need only a function That operates on records like 90,ALL2,,,AQ,AKNTARLDKCTICA,,, correct them using a Regex or any other method and put back into the source file !
Your best bet here may be to use the tSchemaComplianceCheck component in Talend. If you read the file in with a tFileInputDelimited component and then check it with the tSchemaComplianceCheck where you set cty_cd to not nullable then it will reject your Antarctica row simply for the null where you expect no nulls. From here you can use a tMap and simply map the fields to the one above. You should be able to easily tweak this as necessary, potentially with further tSchemaComplianceChecks down the reject lines and mapping to suit. This method is a lot more self explanatory and you don't have to deal with complicated regex's that need complicated management when you want to accommodate different variations of your file structure with the benefit that you will always capture all of the well formatted rows.
You could try to delete the empty field in column 4, if column no. 4 is not a two-character field, as follows: awk 'BEGIN {FS=OFS=","} { for (i=1; i<=NF; i++) { if (!(i==4 && length($4)!=4)) printf "%s%s",$i,(i<NF)?OFS:ORS } }' file.csv Output: "id","cty_ccy_id","cty_src","cty_nm","cty_region","cty_natnl","cty_bus_load","cty_data_load" 6,"ALL",,"AL","ALBANIA",,,, 9,"ALL",,"AQ","ANTARCTICA",,, 16,"IDR",,"AZ","AZERBAIJAN",,,, 25,"LTL",,"BJ","BENIN",,,, 26,"CVE",,"BL","SAINT BARTH�LEMY",,,, 36,,,"BW","BOTSWANA",,,, 41,"BNS",,"CF","CENTRAL AFRICAN REPUBLIC",,,, 47,"CVE",,"CL","CHILE",,,, 50,"IDR",,"CO","COLOMBIA",,,, 61,"BNS",,"DK","DENMARK",,,, Note: We use length($4)!=4 since we assume two characters in column 4, but we also have to add two extra characters for the double quotes..
The solution is to use a look-ahead regex, as suggested before. To reproduce your issue I used this: "\\,\\,\\,(?=\\\"[A-Z]{2}\\\")" which matches three commas followed by two quoted uppercase letters, but not including these in the match. Ofc you could need to adjust it a bit for your needs (ie. an arbitrary numbers of commas rather than exactly three). But you cannot use it in Talend directly without tons of errors. Here's how to design your job: In other words, you need to read the file line by line, no fields yet. Then, inside the tMap, do the match&replace, like: row1.line.replaceAll("\\,\\,\\,(?=\\\"[A-Z]{2}\\\")", ",,") and finally tokenize the line using "," as separator to get your final schema. You probably need to manually trim out the quotes here and there, since tExtractDelimitedFields won't. Here's an output example (needs some cleaning, ofc): You don't need to entry the schema for tExtractDelimitedFields by hand. Use the wizard to record a DelimitedFile Schema into the metadata repository, as you probably already did. You can use this schema as a Generic Schema, too, fitting it to the outgoing connection of tExtractDelimitedField. Not something the purists hang around, but it works and saves time. About your UI problems, they are often related to file encodings and locale settings. Don't worry too much, they (usually) won't affect the job execution. EDIT: here's a sample TOS job which shows the solution, just import in your project: TOS job archive EDIT2: added some screenshots
Coming to the party late with a VBA based approach. An alternative way to regex is to to parse the file and remove a comma when the 4th field is empty. Using microsoft scripting runtime this can be acheived the code opens a the file then reads each line, copying it to a new temporary file. If the 4 element is empty, if it is it writes a line with the extra comma removed. The cleaned data is then copied to the origonal file and the temporary file is deleted. It seems a bit of a long way round, but it when I tested it on a file of 14000 rows based on your sample it took under 2 seconds to complete. Sub Remove4thFieldIfEmpty() Const iNUMBER_OF_FIELDS As Integer = 9 Dim str As String Dim fileHandleInput As Scripting.TextStream Dim fileHandleCleaned As Scripting.TextStream Dim fsoObject As Scripting.FileSystemObject Dim sPath As String Dim sFilenameCleaned As String Dim sFilenameInput As String Dim vFields As Variant Dim iCounter As Integer Dim sNewString As String sFilenameInput = "Regex.CSV" sFilenameCleaned = "Cleaned.CSV" Set fsoObject = New FileSystemObject sPath = ThisWorkbook.Path & "\" Set fileHandleInput = fsoObject.OpenTextFile(sPath & sFilenameInput) If fsoObject.FileExists(sPath & sFilenameCleaned) Then Set fileHandleCleaned = fsoObject.OpenTextFile(sPath & sFilenameCleaned, ForWriting) Else Set fileHandleCleaned = fsoObject.CreateTextFile((sPath & sFilenameCleaned), True) End If Do While Not fileHandleInput.AtEndOfStream str = fileHandleInput.ReadLine vFields = Split(str, ",") If vFields(3) = "" Then sNewString = vFields(0) For iCounter = 1 To UBound(vFields) If iCounter <> 3 Then sNewString = sNewString & "," & vFields(iCounter) Next iCounter str = sNewString End If fileHandleCleaned.WriteLine (str) Loop fileHandleInput.Close fileHandleCleaned.Close Set fileHandleInput = fsoObject.OpenTextFile(sPath & sFilenameInput, ForWriting) Set fileHandleCleaned = fsoObject.OpenTextFile(sPath & sFilenameCleaned) Do While Not fileHandleCleaned.AtEndOfStream fileHandleInput.WriteLine (fileHandleCleaned.ReadLine) Loop fileHandleInput.Close fileHandleCleaned.Close Set fileHandleCleaned = Nothing Set fileHandleInput = Nothing KillFile (sPath & sFilenameCleaned) Set fsoObject = Nothing End Sub
If that's the only problem (and if you never have a comma in the field bt_cty_ccy_id), then you could remove such an extra comma by loading your file into an editor that supports regexes and have it replace ^([^,]*,[^,]*,[^,]*,),(?="[A-Z]{2}") with \1.
i would question the source system which is sending you this file as to why this extra comma in between for some rows? I guess you would be using comma as a delimeter for importing this .csv file into talend. (or another suggestion would be to ask for semi colon as column separator in the input file) 9,"ALL",,,"AQ","ANTARCTICA",,,, will be 9;"ALL";,;"AQ";"ANTARCTICA";;;;
How can I manually define column types in OLEDB/JET DataTable?
I am importing data from an Excel spreadsheet to a VB.NET DataTable. This Excel spreadsheet has a lot of garbage data in the first 18 rows, including a lot of empty cells. I ultimately remove these rows in post-processing, but I need to access the Excel file as is, without modifying it by hand at all. I realize that setting IMEX=1 instructs the Jet engine to assume all columns are text. However, I have an issue with setting it to another value (explained more below). So, the default Jet engine column type scan wouldn't work particularly well. I'd like to either: Manually define column types before the import Force Excel to scan many more rows (I believe the default is 8) to determine the column type However, I do have an issue with idea #2. I do not have administrative rights to open regedit.exe, so I can't modify the registry using that method. I did circumvent this before by importing a key somehow, but I can't remember how I did it. So #1 would be an ideal solution, unless someone can help me carry out idea #2. Is this possible? Currently, I'm using the following method: If _ SetDBConnect( _ "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & filepath & _ ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1""", True) Then dtSchema = _dh.GetOleDbSchemaTable() If _dh.Errors <> "" Then Throw New Exception("::LoadFileToBuffer.GetOleDbSchemaTable::" & _dh.Errors()) End If For Each sheetRow In dtSchema.Rows If sheetRow("TABLE_NAME").ToString() = "TOTAL_DOLLARS$" Then totalDollars = sheetRow("TABLE_NAME").ToString() ElseIf sheetRow("TABLE_NAME").ToString() = "TOTAL_UNITS$" Then totalUnits = sheetRow("TABLE_NAME").ToString() End If Next 'Get total dollars table sql.Append("SELECT * FROM [" & totalDollars & "]") dtDollars = _dh.GetTable(sql.ToString()) End If Thank you!
You should be able to say: sql.Append("SELECT * FROM [" & totalDollars & "$A18:X95]") Where totalDollars is a sheet name and x95 is the last valid row. You will not be able to include headers unless they are available at row 18.