So lets say I have this list of strings in an Excel file:
33000
33100
33010
33110
45050
45150
45250
45350
45360
45370
55360
55370
And I've got a SQL table that has this list of strings and more and I want to make a SELECT statement that searches only for this list of strings.
I could make a brute force statement like SELECT * FROM Table WHERE field = '33100' OR field = '33010' .... However I could make the WHERE list smaller by using LIKE statements.
I'm trying to find a way to make the number of LIKE statements as small as possible so I need to generate the least amount of SQL patterns to identify the whole list. For the list above, the least amount of SQL patterns would be this:
33[01][01]0
45[0123]50
[45]53[67]0
How could I generate a list of patterns like this dynamically where the input is the list of strings?
An alternative approach might be more "elegant", but it will not be faster. Your strings start with different characters, so the first part of a like pattern would be a wildcard or character range -- effectively precluding the use of an index.
A simple in expression, on the other hand, can use an index:
where col in ('33100', '33010', '33110', '45050', ...)
Okay, let's say you have this data in Excel which starts from cell A2
In cell C1 write this code: create table ##TEMP(STRS varchar(20))
In cell C2 write this code: ="insert into ##TEMP"&" values"&" ('"&A2&"' )"&","
In cell C3 write this code: =" ('"&A3&"')"&","
Now Ctrl+C formula in cell C3 and paste it in range C4-C13
Now you get Excel like this
Copy this code in range C1-C13, open SQL management studio paste it, delete last comma (in this case in cell C13 there is comma at the end you have to delete it for success SQL run) and run, now in you have ##temp table.
INNER JOIN it with your table like
SELECT * FROM MYTABLE M INNER JOIN ##TEMP AS T ON T.STRS = M.COLUMN_NAME_STR
And you should get data which you need, hope it helps.
I have been working on this on for months. I just cannot get the natural (True alpha-numeric) results. I am shocked that I cannot get them as I have been able to in RPG since 1992 with EBCDIC.
I am looking for any solution in SQL, VBS or simple excel or access. Here is the data I have:
299-8,
3410L-87,
3410L-88,
420-A20,
420-A21,
420A-40,
4357-3,
AN3H10A,
K117GM-8,
K129-1,
K129-15,
K271B-200L,
K271B-38L,
K271D-200EL,
KD1051,
KD1062,
KD1092,
KD1108,
KD1108,
M8000-3,
MS24665-1,
SK271B-200L,
SAYA4008
The order I am looking for is the true alpha-numeric order as below:
AN3H10A,
KD1051,
KD1062,
KD1092,
KD1108,
KD1108,
K117GM-8,
K129-1,
K129-15,
MS24665-1,
M8000-3,
SAYA4008,
SK271B-200L
The inventory is 7800 records so I have had some problems with processing power as well.
Any help would be appreciated.
Jeff
In native Excel, you can add multiple sorting columns to return the ASCII code for each character, but if the character is a number, then add a large number to the code (e.g 1000).
Then sort on each of the helper columns, including the first column in the table, but not in the sort.
The formula:
=IFERROR(CODE(MID($A1,COLUMNS($A:A),1))+AND(CODE(MID($A1,COLUMNS($A:A),1))>=48,CODE(MID($A1,COLUMNS($A:A),1))<=57)*1000,"")
The Sort dialog:
The results:
You can implement a similar algorithm using VBA, and probably SQL also. I dunno about VBS or Access.
You could try using format for left padding the string in order by
select column
from my_table
order by Format(column, "0000000000")
Add a sorting column:
, iif (left(fieldname, 1) between '0' and '9', 1, 0) sortField
etc
order by sortField, FieldName
Lets say you have your data in column "A". If you put this formula in column "B" =IFERROR(IF(LEFT(A1,1)+1>0,"ZZZZZZZ "&A1,A1),A1), it will automatically add Z in front of all numerical values, so that they will naturally appear after all alphabetical values when you sort A-Z. later you can find&replace that funny ZZZZZZ string...
There a number of approaches, but likely the least amount of work is to build two columns that split out the delimiter (-) in this case.
You then “pad” the results (spaces, or 0) right justified, and then sort on the two columns.
So in the query builder we have this:
SELECT Field1,
Format(
Mid(field1,1,IIf(InStr(field1,"-")=0,50,InStr(field1,"-")-1)),
">##########") AS Expr1,
Format(
Mid(field1,IIf(InStr(field1,"-")=0,99,InStr(field1,"-")+1)),
">##########") AS Expr2
FROM Data
When we run the above raw query we get this:
So now in the query builder, simply sort on the first derived column, and then sort on the 2nd derived column.
Eg this:
Run the query, and we get this result:
Edit:
Looking at you desired results, it looks like above sort is wrong. We have to RIGHT just and pad with 0’s.
So this 2nd try:
SELECT Field1,
Left(Mid(field1,1,IIf(InStr(field1,"-")=0,30,InStr(field1,"-")-1))
& String(30,"0"),30) AS Expr1,
Left(Mid(field1,IIf(InStr(field1,"-")=0,99,InStr(field1,"-")+1))
& String(30,"0"),30) AS Expr2
FROM Data
The results are thus this:
Given your small table size, then the above query should perform quite well.
I am working with data from the hospital, and when I add the .csv extension to my text files they output in the following way:
It would be much easier to manage if there were a way to only include the numbers in the first column once, and also transpose them as column headers. And go through the first ten in the second column, add and transpose them underneath, then do the next ten. The final product looking like this:
I have tried transposing them manually, but since there are millions of files, the CSV's are quite extensive. I have looked for a way in Excel to do it, but I have found nothing.
Could someone help me with a macro for this?
An excel formula could be used, if the numbers are repeated exactly.
If the data is in Columns A & B, the following formula could be placed in C2:
=INDEX($B:$B,(ROW(C1)-1)*10+COLUMN(A$1))
And then copied to the right and down as far as needed.
You didn't mention whether the sequence of row numbers (1,90,100,120...) is always the same for each "row". From your sample, I will assume that the numbers repeat the same way, ad infinitum.
First, import the CSV into Microsoft Access. Let's assume your first column is called RowID, and your second is called Description. RowID is an integer, and Description is a memo field.
Add a third column, an Integer, and call it "Ord" (no quotes).
In Access's VBA editor add a new module with this GroupIncrement function:
Function GroupIncrement(ByVal sGroup)
Static Num, LastGrp
If (StrComp(LastGrp, Nz(sGroup, "")) <> 0) Then
Num = 0
End If
Num = Num + 1
GroupIncrement = Num
LastGrp = sGroup
End Function
Create a new query, replacing MyTable with the name of your Access table containing the CSV data:
UPDATE (SELECT * FROM [MyTable]
ORDER BY [RowID]) SET [Ord]=GroupIncrement([RowID])
Create a third query:
TRANSFORM First([Description])
SELECT [Ord]
FROM [MyTable]
GROUP BY [Ord]
PIVOT [RowID]
This should put the data into the format you want (with an extra column on the left, Ord).
In Access, highlight that query and choose External Data, and in the Export section, choose Excel. Export the query to Excel.
Open the file in Excel and delete the Ord column.
I have a google spreadsheet sheet with several columns:
A: date
B: string
C: number
...
G: string (could be empty)
H: string (could be empty)
I would like to have a small table with the following:
Get the sum of C in table, where rows are the values of G (substring of it, as they are configured as this: CATEGORY:ITEM, I need grouping by CATEGORY only) and columns are monts
So far I've got only partial solutions with query (for example group by month(toDate(A)) etc) - I seem to be unable to use substring and charindex nor left to do string manipulation to remove the item part after the category nor to visualize those as the first clumns in the resulting rows...
Edit: just to clear up a bit: I need to alter the value in G for each row so that I can group by the altered value. I know it is possible to do with dates ( in my example -> group by month(toDate(A)) gives me access to each value in A column so the result is grouped correctly for each separate month). But it seems string manipulation is not allowed?
How do I do that for starters.
Thanks
Based on your desired output, can you try this on sheet 2:
=ArrayFormula(query({day(Sheet1!A2:A10)&text(month(Sheet1!A2:A10), " (mmm)"), Sheet1!B2:F10, regexextract(Sheet1!G2:G10, "(.+):")}, "select Col7, sum(Col3) group by Col7 pivot Col1"))
and see if this is getting somewhere ?
Or in case you prefer open ended ranges:
=ArrayFormula(query({day(Sheet1!A2:A)&text(month(Sheet1!A2:A), " (mmm)"), Sheet1!B2:F, iferror(regexextract(Sheet1!G2:G, "(.+):"))}, "select Col7, sum(Col3) where Col7 <>'' group by Col7 pivot Col1"))
been searching all over for a similar solution :-)
just in case you would like an alternative solution...
This works a treat based on my sheet H with data in column A to F
=transpose(split(textjoin("|",1,transpose({H!A1:F100})),"|"))
See... https://webapps.stackexchange.com/questions/90629/concatenate-several-columns-into-one-in-google-sheets
i have a dataset, dataset1:
i am trying to do:
select col1+col2, col3 from mytable
this does not work; however this is no problem:
select col1, col2, col3 from mytable
how can i concatenate fields inside of my dataset?
please note that col1, col2, col3 are all VARCHARS. i've tried the & operator as well.
please see the image below. i get the DEFINE QUERY PARAMETERS dialogue when it doesnt like my query:
Double quotes are not appropriate string delimiters in MS SQL. Try:
SELECT ppl.FirstName + ' ' + ppl.LastName AS ReferredBy
Putting your actual query in the question may help you get a faster response next time, since the concatenation is actually a red herring here.