execute variable values Google BigQuery - google-bigquery

I am trying to execute the value of the variable, but I can't find documentation about it in Google BigQuery.
DECLARE SQL STRING;
SELECT
SQL =
CONCAT(
"CREATE TABLE IF NOT EXISTS first.rdds_",
REPLACE(CAST(T.actime AS STRING), " 00:00:00+00", ""),
" PARTITION BY actime ",
" CLUSTER BY id ",
" OPTIONS( ",
" partition_expiration_days=365 ",
" ) ",
" AS ",
"SELECT * ",
"FROM first.rdds AS rd ",
"WHERE rd.actime = ",
"'", CAST(T.actime AS STRING), "'",
" AND ",
"EXISTS ( ",
"SELECT 1 ",
"FROM first.rdds_load AS rd_load ",
"WHERE rd_load.id= rd.id ",
")"
) AS SQ
FROM (
SELECT DISTINCT actime
FROM first.rdds AS rd
WHERE EXISTS (
SELECT 1
FROM first.rdds_load AS rd_load
WHERE rd_load.id= rd.id
)
) T;
My variable will have many rows with scripted for create tables and I need to execute this variable.
In SQL Server for to execute variable is:
EXEC(#variable);
How to I execute SQL variable in Google BigQuery?
EDIT:
I did new test with version beta:
Using array, all rows in one result (ARRAY_AGG):
DECLARE SQL ARRAY<STRING>;
SET SQL = (
SELECT
CONCAT(
"CREATE TABLE IF NOT EXISTS first.rdds_",
REPLACE(CAST(T.actime AS STRING), " 00:00:00+00", ""),
" PARTITION BY actime ",
" CLUSTER BY id ",
" OPTIONS( ",
" partition_expiration_days=365 ",
" ) ",
" AS ",
"SELECT * ",
"FROM first.rdds AS rd ",
"WHERE rd.actime = ",
"'", CAST(T.actime AS STRING), "'",
" AND ",
"EXISTS ( ",
"SELECT 1 ",
"FROM first.rdds_load AS rd_load ",
"WHERE rd_load.id= rd.id ",
")"
)
) AS SQ
FROM (
SELECT DISTINCT actime
FROM first.rdds AS rd
WHERE EXISTS (
SELECT 1
FROM first.rdds_load AS rd_load
WHERE rd_load.id= rd.id
)
) T
);
My result:
One row with all instructions. But I can't running this with all instructions

Update: as of 5/20/2020, BigQuery released dynamic SQL feature for you to achieve the goal.
Dynamic SQL is now available as a beta release in all BigQuery regions. Dynamic SQL lets you generate and execute SQL statements dynamically at runtime. For more information, see EXECUTE IMMEDIATE.
x
================
BigQuery does not support this (Dynamic SQL) in pure SQL, but you can implement this in any client of your choice

While Mikhail is correct that this historically hasn't been supported in BigQuery, the very new beta release of BigQuery Scripting should let you accomplish similar results:
https://cloud.google.com/bigquery/docs/reference/standard-sql/scripting
In this case, you would need to use SET to assign the value of the variable, and there isn't an EXEC statement at this time, but there is support for conditionals, loops, variables, etc.
To recreate your example, you could store the results of a query against either your first.rdds_load table, then use WHILE to loop over those results. Within that loop, you can run a normal CREATE TABLE if it doesn't already exist. I'm thinking something along these lines based on your example . . .
DECLARE results ARRAY<STRING>;
DECLARE i INT64 DEFAULT 1;
DECLARE cnt INT64 DEFAULT 0;
SET results = ARRAY(
SELECT
DISTINCT AS VALUE
CAST(actime AS STRING)
FROM
first.rdds AS rd
WHERE
EXISTS (
SELECT
1
FROM
first.rdds_load AS rd_load
WHERE
rd_load.id = rd.id
)
);
SET cnt = ARRAY_LENGTH(results);
WHILE i <= cnt DO
/* Body of CREATE TABLE goes here; you can access the rows from the query above using results[ORDINAL(i)] as you loop through*/
END WHILE;
There's also support for stored procedures, which can be executed via CALL with passed arguments, which may work in your case as well (if you need to abstract the creation logic used by many scripts).
(I would argue that this scripting support is superior to building and executing strings, since you'll still get SQL validation and such for your query.)
As always with beta features, use with caution in production—but for what it's worth, thus far my experience has been incredibly stable.

Related

How to Ignore or Replace a Row if some columns are the same?

I have a table like this ID|MONTH|YEAR|AMOUNT|PAID|ISPAID|INDEX_ID where INDEX_ID is an auto filled integer acting as PK and AI and PAID|ISPAID are auto filled with 0.0
The Problem I have is when I fill a transaction with the same ID|MONTH|YEAR|AMOUNT I would like to either replace or ignore it. Preferably ignore it. But I would still like to know both.
Is there a way to do this in sqlite3?
this is my current SQL String:
SQLStr := "INSERT OR REPLACE INTO ClientRecord(id, month, year, amount)
VALUES(" ir.ID "," ir.MONTH "," ir.YEAR "," ir.AMOUNT ");"
You can prevent the duplicate insertion through use of NOT EXISTS within your INSERT query string :
SQLStr := "INSERT OR REPLACE INTO ClientRecord( id, month, year, amount)
SELECT " ir.ID "," ir.MONTH "," ir.YEAR "," ir.AMOUNT "
WHERE NOT EXISTS ( SELECT 1
FROM ClientRecord
WHERE id = " ir.ID "
AND month = " ir.MONTH "
AND year = " ir.YEAR "
AND amount = " ir.AMOUNT ");"
Demo
As a side note : Prefer rewriting the above query by using dynamic query logic by replacing those variables by question marks rather than variable concatenetion in order to make the query more secure against injection

R - Dataframe values inside mysql query in R

In R - Is it possible to make a mysql query with a IN, where the values in that IN are from a DataFrame column?
EXAMPLE of what I'm trying to do:
Directory = read.csv("worlddirectory.csv", header = TRUE, sep = ",",stringsAsFactors=FALSE)
Active_Customers = Directory[(Directory$Status == "Active"),]
PhoneNumbers = dbGetQuery(DBConnection,
"
Select
db.phonenumbers,
db.names
from
database db
where
db.country IN
(
Active_Customers$Country
);"
As we can see here, the expected statement looks like:
WHERE column_name IN (value1, value2, ...);
We can use paste with the argument collapse=", " to obtain the desired format. I think this should work:
PhoneNumbers = dbGetQuery(DBConnection,
paste0("SELECT db.phonenumbers, db.names ",
"FROM database db ",
"WHERE db.country IN (",
paste(Active_Customers$Country,collapse=", "),");"))
Example:
Active_Customers <- data.frame(Country=c("NL","BE","US"))
paste0("SELECT db.phonenumbers, db.names ",
"FROM database db ",
"WHERE db.country IN (",
paste(Active_Customers$Country,collapse=", "),");")
Output:
[1] "SELECT db.phonenumbers, db.names FROM database db WHERE db.country IN (NL, BE, US);"
Hope this helps!

HSQLDB - updatable statements don't work with "SELECT TOP" or "ORDER BY"

I'm using HSQLDB and preparedStatements just fine, but if I include either "SELECT TOP" or "ORDER BY" in my SQL statement, when I call updateBoolean (or UpdateInt, etc), I hit an exception:
java.sql.SQLException: attempt to assign to non-updatable column
This sample code works fine:
preparedStatement = connection.prepareUpdatable(
"SELECT " + MyTable.COL_ID + ", " +
MyTable.COL_READ +
" FROM " + MyTable.NAME +
" WHERE " + MyTable.COL_LOCAL +
" =? AND " + MyTable.COL_REMOTE +
" =?",
ResultSet.TYPE_FORWARD_ONLY,
ResultSet.CONCUR_UPDATABLE);
preparedStatement.setString(1, localAddress);
preparedStatement.setString(2, remoteAddress);
ResultSet rs = connection.query(preparedStatement);
if (rs.next())
{
rs.updateBoolean(MyTable.COL_READ, isRead);
rs.updateRow();
}
I get the exception if I change "SELECT" to "SELECT TOP". Or if I append this to the SQL statement:
" ORDER BY " + MyTable.COL_RECEIVED_TIMESTAMP + " DESC"
Thanks for any help.
NickB
An updatable SELECT statement cannot have TOP n, LIMIT or ORDER BY. This restriction is imposed by the SQL standard. Your SELECT becomes not-updatable when you add one of those keywords.
It is possible to use a subquery in a WITH clause with the above keywords and the SELECT is updatable.
CREATE TABLE t (a int, b int, PRIMARY KEY(a));
WITH SUBQ(COL) AS (SELECT TOP 1 a FROM t)
SELECT * FROM t WHERE a IN (SELECT * FROM SUBQ)

Using IN with sets of tuples in SQL (SQLite3)

I have the following table in a SQLite3 database:
CREATE TABLE overlap_results (
neighbors_of_annotation varchar(20),
other_annotation varchar(20),
set1_size INTEGER,
set2_size INTEGER,
jaccard REAL,
p_value REAL,
bh_corrected_p_value REAL,
PRIMARY KEY (neighbors_of_annotation, other_annotation)
);
I would like to perform the following query:
SELECT * FROM overlap_results WHERE
(neighbors_of_annotation, other_annotation)
IN (('16070', '8150'), ('16070', '44697'));
That is, I have a couple of tuples of annotation IDs, and I'd like to fetch
records for each of those tuples. The sqlite3 prompt gives me the following
error:
SQL error: near ",": syntax error
How do I properly express this as a SQL statement?
EDIT I realize I did not explain well what I am really after. Let me try another crack at this.
If a person gives me an arbitrary list of terms in neighbors_of_annotation that they're interested in, I can write a SQL statement like the following:
SELECT * FROM overlap_results WHERE
neighbors_of_annotation
IN (TERM_1, TERM_2, ..., TERM_N);
But now suppose that person wants to give me pairs of terms if the form (TERM_1,1, TERM_1,2), (TERM_2,1, TERM_2,2), ..., (TERM_N,1, TERM_N,2), where TERM_i,1 is in neighbors_of_annotation and TERM_i,2 is in other_annotation. Does the SQL language provide an equally elegant way to formulate the query for pairs (tuples) of interest?
The simplest solution seems to be to create a new table, just for these pairs,
and then join that table with the table to be queried, and select only the
rows where the first terms and the second terms match. Creating tons of AND /
OR statements looks scary and error prone.
I've never seen SQL like that. If it exists, I would suspect it's a non-standard extension. Try:
SELECT * FROM overlap_results
WHERE neighbors_of_annotation = '16070'
AND other_annotation = '8150'
UNION ALL SELECT * FROM overlap_results
WHERE neighbors_of_annotation = '16070'
AND other_annotation = '44697';
In other words, build the dynamic query from your tuples but as a series of unions instead, or as a series of ANDs within ORs:
SELECT * FROM overlap_results
WHERE (neighbors_of_annotation = '16070' AND other_annotation = '8150')
OR (neighbors_of_annotation = '16070' AND other_annotation = '44697');
So, instead of code (pseudo-code, tested only in my head so debugging is your responsibility) such as:
query = "SELECT * FROM overlap_results"
query += " WHERE (neighbors_of_annotation, other_annotation) IN ("
sep = ""
for element in list:
query += sep + "('" + element.noa + "','" + element.oa + "')"
sep = ","
query += ");"
you would instead have something like:
query = "SELECT * FROM overlap_results "
sep = "WHERE "
for element in list:
query += sep + "(neighbors_of_annotation = '" + element.noa + "'"
query += " AND other_annotation = '" + element.oa + "')"
sep = "OR "
query += ";"
I'm not aware of any SQL dialects that support tuples inside IN clauses. I think you're stuck with:
SELECT * FROM overlap_results WHERE (neighbors_of_annotation = '16070' and other_annotation = '8150') or (neighbors_of_annotation = '16070' and other_annotation = '44697')
Of course, this particular query can be simplified to something like:
SELECT * FROM overlap_results WHERE neighbors_of_annotation = '16070' and (other_annotation = '8150' or other_annotation = '44697')
Generally SQL WHERE-clause predicates only allow filtering on a single-column.

SQL for parsing multi-line data?

I have the unfortunate task of having to import data from excel into a database on a regular basis. The table looks something like this:
IssueID References
1234 DocID1<cr>DocID2<cr>DocID3
1235 DocID1
1236 DocID2
1237 DocID2<cr>DocID3
References is a multi-line text field. What I'm trying to do is create a Docs table with one-to-many relationship to the Issue table, rather than having these multi-line references.
I have the following tables defined:
Issue: IssueKey, IssueID, IssueFields
Doc: DocKey, DocID, DocRev, DocOwner, etc
DocLink: LinkKey, DocKey, IssueKey
Since this will be run repeatedly, the Doc table will already exist with the DocIDs defined. So, what I want to do is have a query or VBA code search for each DocID in the References column and add a link based on IssueID if one does not already exist.
Simple, Right?
Jeff
Clarifications:
1) I had a third column called "Val1" to show that there were other columns, but that seemed to confuse the issue. There are actually many (way to many, most ignored) columns in the source table, but I only care about the two above.
2) I don't have to parse for a delimiter or anything too paranoid: References contains one or more uniquely defined document reference numbers (stored as text). So, a LIKE filter will turn up the list of IssueIDs on a case by case basis.
3) Here is an example of acceptable output:
IssueID References
1234 DocID1
1234 DocID2
1234 DocID3
1235 DocID1
1236 DocID2
1237 DocID2
1237 DocID3
The ideal solution would take the original excel table (top) and these two tables:
IssueKey IssueID
1 1234
2 1235
3 1236
4 1237
DocKey DocID
1 DocID1
2 DocID2
3 DocID3
And populate/update the link table:
LinkKey IssueKey DocKey
1 1 1
2 1 2
3 1 3
4 2 1
5 3 2
6 3 3
4) Here is an example of what I expected for a solution (creates #3 above). Unfortunately it crashes Access, so I can't tell if the syntax is correct (edited to reflect field names above).
SELECT Q1.IssueID, D1.DocID
FROM Docs AS D1, Issues AS Q1
WHERE Q1.IssueID IN
((SELECT Q2.IssueID from Issues AS Q2 where (Q2.References) Like D1.DocID));
5) Giving up on Access for the moment, I've got the following working in MySQL:
SELECT Q1.IssueID, D1.DocID
FROM Docs AS D1, Issues AS Q1
WHERE Q1.IssueID IN
((SELECT Q2.IssueID from Issues AS Q2 where (Q2.References) Like '%DocID1%'));
This works as I'd expect - I get every IssueID with a Reference to DocID1, repeated for every Doc in the table. With the above data it would look like:
IssueID References
1234 DocID1
1234 DocID2
1234 DocID3
1235 DocID1
1235 DocID2
1235 DocID3
Now I just want to replace the '%DocID1%' with '%'+D1.DocID+'%' - limiting the results to those document IDs which actually have a match. For some reason I'm getting zero records when I do this - I think I have the syntax for putting wildcards on the correlated field wrong.
6) The following works to provide #3 above in MySQL, but the same query translated to access crashes it:
SELECT Q1.IssueID, D1.DocID
FROM Docs AS D1, Issues AS Q1
WHERE Q1.IssueID IN
((SELECT Q2.IssueID from Issues AS Q2 where (Q2.References) Like
CONCAT('%',D1.DocID,'%')));
[in access it becomes ('' & D1.DocID & '')]
Conclusion: Access sucks
This has been chosen as the answer:
Q2.References LIKE ("*" & D1.DocID & "*"));
However, I don't think this is safe.
Consider if one of the value for the column named 'References' contained this data:
DocID1<cr>DocID999<cr>DocID3
and a value DocID = 9 existed in the other table.
The problem here is that
"DocID1<cr>DocID999<cr>DocID3" LIKE "*" & "DocID9" & "*"
will evaluate to TRUE, which is probably undesirable.
To address this problem, I think the values in the search/join condition should be made safe by surrounding the values using the delimiter character e.g.
(CHR(13) & Q2.References & CHR(13)) LIKE ("*" & CHR(13) & D1.DocID & CHR(13) & "*"));
Since this is to run repeatedly, I would ask (strongly suggest) they provide me a proper file where the issueID and valid appear on every line. This is much easier to process. You need to know for sure what the values for these fields are to properly import to your system.
Based on the comments: IN SQL Server you can would build a function to split the data based on the charindex for commas. If you search Google for fn_split, you will find a sample of this. Not sure how you would do this in Access but it would probably be an interative process where you look for the last comma and move everything past it to a holding table and then get rid of the command, then do again until there are no more commas. It iseasiest to do imports like this to staging tables where you can manipulate the data the way you need it and then put the final result into your real tables.
My first choice would be to put together a quick application in C# or VB.Net to handle this.
If that wasn't viable, I'd have an "Import" table which took everything as is. Then I would use a cursor to iterate the records in the table. Inside the cursor I'd keep track of the IssueId and Val1 and parse the References column to create my child records. This part I'd package into a stored procedure.
I suggest that you research SQL Server Integration Services (SSIS). This tool was created to do this kind of data import/export as quickly as possible with as little code.
Read about it. Do some hands on labs to see if any examples are close to what your trying to do.
http://en.wikipedia.org/wiki/SQL_Server_Integration_Services
http://www.microsoft.com/downloads/details.aspx?familyid=b1145e7a-a4e3-4d14-b1e7-d1d823b6a447&displaylang=en
Do you mean (typed, not tested):
Dim rs As DAO.Recordset
Dim rsIn As DAO.Recordset ''Or ADO if you link directly to Excel
Set rs=CurrentDB.OpenRecordset( _
"SELECT * FROM DocLinks dl INNER JOIN Docs d ON dl.DocKey=d.DocKey")
Do While Not rsIn.EOF
astrDocs=Split(rsIn!References, vbCrLf)
For Each strDoc In astrDocs
rs.FindFirst "DocID='" & strDoc & "'"
If rs.NoMatch Then
strSQL="INSERT INTO DocLinks (DocID, IssueID) " _
& "VALUES ('" strDoc & "'," & rsIn!IssueID & ")"
CurrentDB.Execute strSQL, dbFailOnError
End If
Next
rsIn.MoveNext
Loop
EDIT re COMMENTS
If the DocIDs are of a fixed length, you could consider something on these lines:
SELECT Sequence.Seq
, ImportTable.IssueID
, Mid(Replace([References],"<cr>",""),[seq],6) AS Docs
FROM Sequence, ImportTable
WHERE ([seq]+5) Mod 6=0)
AND Mid(Replace([References],"<cr>",""),[seq],6))<>""
AND Mid(Replace([References],"<cr>",""),[seq],6))
Not In (SELECT DocID FROM Docs)
You will need a sequence table with integers from 1 to at least max length of Reference.
This can easily be done in SQL. I have written a TVF (table-valued function) specifically for line-splitting text that demonstrates how:
ALTER function [dbo].[fnSplit3](
#parameter varchar(Max) -- the string to split
, #Seperator Varchar(64) -- the string to use as a seperator
)
RETURNS #Items TABLE(
ID INT -- the element number
, item VARCHAR(8000) -- the split-out string element
, OffSet int -- the original offest
--( not entirley accurate if LEN(#Seperator) > 1 because of the Replace() )
)
AS
BEGIN
/*
"Monster" Split in SQL Server 2005
From Jeff Moden, 2008/05/22
BYoung, 2008/06/18: Modified to be a Table-Valued Function
And to handle CL/LF or LF-only line breaks
Test: (scripts all procs & views in master)
Select Lines.Item
From Master.sys.syscomments C
CROSS APPLY dbo.fnSplit3(C.text, char(13)+char(10)) Lines
Order by C.ID, Lines.ID
Test2: (scripts all triggers in your database)
Select Lines.Item
From sys.sql_modules M
Join sys.objects O on O.object_id = M.object_id
CROSS APPLY dbo.fnSplit3(M.definition, char(13)+char(10)) Lines
Where O.Type = 'TR'
Order by O.create_date, Lines.ID
*/
Declare #Sep char(1)
Set #Sep = char(10) --our seperator character (convenient, doesnt affect performance)
--NOTE: we make the #Sep character LF so that we will automatically
-- parse out rogue LF-only line breaks.
--===== Add start and end seprators to the Parameter so we can handle
-- all the elements the same way
-- Also change the seperator expressions to our seperator
-- character to keep all offsets = 1
SET #Parameter = #Sep+ Replace(#Parameter,#Seperator,#Sep) +#Sep
-- This reduces run-time about 10%
;WITH cteTally AS
(--==== Create a Tally CTE from 1 to whatever the length
-- of the parameter is
SELECT TOP (LEN(#Parameter))
ROW_NUMBER() OVER (ORDER BY t1.object_id) AS N
FROM Master.sys.system_Columns t1
CROSS JOIN Master.sys.system_Columns t2
)
INSERT into #Items
SELECT ROW_NUMBER() OVER (ORDER BY N) AS Number,
SUBSTRING(#Parameter, N+1, CHARINDEX(#Sep, #Parameter, N+1)-N-1) AS Value
, N+1
FROM cteTally
WHERE N < LEN(#Parameter)
AND SUBSTRING(#Parameter, N, 1) = #Sep --Notice how we find the seperator
Return
END
In order to use this with your current table & data do this:
SELECT Issues.IssueID, Lines.Item as Reference
From Issues
Cross Apply dbo.fnSplit3(Issues.Reference, char(13)) Lines
Order By IssueID, Reference
I'm having problems coming up with a set-based SQL solution here. I've done this kind of thing before, I had to refresh my memory somewhat, but I'm running into a problem. I think it's an issue (feature/bug?) with the engine but I could be doing something daft. Perhaps someone intimate with Jet/ACE and who can read VBA can take a look at the code at the end of this answer and hopefully take this forward...?
The basic approach is to use a Sequence table of integers with the MID() expression to parse the data column (which I've renamed to MyReferences because REFERENCES is a SQL keyword).
Here's some MS Access VBA to recreate the test tables/data using SQL DDL/DML. Notice the first SELECT query returns sub-strings and star- and end delimiters; obviously, we're looking for the row(s) where both delimiters are the delimiting character, CHR(13) in this case. The second SELECT query merely adds search conditions for the desired delimiters but errors with 'Invalid procedure call'; this happens when the MID() expression is called using invalid parameter values e.g.
SELECT MID('A', 0, 0)
I guess what is happening is the optimizer is not using the subquery as a 'shortcut' and instead is evaluating the MID() expression before the search conditions from Sequence table. If so it's a bit dumb and I can't think of a way of forcing the order of evaluation.
So, is is my or the engine at fault here?
Sub main()
Dim sql As String
sql = _
"DROP TABLE ImportTable;"
On Error Resume Next ' Table may not exist
CurrentProject.Connection.Execute sql
On Error GoTo 0
sql = _
"DROP TABLE Sequence;"
On Error Resume Next ' Table may not exist
CurrentProject.Connection.Execute sql
On Error GoTo 0
sql = _
"CREATE TABLE ImportTable ( " & _
"IssueID INTEGER NOT NULL UNIQUE, MyReferences VARCHAR(90) NOT NULL);"
CurrentProject.Connection.Execute sql
sql = _
"INSERT INTO ImportTable VALUES (1234, 'DocID1' & Chr(13) & 'DocID22' & Chr(13) & 'DocID3');"
CurrentProject.Connection.Execute sql
sql = _
"CREATE TABLE Sequence (seq INTEGER NOT NULL UNIQUE);"
CurrentProject.Connection.Execute sql
sql = _
"INSERT INTO Sequence VALUES (-1);"
CurrentProject.Connection.Execute sql
sql = _
"INSERT INTO [Sequence] (seq) SELECT Units.nbr + Tens.nbr" & _
" FROM ( SELECT" & _
" nbr FROM ( SELECT 0 AS nbr FROM [Sequence] UNION" & _
" ALL SELECT 1 FROM [Sequence] UNION ALL SELECT 2 FROM" & _
" [Sequence] UNION ALL SELECT 3 FROM [Sequence] UNION" & _
" ALL SELECT 4 FROM [Sequence] UNION ALL SELECT 5 FROM" & _
" [Sequence] UNION ALL SELECT 6 FROM [Sequence] UNION" & _
" ALL SELECT 7 FROM [Sequence] UNION ALL SELECT 8 FROM" & _
" [Sequence] UNION ALL SELECT 9 FROM [Sequence] ) AS" & _
" Digits ) AS Units, ( SELECT nbr * 10 AS nbr FROM" & _
" ( SELECT 0 AS nbr FROM [Sequence] UNION ALL SELECT" & _
" 1 FROM [Sequence] UNION ALL SELECT 2 FROM [Sequence]" & _
" UNION ALL SELECT 3 FROM [Sequence] UNION ALL SELECT" & _
" 4 FROM [Sequence] UNION ALL SELECT 5 FROM [Sequence]" & _
" UNION ALL SELECT 6 FROM [Sequence] UNION ALL SELECT" & _
" 7 FROM [Sequence] UNION ALL SELECT 8 FROM [Sequence]" & _
" UNION ALL SELECT 9 FROM [Sequence] ) AS Digits )" & _
" AS Tens;"
CurrentProject.Connection.Execute sql
sql = _
"SELECT DT1.IssueID, DT1.parsed_text, DT1.delimiter_1, DT1.delimiter_2 " & _
"FROM ( " & _
"SELECT I1.IssueID, MID(I1.MyReferences, S1.seq, S2.seq - S1.seq - LEN(CHR(13))) AS parsed_text, " & _
" MID(CHR(13) & I1.MyReferences & CHR(13), S1.seq, LEN(CHR(13))) AS delimiter_1, " & _
" MID(CHR(13) & I1.MyReferences & CHR(13), S2.seq, LEN(CHR(13))) AS delimiter_2 " & _
"FROM ImportTable AS I1, Sequence AS S1, Sequence AS S2 " & _
"WHERE S1.seq < S2.seq " & _
"AND S2.seq - S1.seq - LEN(CHR(13)) > 0 " & _
"AND S1.seq BETWEEN 1 AND LEN(CHR(13)) + LEN(I1.MyReferences) + LEN(CHR(13)) " & _
"AND S2.seq BETWEEN 1 AND LEN(CHR(13)) + LEN(I1.MyReferences) + LEN(CHR(13)) " & _
") AS DT1;"
Dim rs As ADODB.Recordset
Set rs = CurrentProject.Connection.Execute(sql)
MsgBox rs.GetString
sql = _
"SELECT DT1.IssueID, DT1.parsed_text, DT1.delimiter_1, DT1.delimiter_2 " & _
"FROM ( " & _
"SELECT I1.IssueID, MID(I1.MyReferences, S1.seq, S2.seq - S1.seq - LEN(CHR(13))) AS parsed_text, " & _
" MID(CHR(13) & I1.MyReferences & CHR(13), S1.seq, LEN(CHR(13))) AS delimiter_1, " & _
" MID(CHR(13) & I1.MyReferences & CHR(13), S2.seq, LEN(CHR(13))) AS delimiter_2 " & _
"FROM ImportTable AS I1, Sequence AS S1, Sequence AS S2 " & _
"WHERE S1.seq < S2.seq " & _
"AND S2.seq - S1.seq - LEN(CHR(13)) > 0 " & _
"AND S1.seq BETWEEN 1 AND LEN(CHR(13)) + LEN(I1.MyReferences) + LEN(CHR(13)) " & _
"AND S2.seq BETWEEN 1 AND LEN(CHR(13)) + LEN(I1.MyReferences) + LEN(CHR(13)) " & _
") AS DT1 " & _
"WHERE DT1.delimiter_1 = CHR(13) " & _
"AND DT1.delimiter_2 = CHR(13);"
Set rs = CurrentProject.Connection.Execute(sql)
MsgBox rs.GetString
End Sub
FWIW here's a PROCEDURE I wrote years ago for parsing a delimited list into a table. It seems to work OK for values up to 255 characters; any more and you get a very nasty ACE/Jet engine error. Again, I don't see what the problem is other than the engine can't cope! Anyhow, my point is that this works (for small values) and I can't figure out why I can't adapt it to the problem at hand:
CREATE PROCEDURE ListToTable
(
delimted_text MEMO,
delimiter VARCHAR(4) = ','
)
AS
SELECT MID(I1.input_string, S1.seq, MIN(S2.seq) - S1.seq - LEN(delimiter)) AS param
FROM
(
SELECT DISTINCT delimted_text AS input_string
FROM Sequence AS S3
WHERE S3.seq BETWEEN 1 AND LEN(delimted_text)
) AS I1, Sequence AS S1, Sequence AS S2
WHERE MID(delimiter & I1.input_string & delimiter, S1.seq, LEN(delimiter)) = delimiter
AND MID(delimiter & I1.input_string & delimiter, S2.seq, LEN(delimiter)) = delimiter
AND S1.seq < S2.seq
AND S1.seq BETWEEN 1 AND LEN(delimiter) + LEN(delimted_text) + LEN(delimiter)
AND S2.seq BETWEEN 1 AND LEN(delimiter) + LEN(delimted_text) + LEN(delimiter)
GROUP
BY I1.input_string, S1.seq
HAVING LEN(MID(I1.input_string, S1.seq, MAX(S2.seq) - S1.seq - LEN(delimiter))) > 0;
I think using the word "parse" in the title has confused the crap out of everyone. The bug in Access was that a correlated query performed on a query (instead of a table) causes a hang. So instead, I created a temporary table that ads the References column (with the multi-line text) to the Issues table so I have access to the other fields. The final query creates the link table described above, along with the DocID and IssueID for reference:
SELECT Q1.IssueID, Q1.IssueKey, D1.DocKey, D1.DocID
FROM Issues AS Q1, Documents AS D1
WHERE Q1.IssueID in
(SELECT Q2.IssueID FROM Issues AS Q2 WHERE Q2.References LIKE ("*" & D1.DocID & "*"));
The inner select pulls the list of issues which has a given document in the references column. The outer select performs this for each document, resulting in the aggregate list.