Import multiple queries written in SQL into access - sql

I've got to write 50 relatively simple queries, that all use the same basic form, but each successive query depends on the one before it to run.
I can quick and easily write the queries in SQL in an text editor e.g. word, but I dont know how to import the text back into access. Nor do I know how to specify the name of the query in the SQL code or how to specify that the end of a query has been readched.
Here is a sample of 4 queries. Here, the 1st line is the name of the Query and the two consecutive hard return represents the end of eqch query.
‘Ring2Q1
SELECT RINGS.Parent, RINGS_1.Child, 2 AS Ring
FROM RINGS INNER JOIN RINGS AS RINGS_1 ON RINGS.Child = RINGS_1.Parent;
‘Ring2Q2
SELECT Ring2Q1.Parent, Ring2Q1.Child, Max(Ring2Q1.Ring) AS Ring
FROM Ring2Q1
GROUP BY Ring2Q1.Parent, Ring2Q1.Child;
‘Ring3Q1
SELECT RINGS.Parent, Ring2Q2.Child, 3 AS Ring
FROM RINGS INNER JOIN Ring2Q2 ON RINGS.Child = Ring2Q2.Parent;
‘Ring3Q2
SELECT Ring3Q1.Parent, Ring3Q1.Child, Max(Ring3Q1.Ring) AS Ring
FROM Ring3Q1
GROUP BY Ring3Q1.Parent, Ring3Q1.Child;

Go into Access. Create a new query. Select SQL View. You can copy and paste the text of the query in here. Save it as the name you need for the next query. Repeat. You will obviously need a starting table that the first query calls. I would look at why you need a cascading set of 50 queries, on any sizeable amount of data this is going to take a long time to run.

Related

Redshift SQL result set 100s of rows wide efficiency (long to wide)

Scenario: Medical records reporting to state government which requires a pipe delimited text file as input.
Challenge: Select hundreds of values from a fact table and produce a wide result set to be (Redshift) UNLOADed to disk.
What I have tried so far is a SQL that I want to make into a VIEW.
;WITH
CTE_patient_record AS
(
SELECT
record_id
FROM fact_patient_record
WHERE update_date = <yesterday>
)
,CTE_patient_record_item AS
(
SELECT
record_id
,record_item_name
,record_item_value
FROM fact_patient_record_item fpri
INNER JOIN CTE_patient_record cpr ON fpri.record_id = cpr.record_id
)
Note that fact_patient_record has 87M rows and fact_patient_record_item has 97M rows.
The above code runs in 2 seconds for 2 test records and the CTE_patient_record_item CTE has about 200 rows per record for a total of about 400.
Now, produce the result set:
,CTE_result AS
(
SELECT
cpr.record_id
,cpri002.record_item_value AS diagnosis_1
,cpri003.record_item_value AS diagnosis_2
,cpri004.record_item_value AS medication_1
...
FROM CTE_patient_record cpr
INNER JOIN CTE_patient_record_item cpri002 ON cpr.cpr.record_id = cpri002.cpr.record_id
AND cpri002.record_item_name = 'diagnosis_1'
INNER JOIN CTE_patient_record_item cpri003 ON cpr.cpr.record_id = cpri003.cpr.record_id
AND cpri003.record_item_name = 'diagnosis_2'
INNER JOIN CTE_patient_record_item cpri004 ON cpr.cpr.record_id = cpri004.cpr.record_id
AND cpri003.record_item_name = 'mediation_1'
...
) SELECT * FROM CTE_result
Result set looks like this:
record_id diagnosis_1 diagnosis_2 medication_1 ...
100001 09 9B 88X ...
...and then I use the Reshift UNLOAD command to write to disk pipe delimited.
I am testing this on a full production sized environment but only for 2 test records.
Those 2 test records have about 200 items each.
Processing output is 2 rows 200 columns wide.
It takes 30 to 40 minutes to process just just the 2 records.
You might ask me why I am joining on the item name which is a string. Basically there is no item id, no integer, to join on. Long story.
I am looking for suggestions on how to improve performance. With only 2 records, 30 to 40 minutes is unacceptable. What will happen when I have 1000s of records?
I have also tried making the VIEW a MATERIALIZED VIEW however, it takes 30 to 40 minutes (not surprisingly) to compile the materialized view also.
I am not sure which route to take from here.
Stored procedure? I have experience with stored procs.
Create new tables so I can create integer id's to join on and indexes? However, my managers are "new table" averse.
?
I could just stop with the first two CTEs, pull the data down to python and process using pandas dataframe which I've done before successfully but it would be nice if I could have an efficient query, just use Redshift UNLOAD and be done with it.
Any help would be appreciated.
UPDATE: Many thanks to Paul Coulson and Bill Weiner for pointing me in the right direction! (Paul I am unable to upvote your answer as I am too new here).
Using (pseudo code):
MAX(CASE WHEN t1.name = 'somename' THEN t1.value END ) AS name
...
FROM table1 t1
reduced execution time from 30 minutes to 30 seconds.
EXPLAIN PLAN for the original solution is 2700 lines long, for the new solution using conditional aggregation is 40 lines long.
Thanks guys.
Without some more information it is impossible to know what is going on for sure but what you are doing is likely not ideal. An explanation plan and the execution time per step would help a bunch.
What I suspect is getting you is that you are reading a 97M row table 200 times. This will slow things down but shouldn't take 40 min. So I also suspect that record_item_name is not unique per value of record_id. This will lead to row replication and could be expanding the data set many fold. Also is record_id unique in fact_patient_record? If not then this will cause row replication. If all of this is large enough to cause significant spill and significant network broadcasting your 40 min execution time is very plausible.
There is no need to be joining when all the data is in a single copy of the table. #PhilCoulson is correct that some sort of conditional aggregation could be applied and the decode() syntax could save you space if you don't like case. Several of the above issues that might be affecting your joins would also make this aggregation complicated. What are you looking for if there are several values for record_item_value for each record_id and record_item_name pair? I expect you have some discovery of what your data holds in your future.

Alternative of IN in SQL to execute it in less time

I wrote a query but I want to know if this execution time will be slow or fast? Can we use any alternative to IN since it is used 4 times in such a small query.
Can we have a better way to write this query ? Moreover I am confused because the someone wants to query it with large number of ep_et_id and the way I wrote this is no where have a place to mention the ep_et_id
The below query fetches 74133 results in 14.55 secs in direct Postgres call. But I believe this will take more time if I call it from the webpage.
select lp, ob_id
from ob_in
where ob_id in (select ob_id
from ob_for_e
where ep_et_id in (select ep_et_id
from rrts rep
left join rrts_inf rinf on rep.rrts_id = rinf.rrts_id
where rrts_type in ('FR','IN')
and rinf.status in('V')))
The tables I am using here are ob_in, ob_for_e, rrts , rrts_inf
The location point lp are in table ob_in and I put many IN to fetch IDs from the table to apply the final condition on table rrts_inf for type FR , IN and Status with V

How to query only old and duplicate data from a database in SQL

I'm trying to query my database to pull only duplicate/old data to write to a scratch section in excel (Using a macro passing SQL to the DB).
For now, I'm currently testing in Access alone to only filter out the old data.
First, I'm trying to filter my database by a specifed WorkOrder, RunNumber, and Row.
The code below only filters by Work Order, RunNumber, and Row. ...but SQL doesn't like when I tack on a 2nd AND statement; so this currently isn't working.
SELECT *
FROM DataPoints
WHERE (((DataPoints.[WorkOrder])=[WO2]) AND ((DataPoints.[RunNumber])=6) AND ((DataPoints.[Row]=1)
Once I figure that portion out....
Then if there is only 1 entry with specified WorkOrder, RunNumber, and Row, then I want filter it out. (its not needed in the scratch section, because its data is already written to the main section of my report)
If there are 2 or more entries with said criteria(WO, RN, and Row), then I want to filter out the newest entry based on RunDate and RunTime, and only keep all older entries.
For instance, in the clip below. The only item remaining in my filtered query will be the top entry with the timestamp 11:47:00AM.
.
Are there any recommended commands to complete this problem? Any ideas are helpful. Thank you.
I would suggest something along the lines of the following:
select t.*
from datapoints t
where
t.workorder = [WO2] and
t.runnumber = 6 and
t.row = 1 and
exists
(
select 1
from datapoints u
where
u.workorder = t.workorder and
u.runnumber = t.runnumber and
u.row = t.row and
(u.rundate > t.rundate or (u.rundate = t.rundate and u.runtime > t.runtime))
)
Here, if the correlated subquery within the where clause finds a record with the same workorder, runnumber and row, but with either a later rundate or the same rundate and a later runtime, then the record is returned by the main query.
You need two more )'s at the end of your code snippet. Or you can delete the parentheses completely in this example, MS Access will ad them back in as it deems necessary.
M.S. Access SQL can be tricky as it is not standards compliant and either doesn't allow for super complex queries, or it needs an ugly work around, like having a parentheses nesting nightmare when trying to join more than two tables.
For these reasons, I suggest using multiple Access queries to produce your results.

How to run a sql query multiple times and combine the results into single output?

I have a list of 2500 obj numbers stored in Excel for which I need to run the below SQL:
SELECT
a.objno,
a.table_comment,
b.queue_comment
FROM
aq$_queue_tables a
JOIN
AQ$_QUEUES b ON a.objno = b.table_objno
WHERE
a.objno = 19551;
Is there any way I can write a loop on above SQL with objno feeding from a list or from a different table? I also want to store/produce all the results from each loop run as a single output.
I considered the option to upload the numbers into a new table and add a where condition:
a.objno=(SELECT newtab.objectno FROM newtab);
However, the logic I'll be writing in the query would exclude certain objectno results. Let's say that the associated objectno has certain queue_comment as of certain date associated with that objectno. I do not want to pull that record. This condition would match with some objectno and wouldn't match with others. Having that condition and running the query against all the objectno is returning 0 results. I couldn't share the original logic as it would reveal certain business rules and it'll be a violation of some policy.
So, I need to run the query on each objectno separately and combine the results.
I'm totally new to SQL and got this task assigned. I'm aware of the regular loop, for in SQL, but I don't think I can apply them in this situation.
Any guidance or reference links to helpful topics is much appreciated as well.
Thanks in advance for the help.
One option is to upload the object numbers from Excel sheet to a table in the database and run the query as following. Assuming newtab is the table where the objectno are uploaded.
SELECT
a.objno,
a.table_comment,
b.queue_comment
FROM
aq$_queue_tables a JOIN AQ$_QUEUES b on a.objno = b.table_objno
WHERE
a.objno IN (SELECT newtab.objectno FROM newtab);
I have used a subquery here, join to the aq$ can work as well.
Reading the comments and all I think you need to enhance your Excel with 2 additional columns and load to a new table.
IN can be used in the following way too:
SELECT
a.objno,
a.table_comment,
b.queue_comment
FROM
aq$_queue_tables a
JOIN
AQ$_QUEUES b ON a.objno = b.table_objno
WHERE
(a.objno,a.table_comment,b.queue_comment) IN (19551,'something','something');
so with the new table will be:
WHERE
(a.objno,a.table_comment,b.queue_comment) IN
(select n.objno, n.table_comment, n.queue_comment from new_table n)

Concatenation and proportion in access

A little background:
I have two tables imported from excel. One is 300k + rows so when I do updates to it in excel it just runs too slow, and often doesn't process on my comp. Anyways, I used a 'outer' left join to bring the two together.
Now when I run the query, I get the result which works fine but I need to add some fields to these results.
I am hoping to mimic what Ive done in excel, so I can create my summary pivots in the same manner.
First, I need a field that just concatenates two others after the join.
Then I need to add a field the equivalent of:
1/Countif($T$2:$T$3330,T2) from excel to access. However, the range does not need to be fixed. I will get it so that all the text entries are at the top of the field, so in theory, i need the equivalent of Sheets("").Range("T2").End(xldown). This proportion is used to eliminate double counting when i do pivot tables.
I am probably making this much more complicated than it has to be but I am new to Access as well, so please try to explain some things in explanations.
Thanks
Edit: I currently have:
Select [Table1].*, [Table2].PlaySk, [Table2].Service
From [Table1] Left Join [Table2] On [Table1].Play + [Table1].Skill
= [Table2].PlaySk
And in a general case, what I am trying to solve is something to get ColAB and ColProportion.
ColA ColB ColAB ColProportion
a 1 a1 .5
b 1 b1 1
a 1 a1 .5
b 2 b2 .3333333
b 2 b2 .3333333
b 2 b2 .3333333
Sounds to me like you'll need to make a couple queries in sequence to do everything you need.
The first part (concatenate) is relatively easy though -- just take the two field names you wish to concatenate together, say [Play] and [Skill], and, in design view, make a new field like "PlaySk: [Play] & [Skill]".
If you want to put a character between them (I often do when I concatenate, just to keep things straight), like a semicolon for example, you can do "PlaySk: [Play] & ';' & [Skill]".
As for the second part, I think you'll want to build a "Group By" query on top of the other one. In your original query, make another field in design view like this: "T2_Counter: Iif([The field you're checking, i.e. whatever column T is] = 'whatever value you're checking for, i.e. whatever T2 is',1,0)". This will result in a column that's a 1 when the check is true, and a zero otherwise.
Then bring this query into a new one, click "Totals" at the top in the Design tab, then bring the fields you want to group by down. Then create a field in design view like this: "MagicField: 1/Sum(T2_Counter)".
Hopefully this helps get you started at least.