Scaling concurrent exports of BigQuery tables to Google Cloud Storage - google-bigquery
I'm trying to run a query in BigQuery and store the results in Cloud Storage. This is rather straight forward to do using BigQueries API.
An issue comes up when I try to do this with multiple queries concurrently. "Extracting" the result table to Cloud Storage slows down significantly the more tables I try to extract. Here's a summary result of an experiment I did for 20 concurrent jobs. Results are in seconds.
job 013 done. Query: 012.0930221081. Extract: 009.8582818508. Signed URL: 000.3398022652
job 000 done. Query: 012.1677722931. Extract: 010.7060177326. Signed URL: 000.3358650208
job 002 done. Query: 009.5634860992. Extract: 014.2841088772. Signed URL: 000.3027939796
job 004 done. Query: 011.7068181038. Extract: 012.5938670635. Signed URL: 000.2734949589
job 020 done. Query: 009.8888399601. Extract: 015.4054799080. Signed URL: 000.3903510571
job 022 done. Query: 012.9012901783. Extract: 013.9143507481. Signed URL: 000.3490731716
job 014 done. Query: 012.8500978947. Extract: 015.0055649281. Signed URL: 000.2981300354
job 006 done. Query: 011.6835210323. Extract: 016.2601530552. Signed URL: 000.2789318562
job 001 done. Query: 013.4435272217. Extract: 015.2819819450. Signed URL: 000.2984759808
job 005 done. Query: 012.0956349373. Extract: 018.9619371891. Signed URL: 000.3134548664
job 018 done. Query: 013.6754779816. Extract: 020.0537509918. Signed URL: 000.3496448994
job 011 done. Query: 011.9627509117. Extract: 025.1803772449. Signed URL: 000.3009829521
job 008 done. Query: 015.7373569012. Extract: 136.8249070644. Signed URL: 000.3158171177
job 023 done. Query: 013.7817242146. Extract: 148.2014479637. Signed URL: 000.4145238400
job 012 done. Query: 014.5390141010. Extract: 151.3171939850. Signed URL: 000.3226230145
job 007 done. Query: 014.1386809349. Extract: 160.1254091263. Signed URL: 000.2966897488
job 021 done. Query: 013.6751790047. Extract: 162.8383400440. Signed URL: 000.3162341118
job 019 done. Query: 013.5642910004. Extract: 163.2161693573. Signed URL: 000.2765989304
job 003 done. Query: 013.8807480335. Extract: 165.1014308929. Signed URL: 000.3309218884
job 024 done. Query: 013.5861997604. Extract: 182.0707099438. Signed URL: 000.3331830502
job 009 done. Query: 013.5025639534. Extract: 199.4397711754. Signed URL: 000.4156360626
job 015 done. Query: 013.7611100674. Extract: 230.2218120098. Signed URL: 000.2913899422
job 016 done. Query: 013.4659759998. Extract: 285.7284781933. Signed URL: 000.3109869957
job 017 done. Query: 019.2001299858. Extract: 322.5298812389. Signed URL: 000.2890429497
job 010 done. Query: 014.7132742405. Extract: 363.8596160412. Signed URL: 000.6748869419
A job does three things
Submits a query to BigQuery
Extracts the results table to Cloud Storage
Generate a Signed URL of the blob in Cloud Storage
As the results show, the first group of Extracts takes 9 - 25 seconds, after that it starts taking much longer.
Any ideas on why this is happening? Is this the reason? https://cloud.google.com/storage/docs/request-rate
Is there any way of fixing this?
EDIT: Here's some additional information I discovered.
| job | Local Extract timed | Google Extract timed | Google's Extract started | Google's Extract ended | Local Extract start | Local Extract start |
| --- | ------------------- | -------------------- | ------------------------ | ---------------------- | ------------------- | ------------------- |
| 026 | 009.26328 | 008.84300 | 13:39:00.441000 | 13:39:09.284000 | 07:39:00.235970 | 07:39:09.498784 |
| 009 | 011.52299 | 008.04000 | 13:39:00.441000 | 13:39:08.481000 | 07:39:00.234297 | 07:39:11.756788 |
| 004 | 010.35730 | 008.66700 | 13:39:03.436000 | 13:39:12.103000 | 07:39:03.240466 | 07:39:13.597328 |
| 011 | 011.86404 | 009.29900 | 13:39:03.055000 | 13:39:12.354000 | 07:39:02.893600 | 07:39:14.756887 |
| 006 | 012.50416 | 011.75400 | 13:39:02.854000 | 13:39:14.608000 | 07:39:02.623032 | 07:39:15.126790 |
| 000 | 013.30535 | 008.77000 | 13:39:02.056000 | 13:39:10.826000 | 07:39:01.863548 | 07:39:15.168434 |
| 002 | 011.47199 | 008.53700 | 13:39:04.443000 | 13:39:12.980000 | 07:39:04.236455 | 07:39:15.708005 |
| 032 | 015.68229 | 009.69200 | 13:39:02.915000 | 13:39:12.607000 | 07:39:02.768185 | 07:39:18.450160 |
| 001 | 017.46480 | 009.35800 | 13:39:01.313000 | 13:39:10.671000 | 07:39:01.071540 | 07:39:18.535896 |
| 012 | 019.02242 | 008.65700 | 13:39:00.903000 | 13:39:09.560000 | 07:39:00.727101 | 07:39:19.749070 |
| 018 | 016.95632 | 009.75800 | 13:39:03.259000 | 13:39:13.017000 | 07:39:03.080580 | 07:39:20.036199 |
| 019 | 017.24428 | 008.51100 | 13:39:03.773000 | 13:39:12.284000 | 07:39:03.575118 | 07:39:20.819042 |
| 008 | 019.55018 | 009.83600 | 13:39:02.110000 | 13:39:11.946000 | 07:39:01.905548 | 07:39:21.455273 |
| 023 | 016.64131 | 008.94500 | 13:39:05.282000 | 13:39:14.227000 | 07:39:05.041235 | 07:39:21.682086 |
| 017 | 019.39104 | 007.12700 | 13:39:03.118000 | 13:39:10.245000 | 07:39:02.896256 | 07:39:22.286485 |
| 020 | 019.96283 | 010.05000 | 13:39:03.115000 | 13:39:13.165000 | 07:39:02.942562 | 07:39:22.904864 |
| 036 | 022.05831 | 010.51200 | 13:39:02.626000 | 13:39:13.138000 | 07:39:02.461061 | 07:39:24.518903 |
| 024 | 028.39538 | 008.79600 | 13:39:05.151000 | 13:39:13.947000 | 07:39:04.916194 | 07:39:33.311248 |
| 007 | 107.36010 | 010.68900 | 13:40:31.555000 | 13:40:42.244000 | 07:39:03.050049 | 07:40:50.409359 |
| 028 | 120.63134 | 009.52400 | 13:40:49.915000 | 13:40:59.439000 | 07:39:02.941202 | 07:41:03.572094 |
| 033 | 120.78268 | 009.54200 | 13:40:27.147000 | 13:40:36.689000 | 07:39:04.152378 | 07:41:04.934602 |
| 037 | 122.64949 | 008.80400 | 13:40:33.298000 | 13:40:42.102000 | 07:39:06.500587 | 07:41:09.149629 |
| 035 | 125.35254 | 009.13200 | 13:40:27.600000 | 13:40:36.732000 | 07:39:04.295941 | 07:41:09.647836 |
| 015 | 139.13287 | 011.17800 | 13:40:27.116000 | 13:40:38.294000 | 07:39:03.406321 | 07:41:22.538701 |
| 029 | 141.21037 | 008.23700 | 13:40:24.271000 | 13:40:32.508000 | 07:39:03.816588 | 07:41:25.026438 |
| 013 | 145.94239 | 009.19400 | 13:40:33.809000 | 13:40:43.003000 | 07:39:03.375451 | 07:41:29.317454 |
| 039 | 149.92807 | 009.72300 | 13:40:33.090000 | 13:40:42.813000 | 07:39:03.635156 | 07:41:33.562607 |
| 016 | 166.26505 | 010.12000 | 13:40:39.999000 | 13:40:50.119000 | 07:39:03.383215 | 07:41:49.647907 |
| 010 | 210.61908 | 011.37900 | 13:42:20.287000 | 13:42:31.666000 | 07:39:03.702486 | 07:42:34.321079 |
| 027 | 227.83011 | 010.00900 | 13:42:25.845000 | 13:42:35.854000 | 07:39:02.953435 | 07:42:50.783106 |
| 025 | 228.48326 | 009.71000 | 13:42:20.845000 | 13:42:30.555000 | 07:39:03.673122 | 07:42:52.155934 |
| 022 | 244.57685 | 010.06900 | 13:42:53.712000 | 13:43:03.781000 | 07:39:03.963936 | 07:43:08.540307 |
| 021 | 263.74717 | 009.81400 | 13:42:40.211000 | 13:42:50.025000 | 07:39:04.505016 | 07:43:28.251864 |
| 031 | 273.96990 | 008.55100 | 13:43:18.645000 | 13:43:27.196000 | 07:39:03.618419 | 07:43:37.587862 |
| 034 | 280.96174 | 010.53300 | 13:42:58.364000 | 13:43:08.897000 | 07:39:04.313498 | 07:43:45.274962 |
| 030 | 281.76029 | 008.27100 | 13:42:49.448000 | 13:42:57.719000 | 07:39:03.832644 | 07:43:45.592592 |
| 005 | 288.15577 | 009.85300 | 13:43:04.825000 | 13:43:14.678000 | 07:39:04.006553 | 07:43:52.161888 |
| 003 | 296.52279 | 009.65300 | 13:43:24.041000 | 13:43:33.694000 | 07:39:03.831264 | 07:44:00.353715 |
| 038 | 380.01783 | 008.45000 | 13:44:57.326000 | 13:45:05.776000 | 07:39:03.055733 | 07:45:23.073209 |
| 014 | 397.05841 | 008.99800 | 13:44:48.577000 | 13:44:57.575000 | 07:39:03.132323 | 07:45:40.190302 |
The table shows the amount of time I have to wait locally to run my jobs, and shows how long Google takes to do my jobs. Looking at the times, it shows that it doesn't take very long for Google to perform the extract, but it won't run the jobs at the same time, and thus will force some extracts to wait a few minutes before starting.
You're correct, there is currently an internal limit on how fast export jobs are processed internally. This was originally put in to protect the system of too many long and expensive exports running in parallel. However as you noted, this limit doesn't seem to help in your case where you have many export jobs all completed within 1 minutes.
We have an open (internal) bug to address this to make the situation better for smaller exports like yours. In the mean time, if you think you're blocked by this, file a bug or let me know your project ID, we can help raise the limit for your project.
Related
Compare data between 2 different source
I have a two datasets coming from 2 sources and i have to compare and find the mismatches. One from excel and other from Datawarehouse. From excel Source_Excel +-----+-------+------------+----------+ | id | name | City_Scope | flag | +-----+-------+------------+----------+ | 101 | Plate | NY|TN | Ready | | 102 | Nut | NY|TN | Sold | | 103 | Ring | TN|MC | Planning | | 104 | Glass | NY|TN|MC | Ready | | 105 | Bolt | MC | Expired | +-----+-------+------------+----------+ From DW Source_DW +-----+-------+------+----------+ | id | name | City | flag | +-----+-------+------+----------+ | 101 | Plate | NY | Ready | | 101 | Plate | TN | Ready | | 102 | Nut | TN | Expired | | 103 | Ring | MC | Planning | | 104 | Glass | MC | Ready | | 104 | Glass | NY | Ready | | 105 | Bolt | MC | Expired | +-----+-------+------+----------+ Unfortunately Data from excel comes with separator for one column. So i have to use DelimitedSplit8K function to split that into individual rows. so i got the below output after splitting the excel source data. +-----+-------+------+----------+ | id | name | item | flag | +-----+-------+------+----------+ | 101 | Plate | NY | Ready | | 101 | Plate | TN | Ready | | 102 | Nut | NY | Sold | | 102 | Nut | TN | Sold | | 103 | Ring | TN | Planning | | 103 | Ring | MC | Planning | | 104 | Glass | NY | Ready | | 104 | Glass | TN | Ready | | 104 | Glass | MC | Ready | | 105 | Bolt | MC | Expired | +-----+-------+------+----------+ Now my expected output is something like this. +-----+----------+---------------+--------------+ | ID | Result | Flag_mismatch | City_Missing | +-----+----------+---------------+--------------+ | 101 | No_Error | | | | 102 | Error | Yes | Yes | | 103 | Error | No | Yes | | 104 | Error | Yes | No | | 105 | No_Error | | | +-----+----------+---------------+--------------+ Logic: I have to find if there are any mismatches in flag values. After splitting if there are any city missing, then that should be reported. Assume that there wont be any Name and city mismatches. As a intial step, I'm trying to get the Mismatch rows and I have tried below query. It is not giving me any output. Please suggest where am going wrong.Check Fiddle Here select a.id,a.name,split.item,a.flag from source_excel a CROSS APPLY dbo.DelimitedSplit8k(a.city_scope,'|') split where not exists ( select a.id,split.item from source_excel a join source_dw b on a.id=b.id and a.name=b.name and a.flag=b.flag and split.item=b.city ) Update I have tried and got close to the answers with the help of temporary tables. Updated Fiddle . But not sure how to do without temp tables
Splitting a table on comma separated emails in Big Query
I have a table with following columns (The email address are comma separated): +---------+----------+------------+---------------------------------------------+---------+ | Sr. No. | Domain | Store Name | Email | Country | +---------+----------+------------+---------------------------------------------+---------+ | 1. | kkp.com | KKP | den#kkp.com, info#kkp.com, reno#kkp.com | US | | 2. | lln.com | LLN | silo#lln.com | UK | | 3. | ddr.com | DDR | info#ddr.com, dave#ddr.com | US | | 4. | cpp.com | CPP | hello#ccp.com, info#ccp.com, stelo#ccp.com | CN | +---------+----------+------------+---------------------------------------------+---------+ I want the output with Email in separate Columns: +---------+----------+------------+---------------+---------------+---------------+---------+---------+ | Sr. No. | Domain | Store Name | Email 1 | Email 2 | Email 3 | Email N | Country | |---------+----------+------------+---------------+---------------+---------------+---------+---------+ | 1. | kkp.com | KKP | den#kkp.com | info#kkp.com | reno#kkp.com | ....... | US | | 2. | lln.com | LLN | silo#lln.com | | | ....... | UK | | 3. | ddr.com | DDR | info#ddr.com | dave#ddr.com | | ....... | US | | 4. | cpp.com | CPP | hello#ccp.com | info#ccp.com | stelo#ccp.com | ....... | CN | +---------+----------+------------+---------------+---------------+---------------+---------+---------+ Can someone please help a beginner in SQL and BigQuery.
"Header" information row required for logical SQL information chunk
I have this current view (SQL Query already developed): | Application Control ID | Application Name | System | ----------------------------------------------------------- | A0037 | ABR_APP1 | ABR | | A1047 | ABR_APP2 | ABR | | A2051 | ABR_APP3 | ABR | | A2053 | ABR_APP4 | ABR | | A1909 | ABR_APP5 | ABR | | A0032 | AIS_APP1 | AIS | | A0029 | AIS_APP2 | AIS | | A0030 | AIS_APP3 | AIS | | A0039 | AOL_APP1 | AOL | | A0038 | AOL_APP2 | AOL | I need to change it to this: | Application Control ID | Application Name | System | ------------------------------------------------------ | S0001 | [blank] | ABR | | A0037 | ABR_APP1 | ABR | | A1047 | ABR_APP2 | ABR | | A2051 | ABR_APP3 | ABR | | A2053 | ABR_APP4 | ABR | | A1909 | ABR_APP5 | ABR | | S0002 | [blank] | AIS | | A0032 | AIS_APP1 | AIS | | A0029 | AIS_APP2 | AIS | | A0030 | AIS_APP3 | AIS | | S0003 | [blank] | AOL | | A0039 | AOL_APP1 | AOL | | A0038 | AOL_APP2 | AOL | The datamart tables in our knowledge management system are as follows: - ATO_ATO_Application - ATO_ATO_System - ATO_APPLICATION_TO_SYSTEM Question: Those sub-headers (S0001..., S0002..., S0003...) – are they easy to add to the view through SQL query programming? - Somehow I need to show these “sub-header” data from the ATO_ATO_System table and place it above the logical application chunk / set as per shown above. Thanks in advance.. Albert
Inefficient SQL Search Query - Oracle DB
My logic in this query is right (well im 80% sure it is). but its been running for 2h 23min and still going, was wondering if some one could maybe help me make this run a bit more efficiently as i don't think its that intense of a query SELECT b.bridge_no, COUNT(*) AS comment_cnt FROM iacd_asset b INNER JOIN iacd_note c ON REGEXP_LIKE(c.comments, '(^|\W)BN' || b.bridge_no || '(\W|$)', 'i') inner join ncr_note e on c.note_id=e.note_id inner join ncr f on e.ncr_id=f.ncr_id inner join ncr_iac g on f.ncr_id=g.ncr_id WHERE c.create_dt >= date'2015-01-01' AND c.create_dt < date'2015-03-12' AND length(b.bridge_no) > 1 AND g.scheme in (1, 3, 5, 6, 7, 8, 9, 9, and about 10 more values) GROUP BY b.bridge_no ORDER BY comment_cnt; in short the query should be making a bunch of joins, and then filtering the joined table by schemes (g.scheme in....) , and then parsing the notes field for anything with BN in it. PLAN TABLE, ok i have never used one before, but i believe this is the plan table +------------------+----------------+--------------+------------------+--------------+-----------------+----------------+-----------+----+-----------+-------+----------+---------+-------------+-------------+-----------------+---------+------------+-----------------------------+---------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+------+--------------+ | OPERATION | OPTIONS | OBJECT_OWNER | OBJECT_NAME | OBJECT_ALIAS | OBJECT_INSTANCE | OBJECT_TYPE | OPTIMIZER | ID | PARENT_ID | DEPTH | POSITION | COST | CARDINALITY | BYTES | CPU_COST | IO_COST | TEMP_SPACE | ACCESS_PREDICATES | FILTER_PREDICATES | PROJECTION | TIME | QBLOCK_NAME | +------------------+----------------+--------------+------------------+--------------+-----------------+----------------+-----------+----+-----------+-------+----------+---------+-------------+-------------+-----------------+---------+------------+-----------------------------+---------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+------+--------------+ | SELECT STATEMENT | | | | | | | ALL_ROWS | 0 | | 0 | 281,503 | 281,503 | 40 | 4,480 | 148,378,917,975 | 215,677 | | | | | 458 | | +------------------+----------------+--------------+------------------+--------------+-----------------+----------------+-----------+----+-----------+-------+----------+---------+-------------+-------------+-----------------+---------+------------+-----------------------------+---------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+------+--------------+ | SORT | ORDER BY | | | | | | | 1 | 0 | 1 | 1 | 281,503 | 40 | 4,480 | 148,378,917,975 | 215,677 | | | | (#keys=1) COUNT(*)[22], "B"."BRIDGE_NO"[NUMBER,22] | 458 | SEL$81719215 | +------------------+----------------+--------------+------------------+--------------+-----------------+----------------+-----------+----+-----------+-------+----------+---------+-------------+-------------+-----------------+---------+------------+-----------------------------+---------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+------+--------------+ | HASH | GROUP BY | | | | | | | 2 | 1 | 2 | 1 | 281,503 | 40 | 4,480 | 148,378,917,975 | 215,677 | | | | (#keys=1) "B"."BRIDGE_NO"[NUMBER,22], COUNT(*)[22] | 458 | | +------------------+----------------+--------------+------------------+--------------+-----------------+----------------+-----------+----+-----------+-------+----------+---------+-------------+-------------+-----------------+---------+------------+-----------------------------+---------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+------+--------------+ | HASH JOIN | | | | | | | | 3 | 2 | 3 | 1 | 281,497 | 16,084 | 1,801,408 | 148,366,537,976 | 215,677 | 24,126,000 | "G"."NCR_ID"="F"."NCR_ID" | | (#keys=1) "B"."BRIDGE_NO"[NUMBER,22] | 458 | | +------------------+----------------+--------------+------------------+--------------+-----------------+----------------+-----------+----+-----------+-------+----------+---------+-------------+-------------+-----------------+---------+------------+-----------------------------+---------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+------+--------------+ | HASH JOIN | | | | | | | | 4 | 3 | 4 | 1 | 96,996 | 209,778 | 21,607,134 | 13,549,630,814 | 90,985 | 22,725,000 | "E"."NCR_ID"="F"."NCR_ID" | | (#keys=1) "F"."NCR_ID"[NUMBER,22], "B"."BRIDGE_NO"[NUMBER,22] | 158 | | +------------------+----------------+--------------+------------------+--------------+-----------------+----------------+-----------+----+-----------+-------+----------+---------+-------------+-------------+-----------------+---------+------------+-----------------------------+---------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+------+--------------+ | HASH JOIN | | | | | | | | 5 | 4 | 5 | 1 | 42,595 | 208,419 | 20,216,643 | 5,484,063,163 | 40,162 | 9,839,000 | "C"."NOTE_ID"="E"."NOTE_ID" | REGEXP_LIKE ("C"."COMMENTS",'(^|\W)BN'||TO_CHAR("B"."BRIDGE_NO")||'(\W|$)','i') | (#keys=1) "B"."BRIDGE_NO"[NUMBER,22], "E"."NCR_ID"[NUMBER,22] | 70 | | +------------------+----------------+--------------+------------------+--------------+-----------------+----------------+-----------+----+-----------+-------+----------+---------+-------------+-------------+-----------------+---------+------------+-----------------------------+---------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+------+--------------+ | PARTITION RANGE | SINGLE | | | | | | | 6 | 5 | 6 | 1 | 1,039 | 104,603 | 8,577,446 | 62,280,224 | 1,011 | | | | "C"."NOTE_ID"[NUMBER,22], "C"."COMMENTS"[VARCHAR2,4000] | 2 | | +------------------+----------------+--------------+------------------+--------------+-----------------+----------------+-----------+----+-----------+-------+----------+---------+-------------+-------------+-----------------+---------+------------+-----------------------------+---------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+------+--------------+ | TABLE ACCESS | FULL | IACDB | IACD_NOTE | C#SEL$1 | 2 | TABLE | ANALYZED | 7 | 6 | 7 | 1 | 1,039 | 104,603 | 8,577,446 | 62,280,224 | 1,011 | | | "C"."CREATE_DATE"<TO_DATE(' 2014-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') | "C"."NOTE_ID"[NUMBER,22], "C"."COMMENTS"[VARCHAR2,4000] | 2 | SEL$81719215 | +------------------+----------------+--------------+------------------+--------------+-----------------+----------------+-----------+----+-----------+-------+----------+---------+-------------+-------------+-----------------+---------+------------+-----------------------------+---------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+------+--------------+ | MERGE JOIN | CARTESIAN | | | | | | | 8 | 5 | 6 | 2 | 24,267 | 12,268,270 | 184,024,050 | 2,780,501,758 | 23,033 | | | | (#keys=0) "B"."BRIDGE_NO"[NUMBER,22], "E"."NCR_ID"[NUMBER,22], "E"."NOTE_ID"[NUMBER,22] | 40 | | +------------------+----------------+--------------+------------------+--------------+-----------------+----------------+-----------+----+-----------+-------+----------+---------+-------------+-------------+-----------------+---------+------------+-----------------------------+---------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+------+--------------+ | TABLE ACCESS | FULL | IACDB | IACD_ASSET | B#SEL$1 | 1 | TABLE | ANALYZED | 9 | 8 | 7 | 1 | 7 | 40 | 160 | 560,542 | 7 | | | LENGTH(TO_CHAR("B"."BRIDGE_NO"))>1 | "B"."BRIDGE_NO"[NUMBER,22] | 1 | SEL$81719215 | +------------------+----------------+--------------+------------------+--------------+-----------------+----------------+-----------+----+-----------+-------+----------+---------+-------------+-------------+-----------------+---------+------------+-----------------------------+---------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+------+--------------+ | BUFFER | SORT | | | | | | | 10 | 8 | 7 | 2 | 24,259 | 308,248 | 3,390,728 | 2,779,941,216 | 23,026 | | | | (#keys=0) "E"."NCR_ID"[NUMBER,22], "E"."NOTE_ID"[NUMBER,22] | 40 | | +------------------+----------------+--------------+------------------+--------------+-----------------+----------------+-----------+----+-----------+-------+----------+---------+-------------+-------------+-----------------+---------+------------+-----------------------------+---------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+------+--------------+ | TABLE ACCESS | FULL | IACDB | IACD_NCR_NOTE | E#SEL$2 | 4 | TABLE | ANALYZED | 11 | 10 | 8 | 1 | 606 | 308,248 | 3,390,728 | 69,498,530 | 576 | | | | "E"."NCR_ID"[NUMBER,22], "E"."NOTE_ID"[NUMBER,22] | 1 | SEL$81719215 | +------------------+----------------+--------------+------------------+--------------+-----------------+----------------+-----------+----+-----------+-------+----------+---------+-------------+-------------+-----------------+---------+------------+-----------------------------+---------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+------+--------------+ | INDEX | FAST FULL SCAN | IACDB | PK_IACDNCR_NCRID | F#SEL$3 | | INDEX (UNIQUE) | ANALYZED | 12 | 4 | 5 | 2 | 31,763 | 22,838,996 | 137,033,976 | 3,248,120,913 | 30,322 | | | | "F"."NCR_ID"[NUMBER,22] | 52 | SEL$81719215 | +------------------+----------------+--------------+------------------+--------------+-----------------+----------------+-----------+----+-----------+-------+----------+---------+-------------+-------------+-----------------+---------+------------+-----------------------------+---------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+------+--------------+ | TABLE ACCESS | FULL | IACDB | IACD_NCR_IAC | G#SEL$4 | 8 | TABLE | ANALYZED | 13 | 3 | 4 | 2 | 181,461 | 1,731,062 | 15,579,558 | 134,407,812,606 | 121,833 | | | ALL THE SCHEMES CHCECKS | "G"."NCR_ID"[NUMBER,22] | 295 | SEL$81719215 | +------------------+----------------+--------------+------------------+--------------+-----------------+----------------+-----------+----+-----------+-------+----------+---------+-------------+-------------+-----------------+---------+------------+-----------------------------+---------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+------+--------------+ Hopefully thats legible enough interms of indexes i assume only the fields that are being sorted is importent crate_dt is not indexed scheme id is indexed Maybe my order in query is wrong...
The plan shows you're doing FULL TABLE SCAN of IACD_NOTE and IACD_ASSET, and then doing a CARETESIAN join of them, because you have provided no criteria for linking one record in IACD_ASSET to a set of records in IACD_NOTE. That's not my definition of a non-intense query, and the eye-popping values for CPU cost bear that out. You need to replace this .., FROM iacd_asset b INNER JOIN iacd_note c ON REGEXP_LIKE(c.comments, '(^|\W)BN' || b.bridge_no || '(\W|$)', 'i') ... with an actual join on indexed columns. It would be helpful if Notes were linked to Assets by a foreign key of BRIDGE_NO or similar. I don't know your data model. Then you can use that regex as an additional filter in the WHERE clause. Also you join to three further tables, to get to something which allows an additional filter on SCHEME. Again, I don't know your data model but this seems pretty inefficient. Unfortunately this is the sort of tuning which relies on domain knowledge. Fixing this query requires understanding of the data - its volume, distribution and skew, the data model itself and the business logic your query implements. This is way beyond the scope of the advice we can offer in StackOverflow. One thing to consider, but it is a big decision would be to index the comments with a free text index. However, that has lots of ramifications (especially space and database admin). Find out more.
I'm wanting to combine unknown number of rows into one
Here is an example of data: TABLE: DRIVERS ID | FNAME | LNAME ------+-------+------ WR558 | WILL | RIKER WW123 | WALT | WHITE TABLE: ACCIDENTS DRIVER | ACCIDENT_NBR | ACCI_DATE | ACCI_CITY | ACCI_ST -------+--------------+------------+-----------+-------- WW123 | 4-7777 | 2014-01-01 | Chicago | IL WW123 | 4-7782 | 2014-01-03 | Houston | TX WW123 | 4-7988 | 2014-01-15 | El Paso | NM There could be any number of accidents listed for this driver in this table or there could be none What i need to see is this: ID | FNAME | LNAME | ACCIDENT1_NBR | ACCI1_DATE | ACCI1_CITY | ACCI1_ST | ACCIDENT2_NBR | ACCI2_DATE | ACCI2_CITY | ACCI2_ST | ACCIDENT3_NBR | ACCI3_DATE | ACCI3_CITY | ACCI3_ST | ... | ACCIDENT10_NBR | ACCI10_DATE | ACCI10_CITY | ACCI10_ST ------+-------+--------+---------------+------------+------------+----------+---------------+ WR558 | WILL | RIKER | | | | | | ... WW123 | WALT | WHITE | 4-7777 | 2014-01-01 | Chicago | IL | 4-7782 | ... I need to pull the driver info and the 10 most recent accidents (if any) onto this one line. I'm unsure if I need to use a PIVOT table or a FETCH. Either way, I'm not sure how to iterate into the columns as needed. Anyone's assistance is greatly appreciated! I'm using SQL Server 2012 and this will ultimately get pulled into a .XLS SSRS file. The pipes I included are only for display and will not be part of the final result
You should use PIVOT for this as ugly as it will be it is the correct choice if coding a C# solution is not viable. Here is an example: Simple Pivot sample