Long SQL subquery trouble

Long SQL subquery trouble - sql

I just registered and want to ask.
I learn sql queries not so long time and I got a trouble when I decided to move a table to another database. A few articles were read about building long subqueries , but they didn't help me.
Everything works perfect before that my action.
I just moved the table and tried to rewrite the query while whole day.
update [dbo].Full
set [salary] = 1000
where [dbo].Full.id in (
select distinct k1.id
from (
select id, Topic, User
from Full
where User not in (select distinct topic_name from [DB_1].dbo.S_School)
) k1
where k1.id not in (
select distinct k2.id
from (
select id, Topic, User
from Full
where User not in (select distinct topic_name from [DB_1].dbo.Shool)
) k2,
List_School t3
where charindex (t3.NameApp, k2.Topic)>5
)
)
I moved table List_School to database [DB_1] and I can't to bend with it.
I can't write [DB_1].dbo.List_School. Should I use one more subquery?
I even thought about create a few temporary tables but it can influence on speed of execution.
Sql gurus , please invest some your time on me. Thank you in advance.
I will be happy for each hint, which you give me.

There appear to be a number of issues. You are comparing the user column to the topic_name column. An expected meaning of those column names would suggest you are not comparing the correct columns. But that is a guess.
In the final subquery you have an ansi join on table List_School but no join columns which means the join witk k2 is a cartesian product (aka cross join) which is not what you would want in most situations. Again a guess as no details of actual problem data or error messages was provided.

Related

SQL Get sum of a column from another table by ID

very inexperienced with SQL, and I've found myself needing to write a query. Hopefully you can help me understand how I'd go about this:
I have two tables.
"table_requests" contains all requests, some of which are batches
"table_pages" contains information for each page of a batch, connects to
"table_requests" on the column "table_request_id"
In addition, "table_pages" has a numeric column "word_count" that lists a number for each page and a "table_request_id" column that can match to the PK of "table_requests".
For my query, I'd like to connect "table_requests" to "table_pages" on that matching column, and select everything from "table_requests" with an added column on the end that totals the "word_count" for each "table_request" (from all pages in "table_pages").
So far I have:
select tr.id, tr.creation_date, sum(tp.word_count) as total_wc
from table_requests tr
join table_pages cp on tp.table_request_id = tr.id
Thank you all, let me know if there is any more information I can provide!

I think that the simplest approach is a correlated subquery:
select
tr.*,
(
select sum(tp.word_count)
from table_pages tp
where tp.table_request_id = tr.id
) total_wc
from table_requests tr
For performance with this query, make sure that you have an index on table_pages(table_request_id ).

Improve Netezza SQL Query That Contains Hundreds of Strings in WHERE Clause

I have a Netezza query with a WHERE clause that includes several hundred potential strings. I'm surprised that it runs, but it takes time to complete and occasionally errors out ('transaction rolled back by client'). Here's a pseudo code version of my query.
SELECT
TO_CHAR(X.I_TS, 'YYYY-MM-DD') AS DATE,
X.I_SRC_NM AS CHANNEL,
X.I_CD AS CODE,
COUNT(DISTINCT CASE WHEN X.I_FLG = 1 THEN X.UID ELSE NULL) AS WIDGETS
FROM
(SELECT
A.I_TS,
A.I_SRC_NM,
A.I_CD,
B.UID,
B.I_FLG
FROM
SCHEMA.DATABASE.TABLE_A A
LEFT JOIN SCHEMA.DATABASE.TABLE_B B ON A.UID = B.UID
WHERE
A.I_TS BETWEEN '2017-01-01' AND '2017-01-15'
AND B.TAB_CODE IN ('00AV', '00BX', '00C2', '00DJ'...
...
...
...
...
...
...
...)
) X
GROUP BY
X.I_TS,
X.I_SRC_NM,
X.I_CD
;
In my query, I'm limiting the results on B.TAB_CODE to about 1,200 values (out of more than 10k). I'm honestly surprised that it works at all, but it does most of the time.
Is there a more efficient way to handle this?

If the IN clause becomes too cumbersome, you can make your query in multiple parts. Create a temporary table containing a TAB_CODE set then use it in a JOIN.
WITH tab_codes(tab_code) AS (
SELECT '00AV'
UNION ALL
SELECT '00BX'
--- etc ---
)
SELECT
TO_CHAR(X.I_TS, 'YYYY-MM-DD') AS DATE,
X.I_SRC_NM AS CHANNEL,
--- etc ---
INNER JOIN tab_codes Q ON B.TAB_CODES = Q.tab_code
If you want to boost performance even more, consider using a real temporary table (CTAS)

We've seen situations where it's "cheaper" to CTAS the original table to another, distributed on your primary condition, and then querying that table instead.

If im guessing correctly , the X.I_TS is in fact a ‘timestamp’, and as such i expect it to contain many different values per day. Can you confirm that?
If I’m right the query can possibly benefit from changing the ‘group by X.I._TS,...’ to ‘group by 1,...’
Furthermore the ‘Count(Distinct Case...’ can never return anything else than 1 or NULL. Can you confirm that?
If I’m right on that, you can get rid of the expensive ‘DISTINCT’ by changing it to ‘MAX(Case...’
Can you follow me :)

What's the best way to amalgamate the 2 queries below

I wrote the query below as part of a larger query to create a table. As I'm new to SQL, this was done in a very step-by-step manner so that I could easily understand the individual steps in the query and what each part was doing.
However, I've now been tasked to make the below 2 parts of the query more efficient by joining them together, and this is where I'm struggling.
I feel like I should be creating a single table rather than 2 and that the single table should contain all of the columns/values that I require. However, I am not at all sure of the syntax required to make this happen or the order in which I need to re-write the query.
Is anyone able to offer any advice?
Many thanks
sys_type as (select nvl(dw_start_date,sysdate) date_updated, id, descr
from scd2.scd2_table_a
inner join year_month_period
on 1=1
WHERE batch_end_date BETWEEN dw_start_date and NVL(dw_end_date,sysdate)),
sys_type_2 as (select -1 as sys_typ_id,
'Unassigned' as sys_typ_desc,
sysdate as date_updated
from dual
union
select id as sys_typ_id, descr as sys_typ_desc, date_updated
from sys_type),

Assuming you are using Oracle database, the queries above seem fine. I don't think you can make them more efficient just by 'joining' them (joining defined very loosely here. Is there a performance issue?
I think you can get better results by tuning your first inline query 'sys_type'.
You have a cartesian product there. Do you need that? Why don't you put the condition in the where clause as the join clause?
Basically
sys_type as (select nvl(dw_start_date,sysdate) date_updated, id, descr
from scd2.scd2_table_a
inner join year_month_period
on (batch_end_date BETWEEN dw_start_date and NVL(dw_end_date,sysdate)))

SQL return distinct while sorting on another column

Afternoon all, hope you can help an SQL newbie with what's probably a simple request. I'll jump straight in with the question/problem.
For table Property_Information, I'd like to retrieve either a complete record, or even specified fields if possible where the below criteria are met.
The table has column PLCODE which is not unique. The Table also has column PCODE, which is unique and which there are multiple per PLCODE (If that makes sense).
What I need to do is request the lowest PCODE record, for each unique PLCODE.
E.G. There are 6500 records in this table, and 255 unique PLCODES; therefore I'd expect a results set of the 255 individual PLCODES, each with the lowest PCODE record atttached.
As I'm here, and already feel like a burden to the community, perhaps someone might suggest a good resource for developing existing (but basic) SQL skills?
Many thanks in advance
P.S. Query will be performed on MSSQLSMS 2012 on a 2005 DB if that's of any relevance

select PLCODE, min(PCODE) from table group by PLCODE
you google any ansi sql site or find SQL tutorials.

Something like this will give you all columns for your grouped rows.
WITH CTE AS
(
SELECT
PLCODE
, MIN(PCODE) AS PCODE
FROM Property_Information
GROUP BY PLCODE
)
SELECT p.* FROM CTE c
LEFT JOIN Property_Information p
ON c.PLCODE = p.PLCODE AND c.PCODE = p.PCODE

SELECT
*, MIN(PCODE)
FROM
Property_Information
GROUP BY
PLCODE

Why is selecting specified columns, and all, wrong in Oracle SQL?

Say I have a select statement that goes..
select * from animals
That gives a a query result of all the columns in the table.
Now, if the 42nd column of the table animals is is_parent, and I want to return that in my results, just after gender, so I can see it more easily. But I also want all the other columns.
select is_parent, * from animals
This returns ORA-00936: missing expression.
The same statement will work fine in Sybase, and I know that you need to add a table alias to the animals table to get it to work ( select is_parent, a.* from animals ani), but why must Oracle need a table alias to be able to work out the select?

Actually, it's easy to solve the original problem. You just have to qualify the *.
select is_parent, animals.* from animals;
should work just fine. Aliases for the table names also work.

There is no merit in doing this in production code. We should explicitly name the columns we want rather than using the SELECT * construct.
As for ad hoc querying, get yourself an IDE - SQL Developer, TOAD, PL/SQL Developer, etc - which allows us to manipulate queries and result sets without needing extensions to SQL.

Good question, I've often wondered this myself but have then accepted it as one of those things...
Similar problem is this:
sql>select geometrie.SDO_GTYPE from ngg_basiscomponent
ORA-00904: "GEOMETRIE"."SDO_GTYPE": invalid identifier
where geometrie is a column of type mdsys.sdo_geometry.
Add an alias and the thing works.
sql>select a.geometrie.SDO_GTYPE from ngg_basiscomponent a;

Lots of good answers so far on why select * shouldn't be used and they're all perfectly correct. However, don't think any of them answer the original question on why the particular syntax fails.
Sadly, I think the reason is... "because it doesn't".
I don't think it's anything to do with single-table vs. multi-table queries:
This works fine:
select *
from
person p inner join user u on u.person_id = p.person_id
But this fails:
select p.person_id, *
from
person p inner join user u on u.person_id = p.person_id
While this works:
select p.person_id, p.*, u.*
from
person p inner join user u on u.person_id = p.person_id
It might be some historical compatibility thing with 20-year old legacy code.
Another for the "buy why!!!" bucket, along with why can't you group by an alias?

The use case for the alias.* format is as follows
select parent.*, child.col
from parent join child on parent.parent_id = child.parent_id
That is, selecting all the columns from one table in a join, plus (optionally) one or more columns from other tables.
The fact that you can use it to select the same column twice is just a side-effect. There is no real point to selecting the same column twice and I don't think laziness is a real justification.

Select * in the real world is only dangerous when referring to columns by index number after retrieval rather than by name, the bigger problem is inefficiency when not all columns are required in the resultset (network traffic, cpu and memory load).
Of course if you're adding columns from other tables (as is the case in this example it can be dangerous as these tables may over time have columns with matching names, select *, x in that case would fail if a column x is added to the table that previously didn't have it.

why must Oracle need a table alias to be able to work out the select
Teradata is requiring the same. As both are quite old (maybe better call it mature :-) DBMSes this might be historical reasons.
My usual explanation is: an unqualified * means everything/all columns and the parser/optimizer is simply confused because you request more than everything.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Long SQL subquery trouble - sql

Related

SQL Get sum of a column from another table by ID

Improve Netezza SQL Query That Contains Hundreds of Strings in WHERE Clause

What's the best way to amalgamate the 2 queries below

SQL return distinct while sorting on another column

Why is selecting specified columns, and all, wrong in Oracle SQL?

Categories

Resources