Update table with combined field from other tables - sql

I have a table with actions that are being due in the future. I have a second table that holds all the cases, including the due date of the case. And I have a third table that holds numbers.
The problems is as follows. Our system automatically populates our table with future actions. For some clients however, we need to change these dates. I wanted to create an update query for this, and have this run through our scheduler. However, I am kind of stuck at the moment.
What I have on code so far is this:
UPDATE proxima_gestion p
SET fecha = (SELECT To_char(d.f_ult_vencim + c.hrem01, 'yyyyMMdd')
FROM deuda d,
c4u_activity_dates c,
proxima_gestion p
WHERE d.codigo_cliente = c.codigo_cliente
AND p.n_expediente = d.n_expediente
AND d.saldo > 1000
AND p.tipo_gestion_id = 914
AND p.codigo_oficina = 33
AND d.f_ult_vencim > sysdate)
WHERE EXISTS (SELECT *
FROM proxima_gestion p,
deuda d
WHERE p.n_expediente = d.n_expediente
AND d.saldo > 1000
AND p.tipo_gestion_id = 914
AND p.codigo_oficina = 33
AND d.f_ult_vencim > sysdate)
The field fecha is the current action date. Unfortunately, this is saved as a char instead of date. That is why I need to convert the date back to a char. F_ult_vencim is the due date, and hrem01 is the number of days the actions should be placed away from the due date. (for example, this could be 10, making the new date 10 days after the due date)
Apart from that, there are a few more criteria when we need to change the date (certain creditors, certain departments, only for future cases and starting from a certain amount, only for a certain action type.)
However, when I try and run this query, I get the error message
ORA-01427: single-row subquery returns more than one row
If I run both subqueries seperately, I get 2 results from both. What I am trying to accomplish, is that it connects these 2 queries, and updates the field to the new value. This value will be different for every case, as every due date will be different.
Is this even possible? And if so, how?

You're getting the error because the first SELECT is returning more than one row for each row in the table being updated.
The first thing I see is that the alias for the table in UPDATE is the same as the alias in both SELECTs (p). So all of the references to p in the subqueries are referencing proxima_gestion in the subquery rather than the outer query. That is, the subquery is not dependent on the outer query, which is required for an UPDATE.
Try removing "proxima_gestion p" from FROM in both subqueries. The references to p, then, will be to the outer UPDATE query.

Related

How to resolve a Left Join sub-query error?

In MS-Access I'm trying to join three tables. The third table is created from a sub-query designed to aggregate dates because I don't want multiple records per day when aligned with the first table.
When I entered the left join sub-query, I got this error:
The field is too small to accept the amount of data you attempted to
add. Try inserting or pasting less data.
I've run the sub-query separately and it returns about 19,000 records. Which is quite smaller than the actual table. If I use the actual table, the query works just fine, but it includes the duplicate records when there is more than one entry per day on the third table.
SELECT
SUM([ACD Calls]),
(SUM([Avg ACD Time]*[ACD Calls])/SUM([ACD Calls]))/86400,
(SUM([Avg ACW Time]*[ACD Calls])/SUM([ACD Calls]))/86400,
((SUM([Hold Time])/SUM([ACD Calls])))/86400,
((SUM([Avg ACD Time]*[ACD Calls])
+ SUM([Avg ACW Time]*[ACD Calls]))/SUM([ACD Calls]))/86400,
SUM([Time Adhering])/SUM([Total Time Scheduled]),
SUM([SS])/SUM([SO])
FROM
(
(
[GroupSumDaily]
LEFT JOIN Adherence_WKLY ON (GroupSumDaily.[Day] = Adherence_WKLY.[Day])
AND (GroupSumDaily.Agent = Adherence_WKLY.Agent)
)
LEFT JOIN
(
SELECT Evaluation_List.[Agent],
Evaluation_List.Recording_Date,
SUM(Evaluation_List.[Score]) as SS,
SUM(Evaluation_List.[Out of]) as SO
From Evaluation_List
Group By Evaluation_List.[Recording_Date],
Evaluation_list.[Agent]
)
as Evals ON (GroupSumDaily.[Day] = Evals.[Recording_Date])
AND (GroupSumDaily.Agent = Evals.Agent)
)
WHERE
(
[GroupSumDaily].[Agent] = "LastName FirstName"
AND Month([GroupSumDaily].[Day]) =1
AND Year([GroupSumDaily].[Day]) =2018
AND [GroupSumDaily].[Day] > #2/23/2015#
)
It looks like you don't have a "main" table to query from.
I'd try removing the first two open brackets after the FROM statement (and their equivalent closing brackets.)
If that doesn't fix it, try moving the whole sub-query into a separate query and selecting from the results...
It turns out the subquery fields are automatically limited to 50 characters and this was the root of the problem. When I limited the return to LEFT([Agent], 50), the error disappeared. Is there a way to set character length is a subquery field?
The other odd things is, none of my fields were actually over 50 characters... when I ran Select [Agent] Where LEN([Agent]) >= 50, it returned only 1 records, and it was the "NEW" blank record from the bottom. I confirmed that it completely blank, with no spaces or tabs. Very confusing.

MS Access - SQL Inner Join two conditions and Len-Function

I have a database with a table which I use as master and which is being updated and extended on a daily basis by a table with the same layout. Before I update almost the whole master with daily data, I want to test if the values from a specific column changed during the daily update. Usually this column only contains Null or an "X".
As a prototype I only compared the specific column of Table A and Table B and if there is a difference, set a value with more than one characters into the column (here yesterday's date).
This is the code which worked as a prototype:
UPDATE ReiseMaster
INNER JOIN Update_Import
ON(ReiseMaster.Col3 <> Update_Import.Col3
SET ReiseMaster.Col3 = Date() - 1
Now, the column contains Null, "X" or a date in the master. For the next update I now have to make sure that this previously updated column values which are containing a date as a string will be excluded (otherwise ReiseMaster.Col3 <> Update_Import.Col3 will always be true for them in the future and the date will always be updated which is not intended to happen).
My approach was to exclude all datasets from the master table where the length of the values in the column is longer than 1.
Now here is my problem:
Running the SQL code makes MS Access not responding anymore, the whole program crashes. Can somebody advise me what could be wrong with the following code?
UPDATE ReiseMaster
INNER JOIN ReiseMaster_Import
ON(ReiseMaster.`Attachment Indicator` <> ReiseMaster_Import.`Attachment Indicator` AND LEN(ReiseMaster.`Attachment Indicator`) <= 1)
SET ReiseMaster.`Attachment Indicator` = Date() - 1
Additional info: I use Access VBA to run a code and during that also the SQL-statements which are being saved in a string. About the reason I add a date once I observe a change, I want to use the dates as a reference when the value has been changed for the first time to do further analysis with them in a later stage.
Avoid using complex joins in update queries! Since the entire recordset needs to be updateable, Access tends to have problems with it.
Instead, use a WHERE clause:
UPDATE ReiseMaster
INNER JOIN ReiseMaster_Import
ON(ReiseMaster.[Attachment Indicator] <> ReiseMaster_Import.[Attachment Indicator])
SET ReiseMaster.[Attachment Indicator] = Date() - 1
WHERE LEN(ReiseMaster.[Attachment Indicator]) <= 1
Also, Access uses brackets to escape spaces in column names.
Note that if you're not using any information from the joined table, and just use it to select records, you should use an Exists clause instead:
UPDATE ReiseMaster
SET ReiseMaster.[Attachment Indicator] = Date() - 1
WHERE EXISTS(SELECT 1 FROM ReiseMaster_Import WHERE ReiseMaster.[Attachment Indicator] <> ReiseMaster_Import.[Attachment Indicator])
AND LEN(ReiseMaster.[Attachment Indicator]) <= 1

Duplicate Results in Oracle SQL Plus

When I run the script below, I keep getting duplicate results, even when using distinct.
select distinct
a.SDT, a.fNo, b.IDType, b.pNo, b.pfName, b.plName, b.PDoB, b.Street, b.City, c.Phone
from Scheduled_Flight a, Passenger b, pass_Phone c
where fNo = '0000021'
and
a.SDT = '08-sep-2017 17:30';
I am new to SQL and any help would be much appreciated into solving this issue.
"I keep getting duplicate results, even when using distinct"
You are not getting duplicates in your result set. Rather you have a Cartesian product which is a combination of ONE flight, THREE passengers and THREE phone numbers. Each record in the set is unique so distinct doesn't have any affect.
The problem is you have no join conditions in your from clause. There should be a column on passenger which is the foreign key on flight, and a column on pass_phone which is the foreign key on passenger.
It is easy to fix: you just need to join the tables. Assuming your data model is consistent, your query should look like this (and you don't need DISTINCT):
select a.SDT, a.fNo, b.IDType, b.pNo, b.pfName, b.plName, b.PDoB, b.Street, b.City,c.Phone
from Scheduled_Flight a
join Passenger b on b.fNo = a.fNo
join pass_Phone c on c.pNo = b.bNo
where a.fNo = '0000021'
and a.SDT = '08-sep-2017 17:30';
However, I notice that in your version of the query you didn't prefix fNo. That makes me think you don't have a column of that name on passenger (otherwise the query would have failed on ORA-00918: column ambiguously defined). So, either the foreign key columns are named differently or you haven't got them.
"Is it possible to specify only the date without the time?"
Yep. Use an ANSI date literal e.g. date '2017-09-08'
"Is it possible to specify only the date without the time to still produce results from the database?"
That depends on the how the data is stored. Oracle dates are stored with a time element. If no time is specified (or the time element is truncated) then the time element defaults to midnight. This often catches beginners out, for instance because the pseudo-column sysdate returns the current date and time, not just the current date.
So, if you know the dates are stored in your table without a time element you can do this:
where a.sdt = date '2017-09-08'
But if you don't know that, you can truncate ...
where trunc(a.sdt) = date '2017-09-08'
or test for a range
where a.sdt >= date '2017-09-08'
and a.sdt < date '2017-09-09'
"How come the following code is still producing duplicate results?
select distinct r.sNo, r.tCode, s.fNo, s.SDT
from Airplane r, Scheduled_Flight s
where SDT >= SYSDATE -1;
The airplane attribute cannot have the s.SDT attribute."
Without seeing the output I can't be sure but I would bet that this query does not produce duplicate records either. What you have is a product combining all your AIRPLANE records with all your FLIGHT records matching the sdt filter.
This is another data modelling problem. Of course aeroplanes don't have a flight time: one aeroplane makes many flights. But it makes perfect sense for a flight to be assigned to a plane. In fact that's crucial to ensuring that you don't have more flights than you have planes to fly them, and that one plane isn't planned to take off from London for Madrid at a time when it's planned to be half-way to Hong Kong.
You really should use the ANSI 92 syntax, as I showed in my answer to your previous posted code. The explicit joins not only make it easier to understand the query but they prevent mistakes like this. The fact that you apparently don't have any candidate columns to make the join immediately highlights the flaw in the data model.
select distinct r.sNo, r.tCode, s.fNo, s.SDT
from Airplane r
INNER JOIN Scheduled_Flight s ON ????
where SDT >= SYSDATE -1;
i don't see any rows which are duplicated, if you compare every column of each row, each row is uniquely identified, since you are doing cartesian product you are getting multiple records. but each rows are unique to each other.

SQL query seems to work for 'AND T1.email_address_ IN (subquery)', but returns 0 rows for 'AND T1.email_address_ NOT IN (subquery)'

Good morning. I'm working in Responsys Interact, which is an Oracle-based email campaign management type SAAS product. I'm creating a query to basically filter a target list for an email campaign designed to target a specific sub-set of our master email contact list. Here's the query I created a few weeks ago that appears to work:
/*
Table Symbolic Name
CONTACTS_LIST $A$
Engaged $B$
TRANSACTIONS_RAW $C$
TRANSACTION_LINES_RAW $D$
-- A Responsys Filter (Engaged) will return only an RIID_, nothing else, according to John # Responsys....so,....let's join on that to contact list...
*/
SELECT
DISTINCT $A$.EMAIL_ADDRESS_,
$A$.RIID_,
$A$.FIRST_NAME,
$A$.LAST_NAME,
$A$.EMAIL_PERMISSION_STATUS_
FROM
$A$
JOIN $B$ ON $B$.RIID_ = $A$.RIID_
LEFT JOIN $C$ ON $C$.EMAIL_ADDRESS_ = $A$.EMAIL_ADDRESS_
LEFT JOIN $D$ ON $D$.TRANSACTION_ID = $C$.TRANSACTION_ID
WHERE
$A$.EMAIL_DOMAIN_ NOT IN ('none.com', 'noemail.com', 'mailinator.com', 'nomail.com') AND
/* don't include hp customers */
$A$.HP_PLAN_START_DATE IS NULL AND
$A$.EMAIL_ADDRESS_ NOT IN
(
SELECT
$C$.EMAIL_ADDRESS_
FROM
$C$
JOIN $D$ ON $D$.TRANSACTION_ID = $C$.TRANSACTION_ID
WHERE
/* Get only purchase transactions for certain item_id's/SKU's */
($D$.ITEM_FAMILY_ID IN (3,4,5,8,14,15) OR $D$.ITEM_ID IN (704,769,1893,2808,3013) ) AND
/* .... within last 60 days (i.e. 2 months) */
$A$.TRANDATE > ADD_MONTHS(CURRENT_TIMESTAMP, -2)
)
;
This seems to work, in that if I run the query without the sub-query, we get 720K rows; and if I add back the 'AND NOT IN...' subquery, we get about 700K rows, which appears correct based on what my user knows about her data. What I'm (supposedly) doing with the NOT IN subquery is filtering out any email addresses where the customer has purchased certain items from us in the last 60 days.
So, now I need to add in another constraint. We still don't want customers who made certain purchases in the last 60 days as above, but now also we want to exclude customers who have purchased another particular item, but now within the last 12 months. So, I thought I would add another subquery, as shown below. Now, this has introduced several problems:
Performance - the query, which took a couple minutes to run before, now takes quite a few more minutes to run - in fact it seems to time out....
So, I wondered if there's an issue having two subqueries, but before I went to think about alternatives to this, I decided to test my new subquery by temporarily deleting the first subquery, so that I had just one subquery similar to above, but with the new item = 11 and within the last 12 months logic. And so with this, the query finally returned after a few minutes now, but with zero rows.
Trying to figure out why, I tried simply changing the AND NOT IN (subquery) to AND IN (subquery), and that worked, in that it returned a few thousand rows, as expected.
So why would the same SQL when using AND IN (subquery) "work", but the exact same SQL simply changed to AND NOT IN (subquery) return zero rows, instead of what I would expect which would be my 700 something thousdand plus rows, less the couple thousand encapsulated by the subquery result?
Also, what is the best i.e. most performant way to accomplish what I'm trying to do, which is filter by some purchases made within one date range, AND by some other purchases made within a different date range?
Here's the modified version:
SELECT
DISTINCT $A$.EMAIL_ADDRESS_,
$A$.RIID_,
$A$.FIRST_NAME,
$A$.LAST_NAME,
$A$.EMAIL_PERMISSION_STATUS_
FROM
$A$
JOIN $B$ ON $B$.RIID_ = $A$.RIID_
LEFT JOIN $C$ ON $C$.EMAIL_ADDRESS_ = $A$.EMAIL_ADDRESS_
LEFT JOIN $D$ ON $D$.TRANSACTION_ID = $C$.TRANSACTION_ID
WHERE
$A$.EMAIL_DOMAIN_ NOT IN ('none.com', 'noemail.com', 'mailinator.com', 'nomail.com') AND
/* don't include hp customers */
$A$.HP_PLAN_START_DATE IS NULL AND
$A$.EMAIL_ADDRESS_ NOT IN
(
SELECT
$C$.EMAIL_ADDRESS_
FROM
$C$
JOIN $D$ ON $D$.TRANSACTION_ID = $C$.TRANSACTION_ID
WHERE
/* Get only purchase transactions for certain item_id's/SKU's */
($D$.ITEM_FAMILY_ID IN (3,4,5,8,14,15) OR $D$.ITEM_ID IN (704,769,1893,2808,3013) ) AND
/* .... within last 60 days (i.e. 2 months) */
$C$.TRANDATE > ADD_MONTHS(CURRENT_TIMESTAMP, -2)
)
AND
$A$.EMAIL_ADDRESS_ NOT IN
(
/* get purchase transactions for another type of item within last year */
SELECT
$C$.EMAIL_ADDRESS_
FROM
$C$
JOIN $D$ ON $D$.TRANSACTION_ID = $C$.TRANSACTION_ID
WHERE
$D$.ITEM_FAMILY_ID = 11 AND $C$.TRANDATE > ADD_MONTHS(CURRENT_TIMESTAMP, -12)
)
;
Thanks for any ideas/insights. I may be missing or mis-remembering some basic SQL concept here - if so please help me out! Also, Responsys Interact runs on top of Oracle - it's an Oracle product - but I don't know off hand what version/flavor. Thanks!
Looks like my problem with the new subquery was due to poor performance due to lack of indexes. Thanks to Alex Poole's comments, I looked in Responsys and there is a facility to get an 'explain' type analysis, and it was throwing warnings, and suggesting I build some indexes. Found the way to do that on the data sources, went back to the explain, and it said, "The query should run without placing an unnecessary burden on the system". And while it still ran for quite a few minutes, it did finally come back with close to the expected number of rows.
Now, I'm on to tackle the other half of the issue, which is to now incorporate this second sub-query in addition to the first, original subquery....
Ok, upon further testing/analysis and refining my stackoverflow search critieria, the answer to the main part of my question dealing with the IN vs. NOT IN can be found here: SQL "select where not in subquery" returns no results
My performance was helped by using Responsys's explain-like feature and adding some indexes, but when I did that, I also happened to add in a little extra SQL in my sub-query's WHERE clause.... when I removed that, even after indexes built, I was back to zero rows returned. That's because as it turned out at least one of the transactions rows for the item family id I was interested in for this additional sub-query had a null value for email address. And as further explained in the link above, when using NOT IN, as soon as you have a null value involved, SQL can't definitively say it's NOT IN, since you can't really compare to null, so as soon as you have a null, the sub-query's going to evaluate 'false', thus zero rows. When using IN, even though there are nulls present, if you get one positive match, well, that's a match, so the sub-query returns 'true', so that's why you'll get rows with IN, but not with NOT IN. I hadn't realized that some of our transaction data may have null email addresses - now I know, so I just added a where not null to the where clause for the email address, and now all's good.

CASE Clause on select clause throwing 'SQLCODE=-811, SQLSTATE=21000' Error

This query is very well working in Oracle. But it is not working in DB2. It is throwing
DB2 SQL Error: SQLCODE=-811, SQLSTATE=21000, SQLERRMC=null, DRIVER=3.61.65
error when the sub query under THEN clause is returning 2 rows.
However, my question is why would it execute in the first place as my WHEN clause turns to be false always.
SELECT
CASE
WHEN (SELECT COUNT(1)
FROM STOP ST,
FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC=1
AND ST.SHIPMENT_ID = 2779) = 1
THEN
(SELECT ST.FACILITY_ALIAS_ID
FROM STOP ST,
FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC=1
AND ST.SHIPMENT_ID = 2779
)
ELSE NULL
END STAPPFAC
FROM SHIPMENT SHIPMENT
WHERE SHIPMENT.SHIPMENT_ID IN (2779);
The SQL standard does not require short cut evaluation (ie evaluation order of the parts of the CASE statement). Oracle chooses to specify shortcut evaluation, however DB2 seems to not do that.
Rewriting your query a little for DB2 (8.1+ only for FETCH in subqueries) should allow it to run (unsure if you need the added ORDER BY and don't have DB2 to test on at the moment)
SELECT
CASE
WHEN (SELECT COUNT(1)
FROM STOP ST,
FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC=1
AND ST.SHIPMENT_ID = 2779) = 1
THEN
(SELECT ST.FACILITY_ALIAS_ID
FROM STOP ST,
FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC=1
AND ST.SHIPMENT_ID = 2779
ORDER BY ST.SHIPMENT_ID
FETCH FIRST 1 ROWS ONLY
)
ELSE NULL
END STAPPFAC
FROM SHIPMENT SHIPMENT
WHERE SHIPMENT.SHIPMENT_ID IN (2779);
Hmm... you're running the same query twice. I get the feeling you're not thinking in sets (how SQL operates), but in a more procedural form (ie, how most common programming languages work). You probably want to rewrite this to take advantage of how RDBMSs are supposed to work:
SELECT Current_Stop.facility_alias_id
FROM SYSIBM/SYSDUMMY1
LEFT JOIN (SELECT MAX(Stop.facility_alias_id) AS facility_alias_id
FROM Stop
JOIN Facility
ON Facility.facility_id = Stop.facility_id
AND Facility.is_dock_sched_fac = 1
WHERE Stop.shipment_id = 2779
HAVING COUNT(*) = 1) Current_Stop
ON 1 = 1
(no sample data, so not tested. There's a couple of other ways to write this based on other needs)
This should work on all RDBMSs.
So what's going on here, why does this work? (And why did I remove the reference to Shipment?)
First, let's look at your query again:
CASE WHEN (SELECT COUNT(1)
FROM STOP ST, FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC = 1
AND ST.SHIPMENT_ID = 2779) = 1
THEN (SELECT ST.FACILITY_ALIAS_ID
FROM STOP ST, FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC = 1
AND ST.SHIPMENT_ID = 2779)
ELSE NULL END
(First off, stop using the implicit-join syntax - that is, comma-separated FROM clauses - always explicitly qualify your joins. For one thing, it's way too easy to miss a condition you should be joining on)
...from this it's obvious that your statement is the 'same' in both queries, and shows what you're attempting - if the dataset has one row, return it, otherwise the result should be null.
Enter the HAVING clause:
HAVING COUNT(*) = 1
This is essentially a WHERE clause for aggregates (functions like MAX(...), or here, COUNT(...)). This is useful when you want to make sure some aspect of the entire set matches a given criteria. Here, we want to make sure there's just one row, so using COUNT(*) = 1 as the condition is appropriate; if there's more (or less! could be 0 rows!) the set will be discarded/ignored.
Of course, using HAVING means we're using an aggregate, the usual rules apply: all columns must either be in a GROUP BY (which is actually an option in this case), or an aggregate function. Because we only want/expect one row, we can cheat a little, and just specify a simple MAX(...) to satisfy the parser.
At this point, the new subquery returns one row (containing one column) if there was only one row in the initial data, and no rows otherwise (this part is important). However, we actually need to return a row regardless.
FROM SYSIBM/SYSDUMMY1
This is a handy dummy table on all DB2 installations. It has one row, with a single column containing '1' (character '1', not numeric 1). We're actually interested in the fact that it has only one row...
LEFT JOIN (SELECT ... )
ON 1 = 1
A LEFT JOIN takes every row in the preceding set (all joined rows from the preceding tables), and multiplies it by every row in the next table reference, multiplying by 1 in the case that the set on the right (the new reference, our subquery) has no rows. (This is different from how a regular (INNER) JOIN works, which multiplies by 0 in the case that there is no row) Of course, we only maybe have 1 row, so there's only going to be a maximum of one result row. We're required to have an ON ... clause, but there's no data to actually correlate between the references, so a simple always-true condition is used.
To get our data, we just need to get the relevant column:
SELECT Current_Stop.facility_alias_id
... if there's the one row of data, it's returned. In the case that there is some other count of rows, the HAVING clause throws out the set, and the LEFT JOIN causes the column to be filled in with a null (no data) value.
So why did I remove the reference to Shipment? First off, you weren't using any data from the table - the only column in the result set was from the subquery. I also have good reason to believe that there would only be one row returned in this case - you're specifying a single shipment_id value (which implies you know it exists). If we don't need anything from the table (including the number of rows in that table), it's usually best to remove it from the statement: doing so can simplify the work the db needs to do.