So, i have 2 tables, for simplicity's sake i'll show 2 columns for one, and only one for the other, but they have a lot more data.
ok, so, the tables looks like this:
_________________________________
| table 1 |
+---------------------------------+
| gas_deu est_pag |
+---------------------------------+
| 56857 (null) |
| 60857 (null) |
| 80857 (null) |
+---------------------------------+
______________________
| table 2 |
+----------------------+
| gas_pag |
+----------------------+
| 56857 |
| 21000 |
| 75857 |
+----------------------+
table 1 and table 2 can be joined using id_edi, and nr_dep (same name in both tables)
what's happening here is basically the following:
in table 1, i have gas_deu which is a number owned to someone.
in table 2 gas_pag is how much has been paid, meaning gas_deu-gas_pag should give 0 (or negative) if gas_deu was paid in full, or a positive number if it was partially paid
Also, some rows from table 1 are not in table 2 meaning gas_deu has not been paid at all.
What i need to do is update est_pag in table 1 with the following:
if gas_deu-gas_pag<=0 (debt paid) then update est_pag value to 1
if gas_deu-gas_pag>0 (partially paid) then update est_pag value to 2
if a row is in table 1 but not in table 2, it has not been paid at all so est_pag value will be 3
I have tried a lot, and i mean A LOT of code in the update for this, the tables have a lot of columns so i won-t post the code i have tried because it would only get more confusing
I mostly tried using select sub-queries in the set of the update, and in the where of the update. all of them always give me the single row error, i know this happens because it returns more than one value, but the how would i do this? I don't see a way in which i get only one value of the query and update only the rows that match the debt status.
using case was the logical choice for me but it always returns more than one row it seems (tried to make a (case when gas_deu-gas_pag<=0 then 1 else est_pag end), since if i could get at least one value in there, it would be a start, but get the same more than one row problem)
Any help or suggestions are highly appreciated,i already tried everything i could think of, and a lot of answers from here in stackoverflow, but still can't get it to work out
Edit: adding what table 1 should look like after updating
| table 1 |
+---------------------------------+
| gas_deu est_pag |
+---------------------------------+
| 56857 1 |
| 60857 2 |
| 80857 2 |
+---------------------------------+
Update 2:
update table1
set est_pag=(select (case when (select min((case when gasto.gas_deu-
pago.gas_pag<=0 then 0 else null end))
from table2 pago full outer join gasto_comun_pruebaestpago gasto on
pago.nr_dep=gasto.nr_dep
where gasto.nr_dep=pago.nr_dep and gasto.id_edi=pago.id_edi)=0 then 1 else
null end)
from table2 pago full outer join gasto_comun_pruebaestpago gasto on
pago.nr_dep=gasto.nr_dep
where gasto.nr_dep=pago.nr_dep and gasto.id_edi=pago.id_edi)
where est_pag is null;
This is one of many codes i tried. this one changes all values to 1, this is bacause of the min() in there, that outpust just one row, with a 0 and the gets checked by the case, 0=0 so everything goes to 1. The problem is that, without the min(), the select does what i need, but the update throws the 'single-row subquery returns more than one row' error again
You can update through a left outer join. It is possible the second table has more than one payment for the same account (customers may make partial payments), so I aggregate the second table by ID first. You didn't provide input data for testing, so in my small test I assume the tables are matched by id (primary key in the first table, foreign key in the second table - even though I didn't write the constraints into the table definitions).
The way I wrote the update, a 3 will be assigned if an account is not present in the second table, but also if it is present but with NULL in the gas_pag column. (It would be best if that column was declared NOT NULL from the outset!) If a different handling is desired in that case, you didn't say; it can be accommodated easily though.
So, here goes.
TABLE CREATION
create table t1 ( id number, gas_deu number, est_pag number );
insert into t1
select 101, 56857, null from dual union all
select 104, 60857, null from dual union all
select 108, 80857, null from dual
;
create table t2 ( id number, gas_pag number ) ;
insert into t2
select 101, 56857 from dual union all
select 104, 60000 from dual
;
select * from t1;
ID GAS_DEU EST_PAG
---------- ---------- ----------
101 56857
104 60857
108 80857
select * from t2;
ID GAS_PAG
---------- ----------
101 56857
104 60000
UPDATE STATEMENT
update
( select est_pag, gas_deu, gas_pag
from t1 left join
( select id, sum(gas_pag) as gas_pag from t2 group by id ) x
on t1.id = x.id
)
set est_pag = case when gas_deu <= gas_pag then 1
when gas_deu > gas_pag then 2
else 3
end
;
select * from t1;
ID GAS_DEU EST_PAG
---------- ---------- ----------
101 56857 1
104 60857 2
108 80857 3
CLEAN-UP
drop table t1 purge;
drop table t2 purge;
Related
I have the following table called table_persons in Hive:
+--------+------+------------+
| people | type | date |
+--------+------+------------+
| lisa | bot | 19-04-2022 |
| wayne | per | 19-04-2022 |
+--------+------+------------+
If type is "bot", I have to add two rows in the table d1_info else if type is "per" i only have to add one row so the result is the following:
+---------+------+------------+
| db_type | info | date |
+---------+------+------------+
| x_bot | x | 19-04-2022 |
| x_bnt | x | 19-04-2022 |
| x_per | b | 19-04-2022 |
+---------+------+------------+
How can I add two rows if this condition is met?
with a Case When maybe?
You may try using a union to merge or duplicate the rows with bot. The following eg unions the first query which selects all records and the second query selects only those with bot.
Edit
In response to the edited question, I have added an additional parity column (storing 1 or 0) named original to differentiate the duplicate entry named
SELECT
p1.*,
1 as original
FROM
table_persons p1
UNION ALL
SELECT
p1.*,
0 as original
FROM
table_persons p1
WHERE p1.type='bot'
You may then insert this into your other table d1_info using the above query as a subquery or CTE with the desired transformations CASE expressions eg
INSERT INTO d1_info
(`db_type`, `info`, `date`)
WITH merged_data AS (
SELECT
p1.*,
1 as original
FROM
table_persons p1
UNION ALL
SELECT
p1.*,
0 as original
FROM
table_persons p1
WHERE p1.type='bot'
)
SELECT
CONCAT('x_',CASE
WHEN m1.type='per' THEN m1.type
WHEN m1.original=1 AND m1.type='bot' THEN m1.type
ELSE 'bnt'
END) as db_type,
CASE
WHEN m1.type='per' THEN 'b'
ELSE 'x'
END as info,
m1.date
FROM
merged_data m1
ORDER BY m1.people,m1.date;
See working demo db fiddle here
I think what you want is to create a new table that captures your logic. This would simplify your query and make it so you could easily add new types without having to edit logic of a case statement. It may also make it cleaner to view your logic later.
CREATE TABLE table_persons (
`people` VARCHAR(5),
`type` VARCHAR(3),
`date` VARCHAR(10)
);
INSERT INTO table_persons
VALUES
('lisa', 'bot', '19-04-2022'),
('wayne', 'per', '19-04-2022');
CREATE TABLE info (
`type` VARCHAR(5),
`db_type` VARCHAR(5),
`info` VARCHAR(1)
);
insert into info
values
('bot', 'x_bot', 'x'),
('bot', 'x_bnt', 'x'),
('per','x_per','b');
and then you can easily do a join:
select
info.db_type,
info.info,
persons.date date
from
table_persons persons inner join info
on
info.type = persons.type
2 tables
table_1 rows: NOTE: id 2 has two rows
-----------------------
| id | counts | track |
-----------------------
| 1 | 10 | 1 |
| 2 | 10 | 2 |
| 2 | 10 | 3 |
-----------------------
table_2 rows
---------------
| id | counts |
---------------
| 1 | 0 |
| 2 | 0 |
---------------
Query:
with t1_rows as (
select id, sum(counts) as counts, track
from table_1
group by id, track
)
update table_2 set counts = (coalesce(table_2.counts, 0) + t1.counts)::float
from t1_rows t1
where table_2.id = t1.id;
select * from table_2;
When i ran above query i got table_2 output as
---------------
| id | counts |
---------------
| 1 | 10 |
| 2 | 10 | (expected counts as 20 but got 10)
---------------
I noticed that above update query is considering only 1st match and skipping rest.
I can make it work by changing the query like below. Now the table_2 updates as expected since there are no duplicate rows from table_1.
But i would like to know why my previous query is not working. Is there anything wrong in it?
with t1_rows as (
select id, sum(counts) as counts, array_agg(track) as track
from table_1
group by id
)
update table_2 set counts = (coalesce(table_2.counts, 0) + t1.counts)::float
from t1_rows t1
where table_2.id = t1.id;
Schema
CREATE TABLE IF NOT EXISTS table_1(
id varchar not null,
counts integer not null,
track integer not null
);
CREATE TABLE IF NOT EXISTS table_2(
id varchar not null,
counts integer not null
);
insert into table_1(id, counts, track) values(1, 10, 1), (2, 10, 2), (2, 10, 3);
insert into table_2(id, counts) values(1, 0), (2, 0);
The problem is that an UPDATE in PostgreSQL creates a new version of the row rather than changing the row in place, but the new row version is not visible in the snapshot of the current query. So from the point of view of the query, the row “vanishes” when it is updated the first time.
The documentation says:
When a FROM clause is present, what essentially happens is that the target table is joined to the tables mentioned in the from_list, and each output row of the join represents an update operation for the target table. When using FROM you should ensure that the join produces at most one output row for each row to be modified. In other words, a target row shouldn't join to more than one row from the other table(s). If it does, then only one of the join rows will be used to update the target row, but which one will be used is not readily predictable.
So if I read your question correctly, you expect row 2&3 from table_1 to get added together? If so, the reason your first approach didn't work is because it grouped by id, track.
Since row 2&3 have a different number in the track column, they didn't get added together by the group by clause.
Your second approach worked because it only grouped by id
I have performing some queries using PostgreSQL SELECT DISTINCT ON syntax. I would like to have the query return the total number of rows alongside with every result row.
Assume I have a table my_table like the following:
CREATE TABLE my_table(
id int,
my_field text,
id_reference bigint
);
I then have a couple of values:
id | my_field | id_reference
----+----------+--------------
1 | a | 1
1 | b | 2
2 | a | 3
2 | c | 4
3 | x | 5
Basically my_table contains some versioned data. The id_reference is a reference to a global version of the database. Every change to the database will increase the global version number and changes will always add new rows to the tables (instead of updating/deleting values) and they will insert the new version number.
My goal is to perform a query that will only retrieve the latest values in the table, alongside with the total number of rows.
For example, in the above case I would like to retrieve the following output:
| total | id | my_field | id_reference |
+-------+----+----------+--------------+
| 3 | 1 | b | 2 |
+-------+----+----------+--------------+
| 3 | 2 | c | 4 |
+-------+----+----------+--------------+
| 3 | 3 | x | 5 |
+-------+----+----------+--------------+
My attemp is the following:
select distinct on (id)
count(*) over () as total,
*
from my_table
order by id, id_reference desc
This returns almost the correct output, except that total is the number of rows in my_table instead of being the number of rows of the resulting query:
total | id | my_field | id_reference
-------+----+----------+--------------
5 | 1 | b | 2
5 | 2 | c | 4
5 | 3 | x | 5
(3 rows)
As you can see it has 5 instead of the expected 3.
I can fix this by using a subquery and count as an aggregate function:
with my_values as (
select distinct on (id)
*
from my_table
order by id, id_reference desc
)
select count(*) over (), * from my_values
Which produces my expected output.
My question: is there a way to avoid using this subquery and have something similar to count(*) over () return the result I want?
You are looking at my_table 3 ways:
to find the latest id_reference for each id
to find my_field for the latest id_reference for each id
to count the distinct number of ids in the table
I therefore prefer this solution:
select
c.id_count as total,
a.id,
a.my_field,
b.max_id_reference
from
my_table a
join
(
select
id,
max(id_reference) as max_id_reference
from
my_table
group by
id
) b
on
a.id = b.id and
a.id_reference = b.max_id_reference
join
(
select
count(distinct id) as id_count
from
my_table
) c
on true;
This is a bit longer (especially the long thin way I write SQL) but it makes it clear what is happening. If you come back to it in a few months time (somebody usually does) then it will take less time to understand what is going on.
The "on true" at the end is a deliberate cartesian product because there can only ever be exactly one result from the subquery "c" and you do want a cartesian product with that.
There is nothing necessarily wrong with subqueries.
I have an SQL table with one bit column. If only one bit value (in my case 1) occurs in the rows of the table, how can I make a SELECT statement which shows for, example, the occurance of both bit values in the table, even if the other does not occur? This is the result I'm trying to achieve:
+----------+--------+
| IsItTrue | AMOUNT |
+----------+--------+
| 1 | 12 |
| 0 | NULL |
+----------+--------+
I already tried to google the answer but without success, as English is not my native language and I am not that familiar with SQL jargon.
select IsItTrue, count(id) as Amount from
(select IsItTrue, id from table
union
select 1 as IsItTrue, null as id
union
select 0 as IsItTrue, null as id) t
group by bool
OK. So this is a hard one to explain, but I am replacing the type of a foreign key in a database. To do this I need to update the values in a table that references it. That is all fine and good, and nice and easy to do.
I'm inserting this stuff into a temporary table which will replace the original table, but the insert query isn't at all difficult, it's the select that I get the values from.
However, I also want to keep any entries where the original reference was NULL. Also not hard, I could use a Left Inner Join for that.
But we're not done yet: I don't want the entries for which there is no match in the second table. I've been dinking around with this for 2 hours now, and am no closer to figuring this out than I am to the moon.
Let me give you an example data set:
____________________________
| Inventory || Customer |
|============||============|
| ID Cust || ID Name |
|------------||------------|
| 1 A || 1 A |
| 2 B || 2 B |
| 3 E || 3 C |
| 4 NULL || 4 D |
|____________||____________|
Let's say the database used to use the Customer.Name field as its Primary Key, and I need to change it to a standard int identity(1,1) not null ID. I've added the field with no issues in the Customer table, and kept the Name because I need it for other stuff. I have had no trouble with this in all the tables that do not allow NULLs, but since the "Inventory" table allows something to be associated with No customer, I'm running into troubles.
If I did a left inner join, my results would be:
______________
| Results |
|============|
| ID Cust |
|------------|
| 1 1 |
| 2 2 |
| 3 NULL |
| 4 NULL |
|____________|
However, Inventory #3 was referencing a customer which does not exist. I want that to be filtered out.
This database is my development database, where I hack, slash, and destroy things with wanton disregard for validity. So a lot of links in these tables are no longer valid.
The next step is replicating this process in the beta-testing environment, where bad records shouldn't exist, but I can't guarantee that. So I'd like to keep the filter, if possible.
The query I have right now is using a sub-query to find all rows in Inventory whose CustID either exists in Customers, or is null. It then tries to only grab the value from those rows which the subquery found. Here's the translated query:
insert into results
(
ID,
Cust
)
select
inv.ID, cust.ID
from Inventory inv, Customer cust
where inv.ID in
(
select inv.ID from Inventory inv, Customer cust
where inv.Cust is null
or cust.Name = inv.Cust
)
and cust.Name = inv.Cust
But, as I'm sure you can see, this query isn't right. I've tried using 2, 3 subqueries, inner joins, left joins, bleh. The results of this query, and many others I've tried (that weren't horribly, horribly wrong) are:
______________
| Results |
|============|
| ID Cust |
|------------|
| 1 1 |
| 2 2 |
|____________|
Which is essentially an inner-join. Considering my actual data has around 1100 records which have NULL values in that field, I don't think truncating them is the answer.
The answer I'm looking for is:
______________
| Results |
|============|
| ID Cust |
|------------|
| 1 1 |
| 2 2 |
| 4 NULL |
|____________|
The trickiest part of this insert into select is the fact that I'm looking to insert either a value from another table, or essentially a value from this table or the literal NULL. That just isn't something I know how to do; I'm still getting the hang of SQL.
Since I'm inserting the results of this query into a table, I've considered doing the insert using a select which leaves out the NULL values and un-matched records, then going back through and adding in all the NULL records, but I really want to learn how to do the more advanced queries like this.
So do any of yous folks have any ideas? 'Cause I'm lost.
How about a union?
Select all records where ID and Cust match and union that with all records where ID matches and inventory.cust is null.