Execute an SQL UPDATE using GROUP BY and COUNT - sql

I am working with SQL in an SQLite database. I have a table that looks something like this:
STORAGE
------------------------------
REC_ID SEQ_NO NAME
------------------------------
100 1 plastic jar
100 2 glass cup
100 fiber rug
101 1 steel fork
101 wool scarf
102 1 leather boots
102 2 paintbox
102 3 cast iron pan
102 toolbox
Keep in mind that that this is a very small number of records compared to what I actually have in the table. What I need to do is update the table so that all the records that have a null value for SEQ_NO are set with the actual number they are supposed to be in sequence to the group of records with the same REC_ID.
Here is what I want the table to look like after the update:
STORAGE
------------------------------
REC_ID SEQ_NO NAME
------------------------------
100 1 plastic jar
100 2 glass cup
100 3 fiber rug
101 1 steel fork
101 2 wool scarf
102 1 leather boots
102 2 paintbox
102 3 cast iron pan
102 4 toolbox
so for example, the record with REC_ID 102 should have have SEQ_NO of 4, because it is the fourth record with the REC_ID 102.
If I do:
SELECT REC_ID, COUNT(*) FROM STORAGE GROUP BY REC_ID;
this returns all of the records by REC_ID and the number (count) of records matching each ID, which would also be the number I would want to assign to each of the records with a null SEQ_NO.
Now how would I go about actually updating all of these records with their count values?

this should work:
update storage set
seq_no=(select count(*) from storage s2 where storage.rec_id=s2.rec_id)
where seq_no is null

Related

hive - Duplicate counts check associated from one to another column

I have a table with and trying to fetch counts of distinct uniqueness from across a column by comparing to another column and the data is across millions to billions for each TMKEY partitioned column
ID TNUM TMKEY
23455 ABCD 1001
23456 ABCD 1001
23455 ABCD 1001
112233 BCDE 1001
113322 BCDE 1001
9009 DDEE 1001
9009 DDEE 1001
1009 FFGG 1001
Looking for desired output:
total_distinct_tNUM_count count_of_TNUM_which_has_more_than_disintct_ID TMKEY
4 2 1001
Here when TNUM is DDEE, the ID is fetching 9009 which has duplicates shouldn't be picked up when calculating the count of TNUM which has more than distinct ID. All I'm looking in here is get group concat counts. Any suggestions please. As I have data with more than 3 billion to 4 billions my approach is completely different and stuck.
select a.tnum,a.group_id,a.time_week from (SELECT time_week,tnum,count(*) as num_of_rows, concat_ws('|' , collect_set(id)) as group_id from source_table_test1 where time_week=1001 group by tnum,time_week) as a where length(a.group_id)>16 and num_of_rows>1

Access/SQL - Delete all instances of duplicate records with an if clause

I'm working on an access query and kinda hit a dead end. I want to delete all duplicate rows in a table that have the same value in the columns Brand, SerialNr, Seats and LastRepair that have the value "2013" in the year column.
I'm trying to delete all rows that have duplicates in those columns and the year 2013 so there isnt a single one left. (Not just delete the duplicated so there is only one left but delete all instances so there is none left)
The original table looks like this:
Brand
SerialNr
Seats
Color
LastRepair
Year
Ford
145
4
Blue
01.01.2020
2010
Ford
145
4
Red
01.01.2020
2010
Ford
145
4
Red
01.01.2020
2013
Ford
145
4
Green
01.01.2020
2013
Porsche
146
2
White
01.01.2022
2013
Ferrari
146
2
White
01.01.2022
2013
Volkswagen
147
4
Blue
01.01.2021
2017
Volkswagen
147
4
Red
01.01.2021
2013
Volkswagen
147
4
Orange
01.01.2021
2013
And the outcome table should look like this:
Brand
SerialNr
Seats
Color
LastRepair
Year
Ford
145
4
Blue
01.01.2020
2010
Ford
145
4
Red
01.01.2020
2010
Porsche
146
2
White
01.01.2022
2013
Ferrari
146
2
White
01.01.2022
2013
Volkswagen
147
4
Blue
01.01.2021
2017
I tried doing it with this question but I need the rows deleted if they have a duplicated value in the those columns so there isnt a single one left who has the same year.
I also tried to do a "find duplicates" query and make an outter join but was unsuccesful so far achieving the desired outcome. I'm thankful for any help.
DELETE Exists (SELECT 1
FROM carTable As t2
WHERE t1.Brand = t2.Brand AND t1.SerialNr = t2.SerialNr AND t1.Seats = t2.Seats AND t1.LastRepair = t2.LastRepair
HAVING Count(*) > 1
), t1.[FilNr], *
FROM carTable AS t1, carTable
WHERE (((Exists (SELECT 1
FROM carTable As t2
WHERE t1.Brand = t2.Brand AND t1.SerialNr = t2.SerialNr AND t1.Seats = t2.Seats AND t1.LastRepair = t2.LastRepair
HAVING Count(*) > 1
))<>False) AND ((t1.[year])=2013));
You can use an EXISTS subquery to identify duplicated rows and delete them.
In the subquery, we just select based on the columns you want to identify duplicates by, then check if the count is greater than 1 (since Count is an aggregate, it's in the HAVING clause).
DELETE * FROM t AS t1
WHERE EXISTS(
SELECT 1
FROM t As t2
WHERE t1.Brand = t2.Brand AND t1.SerialNr = t2.SerialNr AND t1.Seats = t2.Seats AND t1.LastRepair = t2.LastRepair
HAVING Count(*) > 1
)
AND Year = 2013
If your goal is to never have duplicate information in the "Brand" column, that can be accomplished in the table design itself. It's much more efficient to setup the table such that it limits what the user can input in certain circumstances. There's a couple ways you can do this. You can set the primary key to the Brand column, or change the "Indexed" property of that column to "Yes (No Duplicates)" If you are using an auto-number as the ID field and plan on relating a table by that ID, then the index is your best bet.

merge two rows into one sql

this is my table schema, total_hours column is the result of a sum function.
Id name client total_hours
1 John company 1 100
1 John company 2 200
2 Jack company 3 350
2 Jack company 2 150
I want to merge the rows with similar ID into one row, looking like this.
Id name client_a total_hours_a client_b total_hours_b
1 John company 1 100 company 2 200
2 Jack company 3 350 company 2 150
I tried to use pivot but this function does not seem to exist in Dbeaver. Here is my query
SELECT
client
,name
,sum(hours) AS total_hours
FROM pojects
GROUP BY client, name;
Thanks in advance if anyone could be of any help.

Return product from order data from multiple record to columns

I have a SQL Server database which contains survey data and is very close to this question How to return ordered data from multiple records into one record in MySQL?
The data is almost identical. Again copied from the above question but with addition of millisecond and datetime2 column.
SURVEY_TAKER_ID | QUESTION_NUMBER | RESPONSE
----------------+-----------------+-----------
101 1 Apple
102 1 Orange
103 1 Banana
101 2 Morning
102 2 Evening
103 2 Afternoon
101 3 Red
102 3 Blue
103 3 Yellow
I am trying to use group by function but it is not grouping responses but showing responses in rows format.
select
s.survey_taker_ID, AVG(s.Millisecond)Duration,
(case when s.Question_Number = 1 then s.Answer end Product1,
(case when s.Question_Number = 2 then s.Answer end Product2
from
survey as s
group by
s.survey_taker_ID, s.Question_Number,s.Answer
Output:
Survey_Taker_ID | Duration | Product1 | Product2
----------------+----------+-----------+----------
101 | 11125 | Apple | Morning
102 | 12545 | Orange | Evening
Sad part is I have done this before but cannot seem to achieve it now. I know i am making some stupid mistake. Any sample code will help.
I think you want aggregation:
select s.survey_taker_ID, AVG(s.Millisecond) as Duration,
max(case when s.Question_Number = 1 then s.Answer end) as Product1,
max(case when s.Question_Number = 2 then s.Answer end) as Product2
from survey as s
group by s.survey_taker_ID;

Combining two tables in a query and creating new columns from that

I'm having issues with a query that I'm not ENTIRELY sure can be done with the way the database is set up. Basically, I'll be using two different tables in my query, let's say Transactions and Ticket Prices. They look like this (With some sample data):
TRANSACTIONS
Transation ID | Ticket Quantity | Total Price | Salesperson | Ticket Price ID
5489 250 250 Jim 8765
5465 50 150 Jim 1258
7898 36 45 Ann 4774
Ticket Prices
Ticket Price ID | Quantity | Price | Bundle Name
8765 1 1 1 ticket, $1
4774 12 15 5 tickets, $10
1258 1 3 1 ticket, $3
What I'm aiming for is a report, that breaks down each salesperson's sales by bundle type. The resulting table should be something like this:
Sales Volume/Salesperson
Name | Bundle A | Bundle B | Bundle C | Total
Jim 250 0 50 300
Ann 0 36 0 36
I've been searching the web, and it seems the best way of getting it like this is using various subqueries, which works well as far as getting the column titles properly displayed, but it doesn't work as far as the actual numerical totals. It basically combines the data, giving each salesperson a total readout (In this example, both Jim and Ann would have 250 sales in Bundle A, 36 in Bundle B, etc). Is there any way I can write a query that will give me the proper results? Or even something at least close to it? Thanks for any input.
You can use the PIVOT statement in Oracle to do this. A query might look something like this:
WITH pivot_data AS (
SELECT t.salesperson,p.bundle_name,t.ticket_quantity
FROM ticket_prices p, transactions t
where t.ticket_price_id = p.ticket_price_id
)
SELECT *
FROM pivot_data
PIVOT (
sum(ticket_quantity) --<-- pivot_clause
FOR bundle_name --<-- pivot_for_clause
IN ('1 ticket, $1','5 tickets, $10', '1 ticket, $3' ) --<-- pivot_in_clause
);
which would give you results like this: