I have two tables - 1st table gas_emissions, 2nd table - regiony_avg.
Table gas_emissions has columns region, region_id, data_val, year.
Table regiony_avg has columns region_id, avg_region.
There are multiple values for each region because they're calculated every year. I need to calculate AVG for each region and insert it into regiony_avg.
There are over 10 regions, what I've done is
SELECT AVG(data_val) AS AKL
FROM gas_emissions
WHERE region_id = 'AKL'
and then
UPDATE regiony_avg
SET avg_region = 1999.64771428571
WHERE region_id = 'AKL'
I did it for each of regions. However I can't see how to do it if there are for example 1000 regions. Is there any way to get AVG for all unique regions at and then insert it into regiony_avg at once?
I think you just want insert . . . select:
insert into regiony_avg (region_id, avg_region)
selet region_id, avg(data_val)
from gas_emissions
group by region_id;
Note: I see little reason to store this information in a table when it can easily be calculated using an aggregation query. In fact, you can add the average to each row of the original table using window functions:
select ge.*,
avg(data) over (partition by region_id) as region_avg
from gas_emissions ge;
Related
I have a table and want to get the average score for each student。
To be more specific, scoremonth1 has more weight to be calculated than 2,3,4,5 and 6 (1>2>3>4>5>6). And we should add no more than 3 monthly scores from the table.
For instance, the average score for Tom should be (80+90)/2 since there are only 2 scores available. As for Marry, the average score should be (90+70+80)/3 since those are the three monthly scores with more weight. And again, for Anna, the average score should be (90+100+70)/3
In my case, there would be over 100 students. Except listing all the possible cases like CASE WHEN scoremonth1 is not null, scoremonth2 is NULL . etc to calculate the average, what else method could do the calculation dynamically?
I know there is a SQL function coalesce to return the first not null value, but how could I get the second and third not null values? And is there a way to track which monthlyscores are added up? I really appreciate your help!
Stu mentioned your underlying issue. To normalize your data without changing table design you can use cross apply...
select student, sum(score)
from table
cross apply (
values(1,scoremonth1),(2,scoremonth2),(3,scoremonth3)) as scores(month,score)
group by student
I strongly suggest you redesign so you don't have to manage this query when adding months by creating a new table called studentScores.
create table studentscores
(
student varchar(200)
,scoremonth int
,score decimal(5,2)
)
And then populate it like this...
insert into studentScores(student,scoremonth,score)
select *
from table
cross apply
(values
(student,1,scoremonth1)
,(student,2,scoremonth2)
,(student,3,scoremonth3)
,(student,4,scoremonth4)
,(student,5,scoremonth5)
) ca(ca1,ca2,ca3)
where ca3 is not null
And finally, usse it like this...
select ss.student, sum(score), count(*) NumOfScores, sum(score)/Count(*) avg
from table
join studentscores ss on ss.student=table.student
where ss.scoremonth between 1 and 3
group by ss.student
Working with SQL via a NOVA Oracle DB. Need to know how to query from multiple tables and arrange results based on being sorted by the highest values. Here are a few lines of code to reflect the three tables:
INSERT INTO VEHICLES
(vehicleVIN,vehicleType,vehicleMake,vehicleModel,vehicleWhereFrom,vehicleWholesaleCost,vehicleTradeID)
VALUES
('147258HHE91K3RT','compact','chevrolet','spark','Maryland',20583.00,NULL);
INSERT INTO VEHICLES
(vehicleVIN,vehicleType,vehicleMake,vehicleModel,vehicleWhereFrom,vehicleWholesaleCost,vehicleTradeID)
VALUES
('789456ERT0923RFB6','Midsize','ford','Taurus','washington, d.c.',25897.22,1);
INSERT INTO VEHICLES
(vehicleVIN,vehicleType,vehicleMake,vehicleModel,vehicleWhereFrom,vehicleWholesaleCost,vehicleTradeID)
VALUES
('1234567890QWERTYUIOP','fullsize','Lincoln','towncar','Virginia',44222.10,NULL);
AND
INSERT INTO SALES
(saleID,grossSalePrice,vehicleStatus,saleDate,saleMileage,customerID,salespersonID,vehicleVIN)
VALUES
(1,25987.28,'sold',date '2012-10-15',10,1,1,'147258HHE91K3RT');
INSERT INTO SALES
(saleID,grossSalePrice,vehicleStatus,saleDate,saleMileage,customerID,salespersonID,vehicleVIN)
VALUES
(2,29999.99,'sold',date '2012-10-17',50087,2,2,'789456ERT0923RFB6');
INSERT INTO SALES
(saleID,grossSalePrice,vehicleStatus,saleDate,saleMileage,customerID,salespersonID,vehicleVIN)
VALUES
(3,47490.88,'sold',date '2012-11-05',30,3,3,'1234567890QWERTYUIOP');
AND
INSERT INTO CUSTOMERS
(customerID,customerFirName,customerLasName,customerMiName,customerStreet,customerState,customerCity,customerZip)
VALUES
(1,'Regorna','Trasper','J','11111 Address Way','Maryland','Hollywood','20636');
INSERT INTO CUSTOMERS
(customerID,customerFirName,customerLasName,customerMiName,customerStreet,customerState,customerCity,customerZip)
VALUES
(2,'Bob','Seagram','A','22222 Seagram Lane','Texas','Houston','77001');
INSERT INTO CUSTOMERS
(customerID,customerFirName,customerLasName,customerMiName,customerStreet,customerState,customerCity,customerZip)
VALUES
(3,'Sally','Anderson','P','33333 Pheonix Drive','Arizona','Pheonix','85001');
Obviously there are other tables that come into play here (salesperson, etc.), however these are the only tables needed for the query. The query I want to pull needs to show the total count of sales for each model, sorted by the highest values, and the total count of sales for each zip code, sorted by the highest values. An example (using the data provided above) would look similar to this:
MODEL NUMBER of SALES ZIP CODE NUMBER OF SALES
spark 1 20636 1
Taurus 1 77001 1
towncar 1 85001 1
The results need to be sorted by highest values, based on the number of sales. I'm also trying to accomplish this via a single SELECT query.
I've tried some ideas, but haven't been able to find anything that hits the home run yet. Thanks for the help!
See if this is what you're after:
SELECT DISTINCT v.VEHICLEMODEL, COUNT(*) OVER (PARTITION BY s.VEHICLEVIN) "CAR_SALES"
, c.CUSTOMERZIP, COUNT(*) OVER (PARTITION BY c.CUSTOMERZIP )"TOTAL_SALES_AT_ZIP"
FROM SALES s, VEHICLES v, CUSTOMERS c
WHERE s. VEHICLEVIN = v. VEHICLEVIN
and c. CUSTOMERID = s. CUSTOMERID
ORDER BY 2 DESC , 4 DESC
sorry for asking on this topic again, but I havent been able to derive a solution to my problem from existing answers.
I have one Table ("Data") from which I need to pull three columns ( "PID", "Manager", "Customer" )
and only the "PID" has to be distinct. I dont care which records are pulled for the other columns ("Manger" / "Customer" ) it could be the first entry or whatever.
SELECT Distinct PID, Manager, Customer
FROM Data;
Will give me all the rows where PID,Manager and Customer are distinct, so if there is two entrys with the same PID but with a different Manager, I will get two records instead of one.
Thank you very much.
You can do this
Hope you will find this helpful
SELECT PID, max(Manager), max(Customer)
FROM Data
group by PID
Or
SELECT PID, min(Manager), min(Customer)
FROM Data
group by PID
EDIT
I will give you an example to explain you the Max & Min Func
Here is the Sample Table
CREATE TABLE data(
PID int ,
Manager varchar(20) ,
Customer varchar(20)
) ;
insert into data
values
(1,'a','b'),
(1,'c','d'),
(3,'1','e'),
(3,'5','e'),
(3,'3','e')
Now,
These are the Three Queries that will return respective outputs,,
select * from data;
SELECT PID, max(Manager), max(Customer)
FROM Data
group by PID;
SELECT PID, min(Manager), min(Customer)
FROM Data
group by PID
Output for the above queries is
Explanation :
MAX :
MAX is returning C & 5 for Manager Coz, C is greater then A & likewise 5 is greater then 1 & 3
Min fuction is totally opposite of MAX function & is self explenatory.
I have also created on demo Please click to see the demo on Fiddle
Click Here To See The Demo
SELECT "PID", max("Manager"), max("Customer")
FROM "Data"
GROUP BY "PID";
This query returns unique "PID"s and max values of "Manager" and "Customer" for each "PID".
DISTINCT is applied for all the columns from the select list. So you need to use GROUP BY + an aggregate function (returns one value for several rows).
I would like to see a most concise way to do what is outlined in this SO question: Sum values from multiple rows into one row
that is, combine multiple rows while summing a column.
But how to then delete the duplicates. In other words I have data like this:
Person Value
--------------
1 10
1 20
2 15
And I want to sum the values for any duplicates (on the Person col) into a single row and get rid of the other duplicates on the Person value. So my output would be:
Person Value
-------------
1 30
2 15
And I would like to do this without using a temp table. I think that I'll need to use OVER PARTITION BY but just not sure. Just trying to challenge myself in not doing it the temp table way. Working with SQL Server 2008 R2
Simply put, give me a concise stmt getting from my input to my output in the same table. So if my table name is People if I do a select * from People on it before the operation that I am asking in this question I get the first set above and then when I do a select * from People after the operation, I get the second set of data above.
Not sure why not using Temp table but here's one way to avoid it (tho imho this is an overkill):
UPDATE MyTable SET VALUE = (SELECT SUM(Value) FROM MyTable MT WHERE MT.Person = MyTable.Person);
WITH DUP_TABLE AS
(SELECT ROW_NUMBER()
OVER (PARTITION BY Person ORDER BY Person) As ROW_NO
FROM MyTable)
DELETE FROM DUP_TABLE WHERE ROW_NO > 1;
First query updates every duplicate person to the summary value. Second query removes duplicate persons.
Demo: http://sqlfiddle.com/#!3/db7aa/11
All you're asking for is a simple SUM() aggregate function and a GROUP BY
SELECT Person, SUM(Value)
FROM myTable
GROUP BY Person
The SUM() by itself would sum up the values in a column, but when you add a secondary column and GROUP BY it, SQL will show distinct values from the secondary column and perform the aggregate function by those distinct categories.
I currently have a table called tempHouses that looks like:
avgprice | dates | city
dates are stored as yyyy-mm-dd
However I need to move the records from that table into a table called houses that looks like:
city | year2002 | year2003 | year2004 | year2005 | year2006
The information in tempHouses contains average house prices from 1995 - 2014.
I know I can use SUBSTRING to get the year from the dates:
SUBSTRING(dates, 0, 4)
So basically for each city in tempHouses.city I need to get the the average house price from the above years into one record.
Any ideas on how I would go about doing this?
This is an SQL Server approach, and a PIVOT may be a better, but here's one way:
SELECT City,
AVG(year2002) AS year2002,
AVG(year2003) AS year2003,
AVG(year2004) AS year2004
FROM (
SELECT City,
CASE WHEN Dates BETWEEN '2002-01-01T00:00:00' AND '2002-12-31T23:59:59' THEN avgprice
ELSE 0
END AS year2002,
CASE WHEN Dates BETWEEN '2003-01-01T00:00:00' AND '2003-12-31T23:59:59' THEN avgprice
ELSE 0
END AS year2003
CASE WHEN Dates BETWEEN '2004-01-01T00:00:00' AND '2004-12-31T23:59:59' THEN avgprice
ELSE 0
END AS year2004
-- Repeat for each year
)
GROUP BY City
The inner query gets the data into the correct format for each record (City, year2002, year2003, year2004), whilst the outer query gets the average for each City.
There many be many ways to do this, and performance may be the deciding factor on which one to choose.
The best way would be to use a script to perform the query execution for you because you will need to run it multiple times and you extract the data based on year. Make sure that the only required columns are city & row id:
http://dev.mysql.com/doc/refman/5.0/en/insert-select.html
INSERT INTO <table> (city) VALUES SELECT DISTINCT `city` from <old_table>;
Then for each city extract the average values, insert them into a temporary table and then insert into the main table.
SELECT avg(price), substring(dates, 0, 4) dates from <old_table> GROUP BY dates;
Otherwise you're looking at a combination query using joins and potentially unions to extrapolate the data. Because you're flattening the table into a single row per city it's going to be a little tough to do. You should create indexes first on the date column if you don't want the database query to fail with memory limits or just take a very long time to execute.