Writing an SQL query for two tables - sql

I have two tables. delayedFlights which has these following attributes:
ID_of_Delayed_Flight
Month
DayofMonth
DayOfWeek
DepTime
ScheduledDepTime
ArrTime
ScheduledArrTime
UniqueCarrier
FlightNum
ActualFlightTime
scheduledFlightTime
AirTime
ArrDelay
DepDelay
Orig
Dest
Distance
and airport which has these attributes:
airportCode
airportName
City
State
I am trying to write a query which Lists the top 5 distinct states in which a flight between different airports within the same state has been delayed (either in arrival or departure) by descending order with respect to number of such delays. I'm not really sure how to write this query as I'm not sure how to query both tables at the same time, could someone give me any pointers on how to do this?
This is what I've tried
SELECT state, COUNT(*) AS num_delays
FROM delayed_flights
WHERE state = state
GROUP BY state
ORDER BY num_delays DESC
LIMIT 5;
(But obviously does not work)

You just need to understand joins you have to join to airport twice once for origin once for destination. we add aliases to make it easier to read and know the source tables of the selected fields. we can group by just origin or destination state since we're checking to make sure they are the same. Inner join requires an entry in both tables for a record to show. left/right (outer joins) show all records from one table and only those that match from the other whereas full outer shows all records from both tables even if no match is found. and cross join shows all records linked to all records in both tables.
This does assume that org and dest are related to the airportcode in the delayed flights table.
SELECT o.state, COUNT(*) AS num_delays
FROM delayed_flights df
INNER JOIN airport O
on o.airportCode = df.orig
INNER JOIN airport D
on d.airportCode = df.dest
WHERE O.state = D.state
GROUP BY o.state
ORDER BY num_delays DESC
LIMIT 5;

Related

Record with latest date, where date comes from a joined table

I have tried every answer that I have found to finding the last record, and I have failed in getting a successful result. I currently have a query that lists active trailers. I am needing it to only show a single row for each trailer entry, where that row is based on a date in a joined table.
I have tables
trailer, company, equipment_group, movement, stop
In order to connect trailer to stop (which is where the date is), i have to join it to equipment group, which joins to movement, which then joins to stop.
I have tried using MAX and GROUP BY, and PARTITION BY, both of which error out.
I have tried many solutions here, as well as these
https://thoughtbot.com/blog/ordering-within-a-sql-group-by-clause
https://www.geeksengine.com/article/get-single-record-from-duplicates.html
It seems that all of these solutions have the date in the same table as the thing that they want to group by, which I do not.
SELECT
trailer.*
company.name,
equipment_group.currentmovement_id,
equipment_group.company_id,
movement.dest_stop_id, stop.location_id,
stop.*
FROM trailer
LEFT OUTER JOIN company ON (company.id = trailer.company_id)
LEFT OUTER JOIN equipment_group ON (equipment_group.id =
trailer.currenteqpgrpid)
LEFT OUTER JOIN movement ON (movement.id =
equipment_group.currentmovement_id)
LEFT OUTER JOIN stop ON (stop.id = movement.dest_stop_id)
WHERE trailer.is_active = 'A'
Using MAX and GROUP BY gives error "invalid in the select list... not contained in...aggregate function"
Welllllll, I never did end up figuring that out, but if I joined movements on to equipment group by two conditions, all is well. Each extra record was created by each company id.... company id is in EVERY table.

Oracle SQL - join table in one-to-many relationship, but instead of duplicating rows show min/max

let's say I have TRADE table and TRADE_EXECUTION table. Common link between them is TRADE_ID.
TRADE_EXECUTION has all broker executions which are part of final TRADE (so information about executed quantity, timestamp of execution etc.). This means that for each entry in TRADE table there might be multiple, one or no corresponding rows in TRADE_EXECUTION.
So if I do standard LEFT JOIN rows from TRADE will be duplicated in case there are multiple corresponding entries in TRADE_ATTRIBUTES.
Instead of duplicating rows I want to show in my SELECT query columns TRADE_ID, MIN(EXECUTION_TIMESTAMP) and MAX(EXECUTION_TIMESTAMP) for each entry from TRADE table.
So if there are 5 executions pointing to one trade I want to show earliest and latest execution timestamp from them and put in one row instead of showing 5 rows.
I need to also keep in mind that TRADE_EXECUTION table is quite big (200k records added daily, around 50m in total so far).
How do I achieve that?
I suspect that you want:
select t.*, min_executiontimestamp, max_executiontimestamp
from trade t left join
(select te.trade_id, min(executiontimestamp) as min_executiontimestamp,
max(executiontimestamp) as max_executiontimestamp
from trade_execution te
group by te.trade_id
) te
on te.trade_id = t.trade_id;
This allows you to select all the columns you want from trade, without fiddling with a group by clause.
A simple left join with aggregation should do, as in:
select
t.trade_id,
MIN(e.EXECUTION_TIMESTAMP) as min_ts,
MAX(e.EXECUTION_TIMESTAMP) as max_ts
from trade t
left join trade_execution e on e.trade_id = t.trade_id
group by t.trade_id
SELECT
TRADE_ID,
MAX(EXECUTION_TIMESTAMP) MAX_TS,
MIN(EXECUTION_TIMESTAMP) MIN_TS
FROM
TRADE
LEFT OUTER JOIN TRADE_EXECUTION
ON TRADE.ID = TRADE_EXECUTION.ID
WHERE /* PUT YOUR WHERE CONDITIONS IF THERE IS*/
GROUP BY TRADE_ID;
Please note that since your table has very high volume of records, you need to use your where condition wisely. You need to filter your data using indexed columns
WITH exec AS
(SELECT TRADE_ID, MIN(TRADE_EXECUTION) MIN_EXEC, MAX(TRADE_EXECUTION) MAX_EXEC
FROM TRADE_EXECUTION
GROUP BY TRADE_ID)
SELECT t.TRADE_ID, e.MIN_EXEC, e.MAX_EXEC
FROM TRADE t LEFT JOIN exec e ON t.TRADE_ID=e.TRADE_ID;
And add as much TRADE columns as you need without grouping them....

SQL - subquery returning more than 1 value

What my issue is:
I am constantly returning multiple values when I don't expect to. I am attempting to get a specific climate, determined by the state, county, and country.
What I've tried:
The code given below. I am unsure as to what is wrong with it specifically. I do know that it is returning multiple values. But why? I specify that STATE_ABBREVIATION = PROV_TERR_STATE_LOC and with the inner joins that I do, shouldn't that create rows that are similar except for their different CLIMATE_IDs?
SELECT
...<code>...
(SELECT locations.CLIMATE_ID
FROM REF_CLIMATE_LOCATION locations, SED_BANK_TST.dbo.STATIONS stations
INNER JOIN REF_STATE states ON STATE_ID = states.STATE_ID
INNER JOIN REF_COUNTY counties ON COUNTY_ID = counties.COUNTY_ID
INNER JOIN REF_COUNTRY countries ON COUNTRY_ID = countries.COUNTRY_ID
WHERE STATE_ABBREVIATION = PROV_TERR_STATE_LOC) AS CLIMATE_ID
...<more code>...
FROM SED_BANK_TST.dbo.STATIONS stations
I've been at this for hours, looking up different questions on SO, but I cannot figure out how to make this subquery return a single value.
All those inner joins don't reduce the result set if the IDs you're testing exist in the REF tables. Apart from that you're doing a Cartesian product between locations and stations (which may be an old fashioned inner join because of the where clause).
You'll only get a single row if you only have a single row in the locations table that matches a single row in the stations table under the condition that STATE_ABBREVIATION = PROV_TERR_STATE_LOC
Your JOINs show a hierarchy of locations: Country->State->County, but your WHERE clause only limits by the state abbreviation. By joining the county you'll get one record for every county in that state. You CAN limit your results by taking the TOP 1 of the results, but you need to be very careful that that's really what you want. If you're looking for a specific county, you'll need to include that in the WHERE clause. You get some control with the TOP 1 in that it will give the top 1 based on an ORDER BY clause. I.e., if you want the most recently added, use:
SELECT TOP 1 [whatever] ORDER BY [DateCreated] DESC;
For your subquery, you can do something like this:
SELECT TOP 1
locations.CLIMATE_ID
FROM REF_CLIMATE_LOCATION locations ,
SED_BANK_TST.dbo.STATIONS stations
INNER JOIN REF_STATE states ON STATE_ID = states.STATE_ID
INNER JOIN REF_COUNTY counties ON COUNTY_ID = counties.COUNTY_ID
INNER JOIN REF_COUNTRY countries ON COUNTRY_ID = countries.COUNTRY_ID
WHERE STATE_ABBREVIATION = PROV_TERR_STATE_LOC
Just be sure to either add an ORDER BY at the end or be okay with it choosing the TOP 1 based on the "natural order" on the tables.
If you are expecting to have a single value on your sub-query, probably you need to use DISTINCT. The best way to see it is you run your sub-query separately and see the result. If you need to include other columns from the tables you used, you may do so to check what makes your result have multiple rows.
You can also use MAX() or MIN() or TOP 1 to get a single value on the sub-query but this is dependent to the logic you want to achieve for locations.CLIMATE_ID. You need to answer the question, "How is it related to the rest of the columns retrieved?"

How to get information from database excluding some data

I have 3 tables in my database.
The first table is "Ordenes", in this table i have all the orders that the users create in the website, this table got information like: FechaFinal, FechaInicial, NoOrden, Cantidad, and Estado.
I have another table called "Cables", in this table i save all the cable that the users register per order.
For example, if i have a order row and the Cantidad says that is 200, so the user need register 200 cables to this order.
The Cables table have information like: Serial, IdOrden, and IdOperation.
The IdOperation is a foreign key of the Operations Tables, in this table i save information like: Name of operation, and its all.
So i want to do a sql query that show me this information:
FechaInicial (From Order Table), Fechafinal (From Order Table), NoOrden (From Order Table), Material (From Order Table), Cantidad (From Order Table), and Performed (This is a count of all cables that i have in my Cables Table for some Order)
I have it, and it works, but i want that the query ONLY COUNT the cables that doesn't are in some operation called 'Scrap'.
I have this SQL Query, and it works, but it also is counting the cables that are in the 'Scrap' Operation.
SELECT o.FechaInicial, o.FechaFinal, o.NoOrden, o.Material, o.Cantidad, COUNT(c.IdCable) as 'Hechos', o.Estado
FROM Ordenes o
LEFT JOIN Cables c ON o.IdOrden = c.IdOrden
LEFT JOIN Operaciones op ON c.IdOperacion = op.IdOperacion AND op.Nombre NOT IN ('Scrap')
GROUP BY o.FechaInicial, o.Fechafinal, o.NoOrden, o.Material, o.Cantidad, o.Estado;
I want to show ALL orders, even if a order doesn't have cables yet.
Change the count to COUNT(op.Nombre) or COUNT(op.idOperacion) if Nombre can be NULL.
The left join keeps all rows, even those that have scrap. The best solution is to count the primary key or join key from the last table. So, I would recommend:
SELECT o.FechaInicial, o.FechaFinal, o.NoOrden, o.Material, o.Cantidad,
COUNT(op.IdOperacion) as Hecho, o.Estado
FROM Ordenes o LEFT JOIN
Cables c
ON o.IdOrden = c.IdOrden LEFT JOIN
Operaciones op
ON c.IdOperacion = op.IdOperacion AND op.Nombre NOT IN ('Scrap')
GROUP BY o.FechaInicial, o.Fechafinal, o.NoOrden, o.Material, o.Cantidad, o.Estado;
This counts only "matching" rows in Operaciones.
Note: If you count a data column, you might be missing rows where that value is NULL in the original table. That might be a good thing, depending on the intention of the query. In general, it is safer to count something that cannot be NULL when a match occurs.

SQL: If there is No match on condition (row in table) return a default value and Rows which match from different tables

I have three tables: Clinic, Stock and StockLog.
I need to get all rows where Stock.stock < 5. I need to also show if an order has been placed and what amount it is; which is found in the table Stocklog.
The issue is that a user can set his stock level in the Stock table without placing an order which would go to Stocklog.
I need a query that can : return the rows in the Stock table and get the related order amounts in the Stocklog table. If no order has been placed in StockLog, then set amount to order amount to zero.
I have tried :
SELECT
Clinic.Name,
Stock.NameOfMedication, Stock.Stock,
StockLog.OrderAmount
FROM
Clinic
JOIN
Stock ON Stock.ClinicID = Clinic.ClinicID
JOIN
StockLog ON StockLog.StockID = Stock.StockID
WHERE
Stock.Stock <= 5
The issue with my query is that I lose rows which are not found in StockLog.
Any help on how to write this.
Thank you.
I am thinking the query should look like this:
SELECT c.Name, s.NameOfMedication, s.Stock,
COALESCE(sl.OrderAmount, 0) as OrderAmount
FROM Stock s LEFT JOIN
Clinic c
ON s.ClinicID = c.ClinicID LEFT JOIN
StockLog sl
ON sl.StockID = s.StockID
WHERE s.Stock <= 5 ;
You want to keep all rows in Stock (subject to the WHERE condition). So think: "make Stock the first table in the FROM and use LEFT JOIN for all the other tables."
If you want to keep all the rows that result from joining Clinic and Stock, then use a LEFT OUTER JOIN with StockLog. I don't know which SQL you're using (SQL Server, MySQL, PostgreSQL, Oracle), so I can't give you a precise example, but searching for "left outer join" in the relevant documentation should work.
See this Stack Overflow post for an explanation of the various kinds of joins.