sql select record with lowest value of the two - sql

Despite my internet searching, I've not found a solution to what I think is a simple SQL problem.
I have a simple table as such:
zip | location | transit
------------------------
10001 | 1 | 5
10001 | 2 | 2
This table of course has a large number of zip codes, but I'd like to make s simple query by zip code and instead of returning all rows with the zip, return only a single row (with all 3 columns), that contains the lowest transit value.
I've been playing with the aggregate function min(), but haven't gotten it right.
Using Postgres SQL DB 9.6
Thanks!

Use ORDER BY along with LIMIT :
SELECT t.*
FROM mytable t
WHERE t.zipcode = ?
ORDER BY t.transit
LIMIT 1

How about
select * from table where zip = ‘10001’ order by transit limit 1

I would use distinct on:
select distinct on (zip) t.*
from t
order by zip, transit;
This is usually the most efficient method in Postgres, particularly with an index on (zip, transit).
Of course if you have only one zip code that you care about, then where/order by/limit is also totally reasonable.

Assuming that you also want to return the location value associated with the minimum transit value, then here is one possible solution using an inner join:
select t.*
from
yourtable t inner join
(select u.zip, min(u.transit) as mt from yourtable u group by u.zip) v
on t.zip = v.zip and t.transit = v.mt
Change all references to yourtable to the name of your table.

Related

SQL SELECT WHERE IN another SELECT with GROUP_CONCAT

Good Day,
I have 3 Tables - Ticket, Ticket Batch (Multiple Ticket Rows To One Batch) and Ticket Staff (Multiple Staff Rows To One Ticket) and wish to ultimately UPDATE the ticket_batch table with the COUNT of all staff working on tickets per ticket batch.
The tables with applicable columns look as follows
ticket:
| ticket_number | recon_number |
ticket_batch:
| recon_number |
ticket_staff:
| ticket_number |
So I have written the following SQL query to essentially first if I do get the COUNT:
SELECT COUNT(*)
FROM ticket_staf
WHERE ticket_staff.ticket_number IN (SELECT GROUP_CONCAT(ticket.ticket_number) FROM ticket WHERE ticket.recon_number = 1);
Which the query just keeps running, but when I execute the queries separately:
SELECT GROUP_CONCAT(ticket.ticket_number)
FROM ticket
WHERE ticket.recon_number = 1;
I get 5 ticket numbers within split seconds and if I paste that string in the other portion of the query:
SELECT COUNT(*)
FROM ticket_staff
WHERE ticket_staff.ticket_number IN (1451,1453,1968,4457,4458);
It returns the correct COUNT.
So ultimately I guess can I not write queries with GROUP_CONCATS into another SELECT WHERE IN? And how should I structure my query?
Thanks for reading :)
I prefer Inner join as follows:
SELECT COUNT(distinct ts.*)
FROM ticket_staff ts
LEFT JOIN ticket t
ON ts.ticket_number = t.ticket_number
WHERE t.recon_number = 1;
GROUP_CONCAT() doesn't look right. I suspect you are confusing a list of values for IN with a string. They are not the same thing.
In general, I would recommend EXISTS over IN anyway:
SELECT COUNT(*)
FROM ticket_staff ts
WHERE EXISTS (SELECT 1
FROM ticket t
WHERE ts.ticket_number = t.ticket_number AND
t.recon_number = 1
);
For this query, you want an index on ticket(ticket_number, recon_number). However, I am guessing that ticket(ticket_number) is the primary key, which is enough of an index by itself.

How to get an incremental "RowId" column in SELECT using ROW_NUMBER()

I've been trying to update a query on the DataExplorer that we use on our Gaming SE Site for keeping track of tags without excerpts to include an incremental row number in the results to make reading the returned values easier. There are a number of questions on here that discuss how to do this, such as this one and this one which appear to have worked for those users, but I can't seem to get it work for my situation.
To be clear, I would like something like this:
RowId | TagName | Count | Easy List Formatting
----------------------------------------------
1 | Tag1 | 6 | 1. [tag:tag1] (6)
2 | Tag2 | 6 | 1. [tag:tag2] (6)
3 | Tag3 | 5 | 1. [tag:tag3] (5)
4 | Tag4 | 5 | 1. [tag:tag4] (5)
What I've come up with so far is this:
SELECT ROW_NUMBER() OVER(PARTITION BY TagInfo.[Count] ORDER BY TagInfo.TagName ASC) AS RowId, *
FROM
(
SELECT
TagName,
[Count],
concat('1. [tag:',concat(TagName,concat('] (', concat([Count],')')))) AS [Easy List Formatting]
FROM Tags
LEFT JOIN Posts pe on pe.Id = Tags.ExcerptPostId
LEFT JOIN TagSynonyms on SourceTagName = Tags.TagName
WHERE coalesce(len(pe.Body),0) = 0 and ApprovalDate is null
) AS TagInfo
ORDER BY TagInfo.[Count] DESC, TagInfo.TagName
This yields something close to what I want, but not quite. The RowId column increments, but once the Count column changes, it resets (presumably because of the PARTITION BY). But, if I remove the PARTITION BY, the RowId column becomes what appear to be random numbers.
Is what I want to do achievable given the way the tables are structured? If so, what should the SQL be?
To access the forked query, you can use this link. The original query (before my changes) can be found here if it helps in anyway.
Removing the PARTITION BY is exactly what is needed. The reason that your numbers look random is that the ORDER BY of the outer query is different from the ORDER BY of your ROW_NUMBER(). All you have to do is make those the same, and the output of the sequence project will have the monotonically increasing value you expect.
Specifically:
SELECT ROW_NUMBER() OVER (ORDER BY TagInfo.[Count] DESC, TagInfo.TagName) AS RowId, *
FROM
(
...
) AS TagInfo
ORDER BY TagInfo.[Count] DESC, TagInfo.TagName
Now you aren't partitioning, and the two ORDER BY clauses match, so you'll get your expected output.
For what it's worth, you technically don't really even care about having an ORDER BY in the ROW_NUMBER(), you just want the same order as the final result set. In that case, you can trick the query engine like so by providing a meaningless ORDER BY clause in the ROW_NUMBER():
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS RowId
Boom, done!
A little way around i used its, i add a column to the original table that has a increment value by 1 for example
ALTER TABLE ur_table ADD id INT IDENTITY(1,1)
GO
after that you do the query with order by column id
select * from ur_table (query) order by id

use of min and count from 2 different table in ibm db2

How can i show which tutor teach the least subject?
this is my syntax but I'm getting
Error code 42607
select
tut_id,
min(count(session_code)) as subject_taught
from
tutor,
class
where
tutor.tutor id = class.tut_id
group by tut_id
Expected output:
tut_id subject_taught
id2 1
This is pretty simple:
WITH Subjects_Taught AS (SELECT tutor_id, COUNT(*) AS subjects_taught
FROM Class
GROUP BY tutor_id)
SELECT tutor_id, subjects_taught
FROM Subjects_Taught
WHERE subjects_taught = (SELECT MIN(subjects_taught)
FROM Subjects_Taught)
SQL Fiddle Example
So what's going on in the statement?
First, the Common Table Expression ->
WITH Subjects_Taught AS (SELECT tutor_id, COUNT(*) AS subjects_taught
FROM Class
GROUP BY tutor_id)
This defines an in-query view or temporary table. These are handy for abstracting certain details away, or when you end up referring to the same info twice in a statement (as we do here). Essentially, you end up with a table that looks like this:
id1 | 2
id2 | 1
id3 | 2
... so then the only thing left is to restrict ourselves to rows of this table that meet the minimum:
WHERE subjects_taught = (SELECT MIN(subjects_taught)
FROM Subjects_Taught)
... we reference our virtual table a second time, getting the minimum, as if it were a normal table.
I don't have a DB2 available now but as far as I can see here you cannot nest aggregate functions in DB2:
$... min(count(session_code))...

Get last record of a table in Postgres

I'm using Postgres and cannot manage to get the last record of my table:
my_query = client.query("SELECT timestamp,value,card from my_table");
How can I do that knowning that timestamp is a unique identifier of the record ?
If under "last record" you mean the record which has the latest timestamp value, then try this:
my_query = client.query("
SELECT TIMESTAMP,
value,
card
FROM my_table
ORDER BY TIMESTAMP DESC
LIMIT 1
");
you can use
SELECT timestamp, value, card
FROM my_table
ORDER BY timestamp DESC
LIMIT 1
assuming you want also to sort by timestamp?
Easy way: ORDER BY in conjunction with LIMIT
SELECT timestamp, value, card
FROM my_table
ORDER BY timestamp DESC
LIMIT 1;
However, LIMIT is not standard and as stated by Wikipedia, The SQL standard's core functionality does not explicitly define a default sort order for Nulls.. Finally, only one row is returned when several records share the maximum timestamp.
Relational way:
The typical way of doing this is to check that no row has a higher timestamp than any row we retrieve.
SELECT timestamp, value, card
FROM my_table t1
WHERE NOT EXISTS (
SELECT *
FROM my_table t2
WHERE t2.timestamp > t1.timestamp
);
It is my favorite solution, and the one I tend to use. The drawback is that our intent is not immediately clear when having a glimpse on this query.
Instructive way: MAX
To circumvent this, one can use MAX in the subquery instead of the correlation.
SELECT timestamp, value, card
FROM my_table
WHERE timestamp = (
SELECT MAX(timestamp)
FROM my_table
);
But without an index, two passes on the data will be necessary whereas the previous query can find the solution with only one scan. That said, we should not take performances into consideration when designing queries unless necessary, as we can expect optimizers to improve over time. However this particular kind of query is quite used.
Show off way: Windowing functions
I don't recommend doing this, but maybe you can make a good impression on your boss or something ;-)
SELECT DISTINCT
first_value(timestamp) OVER w,
first_value(value) OVER w,
first_value(card) OVER w
FROM my_table
WINDOW w AS (ORDER BY timestamp DESC);
Actually this has the virtue of showing that a simple query can be expressed in a wide variety of ways (there are several others I can think of), and that picking one or the other form should be done according to several criteria such as:
portability (Relational/Instructive ways)
efficiency (Relational way)
expressiveness (Easy/Instructive way)
If your table has no Id such as integer auto-increment, and no timestamp, you can still get the last row of a table with the following query.
select * from <tablename> offset ((select count(*) from <tablename>)-1)
For example, that could allow you to search through an updated flat file, find/confirm where the previous version ended, and copy the remaining lines to your table.
The last inserted record can be queried using this assuming you have the "id" as the primary key:
SELECT timestamp,value,card FROM my_table WHERE id=(select max(id) from my_table)
Assuming every new row inserted will use the highest integer value for the table's id.
If you accept a tip, create an id in this table like serial. The default of this field will be:
nextval('table_name_field_seq'::regclass).
So, you use a query to call the last register. Using your example:
pg_query($connection, "SELECT currval('table_name_field_seq') AS id;
I hope this tip helps you.
To get the last row,
Get Last row in the sorted order: In case the table has a column specifying time/primary key,
Using LIMIT clause
SELECT * FROM USERS ORDER BY CREATED_TIME DESC LIMIT 1;
Using FETCH clause - Reference
SELECT * FROM USERS ORDER BY CREATED_TIME FETCH FIRST ROW ONLY;
Get Last row in the rows insertion order: In case the table has no columns specifying time/any unique identifiers
Using CTID system column, where ctid represents the physical location of the row in a table - Reference
SELECT * FROM USERS WHERE CTID = (SELECT MAX(CTID) FROM USERS);
Consider the following table,
userid |username | createdtime |
1 | A | 1535012279455 |
2 | B | 1535042279423 | //as per created time, this is the last row
3 | C | 1535012279443 |
4 | D | 1535012212311 |
5 | E | 1535012254634 | //as per insertion order, this is the last row
The query 1 and 2 returns,
userid |username | createdtime |
2 | B | 1535042279423 |
while 3 returns,
userid |username | createdtime |
5 | E | 1535012254634 |
Note : On updating an old row, it removes the old row and updates the data and inserts as a new row in the table. So using the following query returns the tuple on which the data modification is done at the latest.
Now updating a row, using
UPDATE USERS SET USERNAME = 'Z' WHERE USERID='3'
the table becomes as,
userid |username | createdtime |
1 | A | 1535012279455 |
2 | B | 1535042279423 |
4 | D | 1535012212311 |
5 | E | 1535012254634 |
3 | Z | 1535012279443 |
Now the query 3 returns,
userid |username | createdtime |
3 | Z | 1535012279443 |
Use the following
SELECT timestamp, value, card
FROM my_table
ORDER BY timestamp DESC
LIMIT 1
These are all good answers but if you want an aggregate function to do this to grab the last row in the result set generated by an arbitrary query, there's a standard way to do this (taken from the Postgres wiki, but should work in anything conforming reasonably to the SQL standard as of a decade or more ago):
-- Create a function that always returns the last non-NULL item
CREATE OR REPLACE FUNCTION public.last_agg ( anyelement, anyelement )
RETURNS anyelement LANGUAGE SQL IMMUTABLE STRICT AS $$
SELECT $2;
$$;
-- And then wrap an aggregate around it
CREATE AGGREGATE public.LAST (
sfunc = public.last_agg,
basetype = anyelement,
stype = anyelement
);
It's usually preferable to do select ... limit 1 if you have a reasonable ordering, but this is useful if you need to do this within an aggregate and would prefer to avoid a subquery.
See also this question for a case where this is the natural answer.
The column name plays an important role in the descending order:
select <COLUMN_NAME1, COLUMN_NAME2> from >TABLENAME> ORDER BY <COLUMN_NAME THAT MENTIONS TIME> DESC LIMIT 1;
For example: The below-mentioned table(user_details) consists of the column name 'created_at' that has timestamp for the table.
SELECT userid, username FROM user_details ORDER BY created_at DESC LIMIT 1;
In Oracle SQL,
select * from (select row_number() over (order by rowid desc) rn, emp.* from emp) where rn=1;
select * from table_name LIMIT 1;

I DISTINCTly hate MySQL (help building a query)

This is staight forward I believe:
I have a table with 30,000 rows. When I SELECT DISTINCT 'location' FROM myTable it returns 21,000 rows, about what I'd expect, but it only returns that one column.
What I want is to move those to a new table, but the whole row for each match.
My best guess is something like SELECT * from (SELECT DISTINCT 'location' FROM myTable) or something like that, but it says I have a vague syntax error.
Is there a good way to grab the rest of each DISTINCT row and move it to a new table all in one go?
SELECT * FROM myTable GROUP BY `location`
or if you want to move to another table
CREATE TABLE foo AS SELECT * FROM myTable GROUP BY `location`
Distinct means for the entire row returned. So you can simply use
SELECT DISTINCT * FROM myTable GROUP BY 'location'
Using Distinct on a single column doesn't make a lot of sense. Let's say I have the following simple set
-id- -location-
1 store
2 store
3 home
if there were some sort of query that returned all columns, but just distinct on location, which row would be returned? 1 or 2? Should it just pick one at random? Because of this, DISTINCT works for all columns in the result set returned.
Well, first you need to decide what you really want returned.
The problem is that, presumably, for some of the location values in your table there are different values in the other columns even when the location value is the same:
Location OtherCol StillOtherCol
Place1 1 Fred
Place1 89 Fred
Place1 1 Joe
In that case, which of the three rows do you want to select? When you talk about a DISTINCT Location, you're condensing those three rows of different data into a single row, there's no meaning to moving the original rows from the original table into a new table since those original rows no longer exist in your DISTINCT result set. (If all the other columns are always the same for a given Location, your problem is easier: Just SELECT DISTINCT * FROM YourTable).
If you don't care which values come from the other columns you can use a (bad, IMHO) MySQL extension to SQL and do:
SELECT * FROM YourTable GROUP BY Location
which will give a result set with one row per location and values for the other columns derived from the original data in an undefined fashion.
Multiple rows with identical values in all columns don't have any sense. OK - the question might be a way to correct exactly that situation.
Considering this table, with id being the PK:
kram=# select * from foba;
id | no | name
----+----+---------------
2 | 1 | a
3 | 1 | b
4 | 2 | c
5 | 2 | a,b,c,d,e,f,g
you may extract a sample for every single no (:=location) by grouping over that column, and selecting the row with minimum PK (for example):
SELECT * FROM foba WHERE id IN (SELECT min (id) FROM foba GROUP BY no);
id | no | name
----+----+------
2 | 1 | a
4 | 2 | c