What does SELECT Function is SQL actually produce? Does it produce a new table by default? - sql

I am struggling to understand what the output of SELECT is meant to be in SQL (I am using MS ACCESS), and what sort of criteria this output needs to specify, if any. As a result, I don't understand why some queries work and others don't. So I know it retrieves data from a table, does calculations with it and displays it. But I don't understand the "inner" working of SELECT function. For instance, what is the name of data structure / entity it displays? Is it a "new" table?
And for example, suppose I have a table called "table_name", with 5 columns. One of the columns called "column_3", and there are 20 records.
SELECT column_3, COUNT(*) AS Count
FROM table_name;
Why does this query fail to run? By logic, I would expect it to display two columns: first column will be "column_3", containing 20 rows with relevant data, and second column will be "Count", containing just one non-empty row (displaying 20), and other 19 rows will be empty (or NULL maybe)?
Is it because SELECT is meant to produce equal number of rows for each column?

Your questions involve a basic understanding of SQL. SELECT statements do not create tables, but instead return virtual result sets. Nothing is persisted unless you change it to an INSERT.
In your example question, you will need to "tell" the SQL engine what you want a count "of". Because you added column_3, you need to write:
SELECT column_3, COUNT(*) AS Count
FROM table_name
GROUP BY column_3
If you wanted a count of all the rows, simply:
SELECT COUNT(*) FROM table_name

Related

Get latest data for all people in a table and then filter based on some criteria

I am attempting to return the row of the highest value for timestamp (an integer) for each person (that has multiple entries) in a table. Additionally, I am only interested in rows with the field containing ABCD, but this should be done after filtering to return the latest (max timestamp) entry for each person.
SELECT table."person", max(table."timestamp")
FROM table
WHERE table."type" = 1
HAVING table."field" LIKE '%ABCD%'
GROUP BY table."person"
For some reason, I am not receiving the data I expect. The returned table is nearly twice the size of expectation. Is there some step here that I am not getting correct?
You can 1st return a table having max(timestamp) and then use it in sub query of another select statement, following is query
SELECT table."person", timestamp FROM
(SELECT table."person",max(table."timestamp") as timestamp, type, field FROM table GROUP BY table."person")
where type = 1 and field LIKE '%ABCD%'
Direct answer: as I understand your end goal, just move the HAVING clause to the WHERE section:
SELECT
table."person", MAX(table."timestamp")
FROM table
WHERE
table."type" = 1
AND table."field" LIKE '%ABCD%'
GROUP BY table."person";
This should return no more than 1 row per table."person", with their associated maximum timestamp.
As an aside, I surprised your query worked at all. Your HAVING clause referenced a column not in your query. From the documentation (and my experience):
The fundamental difference between WHERE and HAVING is this: WHERE selects input rows before groups and aggregates are computed (thus, it controls which rows go into the aggregate computation), whereas HAVING selects group rows after groups and aggregates are computed.

Splitting up same values in a query output

For starters I should say that I'm using openoffice.
I've got a query with the following columns.
Now when I get the results there are multiple delegations with the same value.
Let's just say for demonstrating purposes I got 6 delegations which are and 3 of them have the same value. In the results they are shown directly below one another. But for this I need a result where I can't have that. I need to separate each value in a way that they don't appear after each other.
So that the results shouldn't look like this
but that the delegations are separated with a least another delegation in between
I hope you got what I want to do.
you use a following query
query=SELECT DISTINCT column1, column2, ...
FROM table_name;
The SELECT DISTINCT statement is used to return only distinct (different) values.
nside a table, a column often contains many duplicate values; and sometimes you only want to list the different (distinct) values.
The SELECT DISTINCT statement is used to return only distinct (different) values.

SQL or statement vs multiple select queries

I'm having a table with an id and a name.
I'm getting a list of id's and i need their names.
In my knowledge i have two options.
Create a forloop in my code which executes:
SELECT name from table where id=x
where x is always a number.
or I'm write a single query like this:
SELECT name from table where id=1 OR id=2 OR id=3
The list of id's and names is enormous so i think you wouldn't want that.
The problem of id's is the id is not always a number but a random generated id containting numbers and characters. So talking about ranges is not a solution.
I'm asking this in a performance point of view.
What's a nice solution for this problem?
SQLite has limits on the size of a query, so if there is no known upper limit on the number of IDs, you cannot use a single query.
When you are reading multiple rows (note: IN (1, 2, 3) is easier than many ORs), you don't know to which ID a name belongs unless you also SELECT that, or sort the results by the ID.
There should be no noticeable difference in performance; SQLite is an embedded database without client/server communication overhead, and the query does not need to be parsed again if you use a prepared statement.
A "nice" solution is using the INoperator:
SELECT name from table where id in (1,2,3)
Also, the IN operator is syntactic sugar built for exactly this purpose..
SELECT name from table where id IN (1,2,3,4,5,6.....)
Hoping that you are getting the list of ID's on which you have to perform a query for names as input temp table #InputIDTable,
SELECT name from table WHERE ID IN (SELECT id from #InputIDTable)

Returning an Access recordset with zeros instead of nulls

Here's the problem:
I have an Access query that feeds a report, which sometimes doesn't return any records for certain criteria. I would like to display zeros in the report instead of an empty line (an empty recordset is currently being returned).
Is there an SQL solution that (perhaps using some kind of union statement and/or nested SQL) always returns one record (with zeros) if there are not matching records from the initial query?
One possible solution would be to create a second table with the same primary key, and add just one record. In your query, choose as join type all records in the second table, including those with no matching records in the first one. Select as output all fields in the first table.
You can materialize a one-row table with zero for all columns. This is a slight pain to achieve in Access (ACE, Jet, whatever) because it doesn't support row constructors and the FROM must resolve to a base table. In other words, you'll need a table that is guaranteed to always contain at least one row.
This isn't a problem for me because my databases always include auxilliary tables e.g. a calendar table, a sequence table of integers, etc. For exmaple, to materialize a table one-row, all-zeros table using my 3000 row Calendar table:
SELECT DISTINCT 0 AS c
FROM Calendar;
I can then UNION my query with my materialized table but include an antijoin to ensure the all-zeros row only appears in the resultset when my query is the empty set:
SELECT c
FROM T
UNION
SELECT 0
FROM Calendar
WHERE NOT EXISTS (
SELECT c
FROM T
);
Note the use of UNION allows me to remove the DISTINCT keyword and the AS clause ("column alias") from the materialized table.

How do I query a SQL database for a lot of results that don't have any common criteria?

I have a MS SQL DB with about 2,600 records (each one information on a computer.) I need to write a SELECT statement that selects about 400 of those records.
What's the best way to do that when they don't have any common criteria? They're all just different random numbers so I can't use wildcards or anything like that. Will I just have to manually include all 400 numbers in the query?
If you need 400 specific rows where their column match a certain number:
Yes include all 400 numbers using an IN clause. It's been my experience (via code profiling) that using an IN clause is faster than using where column = A or column = B or ...
400 is really not a lot.
SELECT * FROM table WHERE column in (12, 13, 93, 4, ... )
If you need 400 random rows:
SELECT TOP 400 * FROM table
ORDER BY NEWID()
Rather than executing multiple queries or selecting the entire rowset and filtering it yourself, create either a temporary table or or a permanent table where you an insert temporary rows for each ID. In your main query just join on your temporary table.
For example, if your source table is...
person:
person_id
name
And you have 400 different person_id's you want, let's say we have a permanent table for our temporary rows, like this...
person_query:
query_id
person_id
You'd insert your rows into person_query, then execute your query like this..
select
*
from person p
join person_query pq on pq.person_id = p.person_id
where pq.query_id = #query_id
Maybe you have found a deficiency in the database design. That is, there is something common amongst the 400 records you want and what you need is another column in the database to indicate this commonality. You could then select against this new column.
As Brian Bondy said above, using the IN statement is probably the best way
SELECT * FROM table WHERE column in (12, 13, 93, 4, ... )
One good trick is to paste the IDs in from a spreadsheet, if you have one ...
If the IDs of the rows you want are in a spreadsheet, then you can add an extra column to the spreadsheet that CONCATENATES() a comma on to the end of the ID, so that the column in your spreadsheet looks like this:
12,
13,
93,
4,
then copy and paste this column of data into your query, so it looks like this:
SELECT * FROM table WHERE column in (
12,
13,
93,
4,
...
)
It doesn't look pretty but its a quick way of getting all the numbers in.
You could create an XML list or something of the sort which would keep track of what you need to query, and then you could write a query that would iterate through that list bringing all of them back.
Here is a website that has numerous examples of performing what you are looking for in a number of different methods (#4 is the XML method).
You can create a table with those 400+ random tokens, and select on those. e.g.,
SELECT * FROM inventory WHERE inventory_id IN (SELECT id FROM inventory_ids WHERE tag = 'foo')
You still have to maintain the other table, but at least you're not having one ginormous query.
I would built a separate table with your selection criteria and then join the tables together or something like that, assuming your criteria is static of course.
Just select the TOP n rows, and order by something random.
Below is a hypothetical example to return 10 random employee names:
SELECT TOP 10
EMP.FIRST_NAME
,EMP.LAST_NAME
FROM
Schema.dbo.Employees EMP
ORDER BY
NEWID()
For this specific situation (not necessarily a general solution) the fastest and simplest thing is probably to read the entire SQL table into memory and find your matches in your program's code rather than have the database parse a gigantic where clause.