difference between select * and select table name - sql

This is the basic question about sql statements.
What is the difference between
SELECT * FROM "Users"
and
SELECT "Users".* FROM "Users"

[TableName].[column] is usually used to pinpoint the table you wish to use when two tables a present in a join or a complex statement and you want to define which column to use out of the two with the same name.
It's most common use is in a join though, for a basic statement such as the one above there is no difference and the output will be the same.

In your case there is no difference. It emerges, when you are selecting from multiple tables. * takes data from all the tables, TABLE_NAME.* - all the data from this table. Suppose, we have a database with 2 tables:
mysql> SELECT * FROM report;
+----+------------+
| id | date |
+----+------------+
| 1 | 2013-05-01 |
| 2 | 2013-06-02 |
+----+------------+
mysql> SELECT * FROM sites_to_report;
+---------+-----------+---------------------+------+
| site_id | report_id | last_run | rows |
+---------+-----------+---------------------+------+
| 1 | 1 | 2013-05-01 16:20:21 | 1 |
| 1 | 2 | 2013-05-03 16:20:21 | 1 |
| 2 | 2 | 2013-05-03 14:21:47 | 1 |
+---------+-----------+---------------------+------+
mysql> SELECT
-> *
-> FROM
-> report
-> INNER JOIN
-> sites_to_report
-> ON
-> sites_to_report.report_id=report.id;
+----+------------+---------+-----------+---------------------+------+
| id | date | site_id | report_id | last_run | rows |
+----+------------+---------+-----------+---------------------+------+
| 1 | 2013-05-01 | 1 | 1 | 2013-05-01 16:20:21 | 1 |
| 2 | 2013-06-02 | 1 | 2 | 2013-05-03 16:20:21 | 1 |
| 2 | 2013-06-02 | 2 | 2 | 2013-05-03 14:21:47 | 1 |
+----+------------+---------+-----------+---------------------+------+
mysql> SELECT
-> report.*
-> FROM
-> report
-> INNER JOIN
-> sites_to_report
-> ON
-> sites_to_report.report_id=report.id;
+----+------------+
| id | date |
+----+------------+
| 1 | 2013-05-01 |
| 2 | 2013-06-02 |
| 2 | 2013-06-02 |
+----+------------+

In the case of example given by you, there is no difference between them when it comes to semantics.When it comes to performance it might be too little... just parsing two different length strings....
But, it is only true for the example given by you. Where as in queries where multiple tables are involved tableName.* disambiguate the table from which table we want to select all columns.
Example:
If you have two tables TableA and TableB. Let's suppose that they have column with same names that is Name. If you want to specify from which table you want to select Name column. Table-name qualifier helps.
`select TableA.Name, TableB.Name where TableA.age=TableB.age`
That's all I can say.

The particular examples specified would return the same result and have the same performance. There would be no difference in that respect, therefore.
However, in some SQL products, difference in interpreting * and alias.* has effect, in particular, on what else you can add to the query. More specifically, in Oracle, you can mix an alias.* with other expressions being returned as columns, i.e. this
SELECT "Users".*, SomeColumn * 2 AS DoubleValue FROM "Users"
would work. At the same time, * must stand on its own, meaning that the following
SELECT *, SomeColumn * 2 AS DoubleValue FROM "Users"
would be illegal.

For the examples you provided, the only difference is in syntax. What both of the queries share is that they are really bad. Select * is evil no matter how you write it and can get you into all kinds of trouble. Get into the habit of listing the columns you want to have included in your result set.

Related

Find sequence of choice in a column

There is a table where user_id is for each test taker, and choice is the answer for all the three questions. I would like to get all the different sequence of choices that test taker made and count the sequence. Is there a way to write sql query to achieve this? Thanks
----------------------------------
| user_id | Choice |
----------------------------------
| 1 | a |
----------------------------------
| 1 | b |
----------------------------------
| 1 | c |
----------------------------------
| 2 | b |
----------------------------------
| 2 | c |
----------------------------------
| 2 | a |
----------------------------------
Desire answer:
----------------------------------
| choice | count |
----------------------------------
| a,b,c | 1 |
----------------------------------
| b,c,a | 1 |
-----------------------------------
In BigQuery, you can use aggregation functions:
select choices, count(*)
from (select string_agg(choice order by ?) as choices, user_id
from t
group by user_id
) t
group by choices;
The ? is for the column that specifies the ordering of the table. Remember: tables represent unordered sets, so without such a column the choices can be in any order.
You can do something similar in SQL Server 2017+ using string_agg(). In earlier versions, you have to use an XML method, which is rather unpleasant.

Why is this Query not Updateable?

I was looking to provide an answer to this question in which the OP has two tables:
Table1
+--------+--------+
| testID | Status |
+--------+--------+
| 1 | |
| 2 | |
| 3 | |
+--------+--------+
Table2
+----+--------+--------+--------+
| ID | testID | stepID | status |
+----+--------+--------+--------+
| 1 | 1 | 1 | pass |
| 2 | 1 | 2 | fail |
| 3 | 1 | 3 | pass |
| 4 | 2 | 1 | pass |
| 5 | 2 | 2 | pass |
| 6 | 3 | 1 | fail |
+----+--------+--------+--------+
Here, the OP is looking to update the status field for each testID in Table1 with pass if the status of all stepID records associated with the testID in Table2 have a status of pass, else Table1 should be updated with fail for that testID.
In this example, the result should be:
+--------+--------+
| testID | Status |
+--------+--------+
| 1 | fail |
| 2 | pass |
| 3 | fail |
+--------+--------+
I wrote the following SQL code in an effort to accomplish this:
update Table1 a inner join
(
select
b.testID,
iif(min(b.status)=max(b.status) and min(b.status)='pass','pass','fail') as v
from Table2 b
group by b.testID
) c on a.testID = c.testID
set a.testStatus = c.v
However, MS Access reports the all-too-familiar, 'operation must use an updateable query' response.
I know that a query is not updateable if there is a one-to-many relationship between the record being updated and the set of values, but in this case, the aggregated subquery would yield a one-to-one relationship between the two testID fields.
Which left me asking, why is this query not updateable?
You're joining in a query with an aggregate (Max).
Aggregates are not updateable. In Access, in an update query, every part of the query has to be updateable (with the exception of simple expressions, and subqueries in WHERE part of your query), which means your query is not updateable.
You can work around this by using domain aggregates (DMin and DMax) instead of real ones, but this query will take a large performance hit if you do.
You can also work around it by rewriting your aggregates to take place in an EXISTS or NOT EXISTS clause, since that's part of the WHERE clause thus doesn't need to be updateable. That would likely minimally affect performance, but means you have to split this query in two: 1 query to set all the fields to "pass" that meet your condition, another to set them to "fail" if they don't.

SQL / Oracle to Tableau - How to combine to sort based on two fields?

I have tables below as follows:
tbl_tasks
+---------+-------------+
| Task_ID | Assigned_ID |
+---------+-------------+
| 1 | 8 |
| 2 | 12 |
| 3 | 31 |
+---------+-------------+
tbl_resources
+---------+-----------+
| Task_ID | Source_ID |
+---------+-----------+
| 1 | 4 |
| 1 | 10 |
| 2 | 42 |
| 4 | 8 |
+---------+-----------+
A task is assigned to at least one person (denoted by the "assigned_ID") and then any number of people can be assigned as a source (denoted by "source_ID"). The ID numbers are all linked to names in another table. Though the ID numbers are named differently, they all return to the same table.
Would there be any way for me to combine the two tables based on ID such that I could search based on someone's ID number? For example- if I decide to search on or do a WHERE User_ID = 8, in order to see what Tasks that 8 is involved in, I would get back Task 1 and Task 4.
Right now, by joining all the tables together, I can easily filter on "Assigned" but not "Source" due to all the multiple entries in the table.
Use union all:
select distinct task_id
from ((select task_id, assigned_id as id
from tbl_tasks
) union all
(select task_id, source_id
from tbl_resources
)
) ti
where id = ?;
Note that this uses select distinct in case someone is assigned to the same task in both tables. If not, remove the distinct.

MS Access ComboBox source Query values issue

here is my issue:
I have a combobox, it's source must be union of two tables.
one table is local AllUsers and it has only one record:
+------------+----------+
|IndexKey | UserName |
+----------- +----------+
| -1 | ALL |
+-----------------------+
and the second one is linked from MS SQL Serverdbo_NGAC_USERINFO, I get only Two fields from it:
+-----------+----------+
|IndexKey | Name |
+-----------+----------+
| 1 | Tedo |
+-----------+----------+
| 2 | Tornike |
+-----------+----------+
| 4 | John |
+----------------------+
so, I want to get union result of these tables, it will look like this:
+-----------+----------+
|-1 | ALL |
+-----------+----------+
| 1 | Tedo |
+-----------+----------+
| 2 | Tornike |
+-----------+----------+
| 4 | John |
+----------------------+
But my problem is:
If I write union Query, it shows blank values for IndexKey and correct values Name. but if I write select for only first or only second table in the query, it shows correct results.
here is my code that shows incorrect results:
SELECT *
FROM AllUsers
UNION ALL
SELECT dbo_NGAC_USERINFO.IndexKey, dbo_NGAC_USERINFO.Name
FROM dbo_NGAC_USERINFO
I tried: writing the values from AllUsers Table manually, writing with Union instead of Union All, moving first table in the end and second at the first, ordering, creating subquery, making aliases for fieldnames, but all my tries are unsuccessful.
Any help will be appreciated, thanks in advance.
Try to be more specific:
SELECT AllUsers.IndexKey, AllUsers.Name
FROM AllUsers
UNION ALL
SELECT dbo_NGAC_USERINFO.IndexKey, dbo_NGAC_USERINFO.Name
FROM dbo_NGAC_USERINFO

1 to Many Query: Help Filtering Results

Problem: SQL Query that looks at the values in the "Many" relationship, and doesn't return values from the "1" relationship.
Tables Example: (this shows two different tables).
+---------------+----------------------------+-------+
| Unique Number | <-- Table 1 -- Table 2 --> | Roles |
+---------------+----------------------------+-------+
| 1 | | A |
| 2 | | B |
| 3 | | C |
| 4 | | D |
| 5 | | |
| 6 | | |
| 7 | | |
| 8 | | |
| 9 | | |
| 10 | | |
+---------------+----------------------------+-------+
When I run my query, I get multiple, unique numbers that show all of the roles associated to each number like so.
+---------------+-------+
| Unique Number | Roles |
+---------------+-------+
| 1 | C |
| 1 | D |
| 2 | A |
| 2 | B |
| 3 | A |
| 3 | B |
| 4 | C |
| 4 | A |
| 5 | B |
| 5 | C |
| 5 | D |
| 6 | D |
| 6 | A |
+---------------+-------+
I would like to be able to run my query and be able to say, "When the role of A is present, don't even show me the unique numbers that have the role of A".
Maybe if SQL could look at the roles and say, WHEN role A comes up, grab unique number and remove it from column 1.
Based on what I would "like" to happen (I put that in quotations as this might not even be possible) the following is what I would expect my query to return.
+---------------+-------+
| Unique Number | Roles |
+---------------+-------+
| 1 | C |
| 1 | D |
| 5 | B |
| 5 | C |
| 5 | D |
+---------------+-------+
UPDATE:
Query Example: I am querying 8 tables, but I condensed it to 4 for simplicity.
SELECT
c.UniqueNumber,
cp.pType,
p.pRole,
a.aRole
FROM c
JOIN cp ON cp.uniqueVal = c.uniqueVal
JOIN p ON p.uniqueVal = cp.uniqueVal
LEFT OUTER JOIN a.uniqueVal = p.uniqueVal
WHERE
--I do some basic filtering to get to the relevant clients data but nothing more than that.
ORDER BY
c.uniqueNumber
Table sizes: these tables can have anywhere from 50,000 rows to 500,000+
Pretending the table name is t and the column names are alpha and numb:
SELECT t.numb, t.alpha
FROM t
LEFT JOIN t AS s ON t.numb = s.numb
AND s.alpha = 'A'
WHERE s.numb IS NULL;
You can also do a subselect:
SELECT numb, alpha
FROM t
WHERE numb NOT IN (SELECT numb FROM t WHERE alpha = 'A');
Or one of the following if the subselect is materializing more than once (pick the one that is faster, ie, the one with the smaller subtable size):
SELECT t.numb, t.alpha
FROM t
JOIN (SELECT numb FROM t GROUP BY numb HAVING SUM(alpha = 'A') = 0) AS s USING (numb);
SELECT t.numb, t.alpha
FROM t
LEFT JOIN (SELECT numb FROM t GROUP BY numb HAVING SUM(alpha = 'A') > 0) AS s USING (numb)
WHERE s.numb IS NULL;
But the first one is probably faster and better[1]. Any of these methods can be folded into a larger query with multiple additional tables being joined in.
[1] Straight joins tend to be easier to read and faster to execute than queries involving subselects and the common exceptions are exceptionally rare for self-referential joins as they require a large mismatch in the size of the tables. You might hit those exceptions though, if the number of rows that reference the 'A' alpha value is exceptionally small and it is indexed properly.
There are many ways to do it, and the trade-offs depend on factors such as the size of the tables involved and what indexes are available. On general principles, my first instinct is to avoid a correlated subquery such as another, now-deleted answer proposed, but if the relationship table is small then it probably doesn't matter.
This version instead uses an uncorrelated subquery in the where clause, in conjunction with the not in operator:
select num, role
from one_to_many
where num not in (select otm2.num from one_to_many otm2 where otm2.role = 'A')
That form might be particularly effective if there are many rows in one_to_many, but only a small proportion have role A. Of course you can add an order by clause if the order in which result rows are returned is important.
There are also alternatives involving joining inline views or CTEs, and some of those might have advantages under particular circumstances.