How do I query a SQL database for a lot of results that don't have any common criteria? - sql

I have a MS SQL DB with about 2,600 records (each one information on a computer.) I need to write a SELECT statement that selects about 400 of those records.
What's the best way to do that when they don't have any common criteria? They're all just different random numbers so I can't use wildcards or anything like that. Will I just have to manually include all 400 numbers in the query?

If you need 400 specific rows where their column match a certain number:
Yes include all 400 numbers using an IN clause. It's been my experience (via code profiling) that using an IN clause is faster than using where column = A or column = B or ...
400 is really not a lot.
SELECT * FROM table WHERE column in (12, 13, 93, 4, ... )
If you need 400 random rows:
SELECT TOP 400 * FROM table
ORDER BY NEWID()

Rather than executing multiple queries or selecting the entire rowset and filtering it yourself, create either a temporary table or or a permanent table where you an insert temporary rows for each ID. In your main query just join on your temporary table.
For example, if your source table is...
person:
person_id
name
And you have 400 different person_id's you want, let's say we have a permanent table for our temporary rows, like this...
person_query:
query_id
person_id
You'd insert your rows into person_query, then execute your query like this..
select
*
from person p
join person_query pq on pq.person_id = p.person_id
where pq.query_id = #query_id

Maybe you have found a deficiency in the database design. That is, there is something common amongst the 400 records you want and what you need is another column in the database to indicate this commonality. You could then select against this new column.

As Brian Bondy said above, using the IN statement is probably the best way
SELECT * FROM table WHERE column in (12, 13, 93, 4, ... )
One good trick is to paste the IDs in from a spreadsheet, if you have one ...
If the IDs of the rows you want are in a spreadsheet, then you can add an extra column to the spreadsheet that CONCATENATES() a comma on to the end of the ID, so that the column in your spreadsheet looks like this:
12,
13,
93,
4,
then copy and paste this column of data into your query, so it looks like this:
SELECT * FROM table WHERE column in (
12,
13,
93,
4,
...
)
It doesn't look pretty but its a quick way of getting all the numbers in.

You could create an XML list or something of the sort which would keep track of what you need to query, and then you could write a query that would iterate through that list bringing all of them back.
Here is a website that has numerous examples of performing what you are looking for in a number of different methods (#4 is the XML method).

You can create a table with those 400+ random tokens, and select on those. e.g.,
SELECT * FROM inventory WHERE inventory_id IN (SELECT id FROM inventory_ids WHERE tag = 'foo')
You still have to maintain the other table, but at least you're not having one ginormous query.

I would built a separate table with your selection criteria and then join the tables together or something like that, assuming your criteria is static of course.

Just select the TOP n rows, and order by something random.
Below is a hypothetical example to return 10 random employee names:
SELECT TOP 10
EMP.FIRST_NAME
,EMP.LAST_NAME
FROM
Schema.dbo.Employees EMP
ORDER BY
NEWID()

For this specific situation (not necessarily a general solution) the fastest and simplest thing is probably to read the entire SQL table into memory and find your matches in your program's code rather than have the database parse a gigantic where clause.

Related

What does SELECT Function is SQL actually produce? Does it produce a new table by default?

I am struggling to understand what the output of SELECT is meant to be in SQL (I am using MS ACCESS), and what sort of criteria this output needs to specify, if any. As a result, I don't understand why some queries work and others don't. So I know it retrieves data from a table, does calculations with it and displays it. But I don't understand the "inner" working of SELECT function. For instance, what is the name of data structure / entity it displays? Is it a "new" table?
And for example, suppose I have a table called "table_name", with 5 columns. One of the columns called "column_3", and there are 20 records.
SELECT column_3, COUNT(*) AS Count
FROM table_name;
Why does this query fail to run? By logic, I would expect it to display two columns: first column will be "column_3", containing 20 rows with relevant data, and second column will be "Count", containing just one non-empty row (displaying 20), and other 19 rows will be empty (or NULL maybe)?
Is it because SELECT is meant to produce equal number of rows for each column?
Your questions involve a basic understanding of SQL. SELECT statements do not create tables, but instead return virtual result sets. Nothing is persisted unless you change it to an INSERT.
In your example question, you will need to "tell" the SQL engine what you want a count "of". Because you added column_3, you need to write:
SELECT column_3, COUNT(*) AS Count
FROM table_name
GROUP BY column_3
If you wanted a count of all the rows, simply:
SELECT COUNT(*) FROM table_name

Inserting lots of rows that only have a single value change into table with lots of column

I have a table I need to insert about 260 rows into, the data will be exactly the same EXCEPT for the value of a single column "project". If this was a small table I would just write it all out using a UNIN ALL but the problem is there are 66 total columns in the table and that is a LOT of repetitive typing. Is there a method of inserting nearly identical info without having to repeat it all like this? If it makes any difference it is on an MS SQL 2008 R2 server.
Assuming I'm understanding your requirements correctly, something like this could perhaps work with a subquery building the project values:
insert into yourtable
select 1, 'Another Value', ..., t.project
from (select 1 as project union all select 2 ... select 260) t
Depending on your table structure, you may need to supply the column names.

SQL or statement vs multiple select queries

I'm having a table with an id and a name.
I'm getting a list of id's and i need their names.
In my knowledge i have two options.
Create a forloop in my code which executes:
SELECT name from table where id=x
where x is always a number.
or I'm write a single query like this:
SELECT name from table where id=1 OR id=2 OR id=3
The list of id's and names is enormous so i think you wouldn't want that.
The problem of id's is the id is not always a number but a random generated id containting numbers and characters. So talking about ranges is not a solution.
I'm asking this in a performance point of view.
What's a nice solution for this problem?
SQLite has limits on the size of a query, so if there is no known upper limit on the number of IDs, you cannot use a single query.
When you are reading multiple rows (note: IN (1, 2, 3) is easier than many ORs), you don't know to which ID a name belongs unless you also SELECT that, or sort the results by the ID.
There should be no noticeable difference in performance; SQLite is an embedded database without client/server communication overhead, and the query does not need to be parsed again if you use a prepared statement.
A "nice" solution is using the INoperator:
SELECT name from table where id in (1,2,3)
Also, the IN operator is syntactic sugar built for exactly this purpose..
SELECT name from table where id IN (1,2,3,4,5,6.....)
Hoping that you are getting the list of ID's on which you have to perform a query for names as input temp table #InputIDTable,
SELECT name from table WHERE ID IN (SELECT id from #InputIDTable)

How do you complex join a number table with an actual table with many clauses dependent on the data from the number table?

I have a table of numbers (PLSQL collection containing some_table_line_ids passed in from a website).
Then I have some_table also has columns -> config_data, config_state
I want to pull in all lines that have the same table_id from the all the table_ids in the number table.
I also want to pull in all lines that have the same config_data as each record pulled in from the first part.
So its a parent/child relationship. This can be done in two for loops by selecting a line by an id in a cursor then another for loop selecting each line equaling the parents config data. Each loop I am performing data manipulation on each line.
I would like to combine both these into a single cursor having all table ids that I need.
What would that look like?
You just want to do a complicated join on different factors. Something like:
select st2.*
from numbers n join
some_table st
on st.table_id = n.table_id join
some_table st2
on st2.config_data = st.config_data
Quite possibly, you actually want:
select distinct st.*
since you might otherwise have duplicates. Or, you might want:
select n.table_id, st.config_data, st2.*
So you know which of the original values was responsible for bringing in the row.
You describe the array as a PL/SQL collection. If you employ a SQL type instead you could include it in the FROM clause by using the TABLE function.
create type some_table_line_id_nt as table of number;
Something like:
select s.*
from some_table s
join table(some_table_line_ids) t
on s.id = t.column_value
(I haven't offered a complete solution as you haven't given enough details of table structure and data.)
I solved the issue using start with and connect by prior.

Query select a bulk of IDs from a table - SQL

I have a table which holds ~1M rows. My application has a list of ~100K IDs which belong to that table (the list being generated by the application layer).
Is there a common-method of how to query all of these IDs? ~100K Select queries? A temporary table which I insert the ~100K IDs to, and Select query via join the required table?
Thanks,
Doori Bar
You could do it in one query, something like
SELECT * FROM large_table WHERE id IN (...)
Insert a comma-separated list of IDs where I put the ...
Unfortunately, there is no easy way that I know of to parametrize this, so you need to be extra-super careful to avoid SQL injection vulnerabilities.
A temporary table which holds the 100k IDs seems like a good solution. Don't insert them one by one though ; INSERT ... VALUES syntax in MySQL accepts the insertion of multiple rows.
By the way, where do you get your 100k IDs, if it's not from the database ? If they come from a preceding request, I'd suggest to have it fill the temporary table.
Edit : For a more portable way of multiple insert :
INSERT INTO mytable (col1, col2) SELECT 'foo', 0 UNION SELECT 'bar', 1
Do those id's actually reference the table with 1M rows?
If so, you could use SELECT * ids FROM <1M table>
where ids is the ID column and where "1M table" is the name of the table which holds the 1M rows.
but I don't think I really understand your question...