SELECT statement to select from multiple tables referenced by ROWIDs

SELECT statement to select from multiple tables referenced by ROWIDs - sql

I have a small SQLITE 3 database accessed by AutoIt. Works all great, but now I need a more complex statement and maybe I now regret that I have referenced tables using only the ROWID instead of particular ID fields...
This is the configuration:
Table 1 Person
Name (string)
Initials (string)
Table 2 Projekte
Description (string)
Person (containing the ROWID of table Person)
Table 3 Planungen
ProjID (contains ROWID of table Projekte)
PlID (numeric, main selection identifier)
(plus some other fields that do not matter)
Initially, I only needed to read all data from table 3 Planungen filtered by a specific PlID. I did that successfully by using:
SELECT ROWID,* FROM Planungen WHERE PlID=[FilterValue1] ORDER BY ROWID;
Works great.
Now, I need to SELECT only a subset of these records, where PlID=[FilterValue1] and where ProjID points to a table 2 Projekte entry, that complies to Projekte.Person=[FilterValue2]. So I do not even need table 1 (Person), just 2 and 3.
I thought I could do it that way (now it becomes obvious, I am SQL idiot):
SELECT ROWID,* FROM Planungen p, Projekte pj WHERE pj.Person=[FilterValue2] and p.ProjID=pj.ROWID and p.PlID=[FilterValue1] ORDER BY ROWID;
That runs into an SQLite Error telling me that there is no such column ROWID. Oops! Really? How can that be? I can't use ROWID in the WHERE clause?? Well, probably it won't do what I intent anyway.
Can someone please help me? Can this be done without changing the database structure and introducing ID fields?
It would be great if the output of the SELECT would be identical to the first, working SELECT command, just with the additional "filtering" applied.

You really should add a proper INTEGER PRIMARY KEY column to your tables. (The implicit rowid might be changed by a VACUUM.)
Anyway, this query fails because the column name rowid is ambiguous. Replace it with pj.rowid (or whatever table you want to access).

Related

SQL or statement vs multiple select queries

I'm having a table with an id and a name.
I'm getting a list of id's and i need their names.
In my knowledge i have two options.
Create a forloop in my code which executes:
SELECT name from table where id=x
where x is always a number.
or I'm write a single query like this:
SELECT name from table where id=1 OR id=2 OR id=3
The list of id's and names is enormous so i think you wouldn't want that.
The problem of id's is the id is not always a number but a random generated id containting numbers and characters. So talking about ranges is not a solution.
I'm asking this in a performance point of view.
What's a nice solution for this problem?

SQLite has limits on the size of a query, so if there is no known upper limit on the number of IDs, you cannot use a single query.
When you are reading multiple rows (note: IN (1, 2, 3) is easier than many ORs), you don't know to which ID a name belongs unless you also SELECT that, or sort the results by the ID.
There should be no noticeable difference in performance; SQLite is an embedded database without client/server communication overhead, and the query does not need to be parsed again if you use a prepared statement.

A "nice" solution is using the INoperator:
SELECT name from table where id in (1,2,3)

Also, the IN operator is syntactic sugar built for exactly this purpose..
SELECT name from table where id IN (1,2,3,4,5,6.....)

Hoping that you are getting the list of ID's on which you have to perform a query for names as input temp table #InputIDTable,
SELECT name from table WHERE ID IN (SELECT id from #InputIDTable)

How to construct an sqlite table that assign and returns IDs to any name?

I would like to have an sqlite table that maps names into unique IDs. I can create this table in the following way:
CREATE TABLE name_to_id (id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT)
With a select statement I can get the row containing a needed name and get from this row the corresponding ID.
The problem appears if I try to get ID for a name that is not yet in the table. The expected behavior in this case is that the new name will be added and its newly generated ID will be returned. I have two possible solutions/implementations of that.
The first solution is trivial:
We check if name is in the table.
If not we insert a row with the name.
We select the row with the name and read the needed ID from that row.
I do not like this solution because it can happen that the first process checks if the name in the table, it sees that the name is not there, meanwhile another process adds the name to the table and then the first process tries to add the same name.
The second solution seems to be better:
For any name we use insert if not exist.
We select from the table the row containing the name and get its ID.
Is the second solution optimal or there are better solutions?

The normal way to avoid duplicate entries in a table is to create an unique constraint. The database will then check for you if the record is already there and fail if so. That should be the best in terms of reliability and performance.
Next, the SQLite FAQ suggests to use the function last_insert_rowid() to fetch the ID instead of running a second query. This is actually the first question of the FAQ at all ;)

In pseudocode, the first solution looks like this:
cursor = db.execute("SELECT id FROM name_to_id WHERE name = ?", name)
if cursor.has_some_row:
id = cursor["id"]
else:
db.execute("INSERT INTO name_to_id(name) VALUES(?)", name)
id = db.last_insert_rowid
and the second like this:
db.execute("INSERT OR IGNORE INTO name_to_id(name) VALUES(?)", name)
cursor = db.execute("SELECT id FROM name_to_id WHERE name = ?", name)
id = cursor["id"]
The first solution requires a transaction around both commands, but this would be a good idea for the second solution, too, to avoid the overhead of multiple implicit transactions.
The second solution requires a unique constaint on name, but this would be a good idea for the first solution, too, for correctness and to speed up the name lookups.
Both solution use two SQL statements, and have similar speed.
(The second searches the row two times, but that data is cached.)
So there isn't anything obvious that makes one better that the other.

SQL query: have results into a table named the results name

I have a very large database I would like to split up into tables. I would like to make it so when I run a distinct, it will make a table for every distinct name. The name of the table will be the data in one of the fields.
EX:
A --------- Data 1
A --------- Data 2
B --------- Data 3
B --------- Data 4
would result in 2 tables, 1 named A and another named B. Then the entire row of data would be copied into that field.
select distinct [name] from [maintable]
-make table for each name
-select [name] from [maintable]
-copy into table name
-drop row from [maintable]
Any help would be great!

I would advise you against this.
One solution is to create indexes, so you can access the data quickly. If you have only a handful of names, though, this might not be particularly effective because the index values would have select almost all records.
Another solution is something called partitioning. The exact mechanism differs from database to database, but the underlying idea is the same. Different portions of the table (as defined by name in your case) would be stored in different places. When a query is looking only for values for a particular name, only that data gets read.
Generally, it is bad design to have multiple tables with exactly the same data columns. Here are some reasons:
Adding a column, changing a type, or adding an index has to be done times instead of one time.
It is very hard to enforce a primary key constraint on a column across the tables -- you lose the primary key.
Queries that touch more than one name become much more complicated.
Insertions and updates are more complex, because you have to first identify the right table. This often results in overuse of dynamic SQL for otherwise basic operations.
Although there may be some simplifications (security comes to mind), most databases have other mechanisms that are superior to splitting the data into separate tables.

what you want is
CREATE TABLE new_table
AS (SELECT .... //the data that you want in this table);

query - select data by first inserted [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
select bottom rows in natural order
People imagine that i have this table :
persons
columns of the table are NAME and ID
and i insert this
insert into persons values ('name','id');
insert into persons values ('John','1');
insert into persons values ('Jack','3');
insert into persons values ('Alice','2');
How can i select this information order by the insertion? My query would like :
NAME ID
name id
John 1
Jack 3
Alice 2
Without indexs (autoincrements), it's possible?

I'm pretty sure its not. From my knowldege sql data order is not sequetional with respect to insertion. The only idea I have is along with each insertion have a timestamp and sort by that time stamp

This is not possible without adding a column or table containing a timestamp. You could add a timestamp column or create another table containing IDs and a timestamp and insert in to that at the same time.

You cannot have any assumptions about how the DBMS will store data and retrieve them without specifying order by clause. I.e. PostgreSQL uses MVCC and if you update any row, physically a new copy of a row will be created at the end of a table datafile. Using a plain select causes pg to use sequence scan scenario - it means that the last updated row will be returned as the last one.

I have to agree with the other answers, Without a specific field/column todo this... well its a unreliable way... While i have not actually ever had a table without an index before i think..
you will need something to index it by, You can go with many other approaches and methods... For example, you use some form of concat/join of strings and then split/separate the query results later.
--EDIT--
For what reason do you wish not to use these methods? time/autoinc

Without storing some sort of order information during insert, the database does not automatically keep track of every record ever inserted and their order (this is probably a good thing ;) ). Autoincrement cannot be avoided... even with timestamp, they can hold same value.

Filling the gaps in values of IDENTITY column

I have a table with an IDENTITY column
[Id] int IDENTITY(1, 1) NOT NULL
After some rows beeing added/removed I end with gaps in Id values:
Id Name
---------
1 Tom
2 Bill
4 Kate
Is there an easy way to compress the values to
Id Name
---------
1 Tom
2 Bill
3 Kate
?

I would strongly recommend that you leave the identity values as they are.
if this ID column is used as a foreign key on another table, changing them will get complicated very quickly.
if there is some business logic where they must be sequential then add a new column ID_Display where you can update them using ROW_NUMBER() and keep them pretty for the end user. I never let end users see and/or dictate how I create/populate/store the data, and if they are bothering you about the IDs then show them some other data that looks like an ID but is not a FK or PK.

I think it's pretty easy to create a 2nd table with the same schema, import all the data (except for the identity column of course; let the 2nd table start renumbering) from the first table, drop the first table and rename the 2nd to the original name.
Easiness may be in question if you'd have a ton of FK relationships to rebuild with other tables etc.

Well as far as I know the only way you can is manually update the values by turning Identity insert on..but you should really avoid doning such a thing in first place..also if you truncate the table it will not have those gaps.

I cannot control the part which requires ID columns to be in sequence.
This sounds like there is program logic which assumes there are no gaps--correct?
I need this to keep two different databases in sync.
It's still not clear what you mean. If the actual values in the IDENTITY column are not meaningful (not used as foreign keys by other tables), you can just do this:
DELETE FROM db1.table
SELECT col1, col2, col3 /* leave out the IDENTITY column */
INTO db1.table FROM db2.table

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SELECT statement to select from multiple tables referenced by ROWIDs - sql

You really should add a proper INTEGER PRIMARY KEY column to your tables. (The implicit rowid might be changed by a VACUUM.) Anyway, this query fails because the column name rowid is ambiguous. Replace it with pj.rowid (or whatever table you want to access).

Related

SQL or statement vs multiple select queries

How to construct an sqlite table that assign and returns IDs to any name?

SQL query: have results into a table named the results name

query - select data by first inserted [duplicate]

Filling the gaps in values of IDENTITY column

Categories

Resources