Tell me please a little about Oracle indexes, because I don't know how to ask about this situation to Google. Let's pretend I have a table
table T (
Id key
FKey1 int,
FKey2 int,
Date1 date,
Date2 date,
Name string,
Surname string
)
And have a composite index on all this fields except Id. I have 2 queries which used:
All the columns except Name and Surname;
All the columns except Surname - and search for Name with LIKE expression.
Is this index efficient? And if not, how can I improve it? Queries generated by ORM and just have a possibility to use indexes :(
Real index columns sequence:
Name
Surname
FKey1
FKey2
Date1
Date2
it depends on the sequence of the columns declared in the creation of the index .. it is important that the columns are more selettiv or used in all the queries are placed before the others: in your case
....
Surname , Name //used by two user queries
work better than
Name , surname // //used by oneuser queries
Related
this is my first time dealing with indexes and would like to understand few things.
I have the tables of the following schemas:
Table1: Customer details
id
name
createdOn
username
phone
address
1
xyz
some date
xyz12
12345678
abc
The id in the above table is unique. The id is not defined as PK in the table though. Would id + createdOn be a good complex index?
Table2: Tracked customer actions
customer id
name
timestamp
action type
cart value
address
1
xyz
some date
click
.
abc
The above table does not have any column with unique values and there can be a lot of sparse data. The above actions table is a sample and can have almost 18 columns, with new data being added frequently. Is having all columns as a index a good one?
The queries on these tables could be both simple and complex as below:
select * from customerDetails
OR
with target_customers as (
select
id as customer_id
from customerDetails
where customer_date > {some date}
)
select avg(cart_value) from actions a
where action_type = 'cart updated'
inner join target_customers b on a.customer_id = b.customer_id
These are sample queries and I believe I will be having even more complex queries using different aggregations and joins with other tables as well to gain insights while performing analytics in the future.
I want to understand the best columns for indexes on the above tables.
The id is not defined as PK in the table though."
That's unusual. Why is that?
Would id + createdOn be a good complex index?
No, you'd reverse it: createdOn, id. An index can use the first column alone. This allows you to use the index to order by createdOn and also createdOn between X and Y.
But you probably wouldn't include id in there at all. Make id a primary key and it is indexed.
In general, if you want to cover all possibilities for two keys, make two indexes...
columnA, columnB
columnB
columnA, columnB can cover queries which only reference columnA and can also order by columnA. It can also cover queries which reference both columnA and columnB. But it can't cover a query which only references columnB, so we need an single-column index for columnB.
Is having all columns as a index a good one?
Maybe, it depends on your queries, but probably not.
You want to index foreign keys, they should be indexed automatically, because that will speed up all joins.
You probably want to index timestamps that you're going to search or order by.
Any flags you often query by, such as where action_type = 'cart updated' you may want to index. Or you may want to partition the table by the action type.
The above actions table is a sample and can have almost 18 columns, with new data being added frequently.
This may be a good use of a single jsonb column to store all the miscellaneous attributes. This allows you to use a single index for the jsonb column. However, jsonb is not a panacea and you will have to choose what to put in jsonb and what to make columns.
For example, a timestamp such as createdOn should probably be a column. Also any foreign keys. And status flags such as action_type.
I a have a table structure as below. For fetching the data from table I am having search criteria as mentioned below. I am writing a singe sql query as per requirement(sample query I mentioned below). I need to create an index for the table to cover all the search criteria. It will be helpful somebody advice me.
Table structure(columns):
applicationid varchar(15),
trans_tms timestamp,
SSN varchar,
firstname varchar,
lastname varchar,
DOB date,
Zipcode smallint,
adddetais json
Search criteria will be from API will be fall under 4 categories. All 4 categories are mandatory. At any cost I will receive 4 categories of values for against single applicant.
Search criteria:
ssn&last name (last name need to use function I.e. soundex(lastname)=soundex('inputvalue').
ssn & DOB
ssn&zipcode
firstname&lastname&DOB.
Query:
I am trying to write.
Sample query is:
Select *
from table
where ((ssn='aaa' and soundex(lastname)=soundex('xxx')
or ((ssn='aaa' and dob=xxx)
or (ssn='aaa' and zipcode = 'xxx')
or (firstname='xxx' and lastname='xxx' and dob= xxxx));
For considering performance I need to create an index for the table. Might be composite. Any suggestion will be helpful.
Some Approaches I would follow:
Yes, you are correct composite index/multicolumn index will give benefit in AND conditions of two columns, however, indexes would overlap on columns for given conditions.
Documentation : https://www.postgresql.org/docs/10/indexes-multicolumn.html
You can use a UNION instead of OR.
Reference : https://www.cybertec-postgresql.com/en/avoid-or-for-better-performance/
If multiple conditions could be combined for e.g: ssn should be 'aaa' with any combination, then modifying the where clause with lesser OR is preferable.
I have a small SQLITE 3 database accessed by AutoIt. Works all great, but now I need a more complex statement and maybe I now regret that I have referenced tables using only the ROWID instead of particular ID fields...
This is the configuration:
Table 1 Person
Name (string)
Initials (string)
Table 2 Projekte
Description (string)
Person (containing the ROWID of table Person)
Table 3 Planungen
ProjID (contains ROWID of table Projekte)
PlID (numeric, main selection identifier)
(plus some other fields that do not matter)
Initially, I only needed to read all data from table 3 Planungen filtered by a specific PlID. I did that successfully by using:
SELECT ROWID,* FROM Planungen WHERE PlID=[FilterValue1] ORDER BY ROWID;
Works great.
Now, I need to SELECT only a subset of these records, where PlID=[FilterValue1] and where ProjID points to a table 2 Projekte entry, that complies to Projekte.Person=[FilterValue2]. So I do not even need table 1 (Person), just 2 and 3.
I thought I could do it that way (now it becomes obvious, I am SQL idiot):
SELECT ROWID,* FROM Planungen p, Projekte pj WHERE pj.Person=[FilterValue2] and p.ProjID=pj.ROWID and p.PlID=[FilterValue1] ORDER BY ROWID;
That runs into an SQLite Error telling me that there is no such column ROWID. Oops! Really? How can that be? I can't use ROWID in the WHERE clause?? Well, probably it won't do what I intent anyway.
Can someone please help me? Can this be done without changing the database structure and introducing ID fields?
It would be great if the output of the SELECT would be identical to the first, working SELECT command, just with the additional "filtering" applied.
You really should add a proper INTEGER PRIMARY KEY column to your tables. (The implicit rowid might be changed by a VACUUM.)
Anyway, this query fails because the column name rowid is ambiguous. Replace it with pj.rowid (or whatever table you want to access).
I've used SQL for years but have never truly harnessed its potential.
For this example let's say I have two tables:
CREATE TABLE messages (
MessageID INTEGER NOT NULL PRIMARY KEY,
UserID INTEGER,
Timestamp INTEGER,
Msg TEXT);
CREATE TABLE users (
UserID INTEGER NOT NULL PRIMARY KEY,
UserName TEXT,
Age INTEGER,
Gender INTEGER,
WebURL TEXT);
As I understand it, PRIMARY KEY basically indexes the field so that it can be used as a rapid search later on - querying based on exact values of the primary key gets results extremely quickly even in huge tables. (which also enforces that the field must be unqiue in each record.)
In my current workflow, I'd do something like
SELECT * FROM messages;
and then in code, for each message, do:
SELECT * FROM users WHERE UserID = results['UserID'];
This obviously sounds very inefficient and I know it can be done a lot better.
What I want to end up with is a result set that contains all of the fields from messages, except that instead of the UserID field, it contains all of the fields from the users table that match that given UserID.
Could someone please give me a quick primer on how this sort of thing can be accomplished?
If it matters, I'm using SQLite3 as an SQL engine, but I also would possibly want to do this on MySQL.
Thank you!
Not sure about the requested order, but you can adapt it.
Just JOIN the tables on UserID
SELECT MESSAGES.*,
USERS.USERNAME,
USERS.AGE,
USERS.GENDER,
USERS.WEBURL
FROM MESSAGES
JOIN USERS
ON USERS.USERID = MESSAGES.USERID
ORDER BY MESSAGES.USERID,
MESSAGES.TIMESTAMP
Presently I'm learning (MS) SQL, and was trying out various aggregate function samples. The question I have is: Is there any scenario (sample query that uses aggregate function) where having a unique constraint column (on a table) helps when using an aggregate function.
Please note: I'm not trying to find a solution to a problem, but trying to see if such a scenario exist in real world SQL programming.
One immediate theoretical scenario comes to mind, the unique constraint is going to be backed by a unique index, so if you were just aggregating that field, the index would be narrower than scanning the table, but that would be on the basis that the query didn't use any other fields and was thus covering, otherwise it would tip out of the NC index.
I think the addition of the index to enforce the unique constraint is automatically going to have the ability to potentially help a query, but it might be a bit contrived.
Put the unique constraint on the field if you need the field to be unique, if you need some indexes to help query performance, consider them seperately, or add a unique index on that field + include other fields to make it covering (less useful, but more useful than the unique index on a single field)
Let's take following two tables, one has records for subject name and subject Id and another table contains record for student having marks in particular subjects.
Table1(SubjectId int unique, subject_name varchar, MaxMarks int)
Table2(Id int, StudentId int, SubjectId, Marks int)
so If I need to find AVG of marks obtained in Science subject by all student who have attempted for
Science(SubjectId =2) then I would fire following query.
SELECT AVG(Marks), MaxMarks
FROM Table1, Table2
WHERE Table1.SubjectId = 2;