I have one table that contains 3 columns. Primary id, uuid, date(last login). When users are loging in I run query to database to check if user with this uuid exists. This table should work super fast for ~5mln users. How to make query on table like that faster? Will it help if I add another column for ex. country, and use it as a index?
If you want to check a particular uuid, then you want an index on that column:
create index idx_table_uuid on table(uuid);
This should be fast enough for your purposes.
Related
I have a transaction table and a inventory table that I would like to 'JOIN' together. The tables need to 'JOIN' on three primary keys.
My question is: should I create a unique key (concatenation of the three fields) and create a 'INDEX' on the unique key or would I just create a non-clustered 'INDEX' on all three fields?
I'm currently using SQL Server 2014
I'm guessing the Transaction table is the biggest and the Inventory is the smaller. A lot depends on what proportion of the data would you expect to be returned by your join - If its most then a table scan will probably occur so an index wont help much. If your going to try and get a small subset of date then create an index on the 3 columns on both tables and create a foreign key from Trans to Inventory on the 3 cols. (SQL Server needs an index as well as a FK)
Pick the most granular column as the first in your index as this will encourage SQL servers Optimiser to use the index.
I have a Table called users which structure looks like this:
user_id
google_id
google_name
google_email
When I try to insert a record to this table I need to check both user_id and google_id for duplicate data. When we take one by one these two columns I should be able to insert data with out an issue. I mean that a record can contain the same value in one of user_id or google_id already in the table. But not in both. I tried many things an still could not make this work. Can some one guide me? I am using Phpmyadmin to mange DB
Create a unique index on the two columns:
create unique index users_ids on users(user_id,google_id)
If someone tries to insert another row with the same user_id and google_id, the unique index will cause an error.
I have a huge existing Order Management Application.
Now, in the main ORDER Table, i am adding a new column: IS_HISTORICAL. If its value is: TRUE, means the Order is Historical now, and should not show up in application.
Now, i have to modify many SQL Queries in my existing application so that they select only those orders whose IS_HISTORICAL is 'FALSE' - i.e add following in WHERE clause:
AND IS_HISTORICAL='FALSE'
Question: *Is there a easier way - so that i do not have to modify so many application queries (to hide away historical orders)?
Essentially all ORDERS marked as IS_HISTORICAL='TRUE' should become invisible/un-available for read/updates!!*
Note: Right now the table sizes are not very huge, but ultimately i intend to partition the table by IS_HISTORICAL true/false.
If you're only going to use the historical data for analysis then I prefer Florin's solution as the amount of data you need to look at for each query remains smaller. It makes the analysis queries more difficult as you need to UNION ALL but everything else will run "quicker" (it may not be noticable).
If some applications/users require access to the historical data the better solution would be to rename your table and create a view on top of it with the query that you need.
The problem with re-writing all your queries is that you're going to forget one or get one incorrect, either now or in the future. A view removes that problem for you as the query is static, every time you query the view the additional conditions you require are automatically added.
Something like:
rename orders to order_history;
create or replace view orders as
select *
from order_history
where is_historical = 'FALSE';
Two further points.
I wouldn't bother with TRUE / FALSE, if the table gets large it's a lot of additional data to scan. Create your column as a VARCHAR2(1) and use T / F or Y / N, they are as immediately obvious but are smaller. Alternatively use a NUMBER(1,0) and 1 / 0.
Don't forget to put a constraint on your table so that the IS_HISTORICAL column can only have the values you've chosen.
If you're only ever going to have the two values then you may want to consider a CHECK CONSTRAINT:
alter table order_history
add constraint chk_order_history_historical
check ( is_historical in ('T','F') );
Otherwise, maybe you should do this anyway, use a FOREIGN KEY CONSTRAINT. Define an extra table, ORDER_HISTORY_TYPES
create table order_history_types (
id varchar2(1)
, description varchar2(4000)
, constraint pk_order_history_types primary key (id)
);
Fill it with your values and then add the foreign key:
alter table order_history
add constraint fk_order_history_historical
foreign key (is_historical)
references order_history_types (id)
You could look into using Virtual Private Database/row-level security. This can be used to automatically add the is_historical = 'FALSE' predicate when certain conditions are met (e.g. you're connected as the application user).
If the user only need nonhistorical records, an option is to create an ORDER_HIST table and move there the historical records. (delete and insert)
If some users/applications need both type of records then the partition aproach is the best.
in the database there are many identical schemas, cmp01..cmpa0
each schema has a users table
each schema's users table's primary key has its own unique range
for example, in cmp01.users the usr_id is between 0x01000000 and 0x01ffffffff.
is there any way I could define a VIEW global.users that is a union of each of the cmp*.union tables in such a way that, if querying by usr_id, the optimizer would head for the correct schema?
was thinking something like:
create view global.users as
select * from cmp01.users where usr_id between 0x01000000 and 0x01ffffffff
union all
select * from cmp02.users where usr_id between 0x02000000 and 0x02ffffffff
....
would this work? NO. EXPLAIN ANALYZE shows all schema used.
Is there an approach that might give good hints to the optimizer?
Why not create a table in a public schema that has all users in it, possibly with an extra column to store the source schema. Since the ids are globally unique, you could keep the id column unique:
create table all_users (
source_schema varchar(32),
usr_id int primary key,
-- other columns as per existing table(s)
);
Poluate the table by inserting all rows:
insert into all_users
select 'cmp01', * from cmp01.users union
select 'cmp02', * from cmp02.users union ...; -- etc
Use triggers to keep the table up to date.
It's not that hard to set up, and it will perform every well
What about creating a partitioned table? The master table would be created as global.users and it would be partitioned by the schema name.
That way you'd get the small user tables in each schema (including fast retrievals) provided you can create queries that PostgreSQL can optimize i.e. including the schema name in the where condition. You could also create a view in each schema that would hide the needed schema name to query the partitioned tables. I don't think it would work by specifying only the user_id. I fear that PostgreSQL's partitioning features are not smart enough for that.
Or use just one single table, and create views in each schema with an instead of trigger and limiting the result to that schema's users.
Try something like:
create view global.users as
select *
from (select 'cmp01' sel_schema, 0x01000000 usr_id_start, 0x01ffffffff usr_id_end
union all
select 'cmp02' sel_schema, 0x02000000 usr_id_start, 0x02ffffffff usr_id_end) s
join (select u1.*, 'cmp01' schema from cmp01.users u1
union all
select u2.*, 'cmp02' schema from cmp02.users u2) u
on s.sel_schema = u.schema
and include a condition like specified_usr_id between usr_id_start and usr_id_end when querying the view by a specified user ID.
I am looking to create a temporary table which is used as an intermediate table while compiling a report.
For a bit of background I am porting a VB 6 app to .net
To create the table I can use...
SELECT TOP 0 * INTO #temp_copy FROM temp;
This creates an empty copy of temp, But it doesn't create a primary key
Is there a way to create a temp table plus the constraints?
Should I create the constraints afterwards?
Or am I better off just creating the table using create table, I didn't want to do this because there are 45 columns in the table and it would fill the procedure with a lot of unnecessary cruft.
The table is required because a lot of people may be generating reports at the same time so I can't use a single intermediary table
Do you actually need a Primary Key? If you are flitering and selecting only the data needed by the report won't you have to visit every row in the temp table anyway?
By design, SELECT INTO does not carry over constraints (PK, FK, Unique), Defaults, Checks, etc. This is because a SELECT INTO can actually pull from numerous tables at once (via joins in the FROM clause). Since SELECT INTO creates a new table from the table(s) you specify, SQL really has no way of determining which constraints you want to keep, and which ones you don't want to keep.
You could write a procedure/script to create the constraint automatically, but it's probably too much effort for minimal gain.
You'd have to do one or the other:
add the PK/indexes afterwards
explicitly declare the temp table with constraints.
I'd also do this rather then TOP 0
SELECT * INTO #temp_copy FROM temp WHERE 1 = 0;