MS Access last() results are sometimes wrong - sql

I have an MS Access query that uses last() but sometimes it doesn't work as expected--which I know is what's expected lol. But I need to find a solution, either in Access or by converting the below query to MySQL. Any suggestions?
SELECT maindata.TrendShort, Last(maindata.Resistance) AS LastOfResistance, Last(maindata.Support) AS LastOfSupport, Count(maindata.ID) AS Days, Max(maindata.Datestamp) AS Datestamp, maindata.ProductID
FROM market_opinion AS maindata
WHERE (((Exists (select * from market_opinion action_count where maindata.ProductID = action_count.ProductID and maindata.Datestamp < action_count.Datestamp and maindata.TrendShort<> action_count.TrendShort))=False))
GROUP BY maindata.TrendShort, maindata.ProductID
ORDER BY Count(maindata.ID) DESC;
Only the LastOfResistence and LastOfSupport are occasionally wrong, the other fields are always correct.
CREATE TABLE `market_opinion` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`ProductID` int(11) DEFAULT NULL,
`Trend` varchar(11) DEFAULT NULL,
`TrendShort` varchar(7) DEFAULT NULL,
`Resistance` decimal(9,2) unsigned DEFAULT NULL,
`Support` decimal(9,2) unsigned DEFAULT NULL,
`Username` varchar(12) DEFAULT NULL,
`Datestamp` date DEFAULT NULL,
PRIMARY KEY (`ID`),
KEY `ProductID` (`ProductID`),
KEY `Datestamp` (`Datestamp`),
KEY `TrendShort` (`TrendShort`)
) ENGINE=InnoDB AUTO_INCREMENT=9536 DEFAULT CHARSET=utf8;

Without some feel for the data, this becomes something of a guess, but what I'm thinking is that the Last(Resistance) and the Last(Support) aren't necessarily pulling from the same record as Max(DateStamp). You might try breaking your query out into a 2 part query, such as:
Select maindata.TrendShort, Resistance, Support, COUNT(ID) as Days, ProductId
FROM market_opinion maindata
INNER JOIN (SELECT mo.TrendShort, MAX(mo.DateStamp) AS MaxDate
FROM market_opinion mo
WHERE (((EXISTS(SELECT ...))=FALSE))
GROUP BY mo.TrendShort) inner
WHERE maindata.TrendShort=inner.TrendSort AND maindata.DateStamp = inner.MaxDate
ORDER BY Days DESC;
I've left out the bulk of your query where the ellipsis (...) is, I wouldn't expect any changes there. You might consider looking at http://www.access-programmers.co.uk/forums/showthread.php?t=42291 for a discussion of first/last vs min/max. Let me know if this gets you any closer. If not, maybe some sample data where it's not working out. It might give some insight into what's going on.

First or Last just returns some (arbitrary) record that is not the last or first respectively.
In most cases, simply use Min for the first and Max for the last.

Related

Get all information about a person from when he or she was XX years old

I store a person's birthday as DATE in the database. If I go to the page /person/10/age/25, I want to get all data for this person that are from when he or she was 25 years old, based on his or hers birthday.
Currently, I'm using FLOOR(DATEDIFF(CURRENT_DATE, STR_TO_DATE(date_birth, '%Y-%m-%d')) / 365.25) to get the age of the person based on DATE. But I want to some how reverse this formula (correct word?) so it is getting all the information about the person, from when he or she was 25 years old.
SQL query:
SELECT *
FROM images AS i
JOIN people AS p
ON i.id_person = p.id
WHERE p.id = '10'
# new line
AND i.date_taken = DATE_ADD(p.date_birth, INTERVAL 25 YEAR)
# old line
# AND p.date_birth = FLOOR(DATEDIFF(CURRENT_DATE, STR_TO_DATE(date_birth, '%Y-%m-%d')) / 365.25)
Here's how the database looks like:
CREATE TABLE IF NOT EXISTS `images` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_person` int(11) NOT NULL,
`date_taken` date DEFAULT NULL,
UNIQUE KEY `id` (`id`)
);
CREATE TABLE IF NOT EXISTS `people` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`data_name` tinytext NOT NULL,
`date_birth` date NOT NULL,
UNIQUE KEY `id` (`id`)
)
http://sqlfiddle.com/#!9/6d1cfc/2
How can I accomplish this?
First up: don't do date manipulation this way (especially using 365.25 as a year multiplier) Instead, take a look at the DateAdd function.
(Ugh, had a long post typed out, and only after posting noticed that you're in MariaDB instead of MSSQL)
Anyway, you're looking for something like:
DATEADD(dateOfBirthValue, INTERVAL 25 YEAR)
... this will add 25 years to a specific date value. Hope that helps - I can't really post a code snippet because I'm not familiar with the ins-and-outs of MariaDB (but I do know that you're going to want to use DateADD instead of manually calculating out days.)

How to make the Primary Key have X digits in PostgreSQL?

I am fairly new to SQL but have been working hard to learn. I am currently stuck on an issue with setting a primary key to have 8 digits no matter what.
I tried using INT(8) but that didn't work. Also AUTO_INCREMENT doesn't work in PostgreSQL but I saw there were a couple of data types that auto increment but I still have the issue of the keys not being long enough.
Basically I want to have numbers represent User IDs, starting at 10000000 and moving up. 00000001 and up would work too, it doesn't matter to me.
I saw an answer that was close to this, but it didn't apply to PostgreSQL unfortunately.
Hopefully my question makes sense, if not I'll try to clarify.
My code (which I am using from a website to try and make my own forum for a practice project) is:
CREATE Table users (
user_id INT(8) NOT NULL AUTO_INCREMENT,
user_name VARCHAR(30) NOT NULL,
user_pass VARCHAR(255) NOT NULL,
user_email VARCHAR(255) NOT NULL,
user_date DATETIME NOT NULL,
user_level INT(8) NOT NULL,
UNIQUE INDEX user_name_unique (user_name),
PRIMARY KEY (user_id)
) TYPE=INNODB;
It doesn't work in PostgreSQL (9.4 Windows x64 version). What do I do?
You are mixing two aspects:
the data type allowing certain values for your PK column
the format you chose for display
AUTO_INCREMENT is a non-standard concept of MySQL, SQL Server uses IDENTITY(1,1), etc.
Use a serial column in Postgres:
CREATE TABLE users (
user_id serial PRIMARY KEY
, ...
)
That's a pseudo-type implemented as integer data type with a column default drawing from an attached SEQUENCE. integer is easily big enough for your case (-2147483648 to +2147483647).
If you really need to enforce numbers with a maximum of 8 decimal digits, add a CHECK constraint:
CONSTRAINT id_max_8_digits CHECK (user_id BETWEEN 0 AND < 99999999)
To display the number in any fashion you desire - 0-padded to 8 digits, for your case, use to_char():
SELECT to_char(user_id, '00000000') AS user_id_8digit
FROM users;
That's very fast. Note that the output is text now, not integer.
SQL Fiddle.
A couple of other things are MySQL-specific in your code:
int(8): use int.
datetime: use timestamp.
TYPE=INNODB: just drop that.
You could make user_id a serial type column and set the seed of this sequence to 10000000.
Why?
int(8) in mysql doesn't actually only store 8 digits, it only displays 8 digits
Postgres supports check constraints. You could use something like this:
create table foo (
bar_id int primary key check ( 9999999 < bar_id and bar_id < 100000000 )
);
If this is for numbering important documents like invoices that shouldn't have gaps, then you shouldn't be using sequences / auto_increment

MySQL query slow when selecting VARCHAR

I have this table:
CREATE TABLE `search_engine_rankings` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`keyword_id` int(11) DEFAULT NULL,
`search_engine_id` int(11) DEFAULT NULL,
`total_results` int(11) DEFAULT NULL,
`rank` int(11) DEFAULT NULL,
`url` varchar(255) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`indexed_at` date DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `unique_ranking` (`keyword_id`,`search_engine_id`,`rank`,`indexed_at`),
KEY `search_engine_rankings_search_engine_id_fk` (`search_engine_id`),
CONSTRAINT `search_engine_rankings_keyword_id_fk` FOREIGN KEY (`keyword_id`) REFERENCES `keywords` (`id`) ON DELETE CASCADE,
CONSTRAINT `search_engine_rankings_search_engine_id_fk` FOREIGN KEY (`search_engine_id`) REFERENCES `search_engines` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=244454637 DEFAULT CHARSET=utf8
It has about 250M rows in production.
When I do:
select id,
rank
from search_engine_rankings
where keyword_id = 19
and search_engine_id = 11
and indexed_at = "2010-12-03";
...it runs very quickly.
When I add the url column (VARCHAR):
select id,
rank,
url
from search_engine_rankings
where keyword_id = 19
and search_engine_id = 11
and indexed_at = "2010-12-03";
...it runs very slowly.
Any ideas?
The first query can be satisfied by the index alone -- no need to read the base table to obtain the values in the Select clause. The second statement requires reads of the base table because the URL column is not part of the index.
UNIQUE KEY `unique_ranking` (`keyword_id`,`search_engine_id`,`rank`,`indexed_at`),
The rows in tbe base table are not in the same physical order as the rows in the index, and so the read of the base table can involve considerable disk-thrashing.
You can think of it as a kind of proof of optimization -- on the first query the disk-thrashing is avoided because the engine is smart enough to consult the index for the values requested in the select clause; it will already have read that index into RAM for the where clause, so it takes advantage of that fact.
Additionally to Tim's answer. An index in Mysql can only be used left-to-right. Which means it can use columns of your index in your WHERE clause only up to the point you use them.
Currently, your UNIQUE index is keyword_id,search_engine_id,rank,indexed_at. This will be able to filter the columns keyword_id and search_engine_id, still needing to scan over the remaining rows to filter for indexed_at
But if you change it to: keyword_id,search_engine_id,indexed_at,rank (just the order). This will be able to filter the columns keyword_id,search_engine_id and indexed_at
I believe it will be able to fully use that index to read the appropriate part of your table.
I know it's an old post but I was experiencing the same situation and I didn't found an answer.
This really happens in MySQL, when you have varchar columns it takes a lot of time processing. My query took about 20 sec to process 1.7M rows and now is about 1.9 sec.
Ok first of all, create a view from this query:
CREATE VIEW view_one AS
select id,rank
from search_engine_rankings
where keyword_id = 19000
and search_engine_id = 11
and indexed_at = "2010-12-03";
Second, same query but with an inner join:
select v.*, s.url
from view_one AS v
inner join search_engine_rankings s ON s.id=v.id;
TLDR: I solved this by running optimize on the table.
I experienced the same just now. Even lookups on primary key and selecting just some few rows was slow. Testing a bit, I found it not to be limited to the varchar column, selecting an int also took considerable amounts of time.
A query roughly looking like this took around 3s:
select someint from mytable where id in (1234, 12345, 123456).
While a query roughly looking like this took <10ms:
select count(*) from mytable where id in (1234, 12345, 123456).
The approved answer here is to just make an index spanning someint also, and it will be fast, as mysql can fetch all information it needs from the index and won't have to touch the table. That probably works in some settings, but I think it's a silly workaround - something is clearly wrong, it should not take three seconds to fetch three rows from a table! Besides, most applications just does a "select * from mytable", and doing changes at the application side is not always trivial.
After optimize table, both queries takes <10ms.

MySQL query optimisation

I have a database table that stores imported information. For simplicity, its something like:
CREATE TABLE `data_import` (
`id` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`amount` DECIMAL(12,2) NULL DEFAULT NULL,
`payee` VARCHAR(50) NULL DEFAULT NULL,
`posted` TINYINT(1) NOT NULL DEFAULT 0,
PRIMARY KEY (`id`),
INDEX `payee` (`payee`)
)
I also have a table that stores import rules:
CREATE TABLE `import_rules` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`search` VARCHAR(50) NULL DEFAULT NULL,
PRIMARY KEY (`id`),
INDEX `search` (`search`)
)
The idea is that for each imported transaction, the query needs to try find a single matching rule - this match is done on the data_import.payee and import_rules.seach fields. Because these are both varchar fields, I have indexed them in the hope of making the query faster.
This is what I have come up with so far, which seems to work fine. Albeit slower than I hoped.
SELECT i.id, i.payee, i.amount, i.posted r.id, r.search
FROM import_data id
LEFT JOIN import_rules ir on REPLACE(i.payee, ' ', '') = REPLACE(ir.search, ' ', '')
One thing that the above query does not cater for, is that if import_data.posted = 1, then I dont need to find a rule for that line - is it possible to stop the query joining on that particular row? Similarly, if the payee is null, then it shouldn't try join either.
Are there any other ways that I can optimise this? I realise that doing text joins is not ideal...not sure if there are any better methods.
I highly recommend doing anything you can to get rid of the REPLACEs in that JOIN. Using REPLACE on both sides of the join totally eliminate the ability to use an index on either table.
Assuming you can get rid of the REPLACEs (by cleansing the existing data and/or new data):
If you need to join on text
columns, use a single byte per
character charset if you application
allows for it (for a smaller/faster index).
Make the N in VARCHAR(N) as small
as you can as it will affect the side
of the index (or arguably, use index
prefixes).
I imagine you want to make the
search index on import_rules
UNIQUE -- then you're sure to only
going to get 1 row result returned per row of
import_data
You can throw an AND into your WHERE clause if you'd like to enforce your 'don't join in this case' rule.
LEFT JOIN import_rules ir ON id.payee=ir.search AND id.posted != 1
The use of REPLACE() on the join is probably breaking the indexing, as it has an index of the values in the field, not the amended values after REPLACE().
As for not joining, you are already using a LEFT JOIN, so non-matching joins will result in NULLs for the import_rules fields; you should be able to add WHERE clauses to force that.

SQL Query always uses filesort in order by clause

I am trying to optimize a sql query which is using order by clause. When I use EXPLAIN the query always displays "using filesort". I am applying this query for a group discussion forum where there are tags attached to posts by users.
Here are the 3 tables I am using: users, user_tag, tags
user_tag is the association mapping table for users and their tags.
CREATE TABLE `usertable` (
`user_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_name` varchar(20) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
PRIMARY KEY (`user_name`),
KEY `user_id` (`user_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `user_tag` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(11) unsigned NOT NULL,
`tag_id` int(11) unsigned NOT NULL,
`usage_count` int(11) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `tag_id` (`tag_id`),
KEY `usage_count` (`usage_count`),
KEY `user_id` (`user_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
I update the usage_count on server side using programming. Here is the query that's giving me problem. The query is to find out the tag_id and usage_count for a particular username, sorted by usage count in descending order
select user_tag.tag_id, user_tag.usage_count
from user_tag inner join usertable on usertable.user_id = user_tag.user_id
where user_name="abc" order by usage_count DESC;
Here is the explain output:
mysql> explain select
user_tag.tag_id,
user_tag.usage_count from user_tag
inner join usertable on
user_tag.user_id = usertable.user_id
where user_name="abc" order by
user_tag.usage_count desc;
Explain output here
What should I be changing to lose that "Using filesort"
I'm rather rusty with this, but here goes.
The key used to fetch the rows is not the same as the one used in the ORDER BY:
http://dev.mysql.com/doc/refman/5.1/en/order-by-optimization.html
As mentioned by OMG Ponies, an index on user_id, usage_count may resolve the filesort.
KEY `user_id_usage_count` (`user_id`,`usage_count`)
"Using filesort" is not necessarily bad; in many cases it doesn't actually matter.
Also, its name is somewhat confusing. The filesort() function does not necessarily use temporary files to perform the sort. For small data sets, the data are sorted in memory which is pretty fast.
Unless you think it's a specific problem (for example, after profiling your application on production-grade hardware in the lab, removing the ORDER BY solves a specific performance issue), or your data set is large, you should probably not worry about it.